[petsc-users] Lookup error in PetscTableFind()
Luc Berger-Vergiat
lb2653 at columbia.edu
Tue Oct 28 12:30:35 CDT 2014
Ok, I'm recompiling PETSc in debug mode then.
Do you know what the call sequence should be on CETUS to get valgrind
attached to PETSc?
Would this work for example:
runjob --np 32 -p 8 --block $COBALT_PARTNAME --cwd /projects/shearbands/job1/200/4nodes_32cores/LU --verbose=INFO --envs FEAPHOME8_3=/projects/shearbands/ShearBands352 PETSC_DIR=/projects/shearbands/petsc-3.5.2 PETSC_ARCH=arch-linux2-c-opt : /usr/bin/valgrind --log-file=valgrind.log.%p /projects/shearbands/ShearBands352/parfeap/feap -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package mumps -ksp_diagonal_scale < /projects/shearbands/job1/yesfile
Best,
Luc
On 10/28/2014 12:33 PM, Barry Smith wrote:
> Hmm, this should never happen. In the code
>
> ierr = PetscTableCreate(aij->B->rmap->n,mat->cmap->N+1,&gid1_lid1);CHKERRQ(ierr);
> for (i=0; i<aij->B->rmap->n; i++) {
> for (j=0; j<B->ilen[i]; j++) {
> PetscInt data,gid1 = aj[B->i[i] + j] + 1;
> ierr = PetscTableFind(gid1_lid1,gid1,&data);CHKERRQ(ierr);
>
> Now mat->cmap->N+1 is the total number of columns in the matrix and gid1 are column entries which must always be smaller. Most likely there has been memory corruption somewhere before this point. Can you run with valgrind? http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> Barry
>
>> On Oct 28, 2014, at 10:04 AM, Luc Berger-Vergiat <lb2653 at columbia.edu> wrote:
>>
>> Hi,
>> I am running a code on CETUS and I use PETSc for as a linear solver.
>> Here is my submission command:
>> qsub -A shearbands -t 60 -n 4 -O 4nodes_32cores_Mult --mode script 4nodes_32cores_LU
>>
>> Here is "4nodes_32cores_LU":
>> #!/bin/sh
>>
>> LOCARGS="--block $COBALT_PARTNAME ${COBALT_CORNER:+--corner} $COBALT_CORNER ${COBALT_SHAPE:+--shape} $COBALT_SHAPE"
>> echo "Cobalt location args: $LOCARGS" >&2
>>
>> ################################
>> # 32 cores on 4 nodes jobs #
>> ################################
>> runjob --np 32 -p 8 --block $COBALT_PARTNAME --cwd /projects/shearbands/job1/200/4nodes_32cores/LU --verbose=INFO --envs FEAPHOME8_3=/projects/shearbands/ShearBands352 PETSC_DIR=/projects/shearbands/petsc-3.5.2 PETSC_ARCH=arch-linux2-c-opt : /projects/shearbands/ShearBands352/parfeap/feap -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package mumps -ksp_diagonal_scale -malloc_log mlog -log_summary time.log < /projects/shearbands/job1/yesfile
>>
>> I get the following error message:
>>
>> [7]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [7]PETSC ERROR: Argument out of range
>> [7]PETSC ERROR: Petsc Release Version 3.5.2, unknown
>> [7]PETSC ERROR: key 532150 is greater than largest key allowed 459888
>> [7]PETSC ERROR: Configure options --known-mpi-int64_t=1 --download-cmake=1 --download-hypre=1 --download-metis=1 --download-parmetis=1 --download-plapack=1 --download-superlu_dist=1 --download-mumps=1 --download-ml=1 --known-bits-per-byte=8 --known-level1-dcache-assoc=0 --known-level1-dcache-linesize=32 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mpi-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8 --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-batch=1 --with-blacs-include=/soft/libraries/alcf/current/gcc/SCALAPACK/ --with-blacs-lib=/soft/libraries/alcf/current/gcc/SCALAPACK/lib/libscalapack.a --with-blas-lapack-lib="-L/soft/libraries/alcf/current/gcc/LAPACK/lib -llapack -L/soft/libraries/alcf/current/gcc/BLAS/lib -lblas" --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-fortran-kernels=1 --with-is-color-value-type=short --with-scalapack-include=/soft/libraries/alcf/current/gcc/SCALAPACK/ --with-scalapack-lib=/soft/libraries/alcf/current/gcc/SCALAPACK/lib/libscalapack.a --with-shared-libraries=0 --with-x=0 -COPTFLAGS=" -O3 -qhot=level=0 -qsimd=auto -qmaxmem=-1 -qstrict -qstrict_induction" -CXXOPTFLAGS=" -O3 -qhot=level=0 -qsimd=auto -qmaxmem=-1 -qstrict -qstrict_induction" -FOPTFLAGS=" -O3 -qhot=level=0 -qsimd=auto -qmaxmem=-1 -qstrict -qstrict_induction"
>>
>> [7]PETSC ERROR: #1 PetscTableFind() line 126 in /gpfs/mira-fs1/projects/shearbands/petsc-3.5.2/include/petscctable.h
>> [7]PETSC ERROR: #2 MatSetUpMultiply_MPIAIJ() line 33 in /gpfs/mira-fs1/projects/shearbands/petsc-3.5.2/src/mat/impls/aij/mpi/mmaij.c
>> [7]PETSC ERROR: #3 MatAssemblyEnd_MPIAIJ() line 702 in /gpfs/mira-fs1/projects/shearbands/petsc-3.5.2/src/mat/impls/aij/mpi/mpiaij.c
>> [7]PETSC ERROR: #4 MatAssemblyEnd() line 4900 in /gpfs/mira-fs1/projects/shearbands/petsc-3.5.2/src/mat/interface/matrix.c
>>
>> Well at least that is what I think comes out after I read all the jammed up messages from my MPI processes...
>>
>> I would guess that I am trying to allocate more memory than I should which seems strange since the same problem runs fine on 2 nodes with 16 cores/node
>>
>> Thanks for the help
>> Best,
>> Luc
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20141028/47c5a96d/attachment-0001.html>
More information about the petsc-users
mailing list