[petsc-dev] VecScatterBegin_1 with zero sized vectors and PETSC_HAVE_CUSP

Barry Smith bsmith at mcs.anl.gov
Fri Jan 20 16:20:21 CST 2012


On Jan 20, 2012, at 2:32 PM, Jed Brown wrote:

> On Fri, Jan 20, 2012 at 14:27, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   I do not understand the error traceback. It should NOT look like this. Is that really the exact output from a single failed run? There snould not be multiple messages of ----Error Message ---- etc. It shoul immediately after the first listing of Configure options show the complete stack where the problem happened instead it printed an initial error message again and then again and then finally a stack. This is not suppose to be possible.
> 
> That's the kind of thing that happens if the error is raised on COMM_SELF.

    ???? I don't think so. Note the entire error set comes from process 17, even with COMM_SELF it is not suppose to print the error message stuff multiple times on the same MPI node.

> Also, is this really supposed to use CHKERRCUSP()?

   No, that is wrong, I fixed it but then had a nasty merge with Paul's updates to PETSc GPU stuff.  I don't think that caused the grief.

   Stefano,
 
      Anyways since Paul updated all the cusp stuff please hg pull; hg update and rebuild the PETSc library then try again if still problems again send the entire output on error.

     If similar thing happens I'm tempted to ask you to run node 17 in the debugger and see why the error message comes up multiple times. -start_in_debugger -debugger_nodes 17


    Barry

 
> The function uses normal CHKERRQ() inside.
> 
> PetscErrorCode VecCUSPCopyFromGPUSome_Public(Vec v, PetscCUSPIndices ci)
> {
>   PetscErrorCode ierr;
> 
>   PetscFunctionBegin;
>   ierr = VecCUSPCopyFromGPUSome(v,&ci->indicesCPU,&ci->indicesGPU);CHKERRCUSP(ierr);
>   PetscFunctionReturn(0);
> }
> 
>  
> Are you running with multiple threads AND gpus? That won't work.
> 
>   Anyways I cannot find anywhere a list of Cusp error messages that include the numbers 46 and 76; why are not the except messages strings ???
> 
> 
>   Barry
> 
> 
> [17]PETSC ERROR: VecCUSPAllocateCheck() line 77 in src/vec/vec/impls/seq/seqcusp//work/adz/zampini/MyWorkingCopyOfPetsc/petsc-dev/include/../src/vec/vec/impls/seq/seqcusp/cuspvecimpl.h
> [17]PETSC ERROR: --------------------- Error Message ------------------------------------
> [17]PETSC ERROR: Error in external library!
> [17]PETSC ERROR: CUSP error 46!
> [17]PETSC ERROR: ------------------------------------------------------------------------
> [17]PETSC ERROR: Petsc Development HG revision:   HG Date:
> [17]PETSC ERROR: See docs/changes/index.html for recent updates.
> [17]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [17]PETSC ERROR: See docs/index.html for manual pages.
> [17]PETSC ERROR: ------------------------------------------------------------------------
> [17]PETSC ERROR: ./bidomonotest on a gnu-4.4.3 named ella011 by zampini Fri Jan 20 19:01:30 2012
> [17]PETSC ERROR: Libraries linked from /work/adz/zampini/MyWorkingCopyOfPetsc/petsc-dev/gnu-4.4.3-debug-double-louis/lib
> [17]PETSC ERROR: Configure run at Fri Jan 20 15:29:21 2012
> [17]PETSC ERROR: Configure options --CUDAFLAGS=-m64 --with-cuda-dir=/caspur/local/apps/cuda/4.0 --with-cuda-arch=sm_20 --with-cusp-dir=/caspur/shared/gpu-cluster/devel/cusp/0.2/.. --with-thrust-dir=/caspur/local/apps/cuda/4.0/include --with-boost-dir=/caspur/shared/sw/devel/boost/1.44.0/intel/11.1.064 --with-pcbddc=1 --with-make-np=12 --with-debugging=1 --with-errorchecking=1 --with-log=1 --with-info=1 --with-cmake=/work/adz/zampini/cmake/2.8.7/bin/cmake --with-gnu-compilers=1 --with-pthread=1 --with-pthreadclasses=1 --with-precision=double --with-mpi-dir=/caspur/shared/sw/devel/openmpi/1.4.1/gnu/4.4.3 PETSC_DIR=/work/adz/zampini/MyWorkingCopyOfPetsc/petsc-dev PETSC_ARCH=gnu-4.4.3-debug-double-louis --with-shared-libraries=1 --with-c++-support=1 --with-large-file-io=1 --download-hypre=/work/adz/zampini/PetscPlusExternalPackages/hypre-2.7.0b.tar.gz --download-umfpack=/work/adz/zampini/PetscPlusExternalPackages/UMFPACK-5.5.1.tar.gz --download-ml=/work/adz/zampini/PetscPlusExternalPackages/ml-6.2.tar.gz --download-spai=/work/adz/zampini/PetscPlusExternalPackages/spai_3.0.tar.gz --download-metis=1 --download-parmetis=1 --download-chaco=1 --download-scotch=1 --download-party=1 --with-blas-lapack-include=/caspur/shared/sw/devel/acml/4.4.0/gfortran64/include/acml.h --with-blas-lapack-lib=/caspur/shared/sw/devel/acml/4.4.0/gfortran64/lib/libacml.a
> [17]PETSC ERROR: ------------------------------------------------------------------------
> [17]PETSC ERROR: VecCUSPCopyFromGPUSome() line 228 in src/vec/vec/impls/seq/seqcusp/veccusp.cu
> [17]PETSC ERROR: --------------------- Error Message ------------------------------------
> [17]PETSC ERROR: Error in external library!
> [17]PETSC ERROR: CUSP error 76!
> [17]PETSC ERROR: ------------------------------------------------------------------------
> [17]PETSC ERROR: Petsc Development HG revision:   HG Date:
> [17]PETSC ERROR: See docs/changes/index.html for recent updates.
> [17]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [17]PETSC ERROR: See docs/index.html for manual pages.
> [17]PETSC ERROR: ------------------------------------------------------------------------
> [17]PETSC ERROR: ./bidomonotest on a gnu-4.4.3 named ella011 by zampini Fri Jan 20 19:01:30 2012
> [17]PETSC ERROR: Libraries linked from /work/adz/zampini/MyWorkingCopyOfPetsc/petsc-dev/gnu-4.4.3-debug-double-louis/lib
> [17]PETSC ERROR: Configure run at Fri Jan 20 15:29:21 2012
> [17]PETSC ERROR: Configure options --CUDAFLAGS=-m64 --with-cuda-dir=/caspur/local/apps/cuda/4.0 --with-cuda-arch=sm_20 --with-cusp-dir=/caspur/shared/gpu-cluster/devel/cusp/0.2/.. --with-thrust-dir=/caspur/local/apps/cuda/4.0/include --with-boost-dir=/caspur/shared/sw/devel/boost/1.44.0/intel/11.1.064 --with-pcbddc=1 --with-make-np=12 --with-debugging=1 --with-errorchecking=1 --with-log=1 --with-info=1 --with-cmake=/work/adz/zampini/cmake/2.8.7/bin/cmake --with-gnu-compilers=1 --with-pthread=1 --with-pthreadclasses=1 --with-precision=double --with-mpi-dir=/caspur/shared/sw/devel/openmpi/1.4.1/gnu/4.4.3 PETSC_DIR=/work/adz/zampini/MyWorkingCopyOfPetsc/petsc-dev PETSC_ARCH=gnu-4.4.3-debug-double-louis --with-shared-libraries=1 --with-c++-support=1 --with-large-file-io=1 --download-hypre=/work/adz/zampini/PetscPlusExternalPackages/hypre-2.7.0b.tar.gz --download-umfpack=/work/adz/zampini/PetscPlusExternalPackages/UMFPACK-5.5.1.tar.gz --download-ml=/work/adz/zampini/PetscPlusExternalPackages/ml-6.2.tar.gz --download-spai=/work/adz/zampini/PetscPlusExternalPackages/spai_3.0.tar.gz --download-metis=1 --download-parmetis=1 --download-chaco=1 --download-scotch=1 --download-party=1 --with-blas-lapack-include=/caspur/shared/sw/devel/acml/4.4.0/gfortran64/include/acml.h --with-blas-lapack-lib=/caspur/shared/sw/devel/acml/4.4.0/gfortran64/lib/libacml.a
> [17]PETSC ERROR: ------------------------------------------------------------------------
> [17]PETSC ERROR: VecCUSPCopyFromGPUSome_Public() line 263 in src/vec/vec/impls/seq/seqcusp/veccusp.cu
> [17]PETSC ERROR: VecScatterBegin_1() line 57 in src/vec/vec/utils//work/adz/zampini/MyWorkingCopyOfPetsc/petsc-dev/include/../src/vec/vec/utils/vpscat.h
> [17]PETSC ERROR: VecScatterBegin() line 1574 in src/vec/vec/utils/vscat.c
> [17]PETSC ERROR: PCISSetUp() line 46 in src/ksp/pc/impls/is/pcis.c
> [17]PETSC ERROR: PCSetUp_BDDC() line 230 in src/ksp/pc/impls/bddc/bddc.c
> [17]PETSC ERROR: PCSetUp() line 832 in src/ksp/pc/interface/precon.c
> [17]PETSC ERROR: KSPSetUp() line 261 in src/ksp/ksp/interface/itfunc.c
> [17]PETSC ERROR: PCBDDCSetupCoarseEnvironment() line 2081 in src/ksp/pc/impls/bddc/bddc.c
> [17]PETSC ERROR: PCBDDCCoarseSetUp() line 1341 in src/ksp/pc/impls/bddc/bddc.c
> [17]PETSC ERROR: PCSetUp_BDDC() line 255 in src/ksp/pc/impls/bddc/bddc.c
> [17]PETSC ERROR: PCSetUp() line 832 in src/ksp/pc/interface/precon.c
> [17]PETSC ERROR: KSPSetUp() line 261 in src/ksp/ksp/interface/itfunc.c
> 
> 
> On Jan 20, 2012, at 12:20 PM, Stefano Zampini wrote:
> 
> > Hi recently installed petsc-dev on a GPU cluster. I got an error in external library CUSP when calling PCISSetup: more precisely, doing VecScatterBegin on SEQ (not SEQCUSP!) vectors (please see the traceback attached). I'm developing the BDDC preconditioner code inside PETSc and this error occurred when doing multilevel: in such case some procs (like proc 17 in the case attached) has local  dimension (relevant to PCIS) equal to zero.
> >
> > Thus, I think the real problem stays on line 41 of src/vec/vec/utils/vpscat.h. If you tell me the reason why you used the first condition on the if clause I can patch the problem.
> >
> > Regards,
> > --
> > Stefano
> > <traceback>
> 
> 




More information about the petsc-dev mailing list