[petsc-dev] VecScatterInitializeForGPU

Dominic Meiser dmeiser at txcorp.com
Wed Jan 22 11:32:08 CST 2014


Hey Paul,

Thanks for providing background on this.

On Wed 22 Jan 2014 10:05:13 AM MST, Paul Mullowney wrote:
>
> Dominic,
> A few years ago, I was trying to minimize the amount of data transfer
> to and from the GPU (for multi-GPU MatMult) by inspecting the indices
> of the data that needed to be message to and from the device. Then, I
> would call gather kernels on the GPU which pulled the scattered data
> into contiguous buffers and then be transferred to the host
> asynchronously (while the MatMult was occurring). The existence of
> VecScatterInitializeForGPU was added in order to build the necessary
> buffers as needed. This was the motivation behind the existence of
> VecScatterInitializeForGPU.
> An alternative approach is to message the smallest contiguous buffer
> containing all the data with a single cudaMemcpyAsync. This is the
> method currently implemented.
> I never found a case where the former implementation (with a GPU
> gather-kernel) performed better than the alternative approach which
> messaged the smallest contiguous buffer. I looked at many, many matrices.
> Now, as far as I understand the VecScatter kernels, this method should
> only get called if the transfer is MPI_General (i.e. PtoP parallel to
> parallel). Other VecScatter methods are called in other circumstances
> where the the scatter is not MPI_General. That assumption could be
> wrong though.


I see. I figured there was some logic in place to make sure that this 
function only gets called in cases where the transfer type is 
MPI_General. I'm getting segfaults in this function where the todata and 
fromdata are of a different type. This could easily be user error but 
I'm not sure. Here is an example valgrind error:

==27781== Invalid read of size 8
==27781== at 0x1188080: VecScatterInitializeForGPU (vscatcusp.c:46)
==27781== by 0xEEAE5D: MatMult_MPIAIJCUSPARSE(_p_Mat*, _p_Vec*, _p_Vec*) 
(mpiaijcusparse.cu:108)
==27781== by 0xA20CC3: MatMult (matrix.c:2242)
==27781== by 0x4645E4: main (ex7.c:93)
==27781== Address 0x286305e0 is 1,616 bytes inside a block of size 1,620 
alloc'd
==27781== at 0x4C26548: memalign (vg_replace_malloc.c:727)
==27781== by 0x4654F9: PetscMallocAlign(unsigned long, int, char const*, 
char const*, void**) (mal.c:27)
==27781== by 0xCAEECC: PetscTrMallocDefault(unsigned long, int, char 
const*, char const*, void**) (mtr.c:186)
==27781== by 0x5A5296: VecScatterCreate (vscat.c:1168)
==27781== by 0x9AF3C5: MatSetUpMultiply_MPIAIJ (mmaij.c:116)
==27781== by 0x96F0F0: MatAssemblyEnd_MPIAIJ(_p_Mat*, MatAssemblyType) 
(mpiaij.c:706)
==27781== by 0xA45358: MatAssemblyEnd (matrix.c:4959)
==27781== by 0x464301: main (ex7.c:78)

This was produced by src/ksp/ksp/tutorials/ex7.c. The command line 
options are

./ex7 -mat_type mpiaijcusparse -vec_type cusp

In this particular case the todata is of type VecScatter_Seq_Stride and 
fromdata is of type VecScatter_Seq_General. The complete valgrind log 
(including configure options for petsc) is attached.

Any comments or suggestions are appreciated.
Cheers,
Dominic

>
> -Paul
>
>
> On Wed, Jan 22, 2014 at 9:49 AM, Dominic Meiser <dmeiser at txcorp.com
> <mailto:dmeiser at txcorp.com>> wrote:
>
> Hi,
>
> I'm trying to understand VecScatterInitializeForGPU in
> src/vec/vec/utils/veccusp/__vscatcusp.c. I don't understand why
> this function can get away with casting the fromdata and todata in
> the inctx to VecScatter_MPI_General. Don't we need to inspect the
> VecScatterType fields of the todata and fromdata?
>
> Cheers,
> Dominic
>
> -- 
> Dominic Meiser
> Tech-X Corporation
> 5621 Arapahoe Avenue
> Boulder, CO 80303
> USA
> Telephone: 303-996-2036 <tel:303-996-2036>
> Fax: 303-448-7756 <tel:303-448-7756>
> www.txcorp.com <http://www.txcorp.com>
>
>



-- 
Dominic Meiser
Tech-X Corporation
5621 Arapahoe Avenue
Boulder, CO 80303
USA
Telephone: 303-996-2036
Fax: 303-448-7756
www.txcorp.com

-------------- next part --------------
==27786== Memcheck, a memory error detector
==27786== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==27786== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==27786== Command: ./ex7 -mat_type mpiaijcusparse -vec_type cusp
==27786== 
==27786== Syscall param writev(vector[...]) points to uninitialised byte(s)
==27786==    at 0x157E433B: writev (in /lib64/libc-2.12.so)
==27786==    by 0x1490D806: mca_oob_tcp_msg_send_handler (in /scr_ivy/dmeiser/PTSOLVE/openmpi-1.6.1-nodl/lib/libmpi.so.1.0.3)
==27786==    by 0x1490E82C: mca_oob_tcp_peer_send (in /scr_ivy/dmeiser/PTSOLVE/openmpi-1.6.1-nodl/lib/libmpi.so.1.0.3)
==27786==    by 0x14910BEC: mca_oob_tcp_send_nb (in /scr_ivy/dmeiser/PTSOLVE/openmpi-1.6.1-nodl/lib/libmpi.so.1.0.3)
==27786==    by 0x1492A315: orte_rml_oob_send (in /scr_ivy/dmeiser/PTSOLVE/openmpi-1.6.1-nodl/lib/libmpi.so.1.0.3)
==27786==    by 0x1492A55F: orte_rml_oob_send_buffer (in /scr_ivy/dmeiser/PTSOLVE/openmpi-1.6.1-nodl/lib/libmpi.so.1.0.3)
==27786==    by 0x148F8327: modex (in /scr_ivy/dmeiser/PTSOLVE/openmpi-1.6.1-nodl/lib/libmpi.so.1.0.3)
==27786==    by 0x147E7ACA: ompi_mpi_init (in /scr_ivy/dmeiser/PTSOLVE/openmpi-1.6.1-nodl/lib/libmpi.so.1.0.3)
==27786==    by 0x147FE57F: PMPI_Init_thread (in /scr_ivy/dmeiser/PTSOLVE/openmpi-1.6.1-nodl/lib/libmpi.so.1.0.3)
==27786==    by 0x4929D9: PetscInitialize (pinit.c:777)
==27786==    by 0x463B11: main (ex7.c:49)
==27786==  Address 0x16c8efc1 is 161 bytes inside a block of size 256 alloc'd
==27786==    at 0x4C27BE0: realloc (vg_replace_malloc.c:662)
==27786==    by 0x14943AC2: opal_dss_buffer_extend (in /scr_ivy/dmeiser/PTSOLVE/openmpi-1.6.1-nodl/lib/libmpi.so.1.0.3)
==27786==    by 0x14943C84: opal_dss_copy_payload (in /scr_ivy/dmeiser/PTSOLVE/openmpi-1.6.1-nodl/lib/libmpi.so.1.0.3)
==27786==    by 0x148F11D6: orte_grpcomm_base_pack_modex_entries (in /scr_ivy/dmeiser/PTSOLVE/openmpi-1.6.1-nodl/lib/libmpi.so.1.0.3)
==27786==    by 0x148F82DC: modex (in /scr_ivy/dmeiser/PTSOLVE/openmpi-1.6.1-nodl/lib/libmpi.so.1.0.3)
==27786==    by 0x147E7ACA: ompi_mpi_init (in /scr_ivy/dmeiser/PTSOLVE/openmpi-1.6.1-nodl/lib/libmpi.so.1.0.3)
==27786==    by 0x147FE57F: PMPI_Init_thread (in /scr_ivy/dmeiser/PTSOLVE/openmpi-1.6.1-nodl/lib/libmpi.so.1.0.3)
==27786==    by 0x4929D9: PetscInitialize (pinit.c:777)
==27786==    by 0x463B11: main (ex7.c:49)
==27786== 
==27786== Warning: set address range perms: large range [0x800000000, 0x1100000000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x3aeed000, 0x33aeed000) (noaccess)
==27786== Warning: set address range perms: large range [0x1100000000, 0x1400000000) (noaccess)
==27786== Invalid read of size 8
==27786==    at 0x1188080: VecScatterInitializeForGPU (vscatcusp.c:46)
==27786==    by 0xEEAE5D: MatMult_MPIAIJCUSPARSE(_p_Mat*, _p_Vec*, _p_Vec*) (mpiaijcusparse.cu:108)
==27786==    by 0xA20CC3: MatMult (matrix.c:2242)
==27786==    by 0x4645E4: main (ex7.c:93)
==27786==  Address 0x28634560 is 1,616 bytes inside a block of size 1,620 alloc'd
==27786==    at 0x4C26548: memalign (vg_replace_malloc.c:727)
==27786==    by 0x4654F9: PetscMallocAlign(unsigned long, int, char const*, char const*, void**) (mal.c:27)
==27786==    by 0xCAEECC: PetscTrMallocDefault(unsigned long, int, char const*, char const*, void**) (mtr.c:186)
==27786==    by 0x5A5296: VecScatterCreate (vscat.c:1168)
==27786==    by 0x9AF3C5: MatSetUpMultiply_MPIAIJ (mmaij.c:116)
==27786==    by 0x96F0F0: MatAssemblyEnd_MPIAIJ(_p_Mat*, MatAssemblyType) (mpiaij.c:706)
==27786==    by 0xA45358: MatAssemblyEnd (matrix.c:4959)
==27786==    by 0x464301: main (ex7.c:78)
==27786== 
==27786== Conditional jump or move depends on uninitialised value(s)
==27786==    at 0xCF97DB: PetscAbsScalar(double) (petscmath.h:206)
==27786==    by 0xCF9878: PetscIsInfOrNanScalar (mathinf.c:67)
==27786==    by 0x542479: VecValidValues (rvector.c:32)
==27786==    by 0x1105581: PCApply (precon.c:434)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786==    by 0xC33E36: KSPSolve_BCGS (bcgs.c:50)
==27786==    by 0xBAB71C: KSPSolve (itfunc.c:432)
==27786==    by 0xA9C0EF: PCApply_BJacobi_Multiblock(_p_PC*, _p_Vec*, _p_Vec*) (bjacobi.c:945)
==27786==    by 0x11057E6: PCApply (precon.c:440)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786== 
==27786== Conditional jump or move depends on uninitialised value(s)
==27786==    at 0xCF9880: PetscIsInfOrNanScalar (mathinf.c:67)
==27786==    by 0x542479: VecValidValues (rvector.c:32)
==27786==    by 0x1105581: PCApply (precon.c:434)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786==    by 0xC33E36: KSPSolve_BCGS (bcgs.c:50)
==27786==    by 0xBAB71C: KSPSolve (itfunc.c:432)
==27786==    by 0xA9C0EF: PCApply_BJacobi_Multiblock(_p_PC*, _p_Vec*, _p_Vec*) (bjacobi.c:945)
==27786==    by 0x11057E6: PCApply (precon.c:440)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786==    by 0xC4E8DF: KSPSolve_GMRES(_p_KSP*) (gmres.c:234)
==27786== 
==27786== Conditional jump or move depends on uninitialised value(s)
==27786==    at 0xCF97DB: PetscAbsScalar(double) (petscmath.h:206)
==27786==    by 0xCF988B: PetscIsInfOrNanScalar (mathinf.c:67)
==27786==    by 0x542479: VecValidValues (rvector.c:32)
==27786==    by 0x1105581: PCApply (precon.c:434)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786==    by 0xC33E36: KSPSolve_BCGS (bcgs.c:50)
==27786==    by 0xBAB71C: KSPSolve (itfunc.c:432)
==27786==    by 0xA9C0EF: PCApply_BJacobi_Multiblock(_p_PC*, _p_Vec*, _p_Vec*) (bjacobi.c:945)
==27786==    by 0x11057E6: PCApply (precon.c:440)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786== 
==27786== Conditional jump or move depends on uninitialised value(s)
==27786==    at 0xCF9893: PetscIsInfOrNanScalar (mathinf.c:67)
==27786==    by 0x542479: VecValidValues (rvector.c:32)
==27786==    by 0x1105581: PCApply (precon.c:434)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786==    by 0xC33E36: KSPSolve_BCGS (bcgs.c:50)
==27786==    by 0xBAB71C: KSPSolve (itfunc.c:432)
==27786==    by 0xA9C0EF: PCApply_BJacobi_Multiblock(_p_PC*, _p_Vec*, _p_Vec*) (bjacobi.c:945)
==27786==    by 0x11057E6: PCApply (precon.c:440)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786==    by 0xC4E8DF: KSPSolve_GMRES(_p_KSP*) (gmres.c:234)
==27786== 
==27786== Conditional jump or move depends on uninitialised value(s)
==27786==    at 0xCF97DB: PetscAbsScalar(double) (petscmath.h:206)
==27786==    by 0xCF9878: PetscIsInfOrNanScalar (mathinf.c:67)
==27786==    by 0x5424F1: VecValidValues (rvector.c:34)
==27786==    by 0x1105972: PCApply (precon.c:442)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786==    by 0xC33E36: KSPSolve_BCGS (bcgs.c:50)
==27786==    by 0xBAB71C: KSPSolve (itfunc.c:432)
==27786==    by 0xA9C0EF: PCApply_BJacobi_Multiblock(_p_PC*, _p_Vec*, _p_Vec*) (bjacobi.c:945)
==27786==    by 0x11057E6: PCApply (precon.c:440)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786== 
==27786== Conditional jump or move depends on uninitialised value(s)
==27786==    at 0xCF9880: PetscIsInfOrNanScalar (mathinf.c:67)
==27786==    by 0x5424F1: VecValidValues (rvector.c:34)
==27786==    by 0x1105972: PCApply (precon.c:442)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786==    by 0xC33E36: KSPSolve_BCGS (bcgs.c:50)
==27786==    by 0xBAB71C: KSPSolve (itfunc.c:432)
==27786==    by 0xA9C0EF: PCApply_BJacobi_Multiblock(_p_PC*, _p_Vec*, _p_Vec*) (bjacobi.c:945)
==27786==    by 0x11057E6: PCApply (precon.c:440)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786==    by 0xC4E8DF: KSPSolve_GMRES(_p_KSP*) (gmres.c:234)
==27786== 
==27786== Conditional jump or move depends on uninitialised value(s)
==27786==    at 0xCF97DB: PetscAbsScalar(double) (petscmath.h:206)
==27786==    by 0xCF988B: PetscIsInfOrNanScalar (mathinf.c:67)
==27786==    by 0x5424F1: VecValidValues (rvector.c:34)
==27786==    by 0x1105972: PCApply (precon.c:442)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786==    by 0xC33E36: KSPSolve_BCGS (bcgs.c:50)
==27786==    by 0xBAB71C: KSPSolve (itfunc.c:432)
==27786==    by 0xA9C0EF: PCApply_BJacobi_Multiblock(_p_PC*, _p_Vec*, _p_Vec*) (bjacobi.c:945)
==27786==    by 0x11057E6: PCApply (precon.c:440)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786== 
==27786== Conditional jump or move depends on uninitialised value(s)
==27786==    at 0xCF9893: PetscIsInfOrNanScalar (mathinf.c:67)
==27786==    by 0x5424F1: VecValidValues (rvector.c:34)
==27786==    by 0x1105972: PCApply (precon.c:442)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786==    by 0xC33E36: KSPSolve_BCGS (bcgs.c:50)
==27786==    by 0xBAB71C: KSPSolve (itfunc.c:432)
==27786==    by 0xA9C0EF: PCApply_BJacobi_Multiblock(_p_PC*, _p_Vec*, _p_Vec*) (bjacobi.c:945)
==27786==    by 0x11057E6: PCApply (precon.c:440)
==27786==    by 0x1131693: KSP_PCApply(_p_KSP*, _p_Vec*, _p_Vec*) (kspimpl.h:227)
==27786==    by 0x1132479: KSPInitialResidual (itres.c:64)
==27786==    by 0xC4E8DF: KSPSolve_GMRES(_p_KSP*) (gmres.c:234)
==27786== 
[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Error in external library!
[0]PETSC ERROR: CUSP error 61!
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Development GIT revision: v3.4.3-2332-g54f71ec  GIT Date: 2014-01-20 14:12:11 -0700
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: ./ex7 on a pargpudbg named ivy.txcorp.com by dmeiser Wed Jan 22 10:23:36 2014
[0]PETSC ERROR: Libraries linked from /scr_ivy/dmeiser/petsc-gpu-dev/build/pargpudbg/lib
[0]PETSC ERROR: Configure run at Tue Jan 21 16:53:42 2014
[0]PETSC ERROR: Configure options --with-cmake=/scr_ivy/dmeiser/PTSOLVE/cmake/bin/cmake --prefix=/scr_ivy/dmeiser/petsc-gpu-dev/build/pargpudbg --with-precision=double --with-scalar-type=real --with-fortran-kernels=1 --with-x=no --with-mpi=yes --with-mpi-dir=/scr_ivy/dmeiser/PTSOLVE/openmpi/ --with-openmp=yes --with-valgrind=1 --with-shared-libraries=0 --with-c-support=yes --with-debugging=yes --with-cuda=1 --with-cuda-dir=/usr/local/cuda --with-cuda-arch=sm_35 --download-txpetscgpu --with-thrust=yes --with-thrust-dir=/usr/local/cuda/include --with-umfpack=yes --download-umfpack --with-mumps=yes --with-superlu=yes --download-superlu=yes --download-mumps=yes --download-scalapack --download-parmetis --download-metis --with-cusp=yes --with-cusp-dir=/scr_ivy/dmeiser/PTSOLVE/cusp/include --CUDAFLAGS="-O3 -I/usr/local/cuda/include   --generate-code arch=compute_20,code=sm_20   --generate-code arch=compute_20,code=sm_21   --generate-code arch=compute_30,code=sm_30   --generate-code arch=compute_35,code=sm_35" --with-clanguage=C++ --CFLAGS="-pipe -fPIC" --CXXFLAGS="-pipe -fPIC" --with-c2html=0 --with-gelus=1 --with-gelus-dir=/scr_ivy/dmeiser/software/gelus
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: VecCUSPAllocateCheck() line 72 in /scr_ivy/dmeiser/petsc/src/vec/vec/impls/seq/seqcusp/veccusp.cu
[0]PETSC ERROR: VecCUSPCopyToGPU() line 96 in /scr_ivy/dmeiser/petsc/src/vec/vec/impls/seq/seqcusp/veccusp.cu
[0]PETSC ERROR: VecCUSPGetArrayReadWrite() line 1946 in /scr_ivy/dmeiser/petsc/src/vec/vec/impls/seq/seqcusp/veccusp.cu
[0]PETSC ERROR: VecAXPBYPCZ_SeqCUSP() line 1507 in /scr_ivy/dmeiser/petsc/src/vec/vec/impls/seq/seqcusp/veccusp.cu
[0]PETSC ERROR: VecAXPBYPCZ() line 726 in /scr_ivy/dmeiser/petsc/src/vec/vec/interface/rvector.c
[0]PETSC ERROR: KSPSolve_BCGS() line 120 in /scr_ivy/dmeiser/petsc/src/ksp/ksp/impls/bcgs/bcgs.c
[0]PETSC ERROR: KSPSolve() line 432 in /scr_ivy/dmeiser/petsc/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: PCApply_BJacobi_Multiblock() line 945 in /scr_ivy/dmeiser/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c
[0]PETSC ERROR: PCApply() line 440 in /scr_ivy/dmeiser/petsc/src/ksp/pc/interface/precon.c
[0]PETSC ERROR: KSP_PCApply() line 227 in /scr_ivy/dmeiser/petsc/include/petsc-private/kspimpl.h
[0]PETSC ERROR: KSPInitialResidual() line 64 in /scr_ivy/dmeiser/petsc/src/ksp/ksp/interface/itres.c
[0]PETSC ERROR: KSPSolve_GMRES() line 234 in /scr_ivy/dmeiser/petsc/src/ksp/ksp/impls/gmres/gmres.c
[0]PETSC ERROR: KSPSolve() line 432 in /scr_ivy/dmeiser/petsc/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: main() line 209 in /scr_ivy/dmeiser/petsc/src/ksp/ksp/examples/tutorials/ex7.c
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD 
with errorcode 76.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
==27786== 
==27786== HEAP SUMMARY:
==27786==     in use at exit: 172,193,472 bytes in 211,364 blocks
==27786==   total heap usage: 350,508 allocs, 139,144 frees, 200,031,576 bytes allocated
==27786== 
==27786== LEAK SUMMARY:
==27786==    definitely lost: 954 bytes in 28 blocks
==27786==    indirectly lost: 61 bytes in 7 blocks
==27786==      possibly lost: 2,154,848 bytes in 15,843 blocks
==27786==    still reachable: 170,037,609 bytes in 195,486 blocks
==27786==         suppressed: 0 bytes in 0 blocks
==27786== Rerun with --leak-check=full to see details of leaked memory
==27786== 
==27786== For counts of detected and suppressed errors, rerun with: -v
==27786== Use --track-origins=yes to see where uninitialised values come from
==27786== ERROR SUMMARY: 82 errors from 10 contexts (suppressed: 6 from 6)


More information about the petsc-dev mailing list