[petsc-users] Combining MPIUNI with duplicated comms

Arne Morten Kvarving arne.morten.kvarving at sintef.no
Mon Oct 5 07:13:11 CDT 2015


hi there.

first of all, i'm not 100% this is supposed to work. if it is not, feel 
free to tell me just that and i'll find a way to workaround.

in our code we use MPI_Comm_dup for various reasons. this works fine. 
except when i run in serial, i.e., using the MPUNI wrapper. if the 
PETSC_COMM_SELF communicator is duplicated, nastyness results when 
tearing down the KSP object. while this isn't really fatal as such, it 
is rather annoying.

please consider http://www.math.ntnu.no/~arnemort/bug-self.tar.bz2 which 
is a silly code i wrote to reproduce the issue. a 3x3 matrix is 
constructed, a corresponding 1-vector and the system is solved.
then the objects are teared down.

the tarball has a cmake buildsystem. the buildsystem relies on the 
standard PETSC_DIR / PETSC_ARCH env variables. if you prefer to build by 
hand, there is no requirement on using the build system. just watch out 
for the define it adds.

the application takes one required command line parameter which should 
be 'self' or 'world'. this controls which comm is duplicated. if I run 
with self, i get

[0]PETSC ERROR: --------------------- Error Message 
--------------------------------------------------------------
[0]PETSC ERROR: Corrupt argument: 
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: Inner MPI_Comm does not have expected reference to outer 
comm
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html 
for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.6.2, Oct, 02, 2015
[0]PETSC ERROR: ./bug_comm_self on a linux-gnu-cxx-dbg named akvarium by 
akva Mon Oct  5 14:05:52 2015
[0]PETSC ERROR: Configure options --with-precision=double 
--with-scalar-type=real --with-debugging=1 
--with-blas-lib=/usr/lib/libblas.a 
--with-lapack-lib=/usr/lib/liblapack.a --with-64-bit-indices=0 
--with-clanguage=c++ --with-mpi=0 --LIBS=-ldl --with-ml=0 
--with-shared-libraries=0
[0]PETSC ERROR: #1 Petsc_DelComm_Outer() line 360 in 
/home/akva/kode/petsc/petsc-3.6.2/src/sys/objects/pinit.c

if i run with world, no error is generated.

this is with the recetly released 3.6.2, but i can reproduce atleast 
back to 3.5.2 (haven't tried older).

i also noticed that if i add a VecView call, errors will also be 
generated with world. this line is commented out in the source code. 
might be some valuable info for those who know the system in that fact.

since i made it reproducable through the test app, i skipped logs for 
now. if they or some other info is required from me side, i'll gladly 
provide it.

thanks in advance

arnem


More information about the petsc-users mailing list