[petsc-users] Combining MPIUNI with duplicated comms
Arne Morten Kvarving
arne.morten.kvarving at sintef.no
Mon Oct 5 07:13:11 CDT 2015
hi there.
first of all, i'm not 100% this is supposed to work. if it is not, feel
free to tell me just that and i'll find a way to workaround.
in our code we use MPI_Comm_dup for various reasons. this works fine.
except when i run in serial, i.e., using the MPUNI wrapper. if the
PETSC_COMM_SELF communicator is duplicated, nastyness results when
tearing down the KSP object. while this isn't really fatal as such, it
is rather annoying.
please consider http://www.math.ntnu.no/~arnemort/bug-self.tar.bz2 which
is a silly code i wrote to reproduce the issue. a 3x3 matrix is
constructed, a corresponding 1-vector and the system is solved.
then the objects are teared down.
the tarball has a cmake buildsystem. the buildsystem relies on the
standard PETSC_DIR / PETSC_ARCH env variables. if you prefer to build by
hand, there is no requirement on using the build system. just watch out
for the define it adds.
the application takes one required command line parameter which should
be 'self' or 'world'. this controls which comm is duplicated. if I run
with self, i get
[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR: Corrupt argument:
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: Inner MPI_Comm does not have expected reference to outer
comm
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.6.2, Oct, 02, 2015
[0]PETSC ERROR: ./bug_comm_self on a linux-gnu-cxx-dbg named akvarium by
akva Mon Oct 5 14:05:52 2015
[0]PETSC ERROR: Configure options --with-precision=double
--with-scalar-type=real --with-debugging=1
--with-blas-lib=/usr/lib/libblas.a
--with-lapack-lib=/usr/lib/liblapack.a --with-64-bit-indices=0
--with-clanguage=c++ --with-mpi=0 --LIBS=-ldl --with-ml=0
--with-shared-libraries=0
[0]PETSC ERROR: #1 Petsc_DelComm_Outer() line 360 in
/home/akva/kode/petsc/petsc-3.6.2/src/sys/objects/pinit.c
if i run with world, no error is generated.
this is with the recetly released 3.6.2, but i can reproduce atleast
back to 3.5.2 (haven't tried older).
i also noticed that if i add a VecView call, errors will also be
generated with world. this line is commented out in the source code.
might be some valuable info for those who know the system in that fact.
since i made it reproducable through the test app, i skipped logs for
now. if they or some other info is required from me side, i'll gladly
provide it.
thanks in advance
arnem
More information about the petsc-users
mailing list