[petsc-dev] DMDAGlobalToNatural errors with Ubuntu:latest; gcc 7 & Open MPI 2.1.1
Fabian.Jakub
Fabian.Jakub at physik.uni-muenchen.de
Tue Jul 30 09:47:05 CDT 2019
Dear Petsc Team,
Our cluster recently switched to Ubuntu 18.04 which has gcc 7.4 and
(Open MPI) 2.1.1 - with this I ended up with segfault and valgrind
errors in DMDAGlobalToNatural.
This is evident in a minimal fortran example such as the attached
example petsc_ex.F90
with the following error:
==22616== Conditional jump or move depends on uninitialised value(s)
==22616== at 0x4FA5CDB: PetscTrMallocDefault (mtr.c:185)
==22616== by 0x4FA4DAC: PetscMallocA (mal.c:413)
==22616== by 0x5090E94: VecScatterSetUp_SF (vscatsf.c:652)
==22616== by 0x50A1104: VecScatterSetUp (vscatfce.c:209)
==22616== by 0x509EE3B: VecScatterCreate (vscreate.c:280)
==22616== by 0x577B48B: DMDAGlobalToNatural_Create (dagtol.c:108)
==22616== by 0x577BB6D: DMDAGlobalToNaturalBegin (dagtol.c:155)
==22616== by 0x5798446: VecView_MPI_DA (gr2.c:720)
==22616== by 0x51BC7D8: VecView (vector.c:574)
==22616== by 0x4F4ECA1: PetscObjectView (destroy.c:90)
==22616== by 0x4F4F05E: PetscObjectViewFromOptions (destroy.c:126)
and consequently wrong results in the natural vec
I was looking at the fortran example if I did forget something but I can
also see the same error, i.e. not being valgrind clean, in pure C - PETSc:
cd $PETSC_DIR/src/dm/examples/tests && make ex14 && mpirun
--allow-run-as-root -np 2 valgrind ./ex14
I then tried various docker/podman linux distributions to make sure that
my setup is clean and to me it seems that this error is confined to the
particular gcc version 7.4 and (Open MPI) 2.1.1 from the ubuntu:latest repo.
I tried other images from dockerhub including
gcc:7.4.0 :: where I could neither install openmpi nor mpich through
apt, however works with --download-openmpi and --download-mpich
ubuntu:rolling(19.04) <-- work
debian:latest & :stable <-- works
ubuntu:latest(18.04) <-- fails in case of openmpi, but works with mpich
or with petsc-configure --download-openmpi or --download-mpich
Is this error with (Open MPI) 2.1.1 a known issue? In the meantime, I
guess I'll go with a custom mpi install but given that ubuntu:latest is
widely spread, do you think there is an easy solution to the error?
I guess you are not eager to delve into this issue with old mpi versions
but in case you find some spare time, maybe you find the root cause
and/or a workaround.
Many thanks,
Fabian
-------------- next part --------------
include ${PETSC_DIR}/lib/petsc/conf/variables
include ${PETSC_DIR}/lib/petsc/conf/rules
run:: petsc_ex
mpirun -np 9 valgrind ./petsc_ex -show_gVec
petsc_ex:: petsc_ex.F90
${PETSC_FCOMPILE} -c petsc_ex.F90 -o petsc_ex.o
${FLINKER} petsc_ex.o -o petsc_ex ${PETSC_LIB}
clean::
rm -rf *.o petsc_ex
-------------- next part --------------
A non-text attachment was scrubbed...
Name: petsc_ex.F90
Type: text/x-fortran
Size: 1117 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20190730/9a0b2fe6/attachment.bin>
-------------- next part --------------
# Dockerfile to reproduce valgrind errors
# in the PETSc Example src/dm/examples/tests/ex14
# with the Ubuntu:latest (18.04) openmpi (2.1.1)
#
# invoked via: podman build -f Dockerfile.ubuntu_latest.test_petsc -t test_petsc_ex14_ubuntu_latest
FROM ubuntu:latest
#FROM ubuntu:rolling
RUN apt-get update && \
apt-get install -fy cmake gfortran git libopenblas-dev libopenmpi-dev openmpi-bin python valgrind && \
apt-get autoremove && apt-get clean
RUN cd $HOME && \
echo "export PETSC_DIR=$HOME/petsc" >> $HOME/.profile && \
echo "export PETSC_ARCH=debug" >> $HOME/.profile && \
. $HOME/.profile && \
git clone --depth=1 https://bitbucket.org/petsc/petsc -b master $PETSC_DIR && \
cd $PETSC_DIR && \
./configure --with-cc=$(which mpicc) --with-fortran-bindings=0 --with-fc=0 && \
make
RUN cd $HOME && . $HOME/.profile && \
cd $PETSC_DIR/src/dm/examples/tests && make ex14 && mpirun --allow-run-as-root -np 2 valgrind ./ex14
More information about the petsc-dev
mailing list