<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Hello,</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
After tests and discussions with the computer admins the problem is solved !</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
It appears that the bug indeed comes from intel mpi 2019 and all of its updates.</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
For reasons that I do not understand it seems that intel mpi 2019 gives strange MPI errors when inter-nodes communication is required for computers using infiniband.</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Apparently this is a known error and indeed I found topics on forums talking about that.</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
I switch to intel mpi 2018 Update 3 and no problem, code runs normally on 1024 mpi ranks.</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Thank you for your attention and your time !</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Sincerly,</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Anthony Jourdon<br>
</div>
<div>
<div id="appendonsend"></div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>De :</b> Zhang, Junchao <jczhang@mcs.anl.gov><br>
<b>Envoyé :</b> vendredi 24 janvier 2020 16:52<br>
<b>À :</b> Anthony Jourdon <jourdon_anthony@hotmail.fr><br>
<b>Cc :</b> petsc-users@mcs.anl.gov <petsc-users@mcs.anl.gov><br>
<b>Objet :</b> Re: [petsc-users] DMDA Error</font>
<div> </div>
</div>
<div>
<div dir="ltr">Hello, Anthony
<div> I tried petsc-3.8.4 + icc/gcc + Intel MPI 2019 update 5 + optimized/debug build, and ran with 1024 ranks, but I could not reproduce the error. Maybe you can try these:</div>
<div></div>
<div> * Use the latest petsc + your test example, run with AND without -vecscatter_type mpi1, to see if they can report useful messages.</div>
<div> * Or, use Intel MPI 2019 update 6 to see if this is an Intel MPI bug.<br>
</div>
<div><br>
</div>
<div>$ cat ex50.c<br>
</div>
<div>#include <petscdm.h><br>
#include <petscdmda.h><br>
<br>
int main(int argc,char **argv)<br>
{<br>
PetscErrorCode ierr;<br>
PetscInt size;<br>
PetscInt X = 1024,Y = 128,Z=512;<br>
//PetscInt X = 512,Y = 64, Z=256;<br>
DM da;<br>
<br>
ierr = PetscInitialize(&argc,&argv,(char*)0,NULL);if (ierr) return ierr;<br>
ierr = MPI_Comm_size(PETSC_COMM_WORLD,&size);CHKERRQ(ierr);<br>
<br>
ierr = DMDACreate3d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_BOX,2*X+1,2*Y+1,2*Z+1,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,3,2,NULL,NULL,NULL,&da);CHKERRQ(ierr);<br>
ierr = DMSetFromOptions(da);CHKERRQ(ierr);<br>
ierr = DMSetUp(da);CHKERRQ(ierr);<br>
<br>
ierr = PetscPrintf(PETSC_COMM_WORLD,"Running with %D MPI ranks\n",size);CHKERRQ(ierr);<br>
<br>
ierr = DMDestroy(&da);CHKERRQ(ierr);<br>
ierr = PetscFinalize();<br>
return ierr;<br>
}<br>
</div>
<div><br>
</div>
<div>$ldd ex50</div>
<div> linux-vdso.so.1 => (0x00007ffdbcd43000)<br>
</div>
<div>libpetsc.so.3.8 => /home/jczhang/petsc/linux-intel-opt/lib/libpetsc.so.3.8 (0x00002afd27e51000)<br>
libX11.so.6 => /lib64/libX11.so.6 (0x00002afd2a811000)<br>
libifport.so.5 => /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/gcc-4.8.5/intel-parallel-studio-cluster.2019.5-zqvneipqa4u52iwlyy5kx4hbsfnspz6g/compilers_and_libraries_2019.5.281/linux/compiler/lib/intel64_lin/libifport.so.5 (0x00002afd2ab4f000)<br>
libmpicxx.so.12 => /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/gcc-4.8.5/intel-parallel-studio-cluster.2019.5-zqvneipqa4u52iwlyy5kx4hbsfnspz6g/compilers_and_libraries_2019.5.281/linux/mpi/intel64/lib/libmpicxx.so.12 (0x00002afd2ad7d000)<br>
libdl.so.2 => /lib64/libdl.so.2 (0x00002afd2af9d000)<br>
libmpifort.so.12 => /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/gcc-4.8.5/intel-parallel-studio-cluster.2019.5-zqvneipqa4u52iwlyy5kx4hbsfnspz6g/compilers_and_libraries_2019.5.281/linux/mpi/intel64/lib/libmpifort.so.12 (0x00002afd2b1a1000)<br>
libmpi.so.12 => /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/gcc-4.8.5/intel-parallel-studio-cluster.2019.5-zqvneipqa4u52iwlyy5kx4hbsfnspz6g/compilers_and_libraries_2019.5.281/linux/mpi/intel64/lib/release/libmpi.so.12 (0x00002afd2b55f000)<br>
librt.so.1 => /lib64/librt.so.1 (0x00002afd2c564000)<br>
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002afd2c76c000)<br>
libimf.so => /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/gcc-4.8.5/intel-parallel-studio-cluster.2019.5-zqvneipqa4u52iwlyy5kx4hbsfnspz6g/compilers_and_libraries_2019.5.281/linux/compiler/lib/intel64_lin/libimf.so (0x00002afd2c988000)<br>
libsvml.so => /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/gcc-4.8.5/intel-parallel-studio-cluster.2019.5-zqvneipqa4u52iwlyy5kx4hbsfnspz6g/compilers_and_libraries_2019.5.281/linux/compiler/lib/intel64_lin/libsvml.so (0x00002afd2d00d000)<br>
libirng.so => /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/gcc-4.8.5/intel-parallel-studio-cluster.2019.5-zqvneipqa4u52iwlyy5kx4hbsfnspz6g/compilers_and_libraries_2019.5.281/linux/compiler/lib/intel64_lin/libirng.so (0x00002afd2ea99000)<br>
libm.so.6 => /lib64/libm.so.6 (0x00002afd2ee04000)<br>
libcilkrts.so.5 => /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/gcc-4.8.5/intel-parallel-studio-cluster.2019.5-zqvneipqa4u52iwlyy5kx4hbsfnspz6g/compilers_and_libraries_2019.5.281/linux/compiler/lib/intel64_lin/libcilkrts.so.5 (0x00002afd2f106000)<br>
libstdc++.so.6 => /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/gcc-4.8.5/intel-parallel-studio-cluster.2019.5-zqvneipqa4u52iwlyy5kx4hbsfnspz6g/clck/2019.5/lib/intel64/libstdc++.so.6 (0x00002afd2f343000)<br>
libgcc_s.so.1 => /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/gcc-4.8.5/intel-parallel-studio-cluster.2019.5-zqvneipqa4u52iwlyy5kx4hbsfnspz6g/clck/2019.5/lib/intel64/libgcc_s.so.1 (0x00002afd2f655000)<br>
libirc.so => /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/gcc-4.8.5/intel-parallel-studio-cluster.2019.5-zqvneipqa4u52iwlyy5kx4hbsfnspz6g/compilers_and_libraries_2019.5.281/linux/compiler/lib/intel64_lin/libirc.so (0x00002afd2f86b000)<br>
libc.so.6 => /lib64/libc.so.6 (0x00002afd2fadd000)<br>
libintlc.so.5 => /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/gcc-4.8.5/intel-parallel-studio-cluster.2019.5-zqvneipqa4u52iwlyy5kx4hbsfnspz6g/compilers_and_libraries_2019.5.281/linux/compiler/lib/intel64_lin/libintlc.so.5 (0x00002afd2feaa000)<br>
libxcb.so.1 => /lib64/libxcb.so.1 (0x00002afd3011c000)<br>
/lib64/ld-linux-x86-64.so.2 (0x00002afd27c2d000)<br>
libfabric.so.1 => /blues/gpfs/home/software/spack-0.10.1/opt/spack/linux-centos7-x86_64/gcc-4.8.5/intel-parallel-studio-cluster.2019.5-zqvneipqa4u52iwlyy5kx4hbsfnspz6g/compilers_and_libraries_2019.5.281/linux/mpi/intel64/libfabric/lib/libfabric.so.1 (0x00002afd30344000)<br>
libXau.so.6 => /lib64/libXau.so.6 (0x00002afd3057c000)<br>
</div>
<div><br>
</div>
<div>--Junchao Zhang<br>
</div>
<div><br>
</div>
</div>
<br>
<div class="x_gmail_quote">
<div dir="ltr" class="x_gmail_attr">On Tue, Jan 21, 2020 at 2:25 AM Anthony Jourdon <<a href="mailto:jourdon_anthony@hotmail.fr">jourdon_anthony@hotmail.fr</a>> wrote:<br>
</div>
<blockquote class="x_gmail_quote" style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex">
<div dir="ltr">
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Hello,</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
I made a test to try to reproduce the error.</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
To do so I modified the file $PETSC_DIR/src/dm/examples/tests/ex35.c</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
I attach the file in case of need.</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
The same error is reproduced for 1024 mpi ranks. I tested two problem sizes (2*512+1x2*64+1x2*256+1 and 2*1024+1x2*128+1x2*512+1) and the error occured for both cases, the first case is also the one I used to run before the OS and mpi updates.</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
I also run the code with -malloc_debug and nothing more appeared. <br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
I attached the configure command I used to build a debug version of petsc.</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Thank you for your time,</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Sincerly.<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Anthony Jourdon<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div>
<div id="x_gmail-m_5197182641261146049appendonsend"></div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<hr style="display:inline-block; width:98%">
<div id="x_gmail-m_5197182641261146049divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>De :</b> Zhang, Junchao <<a href="mailto:jczhang@mcs.anl.gov" target="_blank">jczhang@mcs.anl.gov</a>><br>
<b>Envoyé :</b> jeudi 16 janvier 2020 16:49<br>
<b>À :</b> Anthony Jourdon <<a href="mailto:jourdon_anthony@hotmail.fr" target="_blank">jourdon_anthony@hotmail.fr</a>><br>
<b>Cc :</b> <a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a> <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>><br>
<b>Objet :</b> Re: [petsc-users] DMDA Error</font>
<div> </div>
</div>
<div>
<div dir="ltr">It seems the problem is triggered by DMSetUp. You can write a small test creating the DMDA with the same size as your code, to see if you can reproduce the problem. If yes, it would be much easier for us to debug it.<br clear="all">
<div>
<div dir="ltr">
<div dir="ltr">--Junchao Zhang</div>
</div>
</div>
<br>
</div>
<br>
<div>
<div dir="ltr">On Thu, Jan 16, 2020 at 7:38 AM Anthony Jourdon <<a href="mailto:jourdon_anthony@hotmail.fr" target="_blank">jourdon_anthony@hotmail.fr</a>> wrote:<br>
</div>
<blockquote style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex">
<div dir="ltr">
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<div>
<div>
<div>
<p style="margin-top:0px; margin-bottom:0px">Dear Petsc developer,</p>
<p style="margin-top:0px; margin-bottom:0px"><br>
</p>
<p style="margin-top:0px; margin-bottom:0px">I need assistance with an error.<br>
</p>
<p style="margin-top:0px; margin-bottom:0px"><br>
</p>
<p style="margin-top:0px; margin-bottom:0px">I run a code that uses the DMDA related functions. I'm using petsc-3.8.4.</p>
<p style="margin-top:0px; margin-bottom:0px"><br>
</p>
<p style="margin-top:0px; margin-bottom:0px">This code used to run very well on a super computer with the OS SLES11.</p>
<p style="margin-top:0px; margin-bottom:0px">Petsc was built using an intel mpi 5.1.3.223 module and intel mkl version 2016.0.2.181
<br>
</p>
<p style="margin-top:0px; margin-bottom:0px">The code was running with no problem on 1024 and more mpi ranks.</p>
<p style="margin-top:0px; margin-bottom:0px"><br>
</p>
<p style="margin-top:0px; margin-bottom:0px">Recently, the OS of the computer has been updated to RHEL7<br>
</p>
<p style="margin-top:0px; margin-bottom:0px">I rebuilt Petsc using new available versions of intel mpi (2019U5) and mkl (2019.0.5.281) which are the same versions for compilers and mkl.<br>
</p>
<p style="margin-top:0px; margin-bottom:0px">Since then I tested to run the exact same code on 8, 16, 24, 48, 512 and 1024 mpi ranks.</p>
<p style="margin-top:0px; margin-bottom:0px">Until 1024 mpi ranks no problem, but for 1024 an error related to DMDA appeared. I snip the first lines of the error stack here and the full error stack is attached.</p>
<p style="margin-top:0px; margin-bottom:0px"><br>
</p>
<div style="margin-top:0px; margin-bottom:0px">
<div>
<div>
<p style="margin-top:0px; margin-bottom:0px">[534]PETSC ERROR: #1 PetscGatherMessageLengths() line 120 in /scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/sys/utils/mpimesg.c</p>
</div>
<div>
<p style="margin-top:0px; margin-bottom:0px">[534]PETSC ERROR: #2 VecScatterCreate_PtoS() line 2288 in /scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/vec/vec/utils/vpscat.c</p>
</div>
<div>
<p style="margin-top:0px; margin-bottom:0px">[534]PETSC ERROR: #3 VecScatterCreate() line 1462 in /scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/vec/vec/utils/vscat.c</p>
</div>
<div>
<p style="margin-top:0px; margin-bottom:0px">[534]PETSC ERROR: #4 DMSetUp_DA_3D() line 1042 in /scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/dm/impls/da/da3.c</p>
</div>
<div>
<p style="margin-top:0px; margin-bottom:0px">[534]PETSC ERROR: #5 DMSetUp_DA() line 25 in /scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/dm/impls/da/dareg.c</p>
</div>
<div>
<p style="margin-top:0px; margin-bottom:0px">[534]PETSC ERROR: #6 DMSetUp() line 720 in /scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/dm/interface/dm.c</p>
</div>
</div>
<div>
<p style="margin-top:0px; margin-bottom:0px"> </p>
<p style="margin-top:0px; margin-bottom:0px">Thank you for your time,</p>
<p style="margin-top:0px; margin-bottom:0px">Sincerly,</p>
<p style="margin-top:0px; margin-bottom:0px"><br>
</p>
<p style="margin-top:0px; margin-bottom:0px">Anthony Jourdon<br>
</p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</body>
</html>