[petsc-users] MPI error for large number of processes and subcomms

Fri Apr 17 10:09:56 CDT 2020

On Thu, Apr 16, 2020 at 11:13 PM Junchao Zhang <junchao.zhang at gmail.com>
wrote:

> Randy,
>   I reproduced your error with petsc-3.12.4 and 5120 mpi ranks. I also
> found the error went away with petsc-3.13.  However, I have not figured out
> what is the bug and which commit fixed it :).
>   So at your side, it is better to use the latest petsc.
>
I want to add that even with petsc-3.12.4 the error is random. I was
only able to reproduce the error once, so I can not claim petsc-3.13
actually fixed it (or, the bug is really in petsc).

> --Junchao Zhang
>
>
> On Thu, Apr 16, 2020 at 9:06 PM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> Randy,
>>   Up to now I could not reproduce your error, even with the biggest
>> mpirun -n 5120 ./test -nsubs 320 -nx 100 -ny 100 -nz 100
>>   While I continue doing test, you can try other options. It looks you
>> want to duplicate a vector to subcomms. I don't think you need the two
>> lines:
>>
>> call AOApplicationToPetsc(aoParent,nis,ind1,ierr)
>> call AOApplicationToPetsc(aoSub,nis,ind2,ierr)
>>
>>  In addition, you can use simpler and more memory-efficient index sets.
>> There is a petsc example for this task, see case 3 in
>> https://gitlab.com/petsc/petsc/-/blob/master/src/vec/vscat/tests/ex9.c
>>  BTW, it is good to use petsc master so we are on the same page.
>> --Junchao Zhang
>>
>>
>> On Wed, Apr 15, 2020 at 10:28 AM Randall Mackie <rlmackie862 at gmail.com>
>> wrote:
>>
>>> Hi Junchao,
>>>
>>> So I was able to create a small test code that duplicates the issue we
>>> have been having, and it is attached to this email in a zip file.
>>> Included is the test.F90 code, the commands to duplicate crash and to
>>> duplicate a successful run, output errors, and our petsc configuration.
>>>
>>> Our findings to date include:
>>>
>>> The error is reproducible in a very short time with this script
>>> It is related to nproc*nsubs and (although to a less extent) to DM grid
>>> size
>>> It happens regardless of MPI implementation (mpich, intel mpi 2018,
>>> 2019, openmpi) or compiler (gfortran/gcc , intel 2018)
>>> No effect changing vecscatter_type to mpi1 or mpi3. Mpi1 seems to
>>> slightly increase the limit, but still fails on the full machine set.
>>> Nothing looks interesting on valgrind
>>>
>>> Our initial tests were carried out on an Azure cluster, but we also
>>> tested on our smaller cluster, and we found the following:
>>>
>>> Works:
>>> $PETSC_DIR/lib/petsc/bin/petscmpiexec -n 1280 -hostfile hostfile ./test
>>> -nsubs 80 -nx 100 -ny 100 -nz 100
>>>
>>> Crashes (this works on Azure)
>>> $PETSC_DIR/lib/petsc/bin/petscmpiexec -n 2560 -hostfile hostfile ./test
>>> -nsubs 80 -nx 100 -ny 100 -nz 100
>>>
>>> So it looks like it may also be related to the physical number of nodes
>>> as well.
>>>
>>> In any case, even with 2560 processes on 192 cores the memory does not
>>> go above 3.5 Gbyes so you don’t need a huge cluster to test.
>>>
>>> Thanks,
>>>
>>> Randy M.
>>>
>>>
>>>
>>> On Apr 14, 2020, at 12:23 PM, Junchao Zhang <junchao.zhang at gmail.com>
>>> wrote:
>>>
>>> There is an MPI_Allreduce in PetscGatherNumberOfMessages, that is why I
>>> doubted it was the problem. Even if users configure petsc with 64-bit
>>> indices, we use PetscMPIInt in MPI calls. So it is not a problem.
>>> Try -vecscatter_type mpi1 to restore to the original VecScatter
>>> implementation. If the problem still remains, could you provide a test
>>> example for me to debug?
>>>
>>> --Junchao Zhang
>>>
>>>
>>> On Tue, Apr 14, 2020 at 12:13 PM Randall Mackie <rlmackie862 at gmail.com>
>>> wrote:
>>>
>>>> Hi Junchao,
>>>>
>>>> We have tried your two suggestions but the problem remains.
>>>> And the problem seems to be on the MPI_Isend line 117 in
>>>> PetscGatherMessageLengths and not MPI_AllReduce.
>>>>
>>>> We have now tried Intel MPI, Mpich, and OpenMPI, and so are thinking
>>>> the problem must be elsewhere and not MPI.
>>>>
>>>> Give that this is a 64 bit indices build of PETSc, is there some
>>>> possible incompatibility between PETSc and MPI calls?
>>>>
>>>> We are open to any other possible suggestions to try as other than
>>>> valgrind on thousands of processes we seem to have run out of ideas.
>>>>
>>>> Thanks, Randy M.
>>>>
>>>> On Apr 13, 2020, at 8:54 AM, Junchao Zhang <junchao.zhang at gmail.com>
>>>> wrote:
>>>>
>>>>
>>>> --Junchao Zhang
>>>>
>>>>
>>>> On Mon, Apr 13, 2020 at 10:53 AM Junchao Zhang <junchao.zhang at gmail.com>
>>>> wrote:
>>>>
>>>>> Randy,
>>>>>    Someone reported similar problem before. It turned out an Intel MPI
>>>>> MPI_Allreduce bug.  A workaround is setting the environment variable
>>>>> I_MPI_ADJUST_ALLREDUCE=1.arr
>>>>>
>>>>  Correct:  I_MPI_ADJUST_ALLREDUCE=1
>>>>
>>>>>    But you mentioned mpich also had the error. So maybe the problem is
>>>>> not the same. So let's try the workaround first. If it doesn't work, add
>>>>> another petsc option -build_twosided allreduce, which is a workaround for
>>>>> Intel MPI_Ibarrier bugs we met.
>>>>>    Thanks.
>>>>> --Junchao Zhang
>>>>>
>>>>>
>>>>> On Mon, Apr 13, 2020 at 10:38 AM Randall Mackie <rlmackie862 at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Dear PETSc users,
>>>>>>
>>>>>> We are trying to understand an issue that has come up in running our
>>>>>> code on a large cloud cluster with a large number of processes and subcomms.
>>>>>> This is code that we use daily on multiple clusters without problems,
>>>>>> and that runs valgrind clean for small test problems.
>>>>>>
>>>>>> The run generates the following messages, but doesn’t crash, just
>>>>>> seems to hang with all processes continuing to show activity:
>>>>>>
>>>>>> [492]PETSC ERROR: #1 PetscGatherMessageLengths() line 117 in
>>>>>> /mnt/home/cgg/PETSc/petsc-3.12.4/src/sys/utils/mpimesg.c
>>>>>> [492]PETSC ERROR: #2 VecScatterSetUp_SF() line 658 in
>>>>>> /mnt/home/cgg/PETSc/petsc-3.12.4/src/vec/vscat/impls/sf/vscatsf.c
>>>>>> [492]PETSC ERROR: #3 VecScatterSetUp() line 209 in
>>>>>> /mnt/home/cgg/PETSc/petsc-3.12.4/src/vec/vscat/interface/vscatfce.c
>>>>>> [492]PETSC ERROR: #4 VecScatterCreate() line 282 in
>>>>>> /mnt/home/cgg/PETSc/petsc-3.12.4/src/vec/vscat/interface/vscreate.c
>>>>>>
>>>>>>
>>>>>> Looking at line 117 in PetscGatherMessageLengths we find the
>>>>>> offending statement is the MPI_Isend:
>>>>>>
>>>>>>
>>>>>>   /* Post the Isends with the message length-info */
>>>>>>   for (i=0,j=0; i<size; ++i) {
>>>>>>     if (ilengths[i]) {
>>>>>>       ierr =
>>>>>> MPI_Isend((void*)(ilengths+i),1,MPI_INT,i,tag,comm,s_waits+j);CHKERRQ(ierr);
>>>>>>       j++;
>>>>>>     }
>>>>>>   }
>>>>>>
>>>>>> We have tried this with Intel MPI 2018, 2019, and mpich, all giving
>>>>>> the same problem.
>>>>>>
>>>>>> We suspect there is some limit being set on this cloud cluster on the
>>>>>> number of file connections or something, but we don’t know.
>>>>>>
>>>>>> Anyone have any ideas? We are sort of grasping for straws at this
>>>>>> point.
>>>>>>
>>>>>> Thanks, Randy M.
>>>>>>
>>>>>
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200417/89216236/attachment.html>