[petsc-users] A bad commit affects MOOSE

Satish Balay balay at mcs.anl.gov
Tue Apr 3 11:35:04 CDT 2018


On Tue, 3 Apr 2018, Satish Balay wrote:

> On Tue, 3 Apr 2018, Derek Gaston wrote:
> 
> > One thing I want to be clear of here: is that we're not trying to solve
> > this particular problem (where we're creating 1000 instances of Hypre to
> > precondition each variable independently)... this particular problem is
> > just a test (that we've had in our test suite for a long time) to stress
> > test some of this capability.
> > 
> > We really do have needs for thousands (tens of thousands) of simultaneous
> > solves (each with their own Hypre instances).  That's not what this
> > particular problem is doing - but it is representative of a class of our
> > problems we need to solve.
> > 
> > Which does bring up a point: I have been able to do solves before with
> > ~50,000 separate PETSc solves without issue.  Is it because I was working
> > with MVAPICH on a cluster?  Does it just have a higher limit?
> 
> Don't know - but thats easy to find out with a simple test code..
> 
> >>>>>>
> $ cat comm_dup_test.c
> #include <mpi.h>
> #include <stdio.h>
> 
> int main(int argc, char** argv) {
>     MPI_Comm newcomm;
>     int i, err;
>     MPI_Init(NULL, NULL);
>     for (i=0; i<100000; i++) {
>       err = MPI_Comm_dup(MPI_COMM_WORLD, &newcomm);
>       if (err) {
>           printf("%5d - fail\n",i);fflush(stdout);
>           break;
>         } else {
>           printf("%5d - success\n",i);fflush(stdout);
>       }
>     }
>     MPI_Finalize();
> }
> <<<<<<<
> 
> OpenMPI fails after '65531' and mpich after '2044'. MVAPICH is derived
> off MPICH - but its possible they have a different limit than MPICH.

BTW: the above is  with: openmpi-2.1.2 and mpich-3.3b1

mvapich2-1.9.5 - and I get error after '2044' comm dupes

Satish


More information about the petsc-users mailing list