[petsc-users] A bad commit affects MOOSE
Satish Balay
balay at mcs.anl.gov
Tue Apr 3 11:35:04 CDT 2018
On Tue, 3 Apr 2018, Satish Balay wrote:
> On Tue, 3 Apr 2018, Derek Gaston wrote:
>
> > One thing I want to be clear of here: is that we're not trying to solve
> > this particular problem (where we're creating 1000 instances of Hypre to
> > precondition each variable independently)... this particular problem is
> > just a test (that we've had in our test suite for a long time) to stress
> > test some of this capability.
> >
> > We really do have needs for thousands (tens of thousands) of simultaneous
> > solves (each with their own Hypre instances). That's not what this
> > particular problem is doing - but it is representative of a class of our
> > problems we need to solve.
> >
> > Which does bring up a point: I have been able to do solves before with
> > ~50,000 separate PETSc solves without issue. Is it because I was working
> > with MVAPICH on a cluster? Does it just have a higher limit?
>
> Don't know - but thats easy to find out with a simple test code..
>
> >>>>>>
> $ cat comm_dup_test.c
> #include <mpi.h>
> #include <stdio.h>
>
> int main(int argc, char** argv) {
> MPI_Comm newcomm;
> int i, err;
> MPI_Init(NULL, NULL);
> for (i=0; i<100000; i++) {
> err = MPI_Comm_dup(MPI_COMM_WORLD, &newcomm);
> if (err) {
> printf("%5d - fail\n",i);fflush(stdout);
> break;
> } else {
> printf("%5d - success\n",i);fflush(stdout);
> }
> }
> MPI_Finalize();
> }
> <<<<<<<
>
> OpenMPI fails after '65531' and mpich after '2044'. MVAPICH is derived
> off MPICH - but its possible they have a different limit than MPICH.
BTW: the above is with: openmpi-2.1.2 and mpich-3.3b1
mvapich2-1.9.5 - and I get error after '2044' comm dupes
Satish
More information about the petsc-users
mailing list