[petsc-users] Irritating behavior of MUMPS with PETSc

Satish Balay balay at mcs.anl.gov
Wed Jun 25 10:17:55 CDT 2014


Suggest running the non-mumps case with -log_summary [to confirm that
'-np 6' is actually used in both cases]

Secondly - you can try a 'release' version of openmpi or mpich and see
if that works. [I don't see a mention of openmpi-1.9a on the website]

Also you can try -log_trace to see where its hanging [or figure out how
to run code in debugger on this cluster]. But that might not help in
figuring out the solution to the hang..

Satish

On Wed, 25 Jun 2014, Matthew Knepley wrote:

> On Wed, Jun 25, 2014 at 7:09 AM, Gunnar Jansen <jansen.gunnar at gmail.com>
> wrote:
> 
> > You are right about the queuing system. The job is submitted with a PBS
> > script specifying the number of nodes/processors. On the cluster petsc is
> > configured in a module environment which sets the appropriate flags for
> > compilers/rules etc.
> >
> > The same exact job script on the same exact nodes with a standard krylov
> > method does not give any trouble but executes nicely on all processors (and
> > also give the correct result).
> >
> > Therefore my suspicion is a missing flag in the mumps interface. Is this
> > maybe rather a topic for the mumps-dev team?
> >
> 
> I doubt this. The whole point of MPI is to shield code from these details.
> 
> Can you first try this system with SuperLU_dist?

> 
>   Thanks,
> 
>      MAtt
> 
> 
> > Best, Gunnar
> >
> >
> >
> > 2014-06-25 15:52 GMT+02:00 Dave May <dave.mayhem23 at gmail.com>:
> >
> > This sounds weird.
> >>
> >> The launch line you provided doesn't include any information regarding
> >> how many processors (nodes/nodes per core to use). I presume you are using
> >> a queuing system. My guess is that there could be an issue with either (i)
> >> your job script, (ii) the configuration of the job scheduler on the
> >> machine, or (iii) the mpi installation on the machine.
> >>
> >> Have you been able to successfully run other petsc (or any mpi) codes
> >> with the same launch options (2 nodes, 3 procs per node)?
> >>
> >> Cheers.
> >>   Dave
> >>
> >>
> >>
> >>
> >> On 25 June 2014 15:44, Gunnar Jansen <jansen.gunnar at gmail.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> i try to solve a problem in parallel with MUMPS as the direct solver. As
> >>> long as I run the program on only 1 node with 6 processors everything works
> >>> fine! But using 2 nodes with 3 processors each gets mumps stuck in the
> >>> factorization.
> >>>
> >>> For the purpose of testing I run the ex2.c on a resolution of 100x100
> >>> (which is of course way to small for a direct solver in parallel).
> >>>
> >>> The code is run with :
> >>> mpirun ./ex2 -on_error_abort -pc_type lu -pc_factor_mat_solver_package
> >>> mumps -ksp_type preonly -log_summary -options_left -m 100 -n 100
> >>> -mat_mumps_icntl_4 3
> >>>
> >>> The petsc-configuration I used is:
> >>> --prefix=/opt/Petsc/3.4.4.extended --with-mpi=yes
> >>> --with-mpi-dir=/opt/Openmpi/1.9a/ --with-debugging=no --download-mumps
> >>>  --download-scalapack --download-parmetis --download-metis
> >>>
> >>> Is this common behavior? Or is there an error in the petsc configuration
> >>> I am using here?
> >>>
> >>> Best,
> >>> Gunnar
> >>>
> >>
> >>
> >
> 
> 
> 



More information about the petsc-users mailing list