[MPICH] core dumps MPICH & Linux

Rajeev Thakur thakur at mcs.anl.gov
Thu Oct 26 12:17:57 CDT 2006


I don't know if this is any different, but I do
limit coredumpsize unlimited 

Rajeev

> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Wolfram Brenig
> Sent: Thursday, October 26, 2006 11:11 AM
> Cc: mpich-discuss at mcs.anl.gov
> Subject: Re: [MPICH] core dumps MPICH & Linux
> 
> Let me be more precise.
> 
> I have no problem in running code on the
> heterogeneous system. (I can also reduce
> the MPI-ring to just a homogeneous section
> of the cluster ... to be sure.)
> 
> What I want to do is, to get the core dump
> from the slaves of a master/slave type of
> MPI-code for debugging purpose in a particular
> case.
> 
> Now, when I run the slaves as standalone
> processes I can get core dumps from them.
> But when I run them as MPI processes they
> do not produce any core dump files.
> 
> I have set:
> 
> $> ulimit -c unlimited
> 
> in the .profile and .bashrc and when I do:
> 
> $> ssh node-whatever ulimit -c
> 
> I get:
> 
> $> unlimited
> 
> for any node-whatever of the cluster.
> I checked for the core files in the directory which I
> get when I do:
> 
> $> ssh node-whatever pwd
> 
> but I also searched over the whole home
> file system ... there is no core
> 
> Any suggestion what I might be missing.
> 
> 
> Wolfram
> 
> 
> Darius Buntinas wrote:
> > Note that MPICH2 does not (yet) run on heterogeneous 
> clusters.  If you're
> > getting crashes, this may be why.
> > 
> > Try running
> >   ulimit -c
> > using mpiexec (as if it were an mpi program).  That will 
> show you what the
> > limit is actually set at on each node.
> > 
> > -d
> > 
> > 
> > On Thu, 26 Oct 2006, Wolfram Brenig wrote:
> > 
> >> I'm trying to force core dumps on
> >> a heterogeneous linux cluster running
> >> mpich2version: 1.0.2 and SuSE linux
> >> versions 9.2 and 10.0.
> >>
> >> I have set "ulimit -c unlimited" on all
> >> nodes.
> >>
> >> When I run non-parallel code I can get
> >> core dumps. But no parallel program will
> >> core dump.
> >>
> >> Any help, or hint where to get info
> >> would be most appreciated.
> >>
> >> From searching the WWW I got the
> >> impression that linux may not be able
> >> to do core dumps with MPI. Is this so?
> >>
> >> Wolfram
> >>
> >>
> >>
> 
> 




More information about the mpich-discuss mailing list