[MPICH] unmanaged disconnection from mpd ring
Matthew Chambers
matthew.chambers at vanderbilt.edu
Wed May 16 10:15:09 CDT 2007
It won't help prevent the problem, it will determine if it is reproducible
though. If the problem isn't reproducible, I have no idea what happened to
be honest.
> -----Original Message-----
> From: jliang at arb.ca.gov [mailto:jliang at arb.ca.gov]
> Sent: Wednesday, May 16, 2007 10:09 AM
> To: Matthew Chambers
> Cc: mpich-discuss at mcs.anl.gov
> Subject: Re: RE: [MPICH] unmanaged disconnection from mpd ring
>
> So you believe the problem was caused by the disconnected nodes and/or
> busy network. Could you explain how the burn-in process will help prevent
> the problem from happening in the future ? Thanks.
>
> With best regards,
> Paul
>
> ----- Original Message -----
> From: Matthew Chambers <matthew.chambers at vanderbilt.edu>
> Date: Tuesday, May 15, 2007 1:07 pm
> Subject: RE: [MPICH] unmanaged disconnection from mpd ring
>
> > I suggest doing an mpdringtest for an absurdly large number of
> > loops as a
> > kind of burn-in for your nodes and/or network.
More information about the mpich-discuss
mailing list