[mpich-discuss] multiple mpich2 jobs on a single node

Rajeev Thakur thakur at mcs.anl.gov
Thu May 1 08:36:54 CDT 2008


Can you try running them with a single mpdboot?

Rajeev 

> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Bryan Putnam
> Sent: Thursday, May 01, 2008 7:52 AM
> To: mpich-discuss at mcs.anl.gov
> Subject: [mpich-discuss] multiple mpich2 jobs on a single node
> 
> Hi, perhaps this is an FAQ, but I can't seem to find it.
> 
> I've noticed that if I have two mpich2 jobs running on for 
> example, the same 8-processor node, submitted as
> 
> mpdboot
> mpirun -np 4 ./a.out
> 
> 
> mpdboot
> mpirun -np 4 ./a.out2
> 
> 
> then when one of the jobs completes, it kills all the mpd 
> processes along 
> with it, and the remaining job dies with message
> 
> job aborted; reason = mpd disappeared
> 
> 
> Is there an easy fix for this?
> 
> Thanks,
> Bryan
> 
> 
> --
> Bryan Putnam
> Rosen Center for Advanced Computing, Purdue University
> Young Hall (Rm. 519)
> 302 Wood Street
> West Lafayette, IN 47907-2108
> Ph 765-496-8225 Fax 765-494-0566
> bfp at purdue.edu
> http://www.purdue.edu/itap
> 
> 
> 
> 




More information about the mpich-discuss mailing list