[mpich-discuss] spawning new processes with a hostfile
Jeff Hammond
jhammond at alcf.anl.gov
Mon Apr 16 09:14:11 CDT 2012
It seems that you are running on an Intel SCC. Is this true?
Which version of RCKMPI are you running? It seems there are a few
MPICH-derived implementations of MPI for SCC.
Some people at Argonne have SCC access (I am one of them) but those of
us that do are not necessarily the people qualified to debug the MPI
process manager on SCC. I am most certainly not qualified to do this.
Jeff
On Mon, Apr 16, 2012 at 8:14 AM, Umit <umitcanyilmaz at gmail.com> wrote:
> Hello all,
>
>
> There are some spawn commands in my program. Now I want to specify the nodes
> of my new spawned processes. I am trying to use a hostfile for this but I
> couldn’t do it successfully. New processes are still spawned on next
> available nodes.
>
> I added my code and outputs of my console.
>
> My hostfile:
>
> root at rck00:~> cat /shared/mpihosts
>
> rck03
>
> rck04
>
> rck05
>
>
>
> Can somebody help me? What is the problem? Can this be a bug?
>
>
> Here is my code and output of my program:
>
> #include "mpi.h"
>
> #include <stdio.h>
>
> #include <stdlib.h>
>
>
>
> #define NUM_SPAWNS 3
>
>
>
> int main( int argc, char *argv[] )
>
> {
>
> int errcodes[NUM_SPAWNS];
>
> MPI_Comm parentcomm, intercomm;
>
> int len;
>
> char name[MPI_MAX_PROCESSOR_NAME];
>
> int rank;
>
>
>
> MPI_Init( &argc, &argv );
>
> MPI_Comm_get_parent( &parentcomm );
>
> MPI_Comm_rank(MPI_COMM_WORLD,&rank);
>
>
>
> if (parentcomm == MPI_COMM_NULL)
>
> {
>
> MPI_Info info;
>
> MPI_Info_create( &info );
>
> MPI_Info_set(info, "hostfile", "/shared/mpihosts");
>
>
>
> MPI_Comm_spawn( "/shared/spawn/./spawn", MPI_ARGV_NULL,
> NUM_SPAWNS, info, 0, MPI_COMM_WORLD, &intercomm, errcodes );
>
>
>
> MPI_Get_processor_name(name, &len);
>
> printf("I am parent process %d on %s. \n", rank, name);
>
> }
>
> else
>
> {
>
> MPI_Get_processor_name(name, &len);
>
> printf("I am a spawned process %d on %s.\n", rank, name);
>
> }
>
> fflush(stdout);
>
> MPI_Finalize();
>
> return 0;
>
> }
>
>
>
> output of my program:
>
> root at rck00:~> mpirun -np 1 /shared/spawn/./spawn
>
> I am parent process 0 on rck00.
>
> I am a spawned process 0 on rck01.
>
> I am a spawned process 1 on rck02.
>
> I am a spawned process 2 on rck03.
>
>
>
>
>
> Thanks in advance,
>
>
>
>
> _______________________________________________
> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
--
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond (in-progress)
https://wiki.alcf.anl.gov/old/index.php/User:Jhammond (deprecated)
https://wiki-old.alcf.anl.gov/index.php/User:Jhammond(deprecated)
More information about the mpich-discuss
mailing list