[mpich-discuss] Unable to run simple mpi problem

Jayesh Krishna jayesh at mcs.anl.gov
Tue Dec 15 15:11:39 CST 2009


Hi,
 Which version of MPICH2 are you using (Use the latest stable version, 1.2.1, available at http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads)?
 Do you get the error when running the MPI program on the local m/c with 3 procs ? Do you get the error if you remove the "-localonly" option ?

Regards,
Jayesh

----- Original Message -----
From: "dave waite" <waitedm at gmail.com>
To: mpich-discuss at mcs.anl.gov
Sent: Tuesday, December 15, 2009 2:40:02 PM GMT -06:00 US/Canada Central
Subject: [mpich-discuss] Unable to run simple mpi problem





We are running mpich2 applications on many Windows platforms. In a few installations, we have a problem where the job dies while initializing mpi. To examine this further, we ran a simple Hellompi program, 



// mpi2.cpp : Defines the entry point for the console application. 

// 



#include "stdafx.h" 



int master ; 

int n_workers ; 

MPI_Comm world, workers ; 

MPI_Group world_group, worker_group ; 

#define BSIZE MPI_MAX_PROCESSOR_NAME 



char chrNames[MPI_MAX_PROCESSOR_NAME*64]; 



int _tmain( int argc, char * argv[]) 

{ 

int nprocs=1; 



world = MPI_COMM_WORLD; 

int iVal=0; 

int rank, size, len; 

char name[MPI_MAX_PROCESSOR_NAME]; 

MPI_Status reqstat; 

char * p; 

int iNodeCnt=1; 



SYSTEM_INFO info; 

GetSystemInfo( &info ); 



int i; 



MPI_Init(&argc, &argv); 

MPI_Comm_rank(MPI_COMM_WORLD, &rank); 

MPI_Comm_size(MPI_COMM_WORLD, &size); 



MPI_Get_processor_name(name, &len); 



if (rank==0) 

{ 

// server commands 

chrNames[0]=0; 

strcat(chrNames, "||" ); 

strcat(chrNames,name); 

strcat(chrNames, "||" ); 



for (i=1;i<size;i++) 

{ 

MPI_Recv(name,BSIZE,MPI_CHAR,i,999,MPI_COMM_WORLD,&reqstat); 

p=strstr(chrNames,name); 

if (p==NULL) 

{ 

strcat(chrNames,name); 

strcat(chrNames, "||" ); 

iNodeCnt++; 

} 



//printf("Hello MPI!\n"); 

printf( "Hello from Rank %d of %d on %s\n" ,i,size,name); 

} 

printf( "\nNodes:%d\n" ,iNodeCnt); 

printf( "Names:%s\n" ,chrNames); 

} 

else 

{ 

// client commands 

MPI_Send(name,BSIZE,MPI_CHAR,0,999,MPI_COMM_WORLD); 

} 



MPI_Finalize(); 

return 0; 

} 



And noted the same failure. Here is our output, 



C:\MPI>mpiexec2 -localonly -n 3 hellompi 

unable to read the cmd header on the pmi context, Error = -1 

. 

[01:4792]......ERROR:result command received but the wait_list is empty. 

[01:4792]....ERROR:unable to handle the command: "cmd=result src=1 dest=1 tag=7 

cmd_tag=2 cmd_orig=dbput ctx_key=1 result=DBS_SUCCESS " 

[01:4792]...ERROR:sock_op_close returned while unknown context is in state: SMPD_IDLE 

mpiexec aborting job... 

SuspendThread failed with error 5 for process 0:3AB7E6A8-6169-4544-8282-D4D35207 

F564:'hellompi' 

unable to suspend process. 

received suspend command for a pmi context that doesn't exist: unmatched id = 1 

unable to read the cmd header on the pmi context, Error = -1 

. 

Error posting readv, An existing connection was forcibly closed by the remote host.(10054) 

received kill command for a pmi context that doesn't exist: unmatched id = 1 

unable to read the cmd header on the pmi context, Error = -1 

. 

Error posting readv, An existing connection was forcibly closed by the remote ho 

st.(10054) 



job aborted: 

rank: node: exit code[: error message] 

0: usbospc126.americas.munters.com: 123: process 0 exited without calling finalize 

1: usbospc126.americas.munters.com: 123: process 1 exited without calling finalize 

2: usbospc126.americas.munters.com: 123 

Fatal error in MPI_Finalize: Invalid communicator, error stack: 

MPI_Finalize(307): MPI_Finalize failed 

MPI_Finalize(198): 

MPID_Finalize(92): 

PMPI_Barrier(476): MPI_Barrier(comm=0x44000002) failed 

PMPI_Barrier(396): Invalid communicator 

[0] unable to post a write of the abort command. 



This was run on a dual-core machine, running Windows XP, SP2. What do these error messages tell us? 

What is the best way to proceed in debugging this kind of issue? 



Thanks, 



Dave Waite 


_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss


More information about the mpich-discuss mailing list