[mpich-discuss] mpich2 error

daniel shawul dshawul at yahoo.com
Wed Feb 8 13:14:41 CST 2012

Dear Jayesh
I just realized that what I want to do was an array job submissions with qsub -t 1-68, which
sends a different task id when executing the script. I didn't really need to write code. 
I will let you know if I have problem with mpich2. 
thanks again


 From: Jayesh Krishna <jayesh at mcs.anl.gov>
To: daniel shawul <dshawul at yahoo.com> 
Cc: mpich-discuss at mcs.anl.gov 
Sent: Wednesday, February 8, 2012 11:35 AM
Subject: Re: [mpich-discuss] mpich2 error
FWIW, I ran your test code (68 jobs) such that each job takes 20s and did not get any errors.
Let us know if you need any help.


----- Original Message -----
From: "daniel shawul" <dshawul at yahoo.com>
To: "Jayesh Krishna" <jayesh at mcs.anl.gov>, mpich-discuss at mcs.anl.gov
Sent: Wednesday, February 8, 2012 8:40:21 AM
Subject: Re: [mpich-discuss] mpich2 error

Dear Jayesh 
I have confirmed it runs correctly on a linux machine with openmpi. 
So it is probably a problem with my setup of mpich2 on the windows machine. 
thank you for your help 

From: Jayesh Krishna <jayesh at mcs.anl.gov> 
To: daniel shawul <dshawul at yahoo.com>; mpich-discuss at mcs.anl.gov 
Sent: Tuesday, February 7, 2012 1:52 PM 
Subject: Re: [mpich-discuss] mpich2 error 

You dummy code (I would recommend sending us a working test code along with the skeleton next time - saves us a lot of time) worked for me (with NTOTAL = 68 and dummy work functions). Please debug your code further to make sure that there are no bugs in the code (I would recommend looking into the work funcs & the "command"). 

(PS: Also make sure you use the latest stable release of MPICH2) 

----- Original Message ----- 
From: "daniel shawul" < dshawul at yahoo.com > 
To: mpich-discuss at mcs.anl.gov 
Sent: Tuesday, February 7, 2012 9:18:38 AM 
Subject: [mpich-discuss] mpich2 error 

Hello , 
I am trying to schedule tasks in a batch file using a small MPI c program as a scheduler. 
Processor 0 is the scheduler, sends jobs to others, checks when a work is finished and sends 
the idle processor to work again. Other than that it doesn't do real work. 
Using mpich2 the program works but I sometimes get the below error when the job takes a long time to finish. 
It tells me it could be something related to timeout. The error is shown below. Thank you for any suggestions 


E:\Alltests\solver\Projects\Release>mpiexec -n 2 test commands.bat 68 
Process [Process [Worker 1 started problem 0 
0/2] on cee-3624-ab52 : pid 118980 
1/2] on cee-3624-ab52 : pid 120092 
10 File(s) copied 

1 file(s) copied. 
[01:97888]..ERROR:Error while connecting to host, No connection could be made because the target machine actively refuse 
d it. (10061) 
Fatal error in MPI_Init: Other MPI error, error stack: 
MPID_Init(107).......: channel initialization failed 
MPID_Init(371).......: PMI_Init returned -1 

And the code is shown below 


int main(int argc, char* argv[] ) { 
int myid,nprocs,namelen,master; 
char processor_name[MPI_MAX_PROCESSOR_NAME]; 
MPI_Request request; 
MPI_Status status; 
int NTOTAL; 
int job; 

/*command and number of times to execute it*/ 
command = argv[1]; 
NTOTAL = atoi(argv[2]); 

* Inititalize MPI environment 
int res = MPI_Init(&argc,&argv); 
MPI_Get_processor_name(processor_name, &namelen); 
cerr << "Process [" << myid << "/" << nprocs<< "] on " 
<< processor_name << " : pid " << PID << endl; 
master = 0; 

* master 
if(myid == master) { 
int r,sent,njobs; 
* Master sends slaves to work here 
sent = 0; 
njobs = 0; 
while(njobs < NTOTAL && sent < nprocs) { 
while(sent) { 
*Non blocking recieve to do housekeeping 
*staff in the mean time 
int flag = 0; 
MPI_Test(&request, &flag, &status); 
double t1,t2; 
t1 = MPI_Wtime(); 
while (!flag) { 
t2 = MPI_Wtime(); 
if(t2 - t1 >= update) { 
cout << "Progress " << njobs << "/" << NTOTAL << " completed." << endl; 
t1 = t2; 
MPI_Test(&request, &flag, &status); 
/*We got an idle processor now*/ 
if(njobs < NTOTAL) { 
} else { 
cout << "Work finished" << endl; 
* Slave processors pick up jobs here 
else { 
while(true) { 
if(status.MPI_TAG == 0) { 
} else { 


return 0; 


mpich-discuss mailing list mpich-discuss at mcs.anl.gov 
To manage subscription options or unsubscribe: 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120208/80aa53db/attachment.htm>

More information about the mpich-discuss mailing list