<html><body><div style="color:#000; background-color:#fff; font-family:times new roman, new york, times, serif;font-size:12pt"><div><span>Dear Jayesh</span></div><div><span>I have confirmed it runs correctly on a linux machine with openmpi.</span></div><div><span>So it is probably a problem with my setup of mpich2 on the windows machine.</span></div><div><span>thank you for your help</span></div><div><span><br></span></div><div> </div><div><br></div> <div style="font-size: 12pt; font-family: 'times new roman', 'new york', times, serif; "> <div style="font-size: 12pt; font-family: 'times new roman', 'new york', times, serif; "> <div dir="ltr"> <font size="2" face="Arial"> <hr size="1"> <b><span style="font-weight:bold;">From:</span></b> Jayesh Krishna <jayesh@mcs.anl.gov><br> <b><span style="font-weight: bold;">To:</span></b> daniel shawul <dshawul@yahoo.com>; mpich-discuss@mcs.anl.gov <br> <b><span style="font-weight:
bold;">Sent:</span></b> Tuesday, February 7, 2012 1:52 PM<br> <b><span style="font-weight: bold;">Subject:</span></b> Re: [mpich-discuss] mpich2 error<br> </font> </div> <br>
Hi,<br> You dummy code (I would recommend sending us a working test code along with the skeleton next time - saves us a lot of time) worked for me (with NTOTAL = 68 and dummy work functions). Please debug your code further to make sure that there are no bugs in the code (I would recommend looking into the work funcs & the "command").<br><br>(PS: Also make sure you use the latest stable release of MPICH2)<br>Regards,<br>Jayesh<br><br>----- Original Message -----<br>From: "daniel shawul" <<a ymailto="mailto:dshawul@yahoo.com" href="mailto:dshawul@yahoo.com">dshawul@yahoo.com</a>><br>To: <a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>Sent: Tuesday, February 7, 2012 9:18:38 AM<br>Subject: [mpich-discuss] mpich2 error<br><br><br><br><br>Hello , <br>I am trying to schedule tasks in a batch file using a small MPI c program as a scheduler. <br>Processor 0 is the scheduler, sends
jobs to others, checks when a work is finished and sends <br>the idle processor to work again. Other than that it doesn't do real work. <br>Using mpich2 the program works but I sometimes get the below error when the job takes a long time to finish. <br>It tells me it could be something related to timeout. The error is shown below. Thank you for any suggestions <br><br><br>[quote] <br><br><br>E:\Alltests\solver\Projects\Release>mpiexec -n 2 test commands.bat 68 <br>Process [Process [Worker 1 started problem 0 <br>0/2] on cee-3624-ab52 : pid 118980 <br>1/2] on cee-3624-ab52 : pid 120092 <br>mytest\controls.txt <br>mytest\controlsp.txt <br>10 File(s) copied <br><br>1 file(s) copied. <br>[01:97888]..ERROR:Error while connecting to host, No connection could be made because the target machine actively refuse <br>d it. (10061) <br>Fatal error in MPI_Init: Other MPI error, error stack: <br>MPIR_Init_thread(388): <br>MPID_Init(107).......: channel
initialization failed <br>MPID_Init(371).......: PMI_Init returned -1 <br>[/quote] <br><br><br><br>And the code is shown below <br><br><br>[code] <br><br>int main(int argc, char* argv[] ) { <br>int myid,nprocs,namelen,master; <br>char processor_name[MPI_MAX_PROCESSOR_NAME]; <br>MPI_Request request; <br>MPI_Status status; <br>int NTOTAL; <br>int job; <br><br><br>/*command and number of times to execute it*/ <br>command = argv[1]; <br>NTOTAL = atoi(argv[2]); <br><br><br>/* <br>* Inititalize MPI environment <br>*/ <br>int res = MPI_Init(&argc,&argv); <br>MPI_Comm_size(MPI_COMM_WORLD,&nprocs); <br>MPI_Comm_rank(MPI_COMM_WORLD,&myid); <br>MPI_Get_processor_name(processor_name, &namelen); <br>cerr << "Process [" << myid << "/" << nprocs<< "] on " <br><< processor_name << " : pid " << PID << endl; <br>cerr.flush(); <br>master = 0; <br>nprocs--; <br><br>/* <br>* master <br>*/ <br>if(myid
== master) { <br>int r,sent,njobs; <br>/* <br>* Master sends slaves to work here <br>*/ <br>sent = 0; <br>njobs = 0; <br>while(njobs < NTOTAL && sent < nprocs) { <br>sent++; <br>njobs++; <br>MPI_Send(&njobs,1,MPI_INT,sent,njobs,MPI_COMM_WORLD); <br>} <br>while(sent) { <br>/* <br>*Non blocking recieve to do housekeeping <br>*staff in the mean time <br>*/ <br>int flag = 0; <br>MPI_Irecv(&r,1,MPI_INT,MPI_ANY_SOURCE,MPI_ANY_TAG,MPI_COMM_WORLD,&request); <br>MPI_Test(&request, &flag, &status); <br>double t1,t2; <br>t1 = MPI_Wtime(); <br>while (!flag) { <br>SLEEP(1000); <br>t2 = MPI_Wtime(); <br>if(t2 - t1 >= update) { <br>cout << "Progress " << njobs << "/" << NTOTAL << " completed." << endl; <br>workProgress(); <br>t1 = t2; <br>} <br>MPI_Test(&request, &flag, &status); <br>} <br>/*We got an idle processor now*/ <br>if(njobs < NTOTAL) { <br>njobs++;
<br>MPI_Send(&njobs,1,MPI_INT,r,njobs,MPI_COMM_WORLD); <br>} else { <br>MPI_Send(MPI_BOTTOM,0,MPI_INT,r,0,MPI_COMM_WORLD); <br>sent--; <br>} <br>} <br>cout << "Work finished" << endl; <br>workProgress(); <br>} <br>/* <br>* Slave processors pick up jobs here <br>*/ <br>else { <br>while(true) { <br>MPI_Recv(&job,1,MPI_INT,master,MPI_ANY_TAG,MPI_COMM_WORLD,&status); <br>if(status.MPI_TAG == 0) { <br>break; <br>} else { <br>work(myid,job); <br>MPI_Send(&myid,1,MPI_INT,master,status.MPI_TAG,MPI_COMM_WORLD); <br>} <br>} <br>} <br><br>MPI_Finalize(); <br><br>return 0; <br>} <br><br><br>[/code] <br><br><br>_______________________________________________<br>mpich-discuss mailing list <a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>To manage subscription options or unsubscribe:<br><a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss"
target="_blank">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br><br><br> </div> </div> </div></body></html>