[mpich-discuss] program in system() executes before prior one finishes

John Joseph John.Joseph at utsa.edu
Fri Dec 3 07:03:08 CST 2010


Hello,

I have a problem I've been struggling with for months, and I'm thinking that folks that deal with parallel processing are the most likely ones to be able to solve it.  

I have this simple code that I run in R on various Windows XP and Windows 7:

system("swat_Edit.exe",wait=FALSE,invisible=FALSE)
system("swat2009.exe")

(Both swat_Edit.exe and swat2009.exe are compiled in Fortran.  I have access to the swat2009.exe source code, but not to the swat_Edit.exe source code.)

In the first of the above two lines, swat_Edit.exe opens hundreds of ASCII files having data descriptive of water basin characteristics, edits them, and then closes them.  In the second line, swat2009.exe reads those files to calculate streamflow at the water basin outlet.   These two code lines are actually part of a loop in which the set of ASCII files is revised and the corresponding streamflow is calculated, and the loop typically cycles about 10,000 times.  In each new cycle the streamflow calculated from the previous cycle is needed to direct swat_Edit.exe in revising the ASCII files so that swat2009.exe can calculate the streamflow for the new cycle.

Now, the problem is this: swat2009.exe sometimes begins executing BEFORE swat_Edit.exe finishes revising and closing all the ASCII files.  As a partial remedy, I insert a Sys.sleep() line like this:

system("swat_Edit.exe",wait=FALSE,invisible=FALSE)
Sys.sleep(7)
system("swat2009.exe")

But this doesn't always work.  It might work for, say, 7,553 cycles, but then on the 7,554th swat2009.exe starts before swat_Edit.exe is quite finished, and then the whole set of simulations, which takes a day or two, must start all over again.  Also, Sys.sleep() adds time to the whole process.  The higher the argument in Sys.sleep(), the less likely the crash, but the more the slow down - and the risk of swat2009.exe starting too soon is never completely eliminated even for Sys.sleep() times that far exceed the average time it takes swat_Edit.exe to complete its task.

The common sense reader might say, "Why not change the 'wait=FALSE' argument in the first line to 'wait=TRUE'?"  Well, if I do that the command won't execute.  I can change 'invisible=FALSE' to 'invisible=TRUE', but that doesn't seem to make a difference.

The problem is worst on my Core-i7, but also happens on an old single-processor computer I have.

MPICH and Rmpi are installed on my computers, and I'm hoping that I can somehow use them to positively prevent swat2009.exe from starting before swat_Edit.exe finishes.  

I'm a newbie as far as mpi goes.  Any ideas?  Your help would be much appreciated.

John Joseph

 



 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20101203/11bff756/attachment-0001.htm>


More information about the mpich-discuss mailing list