<br>I am struggling to run an MPI job on our RedHat MRG which has condor and MPICH2 (1.4.1) with hydra process manager installed. If some one has used hydra on condor please let me know how to run a job successfully. <br><br>
Thanks<br>Savita<br><br><br><br><div class="gmail_quote">On Tue, Dec 27, 2011 at 9:30 PM, Pavan Balaji <span dir="ltr"><<a href="mailto:balaji@mcs.anl.gov" target="_blank">balaji@mcs.anl.gov</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Savita,<br>
<br>
[please keep mpich-discuss cc'ed]<br>
<br>
These files should be written to the same location that your mpiexec was launched from. If you want to write it to a different location, you can provide the full path to the files.<br>
<br>
-- Pavan<br>
<br>
On 12/27/2011 03:28 PM, Shrivastava, Savita wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi Pavan, Thanks for your response.<br>
<br>
After adding the -outfile-pattern ho.out -errfile-pattern he.err in the<br>
mpiexec command, the output file and error files were not written to the<br>
directory. I also tried adding -wdir. Another point I want to mention<br>
here is that the output file created by my script is not written to the<br>
output directory I mentioned in my script.<br>
<br>
I suspect there may be some parameters or settings may be needed to run<br>
mpiexec successfully on condor.<br>
<br>
Please advise.<br>
<br>
Thanks<br>
Savita<div><div><br>
<br>
<br>
<br>
-----Original Message-----<br>
From: Pavan Balaji [mailto:<a href="mailto:balaji@mcs.anl.gov" target="_blank">balaji@mcs.anl.gov</a>]<br>
Sent: Tuesday, December 27, 2011 12:01 AM<br>
To: <a href="mailto:mpich-discuss@mcs.anl.gov" target="_blank">mpich-discuss@mcs.anl.gov</a><br>
Cc: Shrivastava, Savita<br>
Subject: Re: [mpich-discuss] Hydra process manager on Condor<br>
<br>
<br>
Hydra does not understand Condor's parameters. But you can emulate the<br>
behavior you want by setting these options for mpiexec:<br>
<br>
-outfile-pattern ho.out -errfile-pattern he.err<br>
<br>
You can also do more fancy things like:<br>
<br>
-outfile-pattern ho.%r.out -errfile-pattern he.%r.err<br>
<br>
which uses different files for each rank.<br>
<br>
See mpiexec -outfile-pattern -help for more information on other<br>
patterns.<br>
<br>
-- Pavan<br>
<br>
On 12/19/2011 12:27 PM, Shrivastava, Savita wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi,<br>
<br>
We have installed MPICH2 with Hydra process manager on our Condor<br>
cluster. When I submit an mpi job to condor, the job is transferred to<br>
the execute node and executed properly (as I looked in the condor log<br>
</blockquote>
on<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
execute node where the job was executing) but it does not write the<br>
standard output from script to condor output file mentioned in job<br>
description file. Please guide me here how to run the mpi job<br>
successfully on condor using hydra process manager.<br>
<br>
My job description as below. The perl script has standard output<br>
</blockquote>
"Hello".<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>
<br>
universe = parallel<br>
<br>
executable = /usr/lib64/mpich2/bin/mpiexec<br>
<br>
arguments = -n 2 -machinefile machinefile <a href="http://test2.pl" target="_blank">test2.pl</a><br>
<br>
getenv=true<br>
<br>
machine_count = 1<br>
<br>
should_transfer_files = yes<br>
<br>
when_to_transfer_output = on_exit<br>
<br>
transfer_input_files = <a href="http://test2.pl" target="_blank">test2.pl</a><br>
<br>
output = ho.out<br>
<br>
error = he.err<br>
<br>
log = hl.log<br>
<br></div></div>
Requirements = Memory>= 1024&& Cpus>=2<div><br>
<br>
request_cpus = 2<br>
<br>
request_memory = 1024<br>
<br>
queue<br>
<br>
Thanks<br>
<br>
Savita<br>
<br>
<br>
<br>
______________________________<u></u>_________________<br>
mpich-discuss mailing list <a href="mailto:mpich-discuss@mcs.anl.gov" target="_blank">mpich-discuss@mcs.anl.gov</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/<u></u>mailman/listinfo/mpich-discuss</a><br>
</div></blockquote>
<br>
</blockquote><div><div>
<br>
-- <br>
Pavan Balaji<br>
<a href="http://www.mcs.anl.gov/%7Ebalaji" target="_blank">http://www.mcs.anl.gov/~balaji</a><br>
______________________________<u></u>_________________<br>
mpich-discuss mailing list <a href="mailto:mpich-discuss@mcs.anl.gov" target="_blank">mpich-discuss@mcs.anl.gov</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/<u></u>mailman/listinfo/mpich-discuss</a><br>
</div></div></blockquote></div><br>