I am able to run the mpich2 outside the condor as well as inside the condor. For condor, I needed to add -wdir to the execute command. It was not finding the condor's scratch directory.<br><br>Thanks for following up.<br>
Savita<br><br><div class="gmail_quote">On Fri, Mar 16, 2012 at 4:32 PM, Pavan Balaji <span dir="ltr"><<a href="mailto:balaji@mcs.anl.gov" target="_blank">balaji@mcs.anl.gov</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
Are you able to use MPICH2 outside the condor environment? It's better to make sure you are able to do that first.<br>
<br>
-- Pavan<div><br>
<br>
On 01/24/2012 03:01 PM, Savita Shrivastava wrote:<br>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>
<br>
I am struggling to run an MPI job on our RedHat MRG which has condor and<br>
MPICH2 (1.4.1) with hydra process manager installed. If some one has<br>
used hydra on condor please let me know how to run a job successfully.<br>
<br>
Thanks<br>
Savita<br>
<br>
<br>
<br>
On Tue, Dec 27, 2011 at 9:30 PM, Pavan Balaji <<a href="mailto:balaji@mcs.anl.gov" target="_blank">balaji@mcs.anl.gov</a><br></div><div><div>
<mailto:<a href="mailto:balaji@mcs.anl.gov" target="_blank">balaji@mcs.anl.gov</a>>> wrote:<br>
<br>
Savita,<br>
<br>
[please keep mpich-discuss cc'ed]<br>
<br>
These files should be written to the same location that your mpiexec<br>
was launched from. If you want to write it to a different location,<br>
you can provide the full path to the files.<br>
<br>
-- Pavan<br>
<br>
On 12/27/2011 03:28 PM, Shrivastava, Savita wrote:<br>
<br>
Hi Pavan, Thanks for your response.<br>
<br>
After adding the -outfile-pattern ho.out -errfile-pattern he.err<br>
in the<br>
mpiexec command, the output file and error files were not<br>
written to the<br>
directory. I also tried adding -wdir. Another point I want to<br>
mention<br>
here is that the output file created by my script is not written<br>
to the<br>
output directory I mentioned in my script.<br>
<br>
I suspect there may be some parameters or settings may be needed<br>
to run<br>
mpiexec successfully on condor.<br>
<br>
Please advise.<br>
<br>
Thanks<br>
Savita<br>
<br>
<br>
<br>
<br>
-----Original Message-----<br>
From: Pavan Balaji [mailto:<a href="mailto:balaji@mcs.anl.gov" target="_blank">balaji@mcs.anl.gov</a><br>
<mailto:<a href="mailto:balaji@mcs.anl.gov" target="_blank">balaji@mcs.anl.gov</a>>]<br>
Sent: Tuesday, December 27, 2011 12:01 AM<br></div></div><div><div>
To: <a href="mailto:mpich-discuss@mcs.anl.gov" target="_blank">mpich-discuss@mcs.anl.gov</a> <mailto:<a href="mailto:mpich-discuss@mcs.anl.gov" target="_blank">mpich-discuss@mcs.anl.<u></u>gov</a>><br>
Cc: Shrivastava, Savita<br>
Subject: Re: [mpich-discuss] Hydra process manager on Condor<br>
<br>
<br>
Hydra does not understand Condor's parameters. But you can<br>
emulate the<br>
behavior you want by setting these options for mpiexec:<br>
<br>
-outfile-pattern ho.out -errfile-pattern he.err<br>
<br>
You can also do more fancy things like:<br>
<br>
-outfile-pattern ho.%r.out -errfile-pattern he.%r.err<br>
<br>
which uses different files for each rank.<br>
<br>
See mpiexec -outfile-pattern -help for more information on other<br>
patterns.<br>
<br>
-- Pavan<br>
<br>
On 12/19/2011 12:27 PM, Shrivastava, Savita wrote:<br>
<br>
Hi,<br>
<br>
We have installed MPICH2 with Hydra process manager on our<br>
Condor<br>
cluster. When I submit an mpi job to condor, the job is<br>
transferred to<br>
the execute node and executed properly (as I looked in the<br>
condor log<br>
<br>
on<br>
<br>
execute node where the job was executing) but it does not<br>
write the<br>
standard output from script to condor output file mentioned<br>
in job<br>
description file. Please guide me here how to run the mpi job<br>
successfully on condor using hydra process manager.<br>
<br>
My job description as below. The perl script has standard output<br>
<br>
"Hello".<br>
<br>
<br>
universe = parallel<br>
<br>
executable = /usr/lib64/mpich2/bin/mpiexec<br>
<br>
arguments = -n 2 -machinefile machinefile <a href="http://test2.pl" target="_blank">test2.pl</a><br></div></div>
<<a href="http://test2.pl" target="_blank">http://test2.pl</a>><div><br>
<br>
getenv=true<br>
<br>
machine_count = 1<br>
<br>
should_transfer_files = yes<br>
<br>
when_to_transfer_output = on_exit<br>
<br></div>
transfer_input_files = <a href="http://test2.pl" target="_blank">test2.pl</a> <<a href="http://test2.pl" target="_blank">http://test2.pl</a>><div><br>
<br>
output = ho.out<br>
<br>
error = he.err<br>
<br>
log = hl.log<br>
<br>
Requirements = Memory>= 1024&& Cpus>=2<br>
<br>
<br>
request_cpus = 2<br>
<br>
request_memory = 1024<br>
<br>
queue<br>
<br>
Thanks<br>
<br>
Savita<br>
<br>
<br>
<br></div>
______________________________<u></u>___________________<br>
mpich-discuss mailing list <a href="mailto:mpich-discuss@mcs.anl.gov" target="_blank">mpich-discuss@mcs.anl.gov</a><br>
<mailto:<a href="mailto:mpich-discuss@mcs.anl.gov" target="_blank">mpich-discuss@mcs.anl.<u></u>gov</a>><div><br>
To manage subscription options or unsubscribe:<br></div>
<a href="https://lists.mcs.anl.gov/__mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/__<u></u>mailman/listinfo/mpich-discuss</a><div><br>
<<a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/<u></u>mailman/listinfo/mpich-discuss</a><u></u>><br>
<br>
<br>
<br>
--<br>
Pavan Balaji<br></div>
<a href="http://www.mcs.anl.gov/%7Ebalaji" target="_blank">http://www.mcs.anl.gov/~balaji</a> <<a href="http://www.mcs.anl.gov/%7Ebalaji" target="_blank">http://www.mcs.anl.gov/%<u></u>7Ebalaji</a>><br>
______________________________<u></u>___________________<br>
mpich-discuss mailing list <a href="mailto:mpich-discuss@mcs.anl.gov" target="_blank">mpich-discuss@mcs.anl.gov</a><br>
<mailto:<a href="mailto:mpich-discuss@mcs.anl.gov" target="_blank">mpich-discuss@mcs.anl.<u></u>gov</a>><div><br>
To manage subscription options or unsubscribe:<br></div>
<a href="https://lists.mcs.anl.gov/__mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/__<u></u>mailman/listinfo/mpich-discuss</a><br>
<<a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/<u></u>mailman/listinfo/mpich-discuss</a><u></u>><div><br>
<br>
<br>
<br>
<br>
______________________________<u></u>_________________<br>
mpich-discuss mailing list <a href="mailto:mpich-discuss@mcs.anl.gov" target="_blank">mpich-discuss@mcs.anl.gov</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/<u></u>mailman/listinfo/mpich-discuss</a><br>
</div></blockquote><div><div>
<br>
-- <br>
Pavan Balaji<br>
<a href="http://www.mcs.anl.gov/%7Ebalaji" target="_blank">http://www.mcs.anl.gov/~balaji</a><br>
</div></div></blockquote></div><br>