[mpich-discuss] Hydra process manager on Condor

Savita Shrivastava savita.shrivastava at gmail.com
Tue Jan 24 15:01:03 CST 2012


I am struggling to run an MPI job on our RedHat MRG which has condor and
MPICH2 (1.4.1) with hydra process manager installed. If some one has used
hydra on condor please let me know how to run a job successfully.

Thanks
Savita



On Tue, Dec 27, 2011 at 9:30 PM, Pavan Balaji <balaji at mcs.anl.gov> wrote:

> Savita,
>
> [please keep mpich-discuss cc'ed]
>
> These files should be written to the same location that your mpiexec was
> launched from.  If you want to write it to a different location, you can
> provide the full path to the files.
>
>  -- Pavan
>
> On 12/27/2011 03:28 PM, Shrivastava, Savita wrote:
>
>> Hi Pavan, Thanks for your response.
>>
>> After adding the -outfile-pattern ho.out -errfile-pattern he.err in the
>> mpiexec command, the output file and error files were not written to the
>> directory. I also tried adding -wdir. Another point I want to mention
>> here is that the output file created by my script is not written to the
>> output directory I mentioned in my script.
>>
>> I suspect there may be some parameters or settings may be needed to run
>> mpiexec successfully on condor.
>>
>> Please advise.
>>
>> Thanks
>> Savita
>>
>>
>>
>>
>> -----Original Message-----
>> From: Pavan Balaji [mailto:balaji at mcs.anl.gov]
>> Sent: Tuesday, December 27, 2011 12:01 AM
>> To: mpich-discuss at mcs.anl.gov
>> Cc: Shrivastava, Savita
>> Subject: Re: [mpich-discuss] Hydra process manager on Condor
>>
>>
>> Hydra does not understand Condor's parameters.  But you can emulate the
>> behavior you want by setting these options for mpiexec:
>>
>>   -outfile-pattern ho.out -errfile-pattern he.err
>>
>> You can also do more fancy things like:
>>
>>   -outfile-pattern ho.%r.out -errfile-pattern he.%r.err
>>
>> which uses different files for each rank.
>>
>> See mpiexec -outfile-pattern -help for more information on other
>> patterns.
>>
>>   -- Pavan
>>
>> On 12/19/2011 12:27 PM, Shrivastava, Savita wrote:
>>
>>> Hi,
>>>
>>> We have installed MPICH2 with Hydra process manager on our Condor
>>> cluster. When I submit an mpi job to condor, the job is transferred to
>>> the execute node and executed properly (as I looked in the condor log
>>>
>> on
>>
>>> execute node where the job was executing) but it does not write the
>>> standard output from script to condor output file mentioned in job
>>> description file. Please guide me here how to run the mpi job
>>> successfully on condor using hydra process manager.
>>>
>>> My job description as below. The perl script has standard output
>>>
>> "Hello".
>>
>>>
>>> universe = parallel
>>>
>>> executable = /usr/lib64/mpich2/bin/mpiexec
>>>
>>> arguments = -n 2 -machinefile machinefile test2.pl
>>>
>>> getenv=true
>>>
>>> machine_count = 1
>>>
>>> should_transfer_files = yes
>>>
>>> when_to_transfer_output = on_exit
>>>
>>> transfer_input_files = test2.pl
>>>
>>> output = ho.out
>>>
>>> error = he.err
>>>
>>> log = hl.log
>>>
>>> Requirements = Memory>= 1024&&  Cpus>=2
>>>
>>>
>>> request_cpus = 2
>>>
>>> request_memory = 1024
>>>
>>> queue
>>>
>>> Thanks
>>>
>>> Savita
>>>
>>>
>>>
>>> ______________________________**_________________
>>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>>> To manage subscription options or unsubscribe:
>>> https://lists.mcs.anl.gov/**mailman/listinfo/mpich-discuss<https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss>
>>>
>>
>>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
> ______________________________**_________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/**mailman/listinfo/mpich-discuss<https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120124/043611a5/attachment.htm>


More information about the mpich-discuss mailing list