[mpich-discuss] Hydra process manager on Condor

Savita Shrivastava savita.shrivastava at gmail.com
Mon Mar 19 12:40:16 CDT 2012


I am able to run the mpich2 outside the condor as well as inside the
condor. For condor, I needed to add -wdir to the execute command. It was
not finding the condor's scratch directory.

Thanks for following up.
Savita

On Fri, Mar 16, 2012 at 4:32 PM, Pavan Balaji <balaji at mcs.anl.gov> wrote:

>
> Are you able to use MPICH2 outside the condor environment?  It's better to
> make sure you are able to do that first.
>
>  -- Pavan
>
>
> On 01/24/2012 03:01 PM, Savita Shrivastava wrote:
>
>>
>> I am struggling to run an MPI job on our RedHat MRG which has condor and
>> MPICH2 (1.4.1) with hydra process manager installed. If some one has
>> used hydra on condor please let me know how to run a job successfully.
>>
>> Thanks
>> Savita
>>
>>
>>
>> On Tue, Dec 27, 2011 at 9:30 PM, Pavan Balaji <balaji at mcs.anl.gov
>> <mailto:balaji at mcs.anl.gov>> wrote:
>>
>>    Savita,
>>
>>    [please keep mpich-discuss cc'ed]
>>
>>    These files should be written to the same location that your mpiexec
>>    was launched from.  If you want to write it to a different location,
>>    you can provide the full path to the files.
>>
>>      -- Pavan
>>
>>    On 12/27/2011 03:28 PM, Shrivastava, Savita wrote:
>>
>>        Hi Pavan, Thanks for your response.
>>
>>        After adding the -outfile-pattern ho.out -errfile-pattern he.err
>>        in the
>>        mpiexec command, the output file and error files were not
>>        written to the
>>        directory. I also tried adding -wdir. Another point I want to
>>        mention
>>        here is that the output file created by my script is not written
>>        to the
>>        output directory I mentioned in my script.
>>
>>        I suspect there may be some parameters or settings may be needed
>>        to run
>>        mpiexec successfully on condor.
>>
>>        Please advise.
>>
>>        Thanks
>>        Savita
>>
>>
>>
>>
>>        -----Original Message-----
>>        From: Pavan Balaji [mailto:balaji at mcs.anl.gov
>>        <mailto:balaji at mcs.anl.gov>]
>>        Sent: Tuesday, December 27, 2011 12:01 AM
>>        To: mpich-discuss at mcs.anl.gov <mailto:mpich-discuss at mcs.anl.**gov<mpich-discuss at mcs.anl.gov>
>> >
>>        Cc: Shrivastava, Savita
>>        Subject: Re: [mpich-discuss] Hydra process manager on Condor
>>
>>
>>        Hydra does not understand Condor's parameters.  But you can
>>        emulate the
>>        behavior you want by setting these options for mpiexec:
>>
>>           -outfile-pattern ho.out -errfile-pattern he.err
>>
>>        You can also do more fancy things like:
>>
>>           -outfile-pattern ho.%r.out -errfile-pattern he.%r.err
>>
>>        which uses different files for each rank.
>>
>>        See mpiexec -outfile-pattern -help for more information on other
>>        patterns.
>>
>>           -- Pavan
>>
>>        On 12/19/2011 12:27 PM, Shrivastava, Savita wrote:
>>
>>            Hi,
>>
>>            We have installed MPICH2 with Hydra process manager on our
>>            Condor
>>            cluster. When I submit an mpi job to condor, the job is
>>            transferred to
>>            the execute node and executed properly (as I looked in the
>>            condor log
>>
>>        on
>>
>>            execute node where the job was executing) but it does not
>>            write the
>>            standard output from script to condor output file mentioned
>>            in job
>>            description file. Please guide me here how to run the mpi job
>>            successfully on condor using hydra process manager.
>>
>>            My job description as below. The perl script has standard
>> output
>>
>>        "Hello".
>>
>>
>>            universe = parallel
>>
>>            executable = /usr/lib64/mpich2/bin/mpiexec
>>
>>            arguments = -n 2 -machinefile machinefile test2.pl
>>            <http://test2.pl>
>>
>>
>>            getenv=true
>>
>>            machine_count = 1
>>
>>            should_transfer_files = yes
>>
>>            when_to_transfer_output = on_exit
>>
>>            transfer_input_files = test2.pl <http://test2.pl>
>>
>>
>>            output = ho.out
>>
>>            error = he.err
>>
>>            log = hl.log
>>
>>            Requirements = Memory>= 1024&&  Cpus>=2
>>
>>
>>            request_cpus = 2
>>
>>            request_memory = 1024
>>
>>            queue
>>
>>            Thanks
>>
>>            Savita
>>
>>
>>
>>            ______________________________**___________________
>>            mpich-discuss mailing list mpich-discuss at mcs.anl.gov
>>            <mailto:mpich-discuss at mcs.anl.**gov<mpich-discuss at mcs.anl.gov>
>> >
>>
>>            To manage subscription options or unsubscribe:
>>            https://lists.mcs.anl.gov/__**mailman/listinfo/mpich-discuss<https://lists.mcs.anl.gov/__mailman/listinfo/mpich-discuss>
>>
>>            <https://lists.mcs.anl.gov/**mailman/listinfo/mpich-discuss<https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss>
>> **>
>>
>>
>>
>>    --
>>    Pavan Balaji
>>    http://www.mcs.anl.gov/~balaji <http://www.mcs.anl.gov/%**7Ebalaji<http://www.mcs.anl.gov/%7Ebalaji>
>> >
>>    ______________________________**___________________
>>    mpich-discuss mailing list mpich-discuss at mcs.anl.gov
>>    <mailto:mpich-discuss at mcs.anl.**gov <mpich-discuss at mcs.anl.gov>>
>>
>>    To manage subscription options or unsubscribe:
>>    https://lists.mcs.anl.gov/__**mailman/listinfo/mpich-discuss<https://lists.mcs.anl.gov/__mailman/listinfo/mpich-discuss>
>>    <https://lists.mcs.anl.gov/**mailman/listinfo/mpich-discuss<https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss>
>> **>
>>
>>
>>
>>
>>
>> ______________________________**_________________
>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/**mailman/listinfo/mpich-discuss<https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss>
>>
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120319/7be7ea06/attachment.htm>


More information about the mpich-discuss mailing list