<br><br><div class="gmail_quote">On Wed, Jul 28, 2010 at 3:12 AM, Nicolas Rosner <span dir="ltr"><<a href="mailto:nrosner@gmail.com">nrosner@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Hi Ivan and all,<br>
<br>
We use MPICH2 (in user space) on a cluster that runs Torque/PBS (as<br>
provided by root).<br>
<br>
I never really managed to properly "integrate" the two (I'm not sure<br>
there's even a standard way to do that -- e.g. even if you were to use<br>
MPI2 spawn et al for dynamic proc mgmt, I suppose you'd still be<br>
trapped within the MPD-supplied MPI world, no?).<br>
<br>
But, frankly, so far I've had no real need for such a thing. So what I<br>
do is this: my job desc files (the .pbs text file, or whatever you'll<br>
qsub) contain<br>
<br>
1) a pipeline similar to the one Camilo described<br>
<br>
2) commands that ensure no old forgotten mpd processes remain out<br>
there (it's a !@$ when your whole job dies after days waiting because<br>
a ring failed to boot!)<br>
<br>
3) commands that ensure a new clean mpd ring gets booted properly<br>
w/the right args according to what we parsed in 1), etc.<br>
<br>
4) # put your favorite mpiexec here<br>
<br>
5) mpdallexit.<br>
<br>
That seems to work quite well, at least for my needs.<br>
<br>
Cheers,<br>
N.<br>
<br>
<br>
PS: Hydra works like a charm on our 3-PC testing "minicluster" at the<br>
office (I really enjoy forgetting about the mpd ring drill<br>
altogether!) but I couldn't get it to stop choking on some dns quirk<br>
of the real cluster (where, alas, no root), so I'm still using mpd<br>
there. If you're interested in some wrapper scripts (just hacks, but<br>
they do the job), do let me know.<br>
<br>
<br></blockquote><div><br></div><div>Right now I moved from using mpd to hydra and has been working fine, it's still on testing phase, but if everything goes fine I find it a good solution since it's powerful and you don't have to mess with mpd's ring. Thanks a lot for your help.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<br>
.pbs jobspecs (the text files that I qsub) usually contain something<br>
similar to what Camilo mentioned<br>
<div><div></div><div class="h5"><br>
<br>
<br>
<br>
<br>
On Mon, Jul 26, 2010 at 11:44 AM, Ivan Pulido <<a href="mailto:mefistofeles87@gmail.com">mefistofeles87@gmail.com</a>> wrote:<br>
><br>
><br>
> On Fri, Jul 23, 2010 at 6:24 PM, Pavan Balaji <<a href="mailto:balaji@mcs.anl.gov">balaji@mcs.anl.gov</a>> wrote:<br>
>><br>
>> Ivan,<br>
>><br>
>> Can you try using the Hydra process manager?<br>
>><br>
>> % mpiexec.hydra -rmk pbs ./application<br>
>><br>
><br>
> This didn't work, I'm not sure if this has to be with the way I've set up my<br>
> cluster. When I try running that command specifying 20 nodes (-n 20) all the<br>
> jobs are run on a single machine and the PBS server doesn't find out about<br>
> this application running (qstat doesn't shopw anything). Any ideas about<br>
> this subject are very welcome.<br>
><br>
> Thanks.<br>
><br>
>><br>
>> -- Pavan<br>
>><br>
>> On 07/23/2010 05:15 PM, Ivan Pulido wrote:<br>
>>><br>
>>> Hello, I'm trying to configure torque resource manager and MPICH2 (with<br>
>>> MPD) but Im having some issues.<br>
>>><br>
>>> The MPICH2 user's guide says there's a way to convert the Torque node<br>
>>> file to one MPD can read, but this is outdated since the syntax used by<br>
>>> torque nowadays is not the one mentioned on MPICH2 user's guide, so I can't<br>
>>> use what's there to use Torque with MPICH2. On the other hand, I tried using<br>
>>> OSC mpiexec <a href="http://www.osc.edu/~djohnson/mpiexec/" target="_blank">http://www.osc.edu/~djohnson/mpiexec/</a> with no good results since<br>
>>> it's looking for a libpbs.a that's not part of Torque default install (this<br>
>>> is for torque's mailling list).<br>
>>><br>
>>> So, what I'm trying to tell is that the ways the user's guide advice to<br>
>>> use MPICH2 with torque functionality are not correct with newest versions of<br>
>>> the software involved. So I'd like to know if there's a way to use MPICH2<br>
>>> with torque functionality that really works with newest versions, I'd really<br>
>>> like a help with this since we need using MPI in our cluster urgently.<br>
>>><br>
>>> Thanks.<br>
>>><br>
>>> --<br>
>>> Ivan Pulido<br>
>>> Estudiante de Física<br>
>>> Universidad Nacional de Colombia<br>
>>><br>
>>><br>
>>> ------------------------------------------------------------------------<br>
>>><br>
>>> _______________________________________________<br>
>>> mpich-discuss mailing list<br>
>>> <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
>>> <a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br>
>><br>
>> --<br>
>> Pavan Balaji<br>
>> <a href="http://www.mcs.anl.gov/~balaji" target="_blank">http://www.mcs.anl.gov/~balaji</a><br>
>> _______________________________________________<br>
>> mpich-discuss mailing list<br>
>> <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
>> <a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br>
><br>
><br>
><br>
> --<br>
> Ivan Pulido<br>
> Estudiante de Física<br>
> Universidad Nacional de Colombia<br>
><br>
> _______________________________________________<br>
> mpich-discuss mailing list<br>
> <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
> <a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br>
><br>
><br>
_______________________________________________<br>
mpich-discuss mailing list<br>
<a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
<a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>Ivan Pulido<br>Estudiante de Física<br>Universidad Nacional de Colombia<br>