[Nek5000-users] How to run Nek5000 in cluster based on mpich?

nek5000-users at lists.mcs.anl.gov nek5000-users at lists.mcs.anl.gov
Thu Jun 5 04:10:12 CDT 2014


Hi All,

My apologies, I actually never use the "rm -f SESSION.NAME" in my scripts.  I think
it was put there just so you wouldn't have too many junk files.  I suggest commenting
it out or removing it.

Paul

________________________________
From: nek5000-users-bounces at lists.mcs.anl.gov [nek5000-users-bounces at lists.mcs.anl.gov] on behalf of nek5000-users at lists.mcs.anl.gov [nek5000-users at lists.mcs.anl.gov]
Sent: Thursday, June 05, 2014 12:41 AM
To: Nek5000
Subject: Re: [Nek5000-users] How to run Nek5000 in cluster based on mpich?

I see, maybe the problem is specific to my machine too, my god. Thank you for your help.

Xianbei





At 2014-06-05 01:35:31, nek5000-users at lists.mcs.anl.gov wrote:
I’m just launching the code as explained in the ‘quick start’ chapter of the user manual. On my computer I had a problem with the sleeping time specified in the nek script. This problem is specific to my machine, but the description of your problem reminded me it. However I don’t know if it is really related ! Good luck !




Best regards,
Emmanuel



On 5 Jun 2014, at 14:58, <nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov>> <nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov>> wrote:

Hi, Emmanuel:
     I try your method, while even if I change sleep to 10 sec, it still has the same error. Can you make it more clear about how you managed to run in different nodes? I mean how to setup and run, what's different with the common case?
Best reguards
Xianbei





At 2014-06-05 12:57:27, nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov> wrote:
Hi all,

I just would like to bring my experience because I had a similar problem.

In my case it appears that the mpiexec command takes some times to be executed, and it is not fully completed after the end of instruction “sleep 2” . So the command rm -f SESSION.NAME is executed before full execution of mpiexec. Consequently, nek can’t find the session file and it searches for nek.rea, which doesn’t exist.
In my case the problem appears for a cpu number > to 4.

I just added some more seconds to the sleeping time and everything works fine now. "Sleep 5" is reasonable in my case to be able to launch nek on 24 cpus.

Hope that helps !


Best regards,
Emmanuel



On 5 Jun 2014, at 14:11, <nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov>> <nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov>> wrote:

Hi,Paul:
    I change this line:
    mpiexec -np $2 ./nek5000 > $1.log.$2 &
   to
   mpiexec -n $2 ./nek5000 > $1.log.$2 &
   in nekbmpi and then changed the name to 'runnek' in each node
   Then type ./runnek xxx 48
   While the error still exsits.
Xianbei




At 2014-06-04 11:20:52, nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov> wrote:

Hi Xianbei,

Have a look at the nekbmpi script and modify to match the job submission
procedure on your cluster.

Paul

________________________________
From: nek5000-users-bounces at lists.mcs.anl.gov<mailto:nek5000-users-bounces at lists.mcs.anl.gov> [nek5000-users-bounces at lists.mcs.anl.gov<mailto:nek5000-users-bounces at lists.mcs.anl.gov>] on behalf of nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov> [nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov>]
Sent: Wednesday, June 04, 2014 9:34 AM
To: Nek5000
Subject: [Nek5000-users] How to run Nek5000 in cluster based on mpich?

Hi,all:
   I have just built a small cluster of computers, based on mpich.I have tested the capability of this cluster, each node can be called freely.  Now I want to do my simulation in it, however, I don't know how to run Nek. As known, on only 1 computer, one can do like this:
  nekmpi XXX X
or
  nekbmpi XXX X
  while this is not the case in the cluster! I believe most of you have tried this, I'll appreciate if anyone can give me some advice.
Best regards
Xianbei


来自网易手机号码邮箱了解更多<http://shouji.163.com/>


来自网易手机号码邮箱了解更多<http://shouji.163.com/>
_______________________________________________
Nek5000-users mailing list
Nek5000-users at lists.mcs.anl.gov<mailto:Nek5000-users at lists.mcs.anl.gov>
https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users



来自网易手机号码邮箱了解更多<http://shouji.163.com/>
_______________________________________________
Nek5000-users mailing list
Nek5000-users at lists.mcs.anl.gov<mailto:Nek5000-users at lists.mcs.anl.gov>
https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users



来自网易手机号码邮箱了解更多<http://shouji.163.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20140605/c0973aa0/attachment.html>


More information about the Nek5000-users mailing list