[Nek5000-users] How to run Nek5000 in cluster based on mpich?

nek5000-users at lists.mcs.anl.gov nek5000-users at lists.mcs.anl.gov
Wed Jun 4 23:57:27 CDT 2014


Hi all,

I just would like to bring my experience because I had a similar problem.

In my case it appears that the mpiexec command takes some times to be executed, and it is not fully completed after the end of instruction “sleep 2” . So the command rm -f SESSION.NAME is executed before full execution of mpiexec. Consequently, nek can’t find the session file and it searches for nek.rea, which doesn’t exist.
In my case the problem appears for a cpu number > to 4.

I just added some more seconds to the sleeping time and everything works fine now. "Sleep 5" is reasonable in my case to be able to launch nek on 24 cpus.

Hope that helps !


Best regards,
Emmanuel



On 5 Jun 2014, at 14:11, <nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov>> <nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov>> wrote:

Hi,Paul:
    I change this line:
    mpiexec -np $2 ./nek5000 > $1.log.$2 &
   to
   mpiexec -n $2 ./nek5000 > $1.log.$2 &
   in nekbmpi and then changed the name to 'runnek' in each node
   Then type ./runnek xxx 48
   While the error still exsits.
Xianbei




At 2014-06-04 11:20:52, nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov> wrote:

Hi Xianbei,

Have a look at the nekbmpi script and modify to match the job submission
procedure on your cluster.

Paul

________________________________
From: nek5000-users-bounces at lists.mcs.anl.gov<mailto:nek5000-users-bounces at lists.mcs.anl.gov> [nek5000-users-bounces at lists.mcs.anl.gov<mailto:nek5000-users-bounces at lists.mcs.anl.gov>] on behalf of nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov> [nek5000-users at lists.mcs.anl.gov<mailto:nek5000-users at lists.mcs.anl.gov>]
Sent: Wednesday, June 04, 2014 9:34 AM
To: Nek5000
Subject: [Nek5000-users] How to run Nek5000 in cluster based on mpich?

Hi,all:
   I have just built a small cluster of computers, based on mpich.I have tested the capability of this cluster, each node can be called freely.  Now I want to do my simulation in it, however, I don't know how to run Nek. As known, on only 1 computer, one can do like this:
  nekmpi XXX X
or
  nekbmpi XXX X
  while this is not the case in the cluster! I believe most of you have tried this, I'll appreciate if anyone can give me some advice.
Best regards
Xianbei


来自网易手机号码邮箱了解更多<http://shouji.163.com/>


来自网易手机号码邮箱了解更多<http://shouji.163.com/>
_______________________________________________
Nek5000-users mailing list
Nek5000-users at lists.mcs.anl.gov<mailto:Nek5000-users at lists.mcs.anl.gov>
https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20140605/7b63abbd/attachment.html>


More information about the Nek5000-users mailing list