[Nek5000-users] Submitting Job Nek Enabled Moab

nek5000-users at lists.mcs.anl.gov nek5000-users at lists.mcs.anl.gov
Tue Oct 28 07:12:51 CDT 2014


Hi All,


I am currently experiencing issues when trying to run jobs on my hpc system.


I have installed moab (with hdf5 and zoltan) as recommended on the nek5000 webpages - https://nek5000.mcs.anl.gov/index.php/Building_and_Using_Nek_/_MOAB.


When I do this the simulation will run if I acquire resources form the system and then run the job using nekmpi [casename] [np]. However if I then attempt to submit this same simulation to the queue using:


#!/bin/bash

#PBS -l nodes=1:ppn=12,pvmem=1900mb,walltime=01:00:00

#PBS -V


cd $PBS_O_WORKDIR

nekmpi pipe 8 > log

exit 0


I submit the job and it fails citing:


mv: cannot stat `pipe.log.8': No such file or directory
mv: cannot stat `pipe.sch': No such file or directory
node6.8452Error re-mmapping shared memory: Cannot allocate memory (err=9)
node6.8452Error re-mmapping shared memory: Cannot allocate memory
[node6:08452] Open MPI detected an unexpected PSM error in opening an endpoint: Error re-mmapping shared memory: Cannot allocate memory
--------------------------------------------------------------------------
mpiexec has exited due to process rank 7 with PID 8452 on
node node6 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
--------------------------------------------------------------------------


I thus decided there may be issues with moab so I installed according to the moab webpages - http://trac.mcs.anl.gov/projects/ITAPS/wiki/BuildingAndUsingMOAB.<http://trac.mcs.anl.gov/projects/ITAPS/wiki/BuildingAndUsingMOAB>


When I submit the job now I see a different error:

mv: cannot stat `pipe.log.8': No such file or directory
mv: cannot stat `pipe.sch': No such file or directory
./nek5000: error while loading shared libraries: libsz.so.2: cannot open shared object file: No such file or directory


The above error I feel is somehow linked to either netcdf or hdf5 but I cannot see how or why as they both installed correctly (as far as I can tell).


Thank you for any help,


Friedrich

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20141028/331b9aff/attachment.html>


More information about the Nek5000-users mailing list