[MOAB-dev] Parallel Test Failure

Rajeev Jain jain at mcs.anl.gov
Mon Oct 27 12:30:14 CDT 2014


Did all your tests pass?
Are you building hdf5 with mpi support?
This could be because due to serial hdf5. I see you have:
FAIL: parallel_hdf5_test                 

 

Rajeev Jain 
630-252-3176 / 630-252-5986 (fax)
jain at mcs.anl.gov 


On Monday, October 27, 2014 12:24 PM, "Grindeanu, Iulian R." <iulian at mcs.anl.gov> wrote:
 


 
Hi Friedrich,

What are you trying to do? 
Rajeev and Vijay have experience running / building nek5000 with moab.

Iulian


________________________________
 
From: Grabner, Friedrich [F.M.Grabner at warwick.ac.uk]
Sent: Monday, October 27, 2014 12:17 PM
To: Grindeanu, Iulian R.
Subject: Re: Parallel Test Failure


Hi Iulian,

I have managed to install it although I believe there is some issue when installing on hpc.

However now when I try to run my moab enabled nek5000 I can only do so if I secure my resources on the system and then manually start my job. 

nekmpi [casename] [np]


When I submit a job to the queue I receive the following error:

-bash: BASH_FUNC_module(): line 0: syntax error near unexpected token `)'
-bash: BASH_FUNC_module(): line 0: `BASH_FUNC_module() () {  eval $($LMOD_CMD bash "$@");'
-bash: error importing function definition for `BASH_FUNC_module'
/home/eng/esumgy/.bashrc: line 29: module: command not found
mv: cannot stat `pipe.log.8': No such file or directory
mv: cannot stat `pipe.sch': No such file or directory
node6.1259Error re-mmapping shared memory: Cannot allocate memory (err=9)
node6.1259Error re-mmapping shared memory: Cannot allocate memory
[node6:01259] Open MPI detected an unexpected PSM error in opening an endpoint: Error re-mmapping shared memory: Cannot allocate memory
--------------------------------------------------------------------------
mpiexec has exited due to process rank 5 with PID 1259 on
node node6 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
--------------------------------------------------------------------------

Could this be as I haven't correctly installed moab or any of it's prerequisites?

Regards Friedrich





________________________________
 
From: Grindeanu, Iulian R. <iulian at mcs.anl.gov>
Sent: 27 October 2014 16:54
To: Grabner, Friedrich
Cc: MOAB dev
Subject: RE: Parallel Test Failure 
 
Hi Friedrich,
Thanks for your report.
 I could not reproduce your error, so I don't know what is going on.
Are these the only errors you see in your make check? 
Can you try a newer version of moab, or hdf5?

We have several nightly builds on our systems that resemble you configuration, for example this one:
http://gnep.mcs.anl.gov:8010/builders/moab-par

more configurations are here:

http://gnep.mcs.anl.gov:8010/waterfall?category=moab

thanks,
Iulian



________________________________
 
From: Grabner, Friedrich [F.M.Grabner at warwick.ac.uk]
Sent: Friday, October 24, 2014 8:03 AM
To: Grindeanu, Iulian R.
Subject: Re: Parallel Test Failure


Hi Iulian,

See attached!

Friedrich


________________________________
 
From: Grindeanu, Iulian R. <iulian at mcs.anl.gov>
Sent: 24 October 2014 13:14
To: Grabner, Friedrich; moab-announce at mcs.anl.gov
Subject: RE: Parallel Test Failure 
 
can you send your config.log file?
Thanks,
Iulian



________________________________
 
From: moab-announce-bounces at mcs.anl.gov [moab-announce-bounces at mcs.anl.gov] on behalf of Grabner, Friedrich [F.M.Grabner at warwick.ac.uk]
Sent: Friday, October 24, 2014 4:22 AM
To: moab-announce at mcs.anl.gov
Subject: [MOAB-announce] Parallel Test Failure


Hi All,

I am trying to install moab-4.6.3 on my HPC system following the instructions on following webpage:

http://trac.mcs.anl.gov/projects/ITAPS/wiki/BuildingAndUsingMOAB

I have run everything exactly as specified on my desktop and successfully installed it however on hpc I receive error:

PASS: pcomm_unit                                                                                                                                                                              
FAIL: parallel_unit_tests                                                                                                                                                                     
PASS: uber_parallel_test                                                                                                                                                                      
FAIL: scdtest                                                                                                                                                                                 
PASS: pcomm_serial                                                                                                                                                                            
PASS: par_spatial_locator_test                                                                                                                                                                
FAIL: parallel_hdf5_test                                                                                                                                                                      
PASS: mhdf_parallel                                                                                                                                                                           
PASS: parallel_write_test                                                                                                                                                                     
================================                                                                                                                                                              
3 of 9 tests failed                                                                                                                                                                           
See test/parallel/test-suite.log                                                                                                                                                              
================================                                                                                                                                                              
make[4]: *** [test-suite.log] Error 1                                                                                                                                                         
make[4]: Leaving directory `/gpfs/home/eng/esumgy/moab-4.6.3/test/parallel'                                                                                                                   
make[3]: *** [check-TESTS] Error 2                                                                                                                                                            
make[3]: Leaving directory `/gpfs/home/eng/esumgy/moab-4.6.3/test/parallel'                                                                                                                   
make[2]: *** [check-am] Error 2                                                                                                                                                               
make[2]: Leaving directory `/gpfs/home/eng/esumgy/moab-4.6.3/test/parallel'                                                                                                                   
make[1]: *** [check-recursive] Error 1                                                                                                                                                        
make[1]: Leaving directory `/gpfs/home/eng/esumgy/moab-4.6.3/test

I have attached my test-suite.log file.

Thanks in advance, Friedrich
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/moab-dev/attachments/20141027/7ae872cc/attachment-0001.html>


More information about the moab-dev mailing list