[Nek5000-users] Problem on cluster

nek5000-users at lists.mcs.anl.gov nek5000-users at lists.mcs.anl.gov
Mon Apr 16 20:57:04 CDT 2018


There is a NOMPIIO option.

On 16 Apr 2018, at 21:24, "nek5000-users at lists.mcs.anl.gov <mailto:nek5000-users at lists.mcs.anl.gov> " <nek5000-users at lists.mcs.anl.gov <mailto:nek5000-users at lists.mcs.anl.gov> > wrote:

Stefan, 

 
Is there a way to run this without the flock option?  For a variety of reasons we don’t support it on our system and I have been trying to run nek but having the same issue.  Are there any compile flags that I can use or a different version of mpi perhaps?  Recommendations welcome.

 
Thanks, 

   Julie

 
From: Nek5000-users <nek5000-users-bounces at lists.mcs.anl.gov <mailto:nek5000-users-bounces at lists.mcs.anl.gov> > on behalf of "nek5000-users at lists.mcs.anl.gov <mailto:nek5000-users at lists.mcs.anl.gov> " <nek5000-users at lists.mcs.anl.gov <mailto:nek5000-users at lists.mcs.anl.gov> >
Reply-To: "nek5000-users at lists.mcs.anl.gov <mailto:nek5000-users at lists.mcs.anl.gov> " <nek5000-users at lists.mcs.anl.gov <mailto:nek5000-users at lists.mcs.anl.gov> >
Date: Monday, April 16, 2018 at 7:49 PM
To: "nek5000-users at lists.mcs.anl.gov <mailto:nek5000-users at lists.mcs.anl.gov> " <nek5000-users at lists.mcs.anl.gov <mailto:nek5000-users at lists.mcs.anl.gov> >
Subject: Re: [Nek5000-users] Problem on cluster

 
Before updating, can you please check if the following advise helps:

 
If the file system is LUSTRE, ensure that the directory is mounted with the 'flock' option




Stefan


On 16 Apr 2018, at 19:39, "nek5000-users at lists.mcs.anl.gov <mailto:nek5000-users at lists.mcs.anl.gov> " <nek5000-users at lists.mcs.anl.gov <mailto:nek5000-users at lists.mcs.anl.gov> > wrote:

Please use the release tarball instead of the GitHub master! The error message suggest that your MPI installation is outdated -> some necessary MPIIO features are missing. I think updating MPI will do the trick.

 
Stefan.

 

On 16 Apr 2018, at 17:46, "nek5000-users at lists.mcs.anl.gov <mailto:nek5000-users at lists.mcs.anl.gov> " <nek5000-users at lists.mcs.anl.gov <mailto:nek5000-users at lists.mcs.anl.gov> > wrote:

Dear Nek users, I'm having a problem when I use the latest version of Nek on a HPC cluster. I can compile, but when I run my simulations they finish. The logfile of a generic case is like the following: 

 
 
===========================================================

 
 
/----------------------------------------------------------\\

|      _   __ ______ __ __  ______  ____   ____   ____     |

|     / | / // ____// //_/ / ____/ / __ \\/ __ \\/ __ \\   |

|    /  |/ // __/  / ,<   /___ \\ / / / // / / // / / /    |

|   / /|  // /___ / /| | ____/ / / /_/ // /_/ // /_/ /     |

|  /_/ |_//_____//_/ |_|/_____/  \\___/ \\___/ \\___/      |

|                                                          |

|----------------------------------------------------------|

|                                                          |

| NEK5000:  Open Source Spectral Element Solver            |

| COPYRIGHT (c) 2008-2017 UCHICAGO ARGONNE, LLC            |

| Version:  17.0-rc1                                       |

| Web:      http://nek5000.mcs.anl.gov <http://nek5000.mcs.anl.gov>                      |

|                                                          |

\\----------------------------------------------------------/

                                                              

 
 Number of processors:          80

 REAL    wdsize      :           8

 INTEGER wdsize      :           4

 Timer accuracy      : 0.00E+00

  
 Reading /home/jrobinson/casos/Placa_6/Placa_6.rea                                                                                           

 Reading /home/jrobinson/casos/Placa_6/Placa_6.re2                                                                                           

 mapping elements to processors

 Reading /home/jrobinson/casos/Placa_6/Placa_6.ma2                                                                                           

 RANK     0 IEG    1754    1755    1756    1757    1758    1759    1760    1774

                   1775    1776    1777    1778    1779    1780    1794    1795

                   1796    1797    1798    1799    1800    1814    1815    1816

                   1817    1818    1819    1820    1834    1835    1836    1837

                   1838    1839    1840    1853    1854    1855    1856    1857

                   1858    1859    1860    1873    1874    1875    1876    1877

                   1878    1879    1880    1893    1894    1895    1896    1897

                   1898    1899    1913    1914    1915    1916    1917    1918

                   1919    1933    1934    1935    1936    1937    1938    1939

                   1953    1954    1955    1956    1957    1958    1974    1975

                   1976    1977    1978    1994    1995    1996    1997    1998

                   1999    2000    2014    2015    2016    2017    2018    2019

                   2020    9783    9784    9785    9786    9787    9788    9789

                   9790    9791    9792    9793    9794    9795    9796    9797

                   9798    9799    9800    9801    9802    9803    9804    9805

                   9806    9807    9808    9809    9810    9811    9812    9813

                   9814    9815    9816    9817    9818    9819    9820    9821

                   9822    9823    9824    9825    9826    9827    9828    9829

                   9830    9849    9855    9856    9861    9862

  
 element load imbalance:            1         150         151

 done :: mapping   0.32155     sec

 
  
  preading mesh 

 
=============================================================

 
 
 
 
So the last line is "preading mesh".

 
This doesn't give too much information, but the cluster generates a file with the following errors (at the end of this text).

 
When I use an old version of Nek on this same cluster, I have no problem running my cases. The problem is that I need to use the latest version because I'm using exo2nek routine for my meshes generated with Trelis (Cubit).

 
 
Any idea of what could I do?

 
Thank you all.

 
Juan Pablo.

 
 
 
 
 
 
=========================================================

 
 
This requires fcntl(2) to be implemented. As of 8/25/2011 it is not. Generic MPICH Message: File locking failed in ADIOI_Set_lock(fd 4A,cmd F_SETLKW/7,type F_RDLCK/0,whence 0) with return value FFFFFFFF and errno 26.

- If the file system is NFS, you need to use NFS version 3, ensure that the lockd daemon is running on all the machines, and mount the directory with the 'noac' option (no attribute caching).

- If the file system is LUSTRE, ensure that the directory is mounted with the 'flock' option.

ADIOI_Set_lock:: Function not implemented

 

_______________________________________________



Nek5000-users mailing list



Nek5000-users at lists.mcs.anl.gov <mailto:Nek5000-users at lists.mcs.anl.gov> 



https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users


_______________________________________________
Nek5000-users mailing list
Nek5000-users at lists.mcs.anl.gov <mailto:Nek5000-users at lists.mcs.anl.gov> 
https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20180417/269ac741/attachment-0001.html>


More information about the Nek5000-users mailing list