[Nek5000-users] Problem on cluster
nek5000-users at lists.mcs.anl.gov
nek5000-users at lists.mcs.anl.gov
Mon Apr 16 20:03:00 CDT 2018
Stefan,
Is there a way to run this without the flock option? For a variety of reasons we don’t support it on our system and I have been trying to run nek but having the same issue. Are there any compile flags that I can use or a different version of mpi perhaps? Recommendations welcome.
Thanks,
Julie
From: Nek5000-users <nek5000-users-bounces at lists.mcs.anl.gov> on behalf of "nek5000-users at lists.mcs.anl.gov" <nek5000-users at lists.mcs.anl.gov>
Reply-To: "nek5000-users at lists.mcs.anl.gov" <nek5000-users at lists.mcs.anl.gov>
Date: Monday, April 16, 2018 at 7:49 PM
To: "nek5000-users at lists.mcs.anl.gov" <nek5000-users at lists.mcs.anl.gov>
Subject: Re: [Nek5000-users] Problem on cluster
Before updating, can you please check if the following advise helps:
If the file system is LUSTRE, ensure that the directory is mounted with the 'flock' option
Stefan
On 16 Apr 2018, at 19:39, "nek5000-users at lists.mcs.anl.gov" <nek5000-users at lists.mcs.anl.gov> wrote:
Please use the release tarball instead of the GitHub master! The error message suggest that your MPI installation is outdated -> some necessary MPIIO features are missing. I think updating MPI will do the trick.
Stefan.
On 16 Apr 2018, at 17:46, "nek5000-users at lists.mcs.anl.gov" <nek5000-users at lists.mcs.anl.gov> wrote:
Dear Nek users, I'm having a problem when I use the latest version of Nek on a HPC cluster. I can compile, but when I run my simulations they finish. The logfile of a generic case is like the following:
===========================================================
/----------------------------------------------------------\\
| _ __ ______ __ __ ______ ____ ____ ____ |
| / | / // ____// //_/ / ____/ / __ \\/ __ \\/ __ \\ |
| / |/ // __/ / ,< /___ \\ / / / // / / // / / / |
| / /| // /___ / /| | ____/ / / /_/ // /_/ // /_/ / |
| /_/ |_//_____//_/ |_|/_____/ \\___/ \\___/ \\___/ |
| |
|----------------------------------------------------------|
| |
| NEK5000: Open Source Spectral Element Solver |
| COPYRIGHT (c) 2008-2017 UCHICAGO ARGONNE, LLC |
| Version: 17.0-rc1 |
| Web: http://nek5000.mcs.anl.gov |
| |
\\----------------------------------------------------------/
Number of processors: 80
REAL wdsize : 8
INTEGER wdsize : 4
Timer accuracy : 0.00E+00
Reading /home/jrobinson/casos/Placa_6/Placa_6.rea
Reading /home/jrobinson/casos/Placa_6/Placa_6.re2
mapping elements to processors
Reading /home/jrobinson/casos/Placa_6/Placa_6.ma2
RANK 0 IEG 1754 1755 1756 1757 1758 1759 1760 1774
1775 1776 1777 1778 1779 1780 1794 1795
1796 1797 1798 1799 1800 1814 1815 1816
1817 1818 1819 1820 1834 1835 1836 1837
1838 1839 1840 1853 1854 1855 1856 1857
1858 1859 1860 1873 1874 1875 1876 1877
1878 1879 1880 1893 1894 1895 1896 1897
1898 1899 1913 1914 1915 1916 1917 1918
1919 1933 1934 1935 1936 1937 1938 1939
1953 1954 1955 1956 1957 1958 1974 1975
1976 1977 1978 1994 1995 1996 1997 1998
1999 2000 2014 2015 2016 2017 2018 2019
2020 9783 9784 9785 9786 9787 9788 9789
9790 9791 9792 9793 9794 9795 9796 9797
9798 9799 9800 9801 9802 9803 9804 9805
9806 9807 9808 9809 9810 9811 9812 9813
9814 9815 9816 9817 9818 9819 9820 9821
9822 9823 9824 9825 9826 9827 9828 9829
9830 9849 9855 9856 9861 9862
element load imbalance: 1 150 151
done :: mapping 0.32155 sec
preading mesh
=============================================================
So the last line is "preading mesh".
This doesn't give too much information, but the cluster generates a file with the following errors (at the end of this text).
When I use an old version of Nek on this same cluster, I have no problem running my cases. The problem is that I need to use the latest version because I'm using exo2nek routine for my meshes generated with Trelis (Cubit).
Any idea of what could I do?
Thank you all.
Juan Pablo.
=========================================================
This requires fcntl(2) to be implemented. As of 8/25/2011 it is not. Generic MPICH Message: File locking failed in ADIOI_Set_lock(fd 4A,cmd F_SETLKW/7,type F_RDLCK/0,whence 0) with return value FFFFFFFF and errno 26.
- If the file system is NFS, you need to use NFS version 3, ensure that the lockd daemon is running on all the machines, and mount the directory with the 'noac' option (no attribute caching).
- If the file system is LUSTRE, ensure that the directory is mounted with the 'flock' option.
ADIOI_Set_lock:: Function not implemented
_______________________________________________
Nek5000-users mailing list
Nek5000-users at lists.mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20180417/8f078128/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5197 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20180417/8f078128/attachment-0001.p7s>
More information about the Nek5000-users
mailing list