[Nek5000-users] Running NEK on CRAY; compilation & running issues

nek5000-users at lists.mcs.anl.gov nek5000-users at lists.mcs.anl.gov
Mon Jan 30 06:57:13 CST 2012


Oliver,

Thanks for your comments.  We'll certainly take care of the
multd() and gsync() issues.

We have what I believe is an xe6 on site and several users
are using it regularly.   They may have some comments about
the flags, etc. and the issues that you are running into.
Hopefully, someone will get back to you early today.

Regards,

Paul


On Mon, 30 Jan 2012, nek5000-users at lists.mcs.anl.gov wrote:

> Dear NEKs,
>
> I'm currently trying to compile NEK via crayftn to run on a XE6 and I 
> run into some issues. Some compilation problems I solved (with kind help 
> of CRAY personnel). I'll mention them here as they can be part of the 
> problem, I assume.
>
> 1) changed
> *ftn*) P="-r8 -Mpreprocess
> to *ftn*) P="-s real64 -eZ -em"
> for double precision reals and to invoke preprocessor in makenek.inc 
> (Cray uses ftn command as a wrapper)
>
> 2) changed subroutine name “gsync” to “gsync_nek” in all source files 
> as 
> crayftn has a build-in routine of the same name that conflicts with 
> Nek's gsync
>
> 3) changed calls to subroutine “multd” from
> CALL MULTD (TA1,TVX,RXM2,SXM2,TXM2,1)
> to CALL MULTD (TA1,TVX,RXM2,SXM2,TXM2,1,0)
> in “navier1.f” as crayftn complains about missing seventh argument 
> “iflg” (calls come from subroutine “tmultd” which seems not to be 
> called 
> in the code, though)
>
> In the SIZE file I'm setting lelt=300, lelv=lelt, lp = 64, lelg = 300.
> With these changes I'm running the eddy_uv example via “aprun -n 64 -N 
> 32 ./nek5000” on 64 processors and I'm getting the following output:
>
> /----------------------------------------------------------\\
> | _ __ ______ __ __ ______ ____ ____ ____ |
> | / | / // ____// //_/ / ____/ / __ \\ / __ \\ / __ \\ |
> | / |/ // __/ / ,< /___ \\ / / / // / / // / / / |
> | / /| // /___ / /| | ____/ / / /_/ // /_/ // /_/ / |
> | /_/ |_//_____//_/ |_|/_____/ \\____/ \\____/ \\____/ |
> | |
> |----------------------------------------------------------|
> | |
> | NEK5000: Open Source Spectral Element Solver |
> | COPYRIGHT (c) 2008-2010 UCHICAGO ARGONNE, LLC |
> | Version: 1.0rc1 / SVN r730 |
> | Web: http://nek5000.mcs.anl.gov |
> | |
> \\----------------------------------------------------------/
>
>
> Number of processors: 64
> REAL wdsize : 8
> INTEGER wdsize : 4
>
>
> Beginning session:
> /zhome/academic/HLRS/iag/iagoschm/run_nek/eddy_example/eddy_uv.rea
>
>
> timer accuracy: 2.8610229E-07 sec
>
> read .rea file
> nelgt/nelgv/lelt: 256 256 300
> lx1 /lx2 /lx3 : 8 6 8
>
> mapping elements to processors
> 0, 2*4, 2*256 NELV
> 1, 2*4, 2*256 NELV
> 8, 2*4, 2*256 NELV
> 9, 2*4, 2*256 NELV
> 17, 2*4, 2*256 NELV
> 16, 2*4, 2*256 NELV
> 20, 2*4, 2*256 NELV
> 3*4, 2*256 NELV
> 29, 2*4, 2*256 NELV
> 21, 2*4, 2*256 NELV
> 28, 2*4, 2*256 NELV
> 7, 2*4, 2*256 NELV
> 31, 2*4, 2*256 NELV
> 25, 2*4, 2*256 NELV
> 11, 2*4, 2*256 NELV
> 12, 2*4, 2*256 NELV
> 6, 2*4, 2*256 NELV
> 30, 2*4, 2*256 NELV
> 18, 2*4, 2*256 NELV
> 19, 2*4, 2*256 NELV
> 2, 2*4, 2*256 NELV
> 5, 2*4, 2*256 NELV
> 23, 2*4, 2*256 NELV
> 27, 2*4, 2*256 NELV
> 10, 2*4, 2*256 NELV
> 22, 2*4, 2*256 NELV
> 13, 2*4, 2*256 NELV
> 14, 2*4, 2*256 NELV
> 15, 2*4, 2*256 NELV
> 3, 2*4, 2*256 NELV
> 26, 2*4, 2*256 NELV
> 24, 2*4, 2*256 NELV
> 33, 2*4, 2*256 NELV
> 41, 2*4, 2*256 NELV
> 32, 2*4, 2*256 NELV
> 40, 2*4, 2*256 NELV
> 45, 2*4, 2*256 NELV
> 44, 2*4, 2*256 NELV
> 49, 2*4, 2*256 NELV
> 34, 2*4, 2*256 NELV
> 35, 2*4, 2*256 NELV
> 48, 2*4, 2*256 NELV
> 39, 2*4, 2*256 NELV
> 38, 2*4, 2*256 NELV
> 50, 2*4, 2*256 NELV
> 51, 2*4, 2*256 NELV
> 37, 2*4, 2*256 NELV
> 36, 2*4, 2*256 NELV
> 55, 2*4, 2*256 NELV
> 54, 2*4, 2*256 NELV
> 57, 2*4, 2*256 NELV
> 58, 2*4, 2*256 NELV
> 43, 2*4, 2*256 NELV
> 42, 2*4, 2*256 NELV
> 47, 2*4, 2*256 NELV
> 53, 2*4, 2*256 NELV
> 59, 2*4, 2*256 NELV
> 46, 2*4, 2*256 NELV
> 60, 2*4, 2*256 NELV
> 61, 2*4, 2*256 NELV
> 63, 2*4, 2*256 NELV
> 62, 2*4, 2*256 NELV
> 52, 2*4, 2*256 NELV
> 56, 2*4, 2*256 NELV
> 25, 0, 2*4, 256 NELT FAIL
> 24, 0, 2*4, 256 NELT FAIL
> 21, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> 20, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> 31, 0, 2*4, 256 NELT FAIL
> 8, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> 17, 0, 2*4, 256 NELT FAIL
> 14, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> 15, 0, 2*4, 256 NELT FAIL
> 16, 0, 2*4, 256 NELT FAIL
> 26, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> 10, 0, 2*4, 256 NELT FAIL
> 2, 0, 2*4, 256 NELT FAIL
> 30, 0, 2*4, 256 NELT FAIL
> 11, 0, 2*4, 256 NELT FAIL
> 7, 0, 2*4, 256 NELT FAIL
> 1, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> 2*0, 2*4, 256 NELT FAIL
> 6, 0, 2*4, 256 NELT FAIL
> 4, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> 13, 0, 2*4, 256 NELT FAIL
> 12, 0, 2*4, 256 NELT FAIL
> 28, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> 18, 0, 2*4, 256 NELT FAIL
> 29, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> 23, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> 27, 0, 2*4, 256 NELT FAIL
> 3, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> 19, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> 5, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> 22, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> 2*0, 2*4, 256 NELT FB
> 1, 0, 2*4, 256 NELT FB
> 2, 0, 2*4, 256 NELT FB
> 3, 0, 2*4, 256 NELT FB
> 4, 0, 2*4, 256 NELT FB
> 5, 0, 2*4, 256 NELT FB
> 6, 0, 2*4, 256 NELT FB
> 7, 0, 2*4, 256 NELT FB
> 8, 0, 2*4, 256 NELT FB
> 9, 3*4, 256 NELT FB
> 10, 0, 2*4, 256 NELT FB
> 11, 0, 2*4, 256 NELT FB
> 12, 0, 2*4, 256 NELT FB
> 13, 0, 2*4, 256 NELT FB
> 14, 0, 2*4, 256 NELT FB
> 15, 0, 2*4, 256 NELT FB
> 16, 0, 2*4, 256 NELT FB
> 17, 0, 2*4, 256 NELT FB
> 18, 0, 2*4, 256 NELT FB
> 19, 0, 2*4, 256 NELT FB
> 20, 0, 2*4, 256 NELT FB
> 21, 0, 2*4, 256 NELT FB
> 22, 0, 2*4, 256 NELT FB
> 23, 0, 2*4, 256 NELT FB
> 24, 0, 2*4, 256 NELT FB
> 25, 0, 2*4, 256 NELT FB
> 26, 0, 2*4, 256 NELT FB
> 27, 0, 2*4, 256 NELT FB
> 28, 0, 2*4, 256 NELT FB
> 29, 0, 2*4, 256 NELT FB
> 30, 0, 2*4, 256 NELT FB
> 31, 0, 2*4, 256 NELT FB
>
> call exitt: dying ...
>
> backtrace(): obtained 1 stack frames.
> [0x662a40]
>
> total elapsed time : 7.82199E-02 sec
> total solver time incl. I/O : 0.00000E+00 sec
> time/timestep : 0.00000E+00 sec
> CPU seconds/timestep/gridpt : 0.00000E+00 sec
>
> 39, 0, 2*4, 256 NELT FAIL
> 38, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> 36, 0, 2*4, 256 NELT FAIL
> 41, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> 32, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> 48, 0, 2*4, 256 NELT FAIL
> 34, 0, 2*4, 256 NELT FAIL
> 54, 196, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> 40, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> 46, 0, 2*4, 256 NELT FAIL
> 47, 0, 2*4, 256 NELT FAIL
> 33, 0, 2*4, 256 NELT FAIL
> 37, 0, 2*4, 256 NELT FAIL
> 35, 0, 2*4, 256 NELT FAIL
> 43, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> 45, 0, 2*4, 256 NELT FAIL
> 42, 0, 2*4, 256 NELT FAIL
> 44, 0, 2*4, 256 NELT FAIL
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> Check that .map file and .rea file agree
> 32, 0, 2*4, 256 NELT FB
> 33, 0, 2*4, 256 NELT FB
> 34, 0, 2*4, 256 NELT FB
> 35, 0, 2*4, 256 NELT FB
> 36, 0, 2*4, 256 NELT FB
> 37, 0, 2*4, 256 NELT FB
> 38, 0, 2*4, 256 NELT FB
> 39, 0, 2*4, 256 NELT FB
> 40, 0, 2*4, 256 NELT FB
> 41, 0, 2*4, 256 NELT FB
> 42, 0, 2*4, 256 NELT FB
> 43, 0, 2*4, 256 NELT FB
> 44, 0, 2*4, 256 NELT FB
> 45, 0, 2*4, 256 NELT FB
> 46, 0, 2*4, 256 NELT FB
> 47, 0, 2*4, 256 NELT FB
> 48, 0, 2*4, 256 NELT FB
> 49, 3*4, 256 NELT FB
> 50, 3*4, 256 NELT FB
> 51, 3*4, 256 NELT FB
> 52, 3*4, 256 NELT FB
> 53, 3*4, 256 NELT FB
> 54, 196, 2*4, 256 NELT FB
> 55, 3*4, 256 NELT FB
> 56, 3*4, 256 NELT FB
> 57, 3*4, 256 NELT FB
> 58, 3*4, 256 NELT FB
> 59, 3*4, 256 NELT FB
> 60, 3*4, 256 NELT FB
> 61, 3*4, 256 NELT FB
> 62, 3*4, 256 NELT FB
> 63, 3*4, 256 NELT FB
> Application 419539 resources: utime ~8s, stime ~13s
>
> Does the code map in parallel here? I'm not sure what's happening here, 
> because the .map and .rea file do in fact agree. The error seems to 
> occur when processor #25 is mapped for a second time. The same thing 
> happens with my own cases that were running on a similar machine before 
> but with Nek compiled with PGI.
>
> Any help would be greatly appreciated
>
> Oliver
>
> _______________________________________________
> Nek5000-users mailing list
> Nek5000-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
>


More information about the Nek5000-users mailing list