[Nek5000-users] Running NEK on CRAY; compilation & running issues

nek5000-users at lists.mcs.anl.gov nek5000-users at lists.mcs.anl.gov
Mon Jan 30 13:43:50 CST 2012


Hi Oliver,

I have been using Cray xe6/xt6 successfully with PGI compilers specified in makenek with

# Fortran compiler
F77="ftn"

# C compiler
CC="cc"

and the submission script I use is below.

Best.
Aleks


rm *batch*
echo $1        >  SESSION.NAME
echo `pwd`'/' >>  SESSION.NAME
touch $1.rea
rm -f ioinfo
mv -f $1.log.$2 $1.log1.$2
mv -f $1.his $1.his1
mv -f $1.sch $1.sch1
rm -f logfile
echo '' > $1.log.$2
echo   "#!/bin/bash"                         >  $1.batch
echo   "#PBS -l mppwidth="$2                 >> $1.batch
echo   "#PBS -l walltime="$3":"$4":00"       >> $1.batch
echo   "#PBS -j oe"                          >> $1.batch
echo   cd `pwd`                              >> $1.batch
echo   aprun -n $2 ./nek5000 ">>" $1.log.$2  >> $1.batch
echo   "exit 0;"                                                 >> $1.batch
qsub -q batch $1.batch
sleep 3
ln $1.log.$2 logfile
##
## usage: neke case cores hours minutes






----- Original Message -----
From: nek5000-users at lists.mcs.anl.gov
To: nek5000-users at lists.mcs.anl.gov
Sent: Monday, January 30, 2012 12:02:23 PM
Subject: Re: [Nek5000-users] Running NEK on CRAY;	compilation & running issues

Hi Oliver,

Can you try to use the PGI compiler instead of the Cray. Just want to
check if this is a compiler specific issue.

Cheers,
Stefan

On 1/30/12, nek5000-users at lists.mcs.anl.gov
<nek5000-users at lists.mcs.anl.gov> wrote:
>
> Oliver,
>
> Thanks for your comments.  We'll certainly take care of the
> multd() and gsync() issues.
>
> We have what I believe is an xe6 on site and several users
> are using it regularly.   They may have some comments about
> the flags, etc. and the issues that you are running into.
> Hopefully, someone will get back to you early today.
>
> Regards,
>
> Paul
>
>
> On Mon, 30 Jan 2012, nek5000-users at lists.mcs.anl.gov wrote:
>
>> Dear NEKs,
>>
>> I'm currently trying to compile NEK via crayftn to run on a XE6 and I
>> run into some issues. Some compilation problems I solved (with kind help
>> of CRAY personnel). I'll mention them here as they can be part of the
>> problem, I assume.
>>
>> 1) changed
>> *ftn*) P="-r8 -Mpreprocess
>> to *ftn*) P="-s real64 -eZ -em"
>> for double precision reals and to invoke preprocessor in makenek.inc
>> (Cray uses ftn command as a wrapper)
>>
>> 2) changed subroutine name “gsync” to “gsync_nek” in all source files
>> as
>> crayftn has a build-in routine of the same name that conflicts with
>> Nek's gsync
>>
>> 3) changed calls to subroutine “multd” from
>> CALL MULTD (TA1,TVX,RXM2,SXM2,TXM2,1)
>> to CALL MULTD (TA1,TVX,RXM2,SXM2,TXM2,1,0)
>> in “navier1.f” as crayftn complains about missing seventh argument
>> “iflg” (calls come from subroutine “tmultd” which seems not to be
>> called
>> in the code, though)
>>
>> In the SIZE file I'm setting lelt=300, lelv=lelt, lp = 64, lelg = 300.
>> With these changes I'm running the eddy_uv example via “aprun -n 64 -N
>> 32 ./nek5000” on 64 processors and I'm getting the following output:
>>
>> /----------------------------------------------------------\\
>> | _ __ ______ __ __ ______ ____ ____ ____ |
>> | / | / // ____// //_/ / ____/ / __ \\ / __ \\ / __ \\ |
>> | / |/ // __/ / ,< /___ \\ / / / // / / // / / / |
>> | / /| // /___ / /| | ____/ / / /_/ // /_/ // /_/ / |
>> | /_/ |_//_____//_/ |_|/_____/ \\____/ \\____/ \\____/ |
>> | |
>> |----------------------------------------------------------|
>> | |
>> | NEK5000: Open Source Spectral Element Solver |
>> | COPYRIGHT (c) 2008-2010 UCHICAGO ARGONNE, LLC |
>> | Version: 1.0rc1 / SVN r730 |
>> | Web: http://nek5000.mcs.anl.gov |
>> | |
>> \\----------------------------------------------------------/
>>
>>
>> Number of processors: 64
>> REAL wdsize : 8
>> INTEGER wdsize : 4
>>
>>
>> Beginning session:
>> /zhome/academic/HLRS/iag/iagoschm/run_nek/eddy_example/eddy_uv.rea
>>
>>
>> timer accuracy: 2.8610229E-07 sec
>>
>> read .rea file
>> nelgt/nelgv/lelt: 256 256 300
>> lx1 /lx2 /lx3 : 8 6 8
>>
>> mapping elements to processors
>> 0, 2*4, 2*256 NELV
>> 1, 2*4, 2*256 NELV
>> 8, 2*4, 2*256 NELV
>> 9, 2*4, 2*256 NELV
>> 17, 2*4, 2*256 NELV
>> 16, 2*4, 2*256 NELV
>> 20, 2*4, 2*256 NELV
>> 3*4, 2*256 NELV
>> 29, 2*4, 2*256 NELV
>> 21, 2*4, 2*256 NELV
>> 28, 2*4, 2*256 NELV
>> 7, 2*4, 2*256 NELV
>> 31, 2*4, 2*256 NELV
>> 25, 2*4, 2*256 NELV
>> 11, 2*4, 2*256 NELV
>> 12, 2*4, 2*256 NELV
>> 6, 2*4, 2*256 NELV
>> 30, 2*4, 2*256 NELV
>> 18, 2*4, 2*256 NELV
>> 19, 2*4, 2*256 NELV
>> 2, 2*4, 2*256 NELV
>> 5, 2*4, 2*256 NELV
>> 23, 2*4, 2*256 NELV
>> 27, 2*4, 2*256 NELV
>> 10, 2*4, 2*256 NELV
>> 22, 2*4, 2*256 NELV
>> 13, 2*4, 2*256 NELV
>> 14, 2*4, 2*256 NELV
>> 15, 2*4, 2*256 NELV
>> 3, 2*4, 2*256 NELV
>> 26, 2*4, 2*256 NELV
>> 24, 2*4, 2*256 NELV
>> 33, 2*4, 2*256 NELV
>> 41, 2*4, 2*256 NELV
>> 32, 2*4, 2*256 NELV
>> 40, 2*4, 2*256 NELV
>> 45, 2*4, 2*256 NELV
>> 44, 2*4, 2*256 NELV
>> 49, 2*4, 2*256 NELV
>> 34, 2*4, 2*256 NELV
>> 35, 2*4, 2*256 NELV
>> 48, 2*4, 2*256 NELV
>> 39, 2*4, 2*256 NELV
>> 38, 2*4, 2*256 NELV
>> 50, 2*4, 2*256 NELV
>> 51, 2*4, 2*256 NELV
>> 37, 2*4, 2*256 NELV
>> 36, 2*4, 2*256 NELV
>> 55, 2*4, 2*256 NELV
>> 54, 2*4, 2*256 NELV
>> 57, 2*4, 2*256 NELV
>> 58, 2*4, 2*256 NELV
>> 43, 2*4, 2*256 NELV
>> 42, 2*4, 2*256 NELV
>> 47, 2*4, 2*256 NELV
>> 53, 2*4, 2*256 NELV
>> 59, 2*4, 2*256 NELV
>> 46, 2*4, 2*256 NELV
>> 60, 2*4, 2*256 NELV
>> 61, 2*4, 2*256 NELV
>> 63, 2*4, 2*256 NELV
>> 62, 2*4, 2*256 NELV
>> 52, 2*4, 2*256 NELV
>> 56, 2*4, 2*256 NELV
>> 25, 0, 2*4, 256 NELT FAIL
>> 24, 0, 2*4, 256 NELT FAIL
>> 21, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> 20, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> 31, 0, 2*4, 256 NELT FAIL
>> 8, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> 17, 0, 2*4, 256 NELT FAIL
>> 14, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> 15, 0, 2*4, 256 NELT FAIL
>> 16, 0, 2*4, 256 NELT FAIL
>> 26, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> 10, 0, 2*4, 256 NELT FAIL
>> 2, 0, 2*4, 256 NELT FAIL
>> 30, 0, 2*4, 256 NELT FAIL
>> 11, 0, 2*4, 256 NELT FAIL
>> 7, 0, 2*4, 256 NELT FAIL
>> 1, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> 2*0, 2*4, 256 NELT FAIL
>> 6, 0, 2*4, 256 NELT FAIL
>> 4, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> 13, 0, 2*4, 256 NELT FAIL
>> 12, 0, 2*4, 256 NELT FAIL
>> 28, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> 18, 0, 2*4, 256 NELT FAIL
>> 29, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> 23, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> 27, 0, 2*4, 256 NELT FAIL
>> 3, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> 19, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> 5, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> 22, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> 2*0, 2*4, 256 NELT FB
>> 1, 0, 2*4, 256 NELT FB
>> 2, 0, 2*4, 256 NELT FB
>> 3, 0, 2*4, 256 NELT FB
>> 4, 0, 2*4, 256 NELT FB
>> 5, 0, 2*4, 256 NELT FB
>> 6, 0, 2*4, 256 NELT FB
>> 7, 0, 2*4, 256 NELT FB
>> 8, 0, 2*4, 256 NELT FB
>> 9, 3*4, 256 NELT FB
>> 10, 0, 2*4, 256 NELT FB
>> 11, 0, 2*4, 256 NELT FB
>> 12, 0, 2*4, 256 NELT FB
>> 13, 0, 2*4, 256 NELT FB
>> 14, 0, 2*4, 256 NELT FB
>> 15, 0, 2*4, 256 NELT FB
>> 16, 0, 2*4, 256 NELT FB
>> 17, 0, 2*4, 256 NELT FB
>> 18, 0, 2*4, 256 NELT FB
>> 19, 0, 2*4, 256 NELT FB
>> 20, 0, 2*4, 256 NELT FB
>> 21, 0, 2*4, 256 NELT FB
>> 22, 0, 2*4, 256 NELT FB
>> 23, 0, 2*4, 256 NELT FB
>> 24, 0, 2*4, 256 NELT FB
>> 25, 0, 2*4, 256 NELT FB
>> 26, 0, 2*4, 256 NELT FB
>> 27, 0, 2*4, 256 NELT FB
>> 28, 0, 2*4, 256 NELT FB
>> 29, 0, 2*4, 256 NELT FB
>> 30, 0, 2*4, 256 NELT FB
>> 31, 0, 2*4, 256 NELT FB
>>
>> call exitt: dying ...
>>
>> backtrace(): obtained 1 stack frames.
>> [0x662a40]
>>
>> total elapsed time : 7.82199E-02 sec
>> total solver time incl. I/O : 0.00000E+00 sec
>> time/timestep : 0.00000E+00 sec
>> CPU seconds/timestep/gridpt : 0.00000E+00 sec
>>
>> 39, 0, 2*4, 256 NELT FAIL
>> 38, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> 36, 0, 2*4, 256 NELT FAIL
>> 41, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> 32, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> 48, 0, 2*4, 256 NELT FAIL
>> 34, 0, 2*4, 256 NELT FAIL
>> 54, 196, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> 40, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> 46, 0, 2*4, 256 NELT FAIL
>> 47, 0, 2*4, 256 NELT FAIL
>> 33, 0, 2*4, 256 NELT FAIL
>> 37, 0, 2*4, 256 NELT FAIL
>> 35, 0, 2*4, 256 NELT FAIL
>> 43, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> 45, 0, 2*4, 256 NELT FAIL
>> 42, 0, 2*4, 256 NELT FAIL
>> 44, 0, 2*4, 256 NELT FAIL
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> Check that .map file and .rea file agree
>> 32, 0, 2*4, 256 NELT FB
>> 33, 0, 2*4, 256 NELT FB
>> 34, 0, 2*4, 256 NELT FB
>> 35, 0, 2*4, 256 NELT FB
>> 36, 0, 2*4, 256 NELT FB
>> 37, 0, 2*4, 256 NELT FB
>> 38, 0, 2*4, 256 NELT FB
>> 39, 0, 2*4, 256 NELT FB
>> 40, 0, 2*4, 256 NELT FB
>> 41, 0, 2*4, 256 NELT FB
>> 42, 0, 2*4, 256 NELT FB
>> 43, 0, 2*4, 256 NELT FB
>> 44, 0, 2*4, 256 NELT FB
>> 45, 0, 2*4, 256 NELT FB
>> 46, 0, 2*4, 256 NELT FB
>> 47, 0, 2*4, 256 NELT FB
>> 48, 0, 2*4, 256 NELT FB
>> 49, 3*4, 256 NELT FB
>> 50, 3*4, 256 NELT FB
>> 51, 3*4, 256 NELT FB
>> 52, 3*4, 256 NELT FB
>> 53, 3*4, 256 NELT FB
>> 54, 196, 2*4, 256 NELT FB
>> 55, 3*4, 256 NELT FB
>> 56, 3*4, 256 NELT FB
>> 57, 3*4, 256 NELT FB
>> 58, 3*4, 256 NELT FB
>> 59, 3*4, 256 NELT FB
>> 60, 3*4, 256 NELT FB
>> 61, 3*4, 256 NELT FB
>> 62, 3*4, 256 NELT FB
>> 63, 3*4, 256 NELT FB
>> Application 419539 resources: utime ~8s, stime ~13s
>>
>> Does the code map in parallel here? I'm not sure what's happening here,
>> because the .map and .rea file do in fact agree. The error seems to
>> occur when processor #25 is mapped for a second time. The same thing
>> happens with my own cases that were running on a similar machine before
>> but with Nek compiled with PGI.
>>
>> Any help would be greatly appreciated
>>
>> Oliver
>>
>> _______________________________________________
>> Nek5000-users mailing list
>> Nek5000-users at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
>>
_______________________________________________
Nek5000-users mailing list
Nek5000-users at lists.mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users



More information about the Nek5000-users mailing list