[Nek5000-users] Kernel Panic(s)

nek5000-users at lists.mcs.anl.gov nek5000-users at lists.mcs.anl.gov
Fri Jun 3 06:54:54 CDT 2011


Hi again Nekies,

Aleks, i am quite sure Stefan is right. Nek used to run fine, it's  
just that the larger the cases get, the more frequent the panics  
occur... So my bet is something wrong with the Infiniband kernel  
module or something in the interconnection area. Stefan my first  
approach was to make the net.core.optmem_max three times larger.  
Anyhow, I ll trace it and get back to you. Thank you all for your  
interest.

~nab

Quoting nek5000-users at lists.mcs.anl.gov:

> Hi Nab,
>
> Please post the kernel panic/oops call trace. I guess this is not a
> Nek related problem. Something is wrong with your system.
>
> -Stefan
>
> On 6/2/11, nek5000-users at lists.mcs.anl.gov
> <nek5000-users at lists.mcs.anl.gov> wrote:
>> Hi Nab,
>>
>> I have been running Nek5000 on a local Linux cluster with the kernel
>> 2.6.17.14, compiler pgf77 and following submission script below.
>>
>> Have you managed to compile and run the code sequentially (IFMPI="false"
>> in makenek) for one of the examples (nek5_svn/examples) witn the script
>> nek or nekb in nek5_svn/trunk/tools/scripts?
>> Then compiled w/ MPI and submitted on one precessor with the script below?
>>
>> Best,
>> Aleks
>>
>>
>> ###
>> set np=32                # procs
>> set m=dnodes/16nod_32cpu # machine node file
>> set d=. ##$home          # run directory
>> set e=$d/nek5000         # executable
>> #
>> echo $1        >  SESSION.NAME
>> echo `pwd`'/' >>  SESSION.NAME
>> touch $1.rea
>> rm -f ioinfo
>> mv -f $1.his $1.his1
>> mv -f $1.sch $1.sch1
>> #
>> rm -f             $1.log2.$np
>> cp -f $1.log1.$np $1.log2.$np
>> rm -f             $1.log1.$np
>> cp -f $1.log.$np  $1.log1.$np
>> nohup mpirun -np $np -machinefile $m -s $e > $1.log.$np &
>> sleep 2
>> rm -f logfile
>> ln $1.log.$np logfile
>>
>>
>>
>>
>>
>> On Thu, 2 Jun 2011, nek5000-users at lists.mcs.anl.gov wrote:
>>
>>> Paul and Stefan, thanks for your fast replies. Heres the info you asked
>>> for
>>> :)
>>>
>>> release:
>>> Red Hat Enterprise Linux AS release 4 (Nahant Update 4)
>>>
>>> kernel:
>>> 2.6.9-42.9hp
>>>
>>> compiler:
>>> pgf90 7.1-1 64-bit target on x86-64 Linux -tp k8-64e
>>>
>>>
>>> launch command:
>>> mpirun -machinefile machines -np $2 ./nek5000 >logfile &
>>>
>>> Best,
>>> nab
>>>
>>> _______________________________________________
>>> Nek5000-users mailing list
>>> Nek5000-users at lists.mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
>> _______________________________________________
>> Nek5000-users mailing list
>> Nek5000-users at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
>>
> _______________________________________________
> Nek5000-users mailing list
> Nek5000-users at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users
>






More information about the Nek5000-users mailing list