[mpich-discuss] 32bit and 64bit system issues

Samir Khanal skhanal at bgsu.edu
Tue Mar 3 17:23:48 CST 2009


And...
When it hangs for a while i do qdel xxx

it spits

compute-0-5.local
Directory is /home/skhanal
This job is running on following Processors
/opt/openmpi/bin/mpiexec
mpiexec: resolve_exe: using absolute path "./lRing".
node  0: name compute-0-5, cpu avail 4
node  1: name compute-0-4, cpu avail 4
node  2: name compute-0-3, cpu avail 4
node  3: name compute-0-2, cpu avail 4
node  4: name compute-0-1, cpu avail 4
node  5: name compute-0-0, cpu avail 4
mpiexec: process_start_event: evt 2 task 0 on compute-0-5.
mpiexec: read_p4_master_port: waiting for port from master.
mpiexec: read_p4_master_port: got port 57476.
mpiexec: process_start_event: evt 4 task 1 on compute-0-4.
mpiexec: process_start_event: evt 5 task 2 on compute-0-3.
mpiexec: process_start_event: evt 6 task 3 on compute-0-2.
mpiexec: All 4 tasks (spawn 0) started.
mpiexec: wait_tasks: waiting for compute-0-5 compute-0-4 and 2 others.
mpiexec: killall: caught signal 15 (Terminated).
mpiexec: kill_tasks: killing all tasks.
mpiexec: wait_tasks: waiting for compute-0-5 compute-0-4 and 2 others.
p0_14302:  p4_error: interrupt SIGx: 15
bm_list_14303:  p4_error: interrupt SIGx: 15
p0_14302: (1212.847656) net_send: could not write to fd=4, errno = 32
mpiexec: killall: caught signal 15 (Terminated).


is this Signal 15 caught before or after qdel?

Any hints?
Samir




________________________________________
From: mpich-discuss-bounces at mcs.anl.gov [mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Samir Khanal [skhanal at bgsu.edu]
Sent: Tuesday, March 03, 2009 6:14 PM
To: mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] 32bit and 64bit system issues

Hi all

I have a code that has some problems while used in the 64bit system

it runs very well on 32bit system (single core) with mpich-1.2.7 / PBS

but the same thing when compiled for and ran for mpich-1.2.7 / PBS on a 64 bit system just hangs up. (Goes on for a while, This one is a quad core system)

#PBS -n nodes=4:ppn=4
I did
mpiexec -n 4 -pernode ./a.out
to make it look like a single core system. (correct me if i am wrong...)

Both of the 32 and 64 bits systems are rocks clusters 5.1

Is there any drastic change moving from 32 to 64 bit system and from single core to quad core system?
Anything to be careful about?

Thanks
Samir




More information about the mpich-discuss mailing list