[mpich-discuss] Using g95 to compile mpich2 problem

Jayesh Krishna jayesh at mcs.anl.gov
Mon Aug 13 10:14:15 CDT 2012


Hi,
 Did you build MPICH2 (make clean; configure/make/make install) after updating the Cygwin version?

Regards,
Jayesh

----- Original Message -----
From: "謝奇哲" <991546107 at stu.nkmu.edu.tw>
To: "Jayesh Krishna" <jayesh at mcs.anl.gov>
Sent: Sunday, August 5, 2012 4:17:01 AM
Subject: Re: Using g95 to compile mpich2 problem

Hi, 
Thank you, Jayesh. 
I check the my Cygwin Information is CYGWIN_NT-5.1 980622-05 1.7.7(0.230/5/3) 2010-08-31 09:58 i686 Cygwin. 

So, I install the latest version on another computer. 
And then the Cygwin Information is CYGWIN_NT-5.1 PC03 1.7.16(0.262/5/3) 2012-07-20 22:55 i686 Cygwin. 

Now, I can use mpiexec running cpi. And I can use -n <number> equal or greater than 4. 

daizy at PC03 /cygdrive/c/cygwin/mpich2-1.4 
$ mpiexec -n 4 ./examples/cpi 
Process 0 of 4 is on PC03 
pi is approximately 3.1415926544231239, Error is 0.0000000008333307 
wall clock time = 0.000000 
Process 1 of 4 is on PC03 
Process 2 of 4 is on PC03 
Process 3 of 4 is on PC03 

daizy at PC03 /cygdrive/c/cygwin/mpich2-1.4 
$ mpiexec -n 5 ./examples/cpi 
Process 0 of 5 is on PC03 
pi is approximately 3.1415926544231230, Error is 0.0000000008333298 
wall clock time = 0.000000 
Process 1 of 5 is on PC03 
Process 2 of 5 is on PC03 
Process 3 of 5 is on PC03 
Process 4 of 5 is on PC03 

But when I use command make testing in mpich2-1.4/test file, I still got error messages (detail saves in testing.txt). 
Processing directory coll 
Looking in ./coll/testlist 
Unexpected output in allred: Fatal error in MPI_Init_thread: Other MPI error, error stack: 
Unexpected output in allred: MPIR_Init_thread(388).................: 
Unexpected output in allred: MPID_Init(139)........................: channel initialization failed 
Unexpected output in allred: MPIDI_CH3_Init(38)....................: 
Unexpected output in allred: MPID_nem_init(196)....................: 
Unexpected output in allred: MPIDI_CH3I_Seg_commit(366)............: 
Unexpected output in allred: MPIU_SHMW_Hnd_deserialize(324)........: 
Unexpected output in allred: MPIU_SHMW_Seg_open(863)...............: 
Unexpected output in allred: MPIU_SHMW_Seg_create_attach_templ(637): open failed - Device or resource busy 
Program allred exited without No Errors 

Processing directory info 
Looking in ./info/testlist 
Processing directory init 
Looking in ./init/testlist 
Processing directory pt2pt 
Looking in ./pt2pt/testlist 
Unexpected output in sendflood: Fatal error in MPI_Init: Other MPI error, error stack: 
Unexpected output in sendflood: MPIR_Init_thread(388).................: 
Unexpected output in sendflood: MPID_Init(139)........................: channel initialization failed 
Unexpected output in sendflood: MPIDI_CH3_Init(38)....................: 
Unexpected output in sendflood: MPID_nem_init(196)....................: 
Unexpected output in sendflood: MPIDI_CH3I_Seg_commit(369)............: 
Unexpected output in sendflood: MPIU_SHMW_Seg_attach(925).............: 
Unexpected output in sendflood: MPIU_SHMW_Seg_create_attach_templ(637): open failed - Permission denied 
Unexpected output in sendflood: [proxy:0:0 at PC03] HYDU_sock_read (./utils/sock/sock.c:272): read error (Software caused connection abort) 
Unexpected output in sendflood: [proxy:0:0 at PC03] pmi_cb (./pm/pmiserv/pmip_cb.c:228): unable to read PMI command 
Unexpected output in sendflood: [proxy:0:0 at PC03] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status 
Unexpected output in sendflood: [proxy:0:0 at PC03] main (./pm/pmiserv/pmip.c:226): demux engine error waiting for event 

Are these error messages related with single-node systems? 
Can you give me some advices to solve the problem? 

Best regards, 
Chi-Che Hsieh 

2012/8/3 Jayesh Krishna < jayesh at mcs.anl.gov > 


Hi, 
This looks to me more like a Cygwin issue. What version of Cygwin do you have in your system (Have you installed the latest version?)? 


Regards, 
Jayesh 

----- Original Message ----- 
From: "謝奇哲" < 991546107 at stu.nkmu.edu.tw > 
To: "Jayesh Krishna" < jayesh at mcs.anl.gov > 


Sent: Friday, August 3, 2012 5:48:44 AM 
Subject: Re: Using g95 to compile mpich2 problem 

Hi, 
Thanks a lot, Jayesh. 
I can run /examples/cpi.exe using mpiexec. 
But when I use mpiexec -n <number> ./examples/cpi, If n is equal or greater than 4, it will produce error messages. 

User at 980622-05 /cygdrive/c/cygwin/mpich2-1.4 
$ mpiexec -n 1 ./examples/cpi 
Process 0 of 1 is on 980622-05 
pi is approximately 3.1415926544231341, Error is 0.0000000008333410 
wall clock time = 0.000000 

User at 980622-05 /cygdrive/c/cygwin/mpich2-1.4 
$ mpiexec -n 2 ./examples/cpi 
Process 0 of 2 is on 980622-05 
pi is approximately 3.1415926544231318, Error is 0.0000000008333387 
wall clock time = 0.000000 
Process 1 of 2 is on 980622-05 

User at 980622-05 /cygdrive/c/cygwin/mpich2-1.4 
$ mpiexec -n 4 ./examples/cpi 
Fatal error in MPI_Init: Other MPI error, error stack: 
MPIR_Init_thread(388).................: 
MPID_Init(139)........................: channel initialization failed 
MPIDI_CH3_Init(38)....................: 
MPID_nem_init(196)....................: 
MPIDI_CH3I_Seg_commit(366)............: 
MPIU_SHMW_Hnd_deserialize(324)........: 
MPIU_SHMW_Seg_open(863)...............: 
MPIU_SHMW_Seg_create_attach_templ(637): open failed - Device or resource busy 

Where having a installation error? 
Can you give me some advices to solve the problem? 

Best regards, 
Chi-Che Hsieh 


2012/8/2 Jayesh Krishna < jayesh at mcs.anl.gov > 


Hi, 
Can you run any MPI programs (Can you run /examples/cpi.c?)? 


Regards, 
Jayesh 

----- Original Message ----- 
From: "謝奇哲" < 991546107 at stu.nkmu.edu.tw > 
To: "Jayesh Krishna" < jayesh at mcs.anl.gov > 


Sent: Thursday, August 2, 2012 7:43:27 AM 
Subject: Re: Using g95 to compile mpich2 problem 

Hi, 
Thank you,Jayesh. 
Before, I had trying to use default process manager (hydra). 

Using the command 
User at 980622-05 /cygdrive/c/cygwin/mpich2-1.4 
$./configure F77=g95 FC=g95 2>&1 | tee c.txt 
User at 980622-05 /cygdrive/c/cygwin/mpich2-1.4 
$make 2>&1 | tee m.txt 
User at 980622-05 /cygdrive/c/cygwin/mpich2-1.4 
$make install 2>&1 | tee mi.txt 
User at 980622-05 /cygdrive/c/cygwin/mpich2-1.4 
$export MPICH2_HOME=/cygdrive/c/cygwin/mpich2-1.4 
User at 980622-05 /cygdrive/c/cygwin/mpich2-1.4 
$export PATH=/cygdrive/c/cygwin/usr/local/bin:$PATH 
User at 980622-05 /cygdrive/c/cygwin/mpich2-1.4 
$export LD_LIBRARY_PATH=/cygdrive/c/cygwin/usr/local/lib:$LD_LIBRARY_PATH 


The MPICH2 installed to the default location /usr/local/ 
And I could run the mpich2-1.4/examples/cpi 
But when I used command make testing in mpich2-1.4/test file, I got errors message (detail saved in testing.txt) 

Looking in ./coll/testlist 
Unexpected output in allred: Fatal error in MPI_Init_thread: Other MPI error, error stack: 
Unexpected output in allred: MPIR_Init_thread(388).................: 
Unexpected output in allred: MPID_Init(139)........................: channel initialization failed 
Unexpected output in allred: MPIDI_CH3_Init(38)....................: 
Unexpected output in allred: MPID_nem_init(196)....................: 
Unexpected output in allred: MPIDI_CH3I_Seg_commit(369)............: 
Unexpected output in allred: MPIU_SHMW_Seg_attach(925).............: 
Unexpected output in allred: MPIU_SHMW_Seg_create_attach_templ(637): open failed - No such file or directory 

Unexpected output in allred5: Fatal error in MPI_Init_thread: Other MPI error, error stack: 
Unexpected output in allred5: MPIR_Init_thread(388).................: 
Unexpected output in allred5: MPID_Init(139)........................: channel initialization failed 
Unexpected output in allred5: MPIDI_CH3_Init(38)....................: 
Unexpected output in allred5: MPID_nem_init(196)....................: 
Unexpected output in allred5: MPIDI_CH3I_Seg_commit(366)............: 
Unexpected output in allred5: MPIU_SHMW_Hnd_deserialize(324)........: 
Unexpected output in allred5: MPIU_SHMW_Seg_open(863)...............: 
Unexpected output in allred5: MPIU_SHMW_Seg_create_attach_templ(637): open failed - Device or resource busy 
Unexpected output in allred5: [proxy:0:0 at 980622-05] HYDU_sock_read (./utils/sock/sock.c:272): read error (Software caused connection abort) 
Unexpected output in allred5: [proxy:0:0 at 980622-05] pmi_cb (./pm/pmiserv/pmip_cb.c:228): unable to read PMI command 
Unexpected output in allred5: [proxy:0:0 at 980622-05] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status 
Unexpected output in allred5: [proxy:0:0 at 980622-05] main (./pm/pmiserv/pmip.c:226): demux engine error waiting for event 

Can you give me some advices to solve the problem? 
Thank you for your help 
Best regards, 
Chi-Che Hsieh 


2012/8/1 Jayesh Krishna < jayesh at mcs.anl.gov > 


Hi, 
Does the default process manager (hydra) work for you? Hydra is the recommended process manager for MPICH2 (on Unix environments). 
Also see if you can compile and run <MPICH2-INSTALL-DIR>/examples/cpi.c . 

(PS: SMPD is not regularly tested on Cygwin environment) 
Regards, 
Jayesh 



----- Original Message ----- 
From: "謝奇哲" < 991546107 at stu.nkmu.edu.tw > 
To: "Jayesh Krishna" < jayesh at mcs.anl.gov >, mpich-discuss at mcs.anl.gov 
Cc: "謝奇哲" < 991546107 at stu.nkmu.edu.tw > 
Sent: Wednesday, August 1, 2012 4:04:45 AM 
Subject: Using g95 to compile mpich2 problem 

Hi, 

I'm sorry if I'm asking too many times. 

My case will need using mpi_init, so I install the MPICH2-1.4. 
I'm using WindosXP, Cygwin, g95 and single-node systems. 
What kind of options smpd or gforker is suitable for me? 

If I suitable use smpd. 
I tried using smpd before, and I could use smpd to run example cpi. 
But I'm in mpich2-1.4/test file make testing, I get different problems on two computers. 
1) 
Looking in ./spawn/testlist 
Unexpected output in taskmaster: Fatal error in MPI_Init: Invalid port, error stack: 
Unexpected output in taskmaster: MPIR_Init_thread(388).............: 

Unexpected output in taskmaster: MPID_Init(264)....................: spawned process group was unable to connect back to the parent on port <tag=0 description=user001 port=2634 ifname=10.1.10.108 > 
Unexpected output in taskmaster: MPID_Comm_connect(191)............: 
Unexpected output in taskmaster: MPIDI_Comm_connect(546)...........: Named port tag=0 description=user001 port=2634 ifname=10.1.10.108 does not exist 
Unexpected output in taskmaster: MPIDI_Comm_connect(401)...........: 
Unexpected output in taskmaster: MPIC_Sendrecv(192)................: 
Unexpected output in taskmaster: MPIC_Wait(540)....................: 
Unexpected output in taskmaster: MPIDI_CH3I_Progress(402)..........: 
Unexpected output in taskmaster: MPID_nem_mpich2_blocking_recv(905): 
Unexpected output in taskmaster: MPID_nem_tcp_connpoll(1838).......: 
Unexpected output in taskmaster: state_c_tmpvcsent_handler(1264)...: Failure during connection protocol 
Unexpected output in taskmaster: recv_cmd_pkt(739).................: read from socket failed - Connection reset by peer 
Unexpected output in taskmaster: 
Unexpected output in taskmaster: job aborted: 
Unexpected output in taskmaster: rank: node: exit code[: error message] 
Unexpected output in taskmaster: 0: user001: 1: process 0 exited without calling finalize 
Unexpected output in taskmaster: 
Unexpected output in taskmaster: mpiexec terminated job due to 600 second timeout. 
Program taskmaster exited without No Errors 
Unexpected output in disconnect_reconnect: Fatal error in MPI_Comm_connect: Invalid port, error stack: 
Unexpected output in disconnect_reconnect: MPI_Comm_connect(127)..: MPI_Comm_connect(port="", MPI_INFO_NULL, root=0, MPI_COMM_WORLD, newcomm=0x22ac48) failed 
Unexpected output in disconnect_reconnect: MPID_Comm_connect(191).: 
Unexpected output in disconnect_reconnect: MPIDI_Comm_connect(412): Named port does not exist 
Unexpected output in disconnect_reconnect: Fatal error in MPI_Comm_connect: Invalid port, error stack: 
Unexpected output in disconnect_reconnect: MPI_Comm_connect(127).............: MPI_Comm_connect(port="tag=0 description=user001 port=3378 ifname=10.1.10.108 ", MPI_INFO_NULL, root=0, MPI_COMM_WORLD, newcomm=0x22ac48) failed 
Unexpected output in disconnect_reconnect: MPID_Comm_connect(191)............: 
Unexpected output in disconnect_reconnect: MPIDI_Comm_connect(546)...........: Named port tag=0 description=user001 port=3378 ifname=10.1.10.108 does not exist 
Unexpected output in disconnect_reconnect: MPIDI_Comm_connect(401)...........: 
Unexpected output in disconnect_reconnect: MPIC_Sendrecv(192)................: 
Unexpected output in disconnect_reconnect: MPIC_Wait(540)....................: 
Unexpected output in disconnect_reconnect: MPIDI_CH3I_Progress(402)..........: 
Unexpected output in disconnect_reconnect: MPID_nem_mpich2_blocking_recv(905): 
Unexpected output in disconnect_reconnect: MPID_nem_tcp_connpoll(1838).......: 
Unexpected output in disconnect_reconnect: state_c_tmpvcsent_handler(1264)...: Failure during connection protocol 
Unexpected output in disconnect_reconnect: recv_cmd_pkt(739).................: read from socket failed - Connection reset by peer 
Unexpected output in disconnect_reconnect: Fatal error in MPI_Comm_connect: Invalid port, error stack: 
Unexpected output in disconnect_reconnect: MPI_Comm_connect(127)..: MPI_Comm_connect(port="", MPI_INFO_NULL, root=0, MPI_COMM_WORLD, newcomm=0x22ac48) failed 
Unexpected output in disconnect_reconnect: MPID_Comm_connect(191).: 
Unexpected output in disconnect_reconnect: MPIDI_Comm_connect(412): Named port does not exist 
Unexpected output in disconnect_reconnect: 
Unexpected output in disconnect_reconnect: job aborted: 
Unexpected output in disconnect_reconnect: rank: node: exit code[: error message] 
Unexpected output in disconnect_reconnect: 0: user001: 1: process 0 exited without calling finalize 
Unexpected output in disconnect_reconnect: 1: user001: 1: process 1 exited without calling finalize 
Unexpected output in disconnect_reconnect: 2: user001: 1: process 2 exited without calling finalize 
Unexpected output in disconnect_reconnect: 
Unexpected output in disconnect_reconnect: mpiexec terminated job due to 180 second timeout. 
Program disconnect_reconnect exited without No Errors 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
2) 
Looking in ./attr/testlist 
Unexpected output in attric: Fatal error in MPI_Init_thread: Other MPI error, error stack: 
Unexpected output in attric: MPIR_Init_thread(388).................: 
Unexpected output in attric: MPID_Init(139)........................: channel initialization failed 
Unexpected output in attric: MPIDI_CH3_Init(38)....................: 
Unexpected output in attric: MPID_nem_init(196)....................: 
Unexpected output in attric: MPIDI_CH3I_Seg_commit(366)............: 
Unexpected output in attric: MPIU_SHMW_Hnd_deserialize(324)........: 
Unexpected output in attric: MPIU_SHMW_Seg_open(863)...............: 
Unexpected output in attric: MPIU_SHMW_Seg_create_attach_templ(637): open failed - Device or resource busy 
Unexpected output in attric: Fatal error in MPI_Init_thread: Other MPI error, error stack: 
Unexpected output in attric: MPIR_Init_thread(388).................: 
Unexpected output in attric: MPID_Init(139)........................: channel initialization failed 
Unexpected output in attric: MPIDI_CH3_Init(38)....................: 
Unexpected output in attric: MPID_nem_init(196)....................: 
Unexpected output in attric: MPIDI_CH3I_Seg_commit(369)............: 
Unexpected output in attric: MPIU_SHMW_Seg_attach(925).............: 
Unexpected output in attric: MPIU_SHMW_Seg_create_attach_templ(637): open failed - No such file or directory 
Unexpected output in attric: op_read error on left context: Error = -1 
Unexpected output in attric: 
Unexpected output in attric: unable to read the cmd header on the left context, Error = -1 
Unexpected output in attric: . 
Unexpected output in attric: 
Unexpected output in attric: mpiexec terminated job due to 180 second timeout. 
Program attric exited without No Errors 
The two computer are the same computer hardware. And I use the same method installing. 
Can you give me some advices to solve the problem? 

Best regards, 
Chi-Che Hsieh 





More information about the mpich-discuss mailing list