[mpich-discuss] Fatal error in MPI_Barrier

Antonio José Gallardo Díaz ajcampa at hotmail.com
Mon Feb 2 11:17:39 CST 2009


Well, thanks for your answer.  Really, the name of mi pc is "Wireless" and the othes pc "Wireless2", i use in the two pc, the same user "mpi". 

I will try the mpdchech utility and then write something.

Thank for all.

Un saludo desde España.

From: thakur at mcs.anl.gov
To: mpich-discuss at mcs.anl.gov
Date: Mon, 2 Feb 2009 10:55:03 -0600
Subject: Re: [mpich-discuss] Fatal error in MPI_Barrier










Are you really trying to use the wireless network? Looks like 
that's what is getting used.
 
You can use the mpdcheck utility to diagnose 
network configuration problems. See Appendix A.2 of the installation 
guide.
 
Rajeev


  
  
  From: mpich-discuss-bounces at mcs.anl.gov 
  [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Antonio José 
  Gallardo Díaz
Sent: Monday, February 02, 2009 9:49 AM
To: 
  mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] Fatal error in 
  MPI_Barrier


  Hello, this error show me when i try my jobs that use 
  MPI.


Fatal error in MPI_Barrier: Other MPI error, error 
  stack:
MPI_Barrier(406).............................: 
  MPI_Barrier(MPI_COMM_WORLD) 
  failed
MPIR_Barrier(77).............................:
MPIC_Sendrecv(123)...........................:
MPIC_Wait(270)...............................:
MPIDI_CH3i_Progress_wait(215)................: 
  an error occurred while handling an event returned by 
  MPIDU_Sock_Wait()
MPIDI_CH3I_Progress_handle_sock_event(640)...:
MPIDI_CH3_Sockconn_handle_connopen_event(887): 
  unable to find the process group structure with id <��oz�>[cli_1]: 
  aborting job:
Fatal error in MPI_Barrier: Other MPI error, error 
  stack:
MPI_Barrier(406).............................: 
  MPI_Barrier(MPI_COMM_WORLD) 
  failed
MPIR_Barrier(77).............................:
MPIC_Sendrecv(123)...........................:
MPIC_Wait(270)...............................:
MPIDI_CH3i_Progress_wait(215)................: 
  an error occurred while handling an event returned by 
  MPIDU_Sock_Wait()
MPIDI_CH3I_Progress_handle_sock_event(640)...:
MPIDI_CH3_Sockconn_handle_connopen_event(887): 
  unable to find the process group structure with id <��oz�>
rank 1 in 
  job 15  wireless_43226   caused collective abort of all 
  ranks
  exit status of rank 1: killed by signal 9

I have two 
  PC's with linux (kubuntu 8.10). I make a cluster using this machines. When use 
  for example the command "mpiexec -l -n 2 hostname" i can see that it's all 
  right, but when i try to send o receive some thing i have the same error. I 
  don't know why. Please i need one hand. Thanks for all. 


  
  El doble de diversión: Con Windows Live Messenger comparte fotos mientras hablas. 

_________________________________________________________________
Consigue gratis el nuevo Messenger. ¡Descárgatelo! 
http://download.live.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20090202/2a80fbaf/attachment.htm>


More information about the mpich-discuss mailing list