[mpich-discuss] MPI_Barrier() failed

Pavan Balaji balaji at mcs.anl.gov
Tue Jun 28 21:25:14 CDT 2011


Which MPICH2 version?

On 06/28/2011 09:22 PM, 王凯 wrote:
> Hi! Pavan Balaji
> Thanks for your quick reply!
> Here is the information on the error:
> --Fatal error in PMPI_BARRIER:Other MPI error, error stack
> --MPI_Barrier(MPI_COMM_WORLD) failed
> --Failure during collective
> --connect failed -- the semaphore timeout period has expired.
>
> The difference of the time consuming of the two process is about several
> minutes.
> Is it the reason that caused the connect failed?
> --
>
> Email: yogikai at 163.com <mailto:yogikai at 163.com>
> Address: Room 921, Automation Building,
> No. 95 Zhongguancun East Road,
> Haidian District, Beijing 100190, China
> Cell Phone:15210370340
> The State Key Laboratory for Intelligent Control and Management of
> Complex Systems
> Institute of Automation, Chinese Academy of Sciences
>
>
>
> At  2011-06-29  10:15:46,"Pavan  Balaji"  <balaji at mcs.anl.gov>  wrote:
>
>>
>>There's  no  time  limit  in  barrier.  The  problem  might  be  something  else.
>>Which  version  of  MPICH2  are  you  using?  Also  can  you  send  some  more
>>information  on  the  error?
>>
>>    --  Pavan
>>
>>On  06/28/2011  09:11  PM,  王凯  wrote:
>>>  Hi!
>>>  I  have  two  computer  in  the  MPI  program,  and  I  use  the  function
>>>  mpi_barrier(MPI_COMM_WORLD)  to  synchronize  the  two  process  on  the  two
>>>  computer.
>>>  One  of  the  process  is  running  more  slower  than  the  other  one  because  two
>>>  different  GPUs  is  on  the  two  computer.  I  got  "fatal  error  in
>>>  PMPI_Barrier"  and  "connect  failed--the  semaphore  timeout  period  has
>>>  expired".
>>>  Is  there  a  time  limitation  in  the  mpi_barrier()?
>>>
>>>  --
>>>
>>>  Email:  yogikai at 163.com  <mailto:yogikai at 163.com>
>>>  Address:  Room  921,  Automation  Building,
>>>  No.  95  Zhongguancun  East  Road,
>>>  Haidian  District,  Beijing  100190,  China
>>>  Cell  Phone:15210370340
>>>  The  State  Key  Laboratory  for  Intelligent  Control  and  Management  of
>>>  Complex  Systems
>>>  Institute  of  Automation,  Chinese  Academy  of  Sciences
>>>
>>>
>>>
>>>
>>>
>>>  _______________________________________________
>>>  mpich-discuss  mailing  list
>>>  mpich-discuss at mcs.anl.gov
>>>  https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>>--
>>Pavan  Balaji
>>http://www.mcs.anl.gov/~balaji
>
>
>

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list