[mpich-discuss] Socket error on Quad-core Windows XP

Jayesh Krishna jayesh at mcs.anl.gov
Tue Apr 15 09:02:29 CDT 2008


 Hi,
  If you are seeing any error (even occasionally :) ) during MPI_Finalize()
it is a bug in MPICH2. Can you send us your MPI program ? How often do you
see the errors ?

Regards,
Jayesh
-----Original Message-----
From: Gib Bogle [mailto:g.bogle at auckland.ac.nz] 
Sent: Monday, April 14, 2008 6:29 PM
To: Jayesh Krishna
Subject: Re: [mpich-discuss] Socket error on Quad-core Windows XP

Thanks. (Why didn't I do this before??)  Almost no errors now.  Very
occasionally I see "unable to read the cmd header on the left context,
socket connection closed" on MPI_FINALIZE().  No more 10093 errors.

Cheers
Gib

Jayesh Krishna wrote:
> Hi,
>  Run your MPI program as "mpiexec -n 3 simple.exe" (Or to use all the 
> 4 cores/procs - "mpiexec -n 4 simple.exe").
> 
> Regards,
> Jayesh
> -----Original Message-----
> From: Gib Bogle [mailto:g.bogle at auckland.ac.nz]
> Sent: Monday, April 14, 2008 2:44 PM
> To: Jayesh Krishna
> Subject: RE: [mpich-discuss] Socket error on Quad-core Windows XP
> 
> If you can tell me of another way to access the 4 cores on my machine, 
> I'll try it.
> 
> Gib
> 
> Quoting Jayesh Krishna <jayesh at mcs.anl.gov>:
> 
>>  Hi,
>>   Is there any reason you want to use the "-localonly" option ?
>>
>> (PS: The "closesocket()" errors that you see are due to a subtle bug 
>> in the smpd state machine where a socket is closed twice. This should 
>> not affect your program. We will fix this bug in the next release.
>> These messages show up since you are using the "-localonly" option.)
>>
>> Regards,
>> Jayesh
>> -----Original Message-----
>> From: owner-mpich-discuss at mcs.anl.gov 
>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Gib Bogle
>> Sent: Sunday, April 13, 2008 8:13 PM
>> To: mpich-discuss at mcs.anl.gov
>> Subject: Re: [mpich-discuss] Socket error on Quad-core Windows XP
>>
>> I am running on an Intel Core 2 Quad CPU Q6600 PC, under Windows XP.  
>> I simply downloaded and installed the latest binary distribution for 
>> Windows, mpich2-1.0.7-win32-ia32.msi.  There was no configuration 
>> that I
> noticed.
>> The process manager is smpd, which is started automatically.
>> The invocation of my program is:
>>
>> mpiexec -localonly 3 simple.exe
>>
>> A simple version of the code follows.  I found a couple of 
>> interesting things.  There are many more 10093 errors with 3 processors
than with 4.
>> Uncommenting the deallocate statement in the main program seems to 
>> eliminate
>> 10093 errors, but I still get the occasional 10058 error.  It seems 
>> that
>> MPICH2 gets upset if memory allocated after MPI_INIT() is not 
>> deallocated before MPI_FINALIZE().  Note that errors occur 
>> intermittently - not every run.
>>
>> Cheers
>> Gib
>>
>> Code:
>>
>> ! FILE: simple.f90
>> ! This exhibits socket errors
>>
>> module mpitest
>>
>> use mpi
>> IMPLICIT NONE
>>
>> integer, parameter :: NDATA = 100
>> integer, parameter :: NX = 50, NY = NX, NZ = NX
>>
>> type occupancy_type
>>      integer :: cdata(NDATA)
>> end type
>>
>> type(occupancy_type), allocatable :: occupancy(:,:,:) integer :: me, 
>> my_cell_type
>>
>> contains
>>
>> !--------------------------------------------------------------------
>> -
>> ------
>> --------------
>> !--------------------------------------------------------------------
>> -
>> ------
>> --------------
>> subroutine mpi_initialisation
>> integer :: size, ierr, status(MPI_STATUS_SIZE)
>>
>> CALL MPI_INIT(ierr)
>> CALL MPI_COMM_RANK( MPI_COMM_WORLD, me, ierr ) CALL MPI_COMM_SIZE( 
>> MPI_COMM_WORLD, size, ierr ) end subroutine
>>
>> !--------------------------------------------------------------------
>> -
>> ------
>> --------------
>> !--------------------------------------------------------------------
>> -
>> ------
>> --------------
>> subroutine array_initialisation
>> integer :: x,y,z,k
>>
>> allocate(occupancy(NX,NY,NZ))
>>
>> k = 0
>> do x = 1,NX
>>      do y = 1,NY
>> 		do z = 1,NZ
>> 	        k = k+1
>>              occupancy(x,y,z)%cdata = k
>> 		enddo
>>      enddo
>> enddo
>> end subroutine
>>
>> end module
>>
>> !--------------------------------------------------------------------
>> -
>> ------
>> --------------
>> !--------------------------------------------------------------------
>> -
>> ------
>> --------------
>> PROGRAM simple
>>
>> use mpitest
>>
>> integer :: ierr
>>
>> call mpi_initialisation
>>
>> call array_initialisation
>>
>> call MPI_BARRIER ( MPI_COMM_WORLD, ierr )
>>
>> !deallocate(occupancy)
>>
>> write(*,*) 'MPI_FINALIZE: ',me
>>
>> CALL MPI_FINALIZE(ierr)
>>
>> END
>>
>>
>> Pavan Balaji wrote:
>>> Do you have a very simple (as simple as possible) program that 
>>> demonstrates this? Also, can you give some more information about 
>>> your installation --
>>>
>>> 1. Which version of MPICH2 are you using?
>>>
>>> 2. What configuration options were passed to MPICH2 during 
>>> configuration
>>>
>>> 3. What process manager are you using?
>>>
>>> 4. What command line did you use to launch the process manager?
>>>
>>> 5. What command line did you use to launch the program?
>>>
>>> 6. Anything other information we should probably know about in your 
>>> cluster, e.g., what OS, is there a firewall between the nodes, etc.
>>>
>>>  -- Pavan
>>>
>>> On 04/09/2008 09:49 PM, Gib Bogle wrote:
>>>> My mpich-2 program seems to run correctly, but when it tries to 
>>>> execute MPI_Finalize() it gives a range of error messages, all 
>>>> apparently related to closing the socket connections.  Typical 
>>>> messages are:
>>>>
>>>> unable to read the cmd header on the pmi context, socket connection 
>>>> closed
>>>>
>>>> shutdown failed, sock ####, error 10093
>>>>
>>>> closesocket failed, sock ####, error 10093
>>>>
>>>> So far I haven't seen any bad consequences from these errors, but 
>>>> they are disconcerting.  Should I care?  Is there something I can do?
>>>>
>>>> Gib
>>>>
>>
>>
> 
> 
> 
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
> 
> 





More information about the mpich-discuss mailing list