[mpich-discuss] Socket error on Quad-core Windows XP

Jayesh Krishna jayesh at mcs.anl.gov
Mon Apr 14 15:18:59 CDT 2008


Hi,
 Run your MPI program as "mpiexec -n 3 simple.exe" (Or to use all the 4
cores/procs - "mpiexec -n 4 simple.exe"). 

Regards,
Jayesh
-----Original Message-----
From: Gib Bogle [mailto:g.bogle at auckland.ac.nz] 
Sent: Monday, April 14, 2008 2:44 PM
To: Jayesh Krishna
Subject: RE: [mpich-discuss] Socket error on Quad-core Windows XP

If you can tell me of another way to access the 4 cores on my machine, I'll
try it.

Gib

Quoting Jayesh Krishna <jayesh at mcs.anl.gov>:

>  Hi,
>   Is there any reason you want to use the "-localonly" option ?
>
> (PS: The "closesocket()" errors that you see are due to a subtle bug 
> in the smpd state machine where a socket is closed twice. This should 
> not affect your program. We will fix this bug in the next release. 
> These messages show up since you are using the "-localonly" option.)
>
> Regards,
> Jayesh
> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Gib Bogle
> Sent: Sunday, April 13, 2008 8:13 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] Socket error on Quad-core Windows XP
>
> I am running on an Intel Core 2 Quad CPU Q6600 PC, under Windows XP.  
> I simply downloaded and installed the latest binary distribution for 
> Windows, mpich2-1.0.7-win32-ia32.msi.  There was no configuration that I
noticed.
> The process manager is smpd, which is started automatically.
> The invocation of my program is:
>
> mpiexec -localonly 3 simple.exe
>
> A simple version of the code follows.  I found a couple of interesting 
> things.  There are many more 10093 errors with 3 processors than with 4.
> Uncommenting the deallocate statement in the main program seems to 
> eliminate
> 10093 errors, but I still get the occasional 10058 error.  It seems 
> that
> MPICH2 gets upset if memory allocated after MPI_INIT() is not 
> deallocated before MPI_FINALIZE().  Note that errors occur 
> intermittently - not every run.
>
> Cheers
> Gib
>
> Code:
>
> ! FILE: simple.f90
> ! This exhibits socket errors
>
> module mpitest
>
> use mpi
> IMPLICIT NONE
>
> integer, parameter :: NDATA = 100
> integer, parameter :: NX = 50, NY = NX, NZ = NX
>
> type occupancy_type
>      integer :: cdata(NDATA)
> end type
>
> type(occupancy_type), allocatable :: occupancy(:,:,:) integer :: me, 
> my_cell_type
>
> contains
>
> !---------------------------------------------------------------------
> ------
> --------------
> !---------------------------------------------------------------------
> ------
> --------------
> subroutine mpi_initialisation
> integer :: size, ierr, status(MPI_STATUS_SIZE)
>
> CALL MPI_INIT(ierr)
> CALL MPI_COMM_RANK( MPI_COMM_WORLD, me, ierr ) CALL MPI_COMM_SIZE( 
> MPI_COMM_WORLD, size, ierr ) end subroutine
>
> !---------------------------------------------------------------------
> ------
> --------------
> !---------------------------------------------------------------------
> ------
> --------------
> subroutine array_initialisation
> integer :: x,y,z,k
>
> allocate(occupancy(NX,NY,NZ))
>
> k = 0
> do x = 1,NX
>      do y = 1,NY
> 		do z = 1,NZ
> 	        k = k+1
>              occupancy(x,y,z)%cdata = k
> 		enddo
>      enddo
> enddo
> end subroutine
>
> end module
>
> !---------------------------------------------------------------------
> ------
> --------------
> !---------------------------------------------------------------------
> ------
> --------------
> PROGRAM simple
>
> use mpitest
>
> integer :: ierr
>
> call mpi_initialisation
>
> call array_initialisation
>
> call MPI_BARRIER ( MPI_COMM_WORLD, ierr )
>
> !deallocate(occupancy)
>
> write(*,*) 'MPI_FINALIZE: ',me
>
> CALL MPI_FINALIZE(ierr)
>
> END
>
>
> Pavan Balaji wrote:
>>
>> Do you have a very simple (as simple as possible) program that 
>> demonstrates this? Also, can you give some more information about 
>> your installation --
>>
>> 1. Which version of MPICH2 are you using?
>>
>> 2. What configuration options were passed to MPICH2 during 
>> configuration
>>
>> 3. What process manager are you using?
>>
>> 4. What command line did you use to launch the process manager?
>>
>> 5. What command line did you use to launch the program?
>>
>> 6. Anything other information we should probably know about in your 
>> cluster, e.g., what OS, is there a firewall between the nodes, etc.
>>
>>  -- Pavan
>>
>> On 04/09/2008 09:49 PM, Gib Bogle wrote:
>>> My mpich-2 program seems to run correctly, but when it tries to 
>>> execute MPI_Finalize() it gives a range of error messages, all 
>>> apparently related to closing the socket connections.  Typical 
>>> messages are:
>>>
>>> unable to read the cmd header on the pmi context, socket connection 
>>> closed
>>>
>>> shutdown failed, sock ####, error 10093
>>>
>>> closesocket failed, sock ####, error 10093
>>>
>>> So far I haven't seen any bad consequences from these errors, but 
>>> they are disconcerting.  Should I care?  Is there something I can do?
>>>
>>> Gib
>>>
>>
>
>
>



----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.





More information about the mpich-discuss mailing list