[mpich-discuss] DuplicateHandle on easy_create

Calin Iaru calin at dolphinics.com
Fri Mar 28 11:51:31 CDT 2008


Hi Jayesh,

    the application runs on 2 nodes and it is launched by mpiexec/smpd. I looked at some handle inheritance articles and it appears that if an application wants to allow a handle to be inherited, it has to call DuplicateHandle. If DuplicateHandle is not called, then I would expect the handle not to be inherited. Why would you want the handle to not be inherited? Is it because of security concerns?

Best regards,
    Calin


From: Jayesh Krishna 
Sent: Friday, March 28, 2008 5:20 PM
To: 'Calin Iaru' 
Cc: mpich-discuss at mcs.anl.gov 
Subject: RE: [mpich-discuss] DuplicateHandle on easy_create


Hi,
 Hmm... looks to me that you are running your MPI app as a singleton client (Without using mpiexec to launch the MPI app).
 There is a lot more work to be done regarding the singleton client implementation (And I think you might have found a suble bug :) where the PMI_Init() gets called multiple times). Let us know if you are running your MPI app as a singleton client and we can discuss on the fix to your problem.

Regards,
Jayesh



--------------------------------------------------------------------------------
From: Calin Iaru [mailto:calin at dolphinics.com] 
Sent: Friday, March 28, 2008 10:33 AM
To: Jayesh Krishna; 'Calin Iaru'
Cc: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] DuplicateHandle on easy_create


I don't have the exact error message, but you can see an exception being reported in the stack backtrace:

0:000> k
ChildEBP RetAddr  
0012f6f0 7ffe0304 ntdll!KiRaiseUserExceptionDispatcher+0x37
0012f6f4 77f426cb SharedUserData!SystemCallStub+0x4
0012f6f8 71b21e1f ntdll!NtDeviceIoControlFile+0xc
0012f748 71b21f29 MSWSOCK!SockGetTdiHandles+0x5f
0012f758 71b22071 MSWSOCK!SockNotifyHelperDll+0x25
0012f7c8 71b30791 MSWSOCK!SockCloseSocket+0x1ec
0012f7d8 71bf27c6 MSWSOCK!DeleteSockets+0x31
0012f804 71b309b3 WS2HELP!WahEnumerateHandleContexts+0x92
0012f828 71c0d89d MSWSOCK!WSPCleanup+0x73
0012f834 71c09c1a WS2_32!DPROVIDER::WSPCleanup+0x1d
0012f86c 71c02c22 WS2_32!CleanupProtocolProviders+0x23
0012f884 71c06572 WS2_32!NSCATALOG::EnumerateCatalogItems+0x22
0012f8a0 71c081b4 WS2_32!DPROCESS::~DPROCESS+0x6f
0012f8b8 10172fe3 WS2_32!WSACleanup+0x40
0012f8c8 100d0e2a nmpi!MPIDU_Sock_finalize+0x53 
0012fd50 100cf01a nmpi!iPMI_Finalize+0x15a 
0012fd5c 100c9afe nmpi!PMI_Finalize+0x1a 
0012fd6c 100a14bd nmpi!MPIDI_CH3_Finalize+0x7e 
0012fdec 1003bac6 nmpi!MPID_Finalize+0x2dd 
0012fdfc 00401e0f nmpi!PMPI_Finalize+0x96 

0:000> kb
ChildEBP RetAddr  Args to Child              
0012f6f0 7ffe0304 77f426cb 71b21e1f 00000328
ntdll!KiRaiseUserExceptionDispatcher+0x37
0012f6f4 77f426cb 71b21e1f 00000328 00000360 SharedUserData!SystemCallStub+0x4
0012f6f8 71b21e1f 00000328 00000360 00000000 ntdll!NtDeviceIoControlFile+0xc

and here's how !htrace looks like on handle 0x328

0:000> !htrace 0x328
--------------------------------------
Handle = 0x00000328 - *** BAD REFERENCE ***
Thread ID = 0x00000c54, Process ID = 0x00000c50

0x71b21f29: MSWSOCK!SockNotifyHelperDll+0x00000025
0x71b30791: MSWSOCK!DeleteSockets+0x00000031
0x71b309b3: MSWSOCK!WSPCleanup+0x00000073
0x71c0d89d: WS2_32!DPROVIDER::WSPCleanup+0x0000001d
0x71c02c22: WS2_32!NSCATALOG::EnumerateCatalogItems+0x00000022
--------------------------------------
Handle = 0x00000328 - CLOSE
Thread ID = 0x00000c54, Process ID = 0x00000c50

0x10174f88: nmpi!easy_create+0x00000228
0x101747cc: nmpi!MPIDU_Sock_post_connect+0x0000011c
0x100d04b1: nmpi!uPMI_ConnectToHost+0x00000041
0x100cfc8a: nmpi!iPMI_Init+0x0000049a
0x100cefd8: nmpi!PMI_Init+0x00000578
0x100ca614: nmpi!MPIDI_CH3_Init_process_group+0x00000034
0x100caa6c: nmpi!MPIDI_CH3_Init+0x0000003c
0x100a3972: nmpi!MPID_Init+0x000001b2
0x10056bff: nmpi!MPIR_Init_thread+0x0000025f
0x10056883: nmpi!PMPI_Init+0x00000053
0x0040132a: imb!main+0x0000002a
--------------------------------------
Handle = 0x00000328 - OPEN
Thread ID = 0x00000c54, Process ID = 0x00000c50

0x71b22e43: MSWSOCK!SockSocket+0x000003b0
0x71b23008: MSWSOCK!WSPSocket+0x00000126
0x71c02e5d: WS2_32!WSASocketW+0x000000ce
0x71c12447: WS2_32!WSASocketA+0x00000057
0x10174d95: nmpi!easy_create+0x00000035
0x101747cc: nmpi!MPIDU_Sock_post_connect+0x0000011c
0x100d04b1: nmpi!uPMI_ConnectToHost+0x00000041
0x100cfc8a: nmpi!iPMI_Init+0x0000049a

Finaly, the call before the exception, on easy_create:
0:000> ub 0x10174f88
nmpi!easy_create+0x20e 
10174f6e 8b55dc          mov     edx,dword ptr [ebp-24h]
10174f71 52              push    edx
10174f72 8bfc            mov     edi,esp
10174f74 ff15989a2810    call    dword ptr [nmpi!_imp__GetCurrentProcess
(10289a98)]
10174f7a 3bfc            cmp     edi,esp
10174f7c e80f0dfaff      call    nmpi!_RTC_CheckEsp (10115c90)
10174f81 50              push    eax
10174f82 ff15949a2810    call    dword ptr [nmpi!_imp__DuplicateHandle]

Best regards,
    Calin


From: Jayesh Krishna 
Sent: Friday, March 28, 2008 4:17 PM
To: 'Calin Iaru' 
Cc: mpich-discuss at mcs.anl.gov 
Subject: RE: [mpich-discuss] DuplicateHandle on easy_create


Hi,
 If you are referring to the use of DuplicateHandle() in sock.c (src/mpid/common/sock/iocp/sock.c : easy_create()), it is to prevent child processes from inheriting the socket (if you would like to duplicate the handle to be used by another process as you mentioned one should use WSADuplicateSocket()).
 Are you getting any messages from your App verifier ?

Regards,
Jayesh


--------------------------------------------------------------------------------

From: owner-mpich-discuss at mcs.anl.gov [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Calin Iaru
Sent: Friday, March 28, 2008 9:35 AM
To: mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] DuplicateHandle on easy_create


A socket object should not be duplicated using DuplicateHandle. For reference, see:
http://msdn2.microsoft.com/en-us/library/ms724251(vs.85).aspx

The problem is that if an application runs under the control of Application Verifier, the exception may be reported and it will require expert skills to trace the cause - use !htrace from WinDbg. Without AppVer, no exception is raised.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080328/aab37c69/attachment.htm>


More information about the mpich-discuss mailing list