[mpich-discuss] \Device\Afd number of handle
Eugenio.Chiavaccini at cst.com
Wed Nov 4 11:15:03 CST 2009
I´m dealing with some deadlocks related to MPI, maybe due to a high number of used resources/handles.
The deadlock I´m observing is not strictly related to MPI communication but to the use in my program of some sockets. And it happens, or at least I have so experienced, on a machine cluster with Window Server 2008, not, for instance, on Window XP.
I´m getting deeply in the issue, but one hypothesis which comes to my mind is that the number of "File/Handle" simultaneously opened by the mpiexec called process is too high and maybe reach some threshold limits.
In detail I have seen in the process explorer (Sysinternals.com) that the number of associated file handles "Devices\Afd" is already extremely high at the beginning of the code execution. This is certainly due to the inheritance of the handle property which is used in the CreateProcess function of mpiexec and the associated demon smpd. Actually those handles are coming exactly from smpd.
Now I wonder if this could be a reason why, when I´m trying to create a socket and communicate data, I´m getting stuck.
As said, the problem happened at the moment only with Window Server 2008. The number of "Devices\Afd" is actually also high in the XP case (in the order of 100), but in the Window Server exceeds quickly 200/300.
Is anyone aware of possible strategies or solutions or ideas to be checked??
Suggestions are really welcome.
In case you need some more details on my installation, do not hesitate to ask.
Thanks a lot
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mpich-discuss