[mpich-discuss] IMB fails with tcp/ip 512 processes.

Guillaume Mercier mercierg at mcs.anl.gov
Wed Jun 3 07:52:56 CDT 2009


Devesh,

I don't remember exactly if ssm stands for "scalable shared memory" or 
"sockets + shared memory".
As for Nemesis, it supports shared memory and sockets (throught TCP) but 
also Myrinet, etc.
It's the default communication channel in MPICH2 1.1
I'm not sure about the current ssm support in MPICH2 so maybe you should 
give Nemesis a try.

Guillaume

Devesh Sharma a écrit :
> thanks Guillaume,
> Kindly clarify the difference between nemesis and ssm devices. I am
> sorry if it sounds silly.
>
> On Wed, Jun 3, 2009 at 6:09 PM, Guillaume Mercier <mercierg at mcs.anl.gov> wrote:
>   
>> Hello,
>>
>> The MPICH2/Nemesis TCP module uses TCP as protocol, not UDP.
>> You can configure it with the option  --with-device:nemesis:tcp
>> The TCP module is the default communication network in Nemesis.
>>
>> Regards,
>> Guillaume
>>
>>
>> Devesh Sharma a écrit :
>>     
>>> thanks sir,
>>>
>>> I will try out this.
>>> Dose new TCP/IP mpi stack still uses UD socket fot data transfer? If
>>> yes them what is the difficulty to use TCP socket?
>>>
>>> On Wed, Jun 3, 2009 at 5:18 PM, Dhabaleswar Panda
>>> <panda at cse.ohio-state.edu> wrote:
>>>
>>>       
>>>> If you are interested in the TCP/IP interface, you should use the latest
>>>> MPICH2 stack (not MVAPICH2 stack). FYI, the TCP/IP interface support in
>>>> MVAPICH2-1.2 is similar to that in MPICH2 1.0.7. We released MVAPICH2 1.4
>>>> yesterday and it has the TCP/IP support of MPICH2 1.0.8p1.  The latest
>>>> version of MPICH2 stack is the 1.1 series. You should use this version to
>>>> get the best performance and stability.
>>>>
>>>> Hope this helps.
>>>>
>>>> Thanks,
>>>>
>>>> DK
>>>>
>>>> On Wed, 3 Jun 2009, Devesh Sharma wrote:
>>>>
>>>>
>>>>         
>>>>> Hello list
>>>>>
>>>>> I am trying to run IMB using ssm ADI of MVAPICH2-1.2 on 32 quad socket
>>>>> quad core machines. But it is failing because segfault when I run with
>>>>> 512 processes.
>>>>> Somebody kindly help me to figure out where the problem is.
>>>>>
>>>>> -Devesh
>>>>>
>>>>>
>>>>>           
>>>>         
>>     



More information about the mpich-discuss mailing list