[mpich-discuss] MPICH2 1.0.6p1 & Windows HPC Server 2008(badperformance

Seifer Lin seiferlin at gmail.com
Mon Dec 15 17:43:11 CST 2008


Hi:
For both my XP and HPC Server 2008 clusters, TcpAckFrequency is not present
for default.
But XP has no such bad performance problem.

For HPC Server 2008 (eahc node in a cluster use the same value of
TcpAckFrequency)
1. No TcpAckFrequency present (the default case) -> bad performance
2. TcpAckFrequency=1 -> good performance
3. TcpAckFrequency=2 -> bad performance
4. TcpAckFrequency=10 -> bad performance

>From some sites, they mentioned that TcpAckFrequency=1 will prevent the
small packets to be merged into a large packet before transmission.

For my tests, just for the function MPI_Barrier(MPI_COMM_WORLD), I can
"feel" that with TcpAckFrequency=1, MPI_Barrier(MPI_COMM_WORLD) returns
faster than TcpAckFrequency=2.

regards,

Seifer

2008/12/16 Jayesh Krishna <jayesh at mcs.anl.gov>

>  Hi,
>  Great to know its working for you now. What was the value (The value with
> the bad perf) of TcpAckFrequency in your machine (Did you change the default
> value before you ran your test cases ?)?
>
> Regards,
> Jayesh
>
>  ------------------------------
>  *From:* mpich-discuss-bounces at mcs.anl.gov [mailto:
> mpich-discuss-bounces at mcs.anl.gov] *On Behalf Of *Seifer Lin
> *Sent:* Monday, December 15, 2008 2:05 AM
> *To:* mpich-discuss at mcs.anl.gov
>  *Subject:* Re: [mpich-discuss] MPICH2 1.0.6p1 & Windows HPC Server
> 2008(badperformance
>
>   Hi Everyone:
>
> Finally the problem is resolved by adding a registry value
>
> HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces\{NIC-id}\TcpAckFrequency
>
> Set TcpAckFrequency to 1 resolves the problem.
>
> regards,
> Seifer Lin
>
> 2008/12/10 Hisham Adel <hosham2004 at yahoo.com>
>
>>
>> Why are you use MPICH2 ?
>>
>> The Cluster component on Windows server 2008 (HPC component) have MSMPI it
>> is like MPICH2 but it is modified by Microsoft.
>> You can use it as MPICH2. There is no change in any thing the same
>> functions and everything is the same.
>>
>> I have tried it on a cluster of 4 nodes and it works.
>>
>>
>>
>> Hisham Adel Hassan Mohamed
>> Research Assistant
>> Bioinformatics Group
>> Nile University
>> Smart Village - km28
>> Cairo - Alexandria Desert Road
>> Giza , EGYPT
>> Mobile : +20106459663
>> Email:Hisham.mohamed at nileu.edu.eg <Email%3AHisham.mohamed at nileu.edu.eg>
>>
>>
>> --- On *Mon, 12/8/08, Jayesh Krishna <jayesh at mcs.anl.gov>* wrote:
>>
>> From: Jayesh Krishna <jayesh at mcs.anl.gov>
>> Subject: Re: [mpich-discuss] MPICH2 1.0.6p1 & Windows HPC Server 2008
>> (badperformance)
>> To: "'Seifer Lin'" <seiferlin at gmail.com>
>> Cc: mpich-discuss at mcs.anl.gov
>> Date: Monday, December 8, 2008, 5:10 PM
>>
>>  Hi,
>>  Can you try out the latest stable version (1.0.8) of MPICH2 (
>> http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads
>> )?
>>
>> Regards,
>> Jayesh
>>
>>  ------------------------------
>> *From:* mpich-discuss-bounces at mcs.anl.gov [mailto:
>> mpich-discuss-bounces at mcs.anl.gov] *On Behalf Of *Seifer Lin
>> *Sent:* Monday, December 08, 2008 12:47 AM
>> *To:* mpich-discuss at mcs.anl.gov
>> *Subject:* [mpich-discuss] MPICH2 1.0.6p1 & Windows HPC Server 2008
>> (badperformance)
>>
>>  Hi everyone:
>>
>> I have a cluster of 4 nodes, all of them are with Windows HPC server 2008
>> installed.
>> I make all of the 4 nodes in the same workgroup. I use MPICH2 1.0.6p1 from
>> Argonne Lab.
>> And then
>> 1. firewall of all 4 nodes are turned off
>> 2. UAC (User Account Control) of all 4 nodes are turned off
>> 3. I start smpd.exe (1.0.6p1 x64) in all the 4 nodes
>>
>> And I run a very simple MPI program (test_mpich2.exe)
>>
>> #include "mpi.h"
>> #include <iostream>
>> int main(int argc, char **argv)
>> {
>>     int cpuid, ncpu;
>>     MPI_Init(&argc, &argv);
>>     MPI_Comm_size(MPI_COMM_WORLD, &ncpu);
>>     MPI_Comm_rank(MPI_COMM_WORLD, &cpuid);
>>     printf("NCPU:%d, CPUID:%d\n", ncpu, cpuid);
>>     fflush(stdout);
>>     printf("start barrier\n"); fflush(stdout);
>>     MPI_Barrier(MPI_COMM_WORLD);
>>     printf("end barrier\n"); fflush(stdout);
>>     MPI_Finalize();
>>     return 0;
>> }
>>
>> The command is
>> mpiexec -hosts 2 192.168.1.1 192.168.1.2
>> \\192.168.1.1\shared\test_mpich2.exe
>>
>> And the MPI_Barrier(...) function costs 10 seconds to return !!!!!
>>
>> If the same code is running on a Windows XP cluster, MPI_Barrier(...)
>> returns at once!
>>
>>
>> Does anyone know how to solve this problem on Windows HPC Server 2008 ?
>> (Windows Vista has the same problem, too)
>>
>> regards,
>>
>> Seifer Lin
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20081216/19a5ea34/attachment.htm>


More information about the mpich-discuss mailing list