<div>Hi everyone:</div>
<div> </div>
<div>I have a cluster of 4 nodes, all of them are with Windows HPC server 2008 installed.</div>
<div>I make all of the 4 nodes in the same workgroup. I use MPICH2 1.0.6p1 from Argonne Lab.</div>
<div>And then</div>
<div>1. firewall of all 4 nodes are turned off</div>
<div>2. UAC (User Account Control) of all 4 nodes are turned off</div>
<div>3. I start smpd.exe (1.0.6p1 x64) in all the 4 nodes</div>
<div> </div>
<div>And I run a very simple MPI program (test_mpich2.exe)</div>
<div> </div>
<div>#include "mpi.h"<br>#include <iostream></div>
<div>int main(int argc, char **argv)<br>{<br> int cpuid, ncpu;<br> MPI_Init(&argc, &argv);<br> MPI_Comm_size(MPI_COMM_WORLD, &ncpu);<br> MPI_Comm_rank(MPI_COMM_WORLD, &cpuid);</div>
<div> printf("NCPU:%d, CPUID:%d\n", ncpu, cpuid);<br> fflush(stdout);</div>
<div> printf("start barrier\n"); fflush(stdout);<br> MPI_Barrier(MPI_COMM_WORLD);<br> printf("end barrier\n"); fflush(stdout);</div>
<div> MPI_Finalize();</div>
<div> return 0;</div>
<div>}</div>
<div> </div>
<div>The command is </div>
<div>mpiexec -hosts 2 <a href="http://192.168.1.1">192.168.1.1</a> <a href="http://192.168.1.2">192.168.1.2</a> <a href="file://192.168.1.1/shared/test_mpich2.exe">\\192.168.1.1\shared\test_mpich2.exe</a></div>
<div> </div>
<div>And the MPI_Barrier(...) function costs 10 seconds to return !!!!!</div>
<div> </div>
<div>If the same code is running on a Windows XP cluster, MPI_Barrier(...) returns at once!</div>
<div> </div>
<div> </div>
<div>Does anyone know how to solve this problem on Windows HPC Server 2008 ? (Windows Vista has the same problem, too)</div>
<div> </div>
<div>regards,</div>
<div> </div>
<div>Seifer Lin</div>
<div> </div>
<div> </div>
<div> </div>
<div> </div>
<div> </div>
<div> </div>
<div> </div>
<div> </div>