[mpich-discuss] MPICH2 1.0.6p1 & Windows HPC Server 2008 (badperformance)

Jayesh Krishna jayesh at mcs.anl.gov
Tue Dec 9 10:54:24 CST 2008


Hi,
 Can you send us the code for LaplaceSolver.exe (or a test program that
runs slow) ? How long do you have to wait for the program to complete
execution ? Are the machines connected using Ethernet ?
 
Regards,
Jayesh

  _____  

From: Seifer Lin [mailto:seiferlin at gmail.com] 
Sent: Monday, December 08, 2008 9:11 PM
To: Jayesh Krishna
Cc: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] MPICH2 1.0.6p1 & Windows HPC Server 2008
(badperformance)


Hi:
 
After that I put all 4 nodes in the same domain (originally they are in
the same workgroup)
For the MPI_Barrier(...) function, version 1.0.6p1 and 1.0.8 both return
at once.
But for another simple MPI program (Laplace equation solver), both 1.0.6p1
and 1.0.8 run very SLOW!!!!
 
I have noticed that if the command is like
mpiexec -hosts 2 192.168.1.1 192.168.1.2
\\192.168.1.1\shared\LaplaceSolver.exe
<file://192.168.1.1/shared/LaplaceSolver.exe> 
It runs very SLOW!!! (the processes are located at different machines)
 
Another command is like
mpiexec -hosts 2 192.168.1.1 192.168.1.1
\\192.168.1.1\shared\LaplaceSolver.exe
<file://192.168.1.1/shared/LaplaceSolver.exe> 
It runs at normal fast speed! (the processes are located at the SAME
machine)
 
 
I think this may due to the strict policies on network transfer of Windows
HPC Server 2008 (and Vista also)
Do you have any solution to this? thank you!
 
regards,
 
Seifer Lin


2008/12/8 Jayesh Krishna <jayesh at mcs.anl.gov>


Hi,
 Can you try out the latest stable version (1.0.8) of MPICH2
(http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=dow
nloads)?
 
Regards,
Jayesh

  _____  

From: mpich-discuss-bounces at mcs.anl.gov
[mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Seifer Lin
Sent: Monday, December 08, 2008 12:47 AM
To: mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] MPICH2 1.0.6p1 & Windows HPC Server 2008
(badperformance)


Hi everyone:
 
I have a cluster of 4 nodes, all of them are with Windows HPC server 2008
installed.
I make all of the 4 nodes in the same workgroup. I use MPICH2 1.0.6p1 from
Argonne Lab.
And then
1. firewall of all 4 nodes are turned off
2. UAC (User Account Control) of all 4 nodes are turned off
3. I start smpd.exe (1.0.6p1 x64) in all the 4 nodes
 
And I run a very simple MPI program (test_mpich2.exe)
 
#include "mpi.h"
#include <iostream>
int main(int argc, char **argv)
{
    int cpuid, ncpu;
    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &ncpu);
    MPI_Comm_rank(MPI_COMM_WORLD, &cpuid);
    printf("NCPU:%d, CPUID:%d\n", ncpu, cpuid);
    fflush(stdout);
    printf("start barrier\n"); fflush(stdout);
    MPI_Barrier(MPI_COMM_WORLD);
    printf("end barrier\n"); fflush(stdout);
    MPI_Finalize();
    return 0;
}
 
The command is   
mpiexec -hosts 2 192.168.1.1 <http://192.168.1.1/>  192.168.1.2
<http://192.168.1.2/>  \\192.168.1.1\shared\test_mpich2.exe
 
And the MPI_Barrier(...) function costs 10 seconds to return !!!!!
 
If the same code is running on a Windows XP cluster, MPI_Barrier(...)
returns at once!
 
 
Does anyone know how to solve this problem on Windows HPC Server 2008 ?
(Windows Vista has the same problem, too)
 
regards,
 
Seifer Lin
 
 
 
 
 
 
 
 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20081209/6907dc9c/attachment.htm>


More information about the mpich-discuss mailing list