[mpich-discuss] MPICH v1.2.7p1 and SMP clusters

Marcus Vinicius Brandão Soares mvbsoares at gmail.com
Tue Jan 13 11:06:55 CST 2009


Hello Gustavo and all,

You described that you are using two machines with a dual processor in each
one. If I can model it in a simple graph, we have two vertices and two
unidirectional edges.

Each machine has a dual processor, each one with dual core, so there are 8
processor. But lets think again in the graph model: now we have two
vertices, each one with two more vertices; these last two vertices have two
more vertices too, and so this is the end.

Do you know the structure of the communication lines of the core processors
?

2009/1/13 Gustavo Miranda Teixeira <magusbr at gmail.com>

> Hello everyone!
>
> I've been experiencing some issues when using MPICH v1.2.7p1 and a SMP
> cluster and thought maybe some one can help me here.
>
> I have a small cluster with two dual processor machines with gigabit
> ethernet communication. Each processor is a dual core which sums up to 8
> cores of processors. When I run an application spreading 4 processes in both
> the machines (like distributing 2 processes in one machine and 2 processes
> in another) I get a significantly better performance than when I run the
> same application using 4 processes in only one machine. Isn`t it a bit
> curious? I know some people who also noticed that, but no one can explain me
> why this happens. Googling it didn't helped either. I originally thought it
> was a problem from my kind of application (a heart simulator which using
> PETSc to solve some differential equations) but some simple experimentations
> showed a simple MPI_Send inside a huge loop causes the same issue. Measuring
> cache hits and misses showed it`s not a memory contention problem. I also
> know that a in-node communication in MPICH uses the loopback interface, but
> as far as I know a message that uses loopback interface simply takes a
> shortcut to the input queue instead of being sent to the device, so there is
> no reason for the message to take longer to get to the other processes. So,
> I have no idea why it`s taking longer to use MPICH in the same machine. Does
> anyone else have noticed that too? Is there some logical explanation for
> this to happen?
>
> Thanks,
> Gustavo Miranda Teixeira
>



-- 
Marcus Vinicius
--
"Havendo suficientes colaboradores,
Qualquer problema é passível de solução"
Eric S. Raymond
A Catedral e o Bazar

"O passado é apenas um recurso para o presente"
Clave de Clau

"Ninguém é tão pobre que não possa dar um abraço; e
Ninguém é tão rico que não necessite de um abraço.
Anônimo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20090113/dc6d6fab/attachment.htm>


More information about the mpich-discuss mailing list