general question on speed using quad core Xeons

Randall Mackie rlmackie862 at gmail.com
Tue Apr 15 19:19:14 CDT 2008


I'm running my PETSc code on a cluster of quad core Xeon's connected
by Infiniband. I hadn't much worried about the performance, because
everything seemed to be working quite well, but today I was actually
comparing performance (wall clock time) for the same problem, but on
different combinations of CPUS.

I find that my PETSc code is quite scalable until I start to use
multiple cores/cpu.

For example, the run time doesn't improve by going from 1 core/cpu
to 4 cores/cpu, and I find this to be very strange, especially since
looking at top or Ganglia, all 4 cpus on each node are running at 100% almost
all of the time. I would have thought if the cpus were going all out,
that I would still be getting much more scalable results.

We are using mvapich-0.9.9 with infiniband. So, I don't know if
this is a cluster/Xeon issue, or something else.

Anybody with experience on this?

Thanks, Randy M.




More information about the petsc-users mailing list