[petsc-users] Performance of superlu_dist
Ormiston, Scott J.
SJ_Ormiston at UManitoba.ca
Fri Apr 1 12:36:05 CDT 2011
Gaetan Kenway wrote:
> I have seen the same thing with SuperLU_dist as Scott Ormiston has. I've
> been using to solve (small-ish) 3D solid finite element structural
> system with rarely more than ~30,000 dof. Basically, if you use more
> than 2 cores, SuperLU_dist tanks and the factorization time goes through
> the roof exponentially. However, if you solve the same system with
> Spooles, its orders of magnitude faster. I'm not overly concerned with
> speed, since I only do this factorization once in my code and as such I
> don't have precise timing results. WIth 22,000 dof on an dual socket
> Xeon X5500 series machine (8 cores per node), with spooles, there's a
> speed up going from 1-8 procs. I could go up to about 32 procs before it
> takes longer than the single processor case.
Following the suggestion of Desire Nuentsa Wakam (who pointed me to the
FAQ), I have had better performance from superlu_dist using
mpiexec --cpus-per-proc 4 --bind-to-core -np 3 executable_name \
-pc_type lu -pc_factor_mat_solver_package superlu_dist
on a server that has 4 quad-core CPUS and 64 Gb of RAM. I assume other
option settings will be needed on other arrangements of cores and
interconnects.
I have not done enough tests to see about any speed-up.
Thank you for your pointer to Spooles.
Scott Ormiston
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SJ_Ormiston.vcf
Type: text/x-vcard
Size: 321 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110401/542a7139/attachment.vcf>
More information about the petsc-users
mailing list