[mpich-discuss] Large memory allocations in MPI applications under Linux

Sudarshan Raghunathan rdarshan at gmail.com
Wed Apr 15 12:09:49 CDT 2009


Dear all,
My question does not pertain to MPICH per se, but I was curious to
know if I've run into a previously-solved problem.

I am running on a AMD Opteron SMP machine with 8 cores in total and 20
GB of physical memory running Linux kernel 2.6.18. My MPI application
(a hugely simplified version attached) is such that all processes
together need to allocate slightly more than 20GB at the same time
(i.e., when running P MPI processes, each rank allocates slightly more
than 20/P GB). When running with P=1, I get a NULL pointer from malloc
when allocating more than the amount of physical memory and can
gracefully terminate my application. However, when running with P > 1,
I see the OS swapping very heavily and the machine becomes totally
unresponsive for a long time (for larger Ps, the only way to get the
machine back into a responsive state is to reboot it). Clearly, this
is a major annoyance for me and for other users of the machine.

I am wondering if there is any way to restructure/rewrite my
application (or tweak settings for malloc), so that irrespective of
how many processes I'm running with, I get a null-pointer exception on
at least a subset of the ranks as soon as the total physical memory is
exhausted. The "obvious" solution is to look at /proc/meminfo to see
the physical amount of available memory and allocate only if
sufficient memory is available, but this seems to be highly
sub-optimal and fragile. Has anyone in the MPICH community run into
this problem before and if so, are there best practices for how one
must deal with memory allocations?

Thank you much in advance.

Sudarshan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_alloc.cc
Type: text/x-c++src
Size: 1143 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20090415/4b957f1e/attachment.cc>


More information about the mpich-discuss mailing list