[mpich-discuss] Large memory allocations in MPI applications under Linux

Sudarshan Raghunathan rdarshan at gmail.com
Wed Apr 15 14:40:16 CDT 2009


Hello Dave,
Yes, unfortunately I really do need to touch all the memory. It just
happens for a particular problem size the allocations reach near the
available system memory, but it's not true in general (smaller
problems work just fine). I wish I could predict the required memory a
priori, so I don't even call malloc if I know it's going to fail, but
that's not feasible in my case (or very, very difficult and
unreliable). As I mentioned, I did consider peeking at /proc/meminfo
to see how much physical memory there is available, but that seems to
be a bit clunky and unportable.

My perhaps unrealistic hope was that the OS would protect applications
that require too much memory and allow them to fail rather than
over-committing and then hanging the system.

I indeed was concerned about MPI failing internally because of malloc
returning NULL but at least for MPICH, I noticed that most of the
routines (at least the ones that I care about :-)) check for this and
return an error code that can be trapped by the calling application.

Thank you much,
Sudarshan

2009/4/15 Dave Goodell <goodell at mcs.anl.gov>:
> Are you actually touching all of this memory?  If so, you just shouldn't be
> trying to allocate more memory than is physically available in general if
> you care about the performance of your application.  In most scenarios
> swapping will absolutely kill the performance of your application.  Is there
> a hard requirement that you process >20GiB of data in-core on a 20GiB-sized
> machine or are you just trying to squeeze every last drop of
> performance/precision/problem out of the system?
>
> Alternatively if you use the rlimit trick described earlier and you malloc
> until it returns NULL then you will probably cause problems for any
> libraries that you use, including the MPI library.  Many libraries assume
> that at least small to moderate amounts of memory are available via malloc
> and will bail out if they are unable to allocate that memory.  This is
> definitely true of MPICH2 and is also the case for anything that uses
> certain libc functions such as strdup or mergesort.
>
> In either case, you should self-impose limits for your memory usage rather
> than relying on the operating system to impose limits on your memory usage.
>  When you run all the way up against OS resource limits bad things usually
> start to happen depending on the exact resource in question.  It also
> usually leads to portability problems down the road when you try to move
> your software to a new platform.
>
> -Dave


More information about the mpich-discuss mailing list