Unchecked memory allocation and potential performance problem

William Gropp gropp at mcs.anl.gov
Tue Dec 5 15:35:41 CST 2006


Hmm.

What's really needed is some feedback to the user that the redef is a  
problem.  The buffer size choice probably also is several factors of  
two in performance at least (and we should have an answer for that  
anyway; if the default buffer size got 1/2 of the performance, that  
might be a good starting point).

So, I'd say if there was some good way to guide the user to better  
usage scenarios, automating the process of picking good strategies  
for poor choices is less important.  But, as the FLASH explanation  
shows, seemingly innocuous changes can have large, and undiagnosed,  
performance consequences.

For example, if there was a way to ask the close call to output a  
list of performance stealing operations that the user committed, that  
would have been a huge help.

Bill

On Dec 5, 2006, at 3:06 PM, Robert Latham wrote:

> On Tue, Dec 05, 2006 at 08:57:26AM -0600, William Gropp wrote:
>> In mpincio.c, there's this code
>>
>>   const int bufsize = 4096;
>>   ...
>>   void *buf = malloc(bufsize);
>
>> buf isn't checked for null before it is used.
>
> I've fixed this in CVS.
>
>> Also, the 4096 is used
>> as the buffer size in MPI_File_read_at and similar calls (this seems
>> rather small).  Should this buffer size be negotiated? (I'm wondering
>> if this might be a source of the slowdown that we're seeing, as this
>> code can be invoked during a close).
>
> I covered most of this in my reply to Katie's message.  I don't know
> what a good value would be for this:  no matter how big we make it, if
> the caller mucks with the header and forces us to shuffle all the bits
> in the datafile, it's going to be slow.  Additionally, both we and  
> serial
> netcdf strongly encourage everyone to do all their defining once, so
> is it worth constructing a lot of negotiation infrastructure for a
> hopefully infrequenlty used code path?
>
> I could pick a larger yet still safe value for the bufsize.  Maybe 1
> MB?
>
> ==rob
>
> -- 
> Rob Latham
> Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
> Argonne National Lab, IL USA                 B29D F333 664A 4280 315B
>




More information about the parallel-netcdf mailing list