Hints on improving performance with WRF and Pnetcdf

Craig Tierney Craig.Tierney at noaa.gov
Mon Sep 6 08:07:56 CDT 2010


On 9/6/10 4:55 AM, Gerry Creager wrote:
> Craig Tierney wrote:
>> On 9/4/10 8:25 PM, Gerry Creager wrote:
>>> Rob Latham wrote:
>>>> On Thu, Sep 02, 2010 at 06:23:42PM -0600, Craig Tierney wrote:
>>>>> I did try setting the hints myself by changing the code, and
>>>>> performance
>>>>> still stinks (or is no faster). I was just looking for a way to not
>>>>> have to modify WRF, or more importantly have every user modify WRF.
>>>>
>>>> What's going slowly?
>>>> If wrf is slowly writing record variables, you might want to try
>>>> disabling collective I/O or carefully selecting the intermediate
>>>> buffer to be as big as one record.
>>>>
>>>> That's the first place I'd look for bad performance.
>>>
>>> Ah, but I'm seeing the same thing on Ranger (UTexas). I'm likely going
>>> to have to modify the WRF pnetcdf code to identify a sufficiently large
>>> stripe count (Lustre file system) to see any sort of real improvement.
>>>
>>> More to the point, I see worse performance than with normal Lustre and
>>> regular netcdf. AND, there's no way to set MPI-IO-HINTS in the SGE as
>>> configured on Ranger. We've tried and their systems folk concur, so it's
>>> not just me saying it.
>>>
>>
>> What do you mean you can't? How would you set it in another batch system?
>
> Pretty much that. In SGE as installed at TACC, it doesn't pass anything.
> That's not to say it won't work with SGE, but not with SGE as installed
> at TACC.
>

Still not clear.  What can you pass to make this work?  What doesn't SGE
pass?  Are you saying there is an environment variables which can be 
used to pass hints to the application but TACC doesn't support it?  Why
can't you use -v, or put it in your batch script and tell mpirun to pass
the variable or put it on the mpirun command line when you pass it.

>>> I will look at setting the hints file up but I don't think that's going
>>> to give me the equivalent of 64 stripe counts, which looks like the
>>> sweet spot for the domain I'm testing on.
>>>
>>
>> So what Hints are you passing and is then the key to increase the number
>> of stripes for the directory?
>
> The key is stripe-count. BUT only for the wrfout files. I've tried
> changing the stripe-count on the directory, and that did improve
> performance transiently... until they killed my job and rebooted Ranger
> because the rsl.* files were ALSO being written with stripe-count=64,
> which had crashed their Lustre file system. Unintended Consequences has
> not been repealed.
>

Is stripe-count a hint, or are you just setting it with lfs stripe -c 
<stripe-count>.  Why is it only for the wrfout files?  Does it not help
the wrfrst files?

Why I would do to get around this, is I knew what files were going to be
created, I would create a separate subdirectory, change the stripe-count 
on that directory, then create links of the files to be created into 
that directory.  When WRF tries to create the wrfout files,
then they get written to the directory that has a different stripe-count.


>>> Craig, one I have time to get back on to this, I think we can convince
>>> NCAR to add this as a bug release. I also anticipate the tweak will be
>>> on the order of 4-5 lines.
>>>
>>
>> I already wrote code so that if you set the variable WRF_MPIIO_HINTS,
>> and list all the hints you want to set (comma delimited), then the
>> code in external/io_pnetcdf/wrf_IO.F90 will set the hints for you. When
>> I see that any of this actually helps I will send the patch in for
>> future use.
>
> Care to share?
>
> Thanks, gerry

I will post tomorrow.

Craig




More information about the parallel-netcdf mailing list