File locking failed in ADIOI_Set_lock

Wei-keng Liao wkliao at ece.northwestern.edu
Mon Sep 27 18:34:09 CDT 2010


Hi, John,

In ROMIO's Lustre driver, there are only two scenarios that ADIOI_Set_lock is called.
1. when data sieving is enabled. ROMIO's default is enabled. The two hints I posted earlier can disable it.
2. when I/O atomicity is enabled. Its default is disable and pnetcdf never enables it. So, this possibility is out of question.

Can you check what hints are enabled in your case?

I just added two programs in C and Fortran in the pnetcdf SVN that print the MPI Info.
You can cut and paste from these codes. Please let us know what info you are using.

http://trac.mcs.anl.gov/projects/parallel-netcdf/browser/trunk/examples/get_info_c.c
http://trac.mcs.anl.gov/projects/parallel-netcdf/browser/trunk/examples/get_info_f.F90


Wei-keng

On Sep 27, 2010, at 3:26 PM, John Michalakes wrote:

> It's not dumping a trace (the code is calling MPI_ABORT at this point and so it's dying cleanly). I'll see what I can get for you.  -John
> 
> On 9/27/2010 2:19 PM, Rob Ross wrote:
>> Ugh. Any chance you could get a stack dump from where this is happening? -- Rob
>> 
>> On Sep 27, 2010, at 3:17 PM, John Michalakes wrote:
>> 
>>> Thanks all for your responses and your suggestions.  I have re-engineered the code to use only collective I/O but I am still seeing the error I first wrote about:
>>> 
>>> File locking failed in ADIOI_Set_lock(fd 16,cmd F_SETLKW/7,type F_WRLCK/1,whence 0) with return value FFFFFFFF and errno 26.
>>> If the file system is NFS, you need to use NFS version 3, ensure that the lockd daemon is running on all the machines, and mount the directory with the 'noac' option (no attribute caching).
>>> ADIOI_Set_lock:: Function not implemented
>>> ADIOI_Set_lock:offset 65492, length 1980
>>> 
>>> So this probably isn't an effect from using the independent API after all (drat!).  I will try some of the other suggestions now -- first of which will be to upgrade to the latest pNetCDF on this machine.  I'm not sure I'll be able to switch over to MPICH2/romio but I'll look into that as well.
>>> 
>>> Thanks,
>>> 
>>> John
>>> 
>>> 
>>> On 9/24/2010 9:53 AM, Wei-keng Liao wrote:
>>>> Hi, John,
>>>> 
>>>> Rob is right, turning off data sieving just avoids the error messages and may
>>>> significantly slow down the performance (if your I/O is non-contiguous and
>>>> only calls non-collective.)
>>>> 
>>>> On Lustre, you need collective I/O to get high I/O bandwidths. Even if your
>>>> write request from each process is already large and contiguous, non-collective
>>>> write will still give you poor results. This is not the case on other file
>>>> systems, e.g. GPFS.
>>>> 
>>>> Is there an option to turn on collective I/O in WRF and PIO?
>>>> 
>>>> If non-collective I/O is the only option (due to the irregular data
>>>> distribution), then non-blocking I/O is another solution. In
>>>> pnetcdf 1.2.0, non-blocking I/O can aggregate multiple
>>>> non-collective requests into a single collective one. However, this
>>>> approach requires changes to the pnetcdf calls in the I/O library
>>>> used by WRF and PIO. The changes should be very simple, though.
>>>> In general, I would suggest Lustre users to seek all opportunity to call
>>>> collective I/O.
>>>> 
>>>> Wei-keng
>>>> 
>>>> On Sep 24, 2010, at 9:44 AM, Rob Latham wrote:
>>>> 
>>>>> On Fri, Sep 24, 2010 at 07:49:06AM -0600, Mark Taylor wrote:
>>>>>> Hi John,
>>>>>> 
>>>>>> I've had a very similar issue a while ago on several older Lustre
>>>>>> filesystems at Sandia, and I can confirm that setting those hints did
>>>>>> allow the code to run
>>>>> if you turn off data sieving then there will be no more lock calls.
>>>>> Depending on how your application partitions the arrays, that could be
>>>>> fine, or it could result in a billion 8 byte operations.
>>>>> 
>>>>>> (but I could never get pnetcdf to be any faster
>>>>>> than netcdf).
>>>>> Unsurprising, honestly.  If you are dealing with Lustre, then you must
>>>>> both use an updated ROMIO and use collective I/O.
>>>>> 
>>>>> Here is the current list of MPI-IO implementations that work well with
>>>>> Lustre:
>>>>> 
>>>>> - Cray MPT 3.2 or newer
>>>>> - MPICH2-1.3.0a1 or newer
>>>>> - and that's it.
>>>>> 
>>>>> I think the OpenMPI community is working on a re-sync with MPICH2
>>>>> romio.  I also think we can stitch together a patch against OpenMPI if
>>>>> you really need the improved lustre driver.   I'm not really in
>>>>> patch-generating mode right now, but maybe once i'm back in the office
>>>>> I can see how tricky it will be.
>>>>> 
>>>>>> This was with CAM, with pnetcdf being called by PIO, and
>>>>>> PIO has a compiler option to turn this on, -DPIO_LUSTRE_HINTS.
>>>>>> 
>>>>>> However, on Sandia's redsky (more-or-less identical to RedMesa), I just
>>>>>> tried these hints and I am also getting those same error messages you
>>>>>> are seeing. So please let me know if you get this resolved.
>>>>> I can't think of any other code paths that use locking, unless your
>>>>> system for some reason presents itself as nfs.
>>>>> 
>>>>> That's why rajeev suggested prefixing with lustre: Unfortunately, that
>>>>> won't help: it has only been since March of this year that (with
>>>>> community support)  the Lustre driver in MPICH2 passed all the ROMIO
>>>>> tests, and now we need to get that into OpenMPI.
>>>>> 
>>>>> ==rob
>> 
>> 
> 
> -- 
> John Michalakes
> National Renewable Energy Laboratory
> 1617 Cole Blvd.
> Golden, Colorado 80401
> Phone: 303-275-4297
> Fax: 303-275-4091
> John.Michalakes at nrel.gov
> 
> 



More information about the parallel-netcdf mailing list