File locking failed in ADIOI_Set_lock
Rob Ross
rross at mcs.anl.gov
Mon Sep 27 15:19:40 CDT 2010
Ugh. Any chance you could get a stack dump from where this is
happening? -- Rob
On Sep 27, 2010, at 3:17 PM, John Michalakes wrote:
> Thanks all for your responses and your suggestions. I have re-
> engineered the code to use only collective I/O but I am still seeing
> the error I first wrote about:
>
> File locking failed in ADIOI_Set_lock(fd 16,cmd F_SETLKW/7,type
> F_WRLCK/1,whence 0) with return value FFFFFFFF and errno 26.
> If the file system is NFS, you need to use NFS version 3, ensure
> that the lockd daemon is running on all the machines, and mount the
> directory with the 'noac' option (no attribute caching).
> ADIOI_Set_lock:: Function not implemented
> ADIOI_Set_lock:offset 65492, length 1980
>
> So this probably isn't an effect from using the independent API
> after all (drat!). I will try some of the other suggestions now --
> first of which will be to upgrade to the latest pNetCDF on this
> machine. I'm not sure I'll be able to switch over to MPICH2/romio
> but I'll look into that as well.
>
> Thanks,
>
> John
>
>
> On 9/24/2010 9:53 AM, Wei-keng Liao wrote:
>> Hi, John,
>>
>> Rob is right, turning off data sieving just avoids the error
>> messages and may
>> significantly slow down the performance (if your I/O is non-
>> contiguous and
>> only calls non-collective.)
>>
>> On Lustre, you need collective I/O to get high I/O bandwidths. Even
>> if your
>> write request from each process is already large and contiguous,
>> non-collective
>> write will still give you poor results. This is not the case on
>> other file
>> systems, e.g. GPFS.
>>
>> Is there an option to turn on collective I/O in WRF and PIO?
>>
>> If non-collective I/O is the only option (due to the irregular data
>> distribution), then non-blocking I/O is another solution. In
>> pnetcdf 1.2.0, non-blocking I/O can aggregate multiple
>> non-collective requests into a single collective one. However, this
>> approach requires changes to the pnetcdf calls in the I/O library
>> used by WRF and PIO. The changes should be very simple, though.
>> In general, I would suggest Lustre users to seek all opportunity to
>> call
>> collective I/O.
>>
>> Wei-keng
>>
>> On Sep 24, 2010, at 9:44 AM, Rob Latham wrote:
>>
>>> On Fri, Sep 24, 2010 at 07:49:06AM -0600, Mark Taylor wrote:
>>>> Hi John,
>>>>
>>>> I've had a very similar issue a while ago on several older Lustre
>>>> filesystems at Sandia, and I can confirm that setting those hints
>>>> did
>>>> allow the code to run
>>> if you turn off data sieving then there will be no more lock calls.
>>> Depending on how your application partitions the arrays, that
>>> could be
>>> fine, or it could result in a billion 8 byte operations.
>>>
>>>> (but I could never get pnetcdf to be any faster
>>>> than netcdf).
>>> Unsurprising, honestly. If you are dealing with Lustre, then you
>>> must
>>> both use an updated ROMIO and use collective I/O.
>>>
>>> Here is the current list of MPI-IO implementations that work well
>>> with
>>> Lustre:
>>>
>>> - Cray MPT 3.2 or newer
>>> - MPICH2-1.3.0a1 or newer
>>> - and that's it.
>>>
>>> I think the OpenMPI community is working on a re-sync with MPICH2
>>> romio. I also think we can stitch together a patch against
>>> OpenMPI if
>>> you really need the improved lustre driver. I'm not really in
>>> patch-generating mode right now, but maybe once i'm back in the
>>> office
>>> I can see how tricky it will be.
>>>
>>>> This was with CAM, with pnetcdf being called by PIO, and
>>>> PIO has a compiler option to turn this on, -DPIO_LUSTRE_HINTS.
>>>>
>>>> However, on Sandia's redsky (more-or-less identical to RedMesa),
>>>> I just
>>>> tried these hints and I am also getting those same error messages
>>>> you
>>>> are seeing. So please let me know if you get this resolved.
>>> I can't think of any other code paths that use locking, unless your
>>> system for some reason presents itself as nfs.
>>>
>>> That's why rajeev suggested prefixing with lustre: Unfortunately,
>>> that
>>> won't help: it has only been since March of this year that (with
>>> community support) the Lustre driver in MPICH2 passed all the ROMIO
>>> tests, and now we need to get that into OpenMPI.
>>>
>>> ==rob
More information about the parallel-netcdf
mailing list