File locking failed in ADIOI_Set_lock
John Michalakes
john at michalakes.us
Mon Sep 27 15:26:41 CDT 2010
It's not dumping a trace (the code is calling MPI_ABORT at this point
and so it's dying cleanly). I'll see what I can get for you. -John
On 9/27/2010 2:19 PM, Rob Ross wrote:
> Ugh. Any chance you could get a stack dump from where this is
> happening? -- Rob
>
> On Sep 27, 2010, at 3:17 PM, John Michalakes wrote:
>
>> Thanks all for your responses and your suggestions. I have
>> re-engineered the code to use only collective I/O but I am still
>> seeing the error I first wrote about:
>>
>> File locking failed in ADIOI_Set_lock(fd 16,cmd F_SETLKW/7,type
>> F_WRLCK/1,whence 0) with return value FFFFFFFF and errno 26.
>> If the file system is NFS, you need to use NFS version 3, ensure that
>> the lockd daemon is running on all the machines, and mount the
>> directory with the 'noac' option (no attribute caching).
>> ADIOI_Set_lock:: Function not implemented
>> ADIOI_Set_lock:offset 65492, length 1980
>>
>> So this probably isn't an effect from using the independent API after
>> all (drat!). I will try some of the other suggestions now -- first
>> of which will be to upgrade to the latest pNetCDF on this machine.
>> I'm not sure I'll be able to switch over to MPICH2/romio but I'll
>> look into that as well.
>>
>> Thanks,
>>
>> John
>>
>>
>> On 9/24/2010 9:53 AM, Wei-keng Liao wrote:
>>> Hi, John,
>>>
>>> Rob is right, turning off data sieving just avoids the error
>>> messages and may
>>> significantly slow down the performance (if your I/O is
>>> non-contiguous and
>>> only calls non-collective.)
>>>
>>> On Lustre, you need collective I/O to get high I/O bandwidths. Even
>>> if your
>>> write request from each process is already large and contiguous,
>>> non-collective
>>> write will still give you poor results. This is not the case on
>>> other file
>>> systems, e.g. GPFS.
>>>
>>> Is there an option to turn on collective I/O in WRF and PIO?
>>>
>>> If non-collective I/O is the only option (due to the irregular data
>>> distribution), then non-blocking I/O is another solution. In
>>> pnetcdf 1.2.0, non-blocking I/O can aggregate multiple
>>> non-collective requests into a single collective one. However, this
>>> approach requires changes to the pnetcdf calls in the I/O library
>>> used by WRF and PIO. The changes should be very simple, though.
>>> In general, I would suggest Lustre users to seek all opportunity to
>>> call
>>> collective I/O.
>>>
>>> Wei-keng
>>>
>>> On Sep 24, 2010, at 9:44 AM, Rob Latham wrote:
>>>
>>>> On Fri, Sep 24, 2010 at 07:49:06AM -0600, Mark Taylor wrote:
>>>>> Hi John,
>>>>>
>>>>> I've had a very similar issue a while ago on several older Lustre
>>>>> filesystems at Sandia, and I can confirm that setting those hints did
>>>>> allow the code to run
>>>> if you turn off data sieving then there will be no more lock calls.
>>>> Depending on how your application partitions the arrays, that could be
>>>> fine, or it could result in a billion 8 byte operations.
>>>>
>>>>> (but I could never get pnetcdf to be any faster
>>>>> than netcdf).
>>>> Unsurprising, honestly. If you are dealing with Lustre, then you must
>>>> both use an updated ROMIO and use collective I/O.
>>>>
>>>> Here is the current list of MPI-IO implementations that work well with
>>>> Lustre:
>>>>
>>>> - Cray MPT 3.2 or newer
>>>> - MPICH2-1.3.0a1 or newer
>>>> - and that's it.
>>>>
>>>> I think the OpenMPI community is working on a re-sync with MPICH2
>>>> romio. I also think we can stitch together a patch against OpenMPI if
>>>> you really need the improved lustre driver. I'm not really in
>>>> patch-generating mode right now, but maybe once i'm back in the office
>>>> I can see how tricky it will be.
>>>>
>>>>> This was with CAM, with pnetcdf being called by PIO, and
>>>>> PIO has a compiler option to turn this on, -DPIO_LUSTRE_HINTS.
>>>>>
>>>>> However, on Sandia's redsky (more-or-less identical to RedMesa), I
>>>>> just
>>>>> tried these hints and I am also getting those same error messages you
>>>>> are seeing. So please let me know if you get this resolved.
>>>> I can't think of any other code paths that use locking, unless your
>>>> system for some reason presents itself as nfs.
>>>>
>>>> That's why rajeev suggested prefixing with lustre: Unfortunately, that
>>>> won't help: it has only been since March of this year that (with
>>>> community support) the Lustre driver in MPICH2 passed all the ROMIO
>>>> tests, and now we need to get that into OpenMPI.
>>>>
>>>> ==rob
>
>
--
John Michalakes
National Renewable Energy Laboratory
1617 Cole Blvd.
Golden, Colorado 80401
Phone: 303-275-4297
Fax: 303-275-4091
John.Michalakes at nrel.gov
More information about the parallel-netcdf
mailing list