[mpich-discuss] ROMIO: Patch to run on Lustre
Pascal Deveze
Pascal.Deveze at bull.net
Tue Sep 21 08:57:47 CDT 2010
Hi Rob,
Rob Latham a écrit :
> On Fri, Sep 17, 2010 at 01:17:22PM +0200, Pascal Deveze wrote:
>
>
>> After initializing lum->lmm_stripe_count to a "correct value", this
>> problem disappears.
>> I think this is a Lustre bug, but I propose to integer this patch:
>>
>
> thanks, pascal!
>
>
>> --- src/mpi/romio/adio/ad_lustre/ad_lustre_open.c 2010-09-17
>> 12:50:58.000000000 +0200
>> +++ src/mpi/romio/adio/ad_lustre/ad_lustre_open.c.OLD 2010-05-25
>> 20:59:13.000000000 +0200
>> @@ -59,9 +59,6 @@
>> MAX_LOV_UUID_COUNT * sizeof(struct lov_user_ost_data);
>> lum = (struct lov_user_md *)ADIOI_Malloc(lumlen);
>> lum->lmm_magic = LOV_USER_MAGIC;
>> - /* Initialize lum->lmm_stripe_count with a value else
>> ioctl() returns an error */
>> - /* This value must be greater or egal than the existing
>> lmm_stripe_count (bug in Lustre ?) */
>> - lum->lmm_stripe_count = -1;
>> err = ioctl(fd->fd_sys, LL_IOC_LOV_GETSTRIPE, (void *)lum);
>> if (!err) {
>> value = (char *)
>> ADIOI_Malloc((MPI_MAX_INFO_VAL+1)*sizeof(char));
>>
>
> What if instead of explicitly initializing elements of the struct
> lov_user_md, we called ADIOI_Calloc(1, lumlen) to set
> everything in the struct to zero? then if that struct changes in
> lustre-2.0 or lustre-5.0 or whatever we'll still be covered.. Or,
> would zero also give that error about value too large?
>
I did not test the value 0. In fact the value 0 is accepted.
So, you are right, we can call ADIOI_Calloc(1, lumlen).
I copy you the new patch hereafter (only one changed line):
--- src/mpi/romio/adio/ad_lustre/ad_lustre_open.c 2010-09-21
15:50:07.000000000 +0200
+++ src/mpi/romio/adio/ad_lustre/ad_lustre_open.c.OLD 2010-05-25
20:59:13.000000000 +0200
@@ -57,7 +57,7 @@
* then a list of 'lmm_objects' representing stripe */
lumlen = sizeof(struct lov_user_md) +
MAX_LOV_UUID_COUNT * sizeof(struct lov_user_ost_data);
- lum = (struct lov_user_md *)ADIOI_Calloc(1, lumlen);
+ lum = (struct lov_user_md *)ADIOI_Malloc(lumlen);
lum->lmm_magic = LOV_USER_MAGIC;
err = ioctl(fd->fd_sys, LL_IOC_LOV_GETSTRIPE, (void *)lum);
if (!err) {
Pascal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100921/bfde632c/attachment-0001.htm>
More information about the mpich-discuss
mailing list