[MPICH] a potential bug
Rajeev Thakur
thakur at mcs.anl.gov
Mon Oct 29 13:58:10 CDT 2007
Wei-keng,
Thanks for pointing this out. I have fixed it as you suggested in
the last paragraph. Attached is the new ad_aggregate.c. Can you test it out?
Thanks,
Rajeev
> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Wei-keng Liao
> Sent: Friday, October 26, 2007 10:51 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: [MPICH] a potential bug
>
>
> In file
> mpich2-1.0.6/src/mpi/romio/adio/common/ad_aggregate.c, function
> ADIOI_Calc_others_req(), lines 403 to 439, two arrays of
> MPI_Request are
> first allocated with ADIOI_Malloc(), used in
> MPI_Isend()/MPI_Irecv(), and
> then in the two following MPI_Waitall()s.
>
> However the two arrays are not initialized to
> MPI_REQUEST_NULL, but all
> elements of the arrays are used in the MPI_Waitall() calls in
> lines 438
> and 439. This is dangerous since ADIOI_Malloc() does not
> always allocate a
> buffer with all zero contents (matching the define of
> MPI_REQUEST_NULL:
> #define MPI_REQUEST_NULL ((MPI_Request)0x2c000000)
> in mpich2-1.0.6/src/include/mpi.h.in .
>
> I am running a MPICH2-1.0.2 on Cray and had a run failed with message
> indicating the location around this area. I can see this part
> has not benn
> changed since 1.0.2. Please confirm this potential bug.
>
> Either initilaizing to MPI_REQUEST_NULL or using ADIOI_Calloc() can
> fix the problem.
>
> Another clever solution can be to combine the two arrays into
> one; use
> variable j as a counter for both loops(i.e. remove line 421, without
> resetting j to 0); and use one MPI_Waitall() with arguments
> of j as the
> number of requests and the combined request array.
>
> Wei-keng
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ad_aggregate.c
Type: application/octet-stream
Size: 15092 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20071029/8cc299a4/attachment.obj>
More information about the mpich-discuss
mailing list