[MPICH] a potential bug

Rajeev Thakur thakur at mcs.anl.gov
Mon Oct 29 13:58:10 CDT 2007


Wei-keng,
         Thanks for pointing this out. I have fixed it as you suggested in
the last paragraph. Attached is the new ad_aggregate.c. Can you test it out?

Thanks,
Rajeev
  

> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Wei-keng Liao
> Sent: Friday, October 26, 2007 10:51 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: [MPICH] a potential bug
> 
> 
> In file 
> mpich2-1.0.6/src/mpi/romio/adio/common/ad_aggregate.c, function 
> ADIOI_Calc_others_req(), lines 403 to 439, two arrays of 
> MPI_Request are 
> first allocated with ADIOI_Malloc(), used in 
> MPI_Isend()/MPI_Irecv(), and 
> then in the two following MPI_Waitall()s.
> 
> However the two arrays are not initialized to 
> MPI_REQUEST_NULL, but all 
> elements of the arrays are used in the MPI_Waitall() calls in 
> lines 438 
> and 439. This is dangerous since ADIOI_Malloc() does not 
> always allocate a 
> buffer with all zero contents (matching the define of 
> MPI_REQUEST_NULL:
>    #define MPI_REQUEST_NULL   ((MPI_Request)0x2c000000)
> in mpich2-1.0.6/src/include/mpi.h.in .
> 
> I am running a MPICH2-1.0.2 on Cray and had a run failed with message 
> indicating the location around this area. I can see this part 
> has not benn 
> changed since 1.0.2. Please confirm this potential bug.
> 
> Either initilaizing to MPI_REQUEST_NULL or using ADIOI_Calloc() can 
> fix the problem. 
> 
> Another clever solution can be to combine the two arrays into 
> one; use 
> variable j as a counter for both loops(i.e. remove line 421, without 
> resetting j to 0); and use one MPI_Waitall() with arguments 
> of j as the 
> number of requests and the combined request array.
> 
> Wei-keng
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ad_aggregate.c
Type: application/octet-stream
Size: 15092 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20071029/8cc299a4/attachment.obj>


More information about the mpich-discuss mailing list