[mpich-discuss] IMB 3.1 with TOL 0 crashes on Allreduce

Calin Iaru calin at dolphinics.com
Sat May 31 05:27:14 CDT 2008


In the second case, where a temporary variable is used, the compiler 
will generate an FLD instruction. From the Intel Basic Architecture manual:
"
The FLD (load floating point) instruction pushes a floating-point 
operand from memory onto
the top of the x87 FPU data-register stack. If the operand is in 
single-precision or double-precision
floating-point format, it is automatically converted to double 
extended-precision floatingpoint
format. This instruction can also be used to push the value in a 
selected x87 FPU data
register onto the top of the register stack.
"


>
> I can do all of the above for a third time, or:
>
> Inside IMB_chk_dadd, the following assignment takes place:
>
>   for(rank = rank0; rank<= rank1; rank++)
>   {
>       for(i=0; i<Locsize/asize; i++)
>        ((assign_type*)AUX)[i] += BUF_VALUE(rank,buf_pos/asize+i);
>        }
>
> When this code is compiled, the test reports a data corruption.
>
> If I modify the sequence as:
>   for(rank = rank0; rank<= rank1; rank++)
>   {
>       for(i=0; i<Locsize/asize; i++) {
>            assign_type x = BUF_VALUE(rank,buf_pos/asize+i);
>        ((assign_type*)AUX)[i] += x;
>       }
>        }
> then the code runs successfully.
>
> Provided that this is not something I did wrong (which even with the 
> most attention I had could still happen), then this code looks like a 
> compiler issue. I plan to compile with /FAsc and continue investigating.
>
>




More information about the mpich-discuss mailing list