[mpich-discuss] IMB 3.1 with TOL 0 crashes on Allreduce
Calin Iaru
calin at dolphinics.com
Sat May 31 05:27:14 CDT 2008
In the second case, where a temporary variable is used, the compiler
will generate an FLD instruction. From the Intel Basic Architecture manual:
"
The FLD (load floating point) instruction pushes a floating-point
operand from memory onto
the top of the x87 FPU data-register stack. If the operand is in
single-precision or double-precision
floating-point format, it is automatically converted to double
extended-precision floatingpoint
format. This instruction can also be used to push the value in a
selected x87 FPU data
register onto the top of the register stack.
"
>
> I can do all of the above for a third time, or:
>
> Inside IMB_chk_dadd, the following assignment takes place:
>
> for(rank = rank0; rank<= rank1; rank++)
> {
> for(i=0; i<Locsize/asize; i++)
> ((assign_type*)AUX)[i] += BUF_VALUE(rank,buf_pos/asize+i);
> }
>
> When this code is compiled, the test reports a data corruption.
>
> If I modify the sequence as:
> for(rank = rank0; rank<= rank1; rank++)
> {
> for(i=0; i<Locsize/asize; i++) {
> assign_type x = BUF_VALUE(rank,buf_pos/asize+i);
> ((assign_type*)AUX)[i] += x;
> }
> }
> then the code runs successfully.
>
> Provided that this is not something I did wrong (which even with the
> most attention I had could still happen), then this code looks like a
> compiler issue. I plan to compile with /FAsc and continue investigating.
>
>
More information about the mpich-discuss
mailing list