On Sun, Mar 18, 2012 at 11:06 AM, Jed Brown <span dir="ltr"><<a href="mailto:jedbrown@mcs.anl.gov">jedbrown@mcs.anl.gov</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

We currently have Vec{Norm,Dot,TDot,MDot,MTDot}{Begin,End}() allowing the reductions to be aggregated, but the reduction itself is only triggered lazily and always uses MPI_Allreduce(). MPICH2 has implemented MPI_Iallreduce (named MPIX_Iallreduce() until MPI-3 is finalized). I suggest adding PetscCommSplitReductionBegin(MPI_Comm) which would start any currently queued split reductions. Then, a latency-tolerant algorithm might be written<div>


<br></div><div>VecNormBegin(..,&nrm);</div><div>VecDotBegin(...,&dot);</div><div>PetscCommSplitReductionBegin(comm);</div><div>MatMult(...); // Or residual evaluation, etc.</div><div>VecNormEnd(...,&nrm);</div>


<div>VecDotEnd(...,&dot);</div><div><br></div><div><br></div><div>PetscCommSplitReductionBegin() would start the split reduction, leaving behind an MPI_Request that would be waited on by the first XXEnd().</div><div>

<br>

</div><div>If MPIX_Iallreduce() is not available, the current semantics would be used. If you don't call PetscCommSplitReductionBegin(), the present semantics would also be used.</div><div><br></div><div>Is this a good API?</div>

</blockquote><div><br></div><div>This sounds fine to me. I think the acid test is to rewrite Bill's new CG in it.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div>I'd also like to propose a design change in which the PetscSplitReduction is placed in _p_Vec when it is gotten out of the MPI_Comm. The MPI_Comm would continue to hold a reference until PetscCommSplitReductionBegin() at which point it drops the reference and the only remaining references are held by the Vecs participating in the split reduction. That would enable a new split reduction to be done by other objects using the same communicator, before the results of the last split reduction have been collected by all participants.</div>


</blockquote></div><br>I do not understand this. What call creates the PetscSplitReduction struct? Is it the Begins? So after Begin, it is present in<div>the Vec and the Comm. Then when you call PetscCommSplitReductionBegin, it leaves the comm, and resides only in the</div>

<div>Vec along with a Request. Then the End destroys it? So it lives in the Comm for aggregation purposes? That sounds fine to</div><div>me.</div><div><br></div><div>   Matt<br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>

-- Norbert Wiener<br>

</div>