[petsc-dev] What is this? "Optimize VecNorm_MPI. Use BLASdot_ instead of BLASnrm2_"

Jack Poulson jack.poulson at gmail.com
Tue Jan 3 16:44:23 CST 2012


It is possible, though unlikely that the BLAS dot could be faster than the
BLAS nrm2, though I am skeptical. The reason is that the result of dnrm2 on
a vector u is more stable than the square root of the inner product of u
with itself via ddot, as it scales the temporary products of the norm to
make the computation more accurate:
http://www.netlib.org/blas/dnrm2.f

Thus, if you don't care about accuracy, then it is _possible_ that ddot
would be faster, but i doubt it, and it is likely a bad idea to give up on
some stability.

Jack

On Tue, Jan 3, 2012 at 4:33 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote:

> http://petsc.cs.iit.edu/petsc/petsc-dev/rev/a8a483b98169
>
> This baffles me. I can think of no good reason for this, which gives me
> the impression that we are optimizing for an implementation quirk. If you
> have evidence that the performance of BLAS dot() is better than nrm2()
> across platforms and implementations, then we are witnessing a major
> implementation failure and people need to be shamed.
>
> Aliasing is also *explicitly disallowed* by Fortran, so the result of
>
> BLASdot_(&bn,xx,&one,xx,&one);
>
> is not defined.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120103/2d6fbf81/attachment.html>


More information about the petsc-dev mailing list