[petsc-dev] What is this? "Optimize VecNorm_MPI. Use BLASdot_ instead of BLASnrm2_"
Jed Brown
jedbrown at mcs.anl.gov
Tue Jan 3 18:13:25 CST 2012
On Tue, Jan 3, 2012 at 18:09, Barry Smith <bsmith at mcs.anl.gov> wrote:
> It is not just the reference BLAS. It is the -lblas that come on many
> Linux systems by default (that are not much more than compiled versions of
> the reference blas).
>
> Now you can say that you don't care about that situation, and those blas
> are stupid but it is a common situation and saying that is stupid doesn't
> help all those users who spend way to much time on norm.
>
Okay.
>
>
> > If the concern is just this routine and just on x86-64, I would be
> inclined to write a simple vectorized implementation (probably using SSE
> intrinsics) that still includes the stability stuff.
> >
> I don't think the stability stuff is needed for how norm() is used in
> PETSc (if it is important how come it is not important for the dot products
> also?). It is just there for pathological matrices the LINPACK guys knew
> about; I consider it just a fetish that got the LINPACK guys excited.
>
Well, the "trick" can't be done simply for dot product because it is not
monotone.
>
>
> > Whatever the case, I'm not a fan of replacing nrm2() with dot().
>
> Why not? If the dot is highly optimized it may be faster than your own
> hand coded blas thing.
>
1. It is not conforming code because Fortran disallows aliasing.
2. Who is going to test what is faster?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120103/2b29d1b4/attachment.html>
More information about the petsc-dev
mailing list