[petsc-dev] Hybrid MPI/OpenMP reflections
Barry Smith
bsmith at mcs.anl.gov
Fri Aug 9 17:09:25 CDT 2013
On Aug 9, 2013, at 4:42 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote:
> Barry Smith <bsmith at mcs.anl.gov> writes:
>> Generally there should be some non zeros in each row, we could
>> probably just use 2.0*a->nz - m
>
> When using cprow, this should always be exact, right?
I believe so.
>
> If we are not using cprow because there weren't enough empty rows, but
> there are still some empty rows, it won't be exact, but it's likely not
> far off for typical matrices. It would be a shame, however, that we
> could now compute a negative number of flops. Should the check for
> cprow store the number of zero rows?
We don't actually check for compressed rows by default, presumably because it takes a little time to scan through the rows (they are check for the off-diagonal part of Mat_MPIXXAIJ()). But note that MatAssemblyEnd_SeqXXAIJ() always has the loop
/* reset ilen and imax for each row */
for (i=0; i<mbs; i++) {
ailen[i] = imax[i] = ai[i+1] - ai[i];
}
so we could capture the count at that point. We could then use that value to decide when to compress rows always and throw away the optional checking for compressed rows. And use that value to get the accurate flop count for MatMult_SeqXXAIJ()
I can make the change to master if this is the right approach?
Barry
More information about the petsc-dev
mailing list