[petsc-users] scaling problem
Mark F. Adams
mark.adams at columbia.edu
Sat May 26 18:34:19 CDT 2012
On May 26, 2012, at 6:32 PM, Jed Brown wrote:
> That load imbalance often comes from whatever came *before* the reduction.
>
>
Yes I assumed that was understood.
> On May 26, 2012 5:25 PM, "Mark F. Adams" <mark.adams at columbia.edu> wrote:
> Just a few comments on the perf data. I see a lot (~5x) load imbalance on reduction stuff like VecMDot and VecNorm (Time Ratio, max/min) but even a lot of imbalance on simple non-collective vector ops like VecAXPY (~3x). So if the work is load balanced well then I would suspect its a NUMA issue.
>
> Mark
>
> On May 26, 2012, at 6:13 PM, Aron Roland wrote:
>
>> Dear All,
>>
>> I have some question on some recent implementation of PETSc for solving a large linear system from a 4d problem on hybrid unstructured meshes.
>>
>> The point is that we have implemented all the mappings and the solution is fine, the number of iterations too. The results are robust with respect to the amount of CPU used but we have a scaling issue.
>>
>> The system is an intel cluster of the latest generation on Infiniband.
>>
>> We have attached the summary ... with hooefully a lot of informations.
>>
>> Any comments, suggestions, ideas are very welcome.
>>
>> We have been reading the threads with that are dealing with multi-core and the bus-limitation stuff, so we are aware of this.
>>
>> I am thinking now on an open/mpi hybrid stuff but I am not quite happy with the bus-limitation explanation, most of the systems are multicore.
>>
>> I hope the limitation are not the sparse matrix mapping that we are using ...
>>
>> Thanks in advance ...
>>
>> Cheers
>>
>> Aron
>>
>>
>>
>>
>>
>> <benchmark.txt>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120526/1f4192f0/attachment.html>
More information about the petsc-users
mailing list