[petsc-dev] profiling question
Leo van Kampenhout
lvankampenhout at gmail.com
Tue Sep 21 03:42:18 CDT 2010
Oops wrong list. Nevermind this
2010/9/21 Leo van Kampenhout <lvankampenhout at gmail.com>
>
> Dear all,
>
> in order to calculate speedup (Sp = T1/Tp) I need an accurate measurement
> of T1, the time to solve on 1 processor. I will be using the parallel
> algorithm for that, but there seems to be a hick-up.
>
> At the cluster I am currently working on, each node is made up by 12 PEs
> and have shared memory. When I would just reserve 1 PE for my job, the other
> 11 processors are given to other users, therefore giving dynamic load on the
> memory system resulting into inaccurate timings. The solve-times I get are
> ranging between 1 and 5 minutes. For me, this is not very scientific either.
>
>
> The second idea was to reserve all 12 PEs on the node and just let 1 PE run
> the job. However, in this way the single CPU gets all the memory bandwidth
> and has no waiting time, therefore giving very fast results. When I would
> calculate speedup from here, the algorithm does not scale very well.
>
> Another idea would be to spawn 12 identical jobs on 12 PEs and take the
> average runtime. Unfortunately, there is only one PETSC_COMM_WORLD, so I
> think this is impossible to do from within one program (MPI_COMM_WORLD).
>
> Do you fellow PETSc-users have any ideas on the subject? It would be much
> appreciated.
>
> regards,
>
> Leo van Kampenhout
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100921/3c8e6d3b/attachment.html>
More information about the petsc-dev
mailing list