sieve-dev a few questions on memory usage of isieve code
Matthew Knepley
knepley at gmail.com
Wed Dec 10 12:04:38 CST 2008
On Tue, Dec 2, 2008 at 8:18 PM, Shi Jin <jinzishuai at gmail.com> wrote:
> Hi Matt,
>
> Thank you for your fix. I am able to build petsc-dev now without any
> problem.
>
> First, I want to let you know that now I am able to run very large
> simulations using isieve. Previously, we have a big limitation on the
> problem size since in order to distribute a large 2nd order finite element
> mesh, we need to find a large shared memory machine to do that. I was able
> to improve the situation a little bit by storing the distributed sieve data
> structured in files and load them in parallel on distributed memory
> clusters. However, it does not eliminate the need of a large shared memory
> machine at the very early stage. Recently, I eliminated this need based on
> the fact that so far our simulations are done in cylinders (round or
> rectangular) and there is no need to have different meshing along the axis
> direction, and in order to make particle tracking easier, our domain
> interface are simple plane surfaces perpendicular to the axis. So basically
> along the axial axis, I simply have the repetitive mesh with a shift. I was
> able to reconstruct the unstrcutred mesh on each process based on the sieve
> data generated for a two-process distributed mesh since the slave nodes have
> basically the same topology. Now I can arbitrarily introduce as many
> processor as we want, making the whole domain much longer. This is not a
> general solution but it suits our need perfectly, at least for the time
> being. If you are interested in my implementation, I am happy to share it.
>
> That said, I still face a bit of memory issue since I want to have as many
> elements per process as possible. I did a detailed profiling of the memory
> usage for a serial code (the parallel version is identical on each process)
> with 259,200 elements (379,093 second order nodes). The decomposition looks
> like the following:
> *sieve mesh: *290MB*
> discretization *25MB
> *uvwp * 270MB *
> global order-p *90MB
> *global order-vel *132MB
> *caching *25MB
> *USG->CFD: dumping mesh *72MB*
> matrix/vector *557MB
> *---------------------------------------------
> Total: *1461MB*
> *
> Where the matrix/vector is already as good as it can ever get but I think
> there is room for improvement on the other parts (highlighted in red).
>
> 1. About the sieve mesh: there is not much to be done since you have
> already optimized it. But I think it would be nice if we have the choice of
> not to include faces in the data structured. In fact, among all points in
> the data structure, face points take about 50%. Since my code does not need
> faces, this will save quite significant amount of memory. Also I think it is
> a good feature to have for a general library allowing users to choose the
> level of interpolation. However, this is not critical.
This would entail looking at all the algorithms and deciding which depend on
having a fully interpolated mesh. I am not sure
right now.
>
> 2. About the global orders: I think I mentioned this before and was told
> that right now the global orders are stored in the old fashion and thus not
> optimized. It is possible to make use of continuous memory just as isieve
> does and it will surely save a lot of memory. I guess this is just a matter
> of time for it to happen, right?
Yes, I am working on it. IT is actually all coded, but I think I have a bug
in the parallelism so I have not made it
the default yet.
>
> 3. Finally the fields: I am still using the idea of fibration. The u, v, w
> and p fields are obtained as fibrations of a single field s. It has worked
> very well but I see it takes a lot of memory to create, llmost as much as
> the mesh itself. I remember you told me there is a new way. Create one mesh
> for each field and the meshes can share the same sieve data. And I think you
> have optimized it as ifield, right? But I have not tried it yet. This is the
> real pressing question I am asking here: how do the memory usages compare
> with each other for the two ways of building fields? If a lot of memory can
> be saved, then I am definitely going to switch to the new method. And I
> would love to have more guidance on how to implement it.
ISections are much much more efficient than Sections. Have you tried jsut
switching this?
Matt
>
> Again, thank you very much for your help.
> --
> Sincerely,
> Shi Jin, Ph.D.
> http://www.ualberta.ca/~sjin1/ <http://www.ualberta.ca/%7Esjin1/>
>
--
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.mcs.anl.gov/mailman/private/sieve-dev/attachments/20081210/b9e74046/attachment.htm>
More information about the sieve-dev
mailing list