[petsc-users] Réf. : Re: Avoiding malloc overhead for unstructured finite element meshes
Thomas DE-SOZA
thomas.de-soza at edf.fr
Thu Jun 28 12:29:03 CDT 2012
0: [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 346647 X 346647; storage
space: 9900 unneeded,26861247 used
0: [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is
1608
This number of mallocs is the real problem, you have not preallocated
correctly for the "diagonal" block of the matrix on rank 0. Fix
preallocation and it will be fast. Everything below is fine.
Since 9900 coefficients were uneeded, we had first thought that enough
room was preallocated.
>From what you're telling us, I understand that we may have given an
overall size which is large enough to contain the diagonal block but whose
nnz line by line is not correct hence the mallocs. Is that correct ?
Or does it mean that in the preallocation we have to take care of the
values that come from the stash of another processor even if they are
added to preexisting entries on the process ?
Thomas
jedbrown at mcs.anl.gov
Envoyé par : petsc-users-bounces at mcs.anl.gov
28/06/2012 17:36
Veuillez répondre à petsc-users
Pour : petsc-users at mcs.anl.gov
cc : nicolas.sellenet at edf.fr, (ccc : Thomas DE-SOZA/A/EDF/FR)
Objet : Re: [petsc-users] Avoiding malloc overhead for unstructured finite element
meshes
>From petsc_info.log:
0: [0] MatStashScatterBegin_Private(): No of messages: 0
0: [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
1: [1] MatAssemblyBegin_MPIAIJ(): Stash has 645696 entries, uses 6
mallocs.
This means that rank 1 generates a bunch of entries in rows owned by rank
0, but not vice-versa. The number of entries is somewhat high, but not
unreasonable.
1: [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 334296 X 334296; storage
space: 0 unneeded,25888950 used
1: [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is
0
1: [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81
1: [1] Mat_CheckInode(): Found 111432 nodes of 334296. Limit used: 5.
Using Inode routines
Rank 1 preallocated correctly, no problem here.
0: [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 346647 X 346647; storage
space: 9900 unneeded,26861247 used
0: [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is
1608
This number of mallocs is the real problem, you have not preallocated
correctly for the "diagonal" block of the matrix on rank 0. Fix
preallocation and it will be fast. Everything below is fine.
0: [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81
0: [0] Mat_CheckInode(): Found 115549 nodes of 346647. Limit used: 5.
Using Inode routines
0: [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
-2080374783
0: [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter
1: [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
-2080374783
1: [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
-2080374783
0: [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
-2080374783
0: [0] VecScatterCreateCommon_PtoS(): Using blocksize 3 scatter
0: [0] VecScatterCreate(): Special case: blocked indices to stride
0: [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 346647 X 12234; storage
space: 0 unneeded,308736 used
0: [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is
0
0: [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 51
1: [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 334296 X 12210; storage
space: 0 unneeded,308736 used
1: [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is
0
1: [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 45
On Thu, Jun 28, 2012 at 5:59 AM, Thomas DE-SOZA <thomas.de-soza at edf.fr> wrote:
Dear PETSc Users,
We're experiencing performances issues after having switched to fully
distributed meshes in our in-house code and would like your opinion on the
matter.
In the current version of our structural mechanics FEA software (http://www.code-aster.org/), all MPI processes have knowledge of the whole matrix and therefore can
easily pass it to PETSc without the need for any communication. In a
nutshell, stash is empty after MatSetValues and no mallocs occur during
assembly.
We're now building a distributed version of the software with each process
reading its own subdomain in order to save memory. The mesh was
partitioned with Metis and as a first approach we built a simple partition
of the degrees of freedom based on the gradual subdomains. This eliminates
the need for Application Ordering but yields an unbalanced decomposition
in terms of rows. If we take an example with 2 MPI processes : processor 0
will have more unknowns than processor 1 and will receive entries lying on
the interface whereas processor 1 will have all entries locally.
PETSc manual states that "It is fine to generate some entries on the
"wrong" process. Often this can lead to cleaner, simpler, less buggy
codes. One should never make code overly complicated in order to generate
all values locally. Rather, one should organize the code in such a way
that most values are generated locally."
Judging from the performance we obtain on a simple cube with two
processes, it seems we have generated too much entries on the wrong
process. Indeed our distributed code runs slower than the current one.
However the stash does not seem to contain that much (650 000 over a total
of 50 000 000 nnz). We have attached the output obtained with "-info" as
well as the "-log_summary" profiling. Most of the time is spent in the
assembly and a lot of mallocs occur.
What's your advice on this ? Is working with ghost cells the only option ?
We were wondering if we could preallocate the stash for example to
decrease the number of mallocs.
Regards,
Thomas
Ce message et toutes les pièces jointes (ci-après le 'Message') sont
établis à l'intention exclusive des destinataires et les informations qui
y figurent sont strictement confidentielles. Toute utilisation de ce
Message non conforme à sa destination, toute diffusion ou toute
publication totale ou partielle, est interdite sauf autorisation expresse.
Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de
le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou
partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de
votre système, ainsi que toutes ses copies, et de n'en garder aucune trace
sur quelque support que ce soit. Nous vous remercions également d'en
avertir immédiatement l'expéditeur par retour du message.
Il est impossible de garantir que les communications par messagerie
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute
erreur ou virus.
____________________________________________________
This message and any attachments (the 'Message') are intended solely for
the addressees. The information contained in this Message is confidential.
Any use of information contained in this Message not in accord with its
purpose, any dissemination or disclosure, either whole or partial, is
prohibited except formal approval.
If you are not the addressee, you may not copy, forward, disclose or use
any part of it. If you have received this message in error, please delete
it and all copies from your system and notify the sender immediately by
return message.
E-mail communication cannot be guaranteed to be timely secure, error or
virus-free.
Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse.
Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message.
Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus.
____________________________________________________
This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval.
If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message.
E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120628/230cc60b/attachment.html>
More information about the petsc-users
mailing list