[petsc-users] Réf. : Re: Avoiding malloc overhead for unstructured finite element meshes

Thu Jun 28 12:29:03 CDT 2012

0: [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 346647 X 346647; storage 
space: 9900 unneeded,26861247 used
0: [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 
1608

This number of mallocs is the real problem, you have not preallocated 
correctly for the "diagonal" block of the matrix on rank 0. Fix 
preallocation and it will be fast. Everything below is fine.

Since 9900 coefficients were uneeded, we had first thought that enough 
room was preallocated. 
>From what you're telling us, I understand that we may have given an 
overall size which is large enough to contain the diagonal block but whose 
nnz line by line is not correct hence the mallocs. Is that correct ?

Or does it mean that in the preallocation we have to take care of the 
values that come from the stash of another processor even if they are 
added to preexisting entries on the process ?

Thomas

jedbrown at mcs.anl.gov
Envoyé par : petsc-users-bounces at mcs.anl.gov
28/06/2012 17:36
Veuillez répondre à petsc-users

        Pour :  petsc-users at mcs.anl.gov
        cc :    nicolas.sellenet at edf.fr, (ccc : Thomas DE-SOZA/A/EDF/FR)
        Objet : Re: [petsc-users] Avoiding malloc overhead for unstructured finite element 
meshes

>From petsc_info.log:

0: [0] MatStashScatterBegin_Private(): No of messages: 0 
0: [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
1: [1] MatAssemblyBegin_MPIAIJ(): Stash has 645696 entries, uses 6 
mallocs.

This means that rank 1 generates a bunch of entries in rows owned by rank 
0, but not vice-versa. The number of entries is somewhat high, but not 
unreasonable.

1: [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 334296 X 334296; storage 
space: 0 unneeded,25888950 used
1: [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 
0
1: [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81
1: [1] Mat_CheckInode(): Found 111432 nodes of 334296. Limit used: 5. 
Using Inode routines

Rank 1 preallocated correctly, no problem here.

0: [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 346647 X 346647; storage 
space: 9900 unneeded,26861247 used
0: [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 
1608

This number of mallocs is the real problem, you have not preallocated 
correctly for the "diagonal" block of the matrix on rank 0. Fix 
preallocation and it will be fast. Everything below is fine.

0: [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81
0: [0] Mat_CheckInode(): Found 115549 nodes of 346647. Limit used: 5. 
Using Inode routines
0: [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
-2080374783
0: [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter
1: [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
-2080374783
1: [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
-2080374783
0: [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
-2080374783
0: [0] VecScatterCreateCommon_PtoS(): Using blocksize 3 scatter
0: [0] VecScatterCreate(): Special case: blocked indices to stride
0: [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 346647 X 12234; storage 
space: 0 unneeded,308736 used
0: [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 
0
0: [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 51
1: [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 334296 X 12210; storage 
space: 0 unneeded,308736 used
1: [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 
0
1: [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 45

On Thu, Jun 28, 2012 at 5:59 AM, Thomas DE-SOZA <thomas.de-soza at edf.fr> wrote:

Dear PETSc Users, 

We're experiencing performances issues after having switched to fully 
distributed meshes in our in-house code and would like your opinion on the 
matter. 

In the current version of our structural mechanics FEA software (http://www.code-aster.org/), all MPI processes have knowledge of the whole matrix and therefore can 
easily pass it to PETSc without the need for any communication. In a 
nutshell, stash is empty after MatSetValues and no mallocs occur during 
assembly. 
We're now building a distributed version of the software with each process 
reading its own subdomain in order to save memory. The mesh was 
partitioned with Metis and as a first approach we built a simple partition 
of the degrees of freedom based on the gradual subdomains. This eliminates 
the need for Application Ordering but yields an unbalanced decomposition 
in terms of rows. If we take an example with 2 MPI processes : processor 0 
will have more unknowns than processor 1 and will receive entries lying on 
the interface whereas processor 1 will have all entries locally. 

PETSc manual states that "It is fine to generate some entries on the 
"wrong" process. Often this can lead to cleaner, simpler, less buggy 
codes. One should never make code overly complicated in order to generate 
all values locally. Rather, one should organize the code in such a way 
that most values are generated locally." 
Judging from the performance we obtain on a simple cube with two 
processes, it seems we have generated too much entries on the wrong 
process. Indeed our distributed code runs slower than the current one. 
However the stash does not seem to contain that much (650 000 over a total 
of 50 000 000 nnz). We have attached the output obtained with "-info" as 
well as the "-log_summary" profiling. Most of the time is spent in the 
assembly and a lot of mallocs occur. 

What's your advice on this ? Is working with ghost cells the only option ? 
We were wondering if we could preallocate the stash for example to 
decrease the number of mallocs. 

Regards, 

Thomas

Ce message et toutes les pièces jointes (ci-après le 'Message') sont 
établis à l'intention exclusive des destinataires et les informations qui 
y figurent sont strictement confidentielles. Toute utilisation de ce 
Message non conforme à sa destination, toute diffusion ou toute 
publication totale ou partielle, est interdite sauf autorisation expresse.
Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de 
le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou 
partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de 
votre système, ainsi que toutes ses copies, et de n'en garder aucune trace 
sur quelque support que ce soit. Nous vous remercions également d'en 
avertir immédiatement l'expéditeur par retour du message.
Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.
____________________________________________________
This message and any attachments (the 'Message') are intended solely for 
the addressees. The information contained in this Message is confidential. 
Any use of information contained in this Message not in accord with its 
purpose, any dissemination or disclosure, either whole or partial, is 
prohibited except formal approval.
If you are not the addressee, you may not copy, forward, disclose or use 
any part of it. If you have received this message in error, please delete 
it and all copies from your system and notify the sender immediately by 
return message.
E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.

Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus.
____________________________________________________

This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120628/230cc60b/attachment.html>