<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div><br></div><span style="text-align: justify; font-size: 14px;">"MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 73239"</span><div><div style="text-align: justify;"><span style="font-size: 14px;"><br></span></div><div style="text-align: justify;"><span style="font-size: 14px;">The preallocation is VERY wrong. This is why the computation is so slow; this number should be zero. </span></div><div style="text-align: justify;"><span style="font-size: 14px;"><br></span></div><div style="text-align: justify;"><span style="font-size: 14px;"><br></span></div><div><br><blockquote type="cite"><div>On Dec 12, 2022, at 10:20 PM, 김성익 <ksi2443@gmail.com> wrote:</div><br class="Apple-interchange-newline"><div><div dir="ltr"><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515"">Following your comments, <br>I checked by using '-info'.</p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515"">As you suspected, most elements being computed on wrong MPI rank.<br>Also, there are a lot of stashed entries.<br></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US"><br></span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">Should I divide the domain from the problem define stage?<br>Or is a proper preallocation sufficient?<br></span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US"><br></span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate():
Duplicating a communicator 139687279637472 94370404729840 max tags = 2147483647</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate():
Duplicating a communicator 139620736898016 94891084133376 max tags = 2147483647</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <mat> MatSetUp(): Warning not
preallocating matrix storage</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate():
Duplicating a communicator 139620736897504 94891083133744 max tags = 2147483647</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate():
Duplicating a communicator 139687279636960 94370403730224 max tags = 2147483647</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736898016 94891084133376</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279637472 94370404729840</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736898016 94891084133376</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279637472 94370404729840</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736898016 94891084133376</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279637472 94370404729840</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279637472 94370404729840</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736898016 94891084133376</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736898016 94891084133376</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279637472 94370404729840</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US"> TIME0 : 0.000000</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US"> TIME0 : 0.000000</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Stash has 661 entries, uses 8 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Block-Stash has 0 entries, uses 0 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Stash has 661 entries, uses 5 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Block-Stash has 0 entries, uses 0 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <mat> MatAssemblyBegin_MPIAIJ():
Stash has 460416 entries, uses 5 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatAssemblyBegin_MPIAIJ():
Stash has 461184 entries, uses 5 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <mat> MatAssemblyEnd_SeqAIJ():
Matrix size: 13892 X 13892; storage space: 180684 unneeded,987406 used</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <mat> MatAssemblyEnd_SeqAIJ():
Number of mallocs during MatSetValues() is 73242</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <mat> MatAssemblyEnd_SeqAIJ():
Maximum nonzeros in any row is 81</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <mat> MatCheckCompressedRow():
Found the ratio (num_zerorows 0)/(num_localrows 13892) < 0.6. Do not use
CompressedRow routines.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <mat> MatSeqAIJCheckInode():
Found 4631 nodes of 13892. Limit used: 5. Using Inode routines</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatAssemblyEnd_SeqAIJ():
Matrix size: 13891 X 13891; storage space: 180715 unneeded,987325 used</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatAssemblyEnd_SeqAIJ():
Number of mallocs during MatSetValues() is 73239</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatAssemblyEnd_SeqAIJ():
Maximum nonzeros in any row is 81</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatCheckCompressedRow():
Found the ratio (num_zerorows 0)/(num_localrows 13891) < 0.6. Do not use
CompressedRow routines.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatSeqAIJCheckInode():
Found 4631 nodes of 13891. Limit used: 5. Using Inode routines</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <mat> MatAssemblyEnd_SeqAIJ():
Matrix size: 13892 X 1390; storage space: 72491 unneeded,34049 used</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <mat> MatAssemblyEnd_SeqAIJ():
Number of mallocs during MatSetValues() is 2472</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <mat> MatAssemblyEnd_SeqAIJ():
Maximum nonzeros in any row is 40</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <mat> MatCheckCompressedRow():
Found the ratio (num_zerorows 12501)/(num_localrows 13892) > 0.6. Use
CompressedRow routines.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">Assemble Time : 174.079366sec</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatAssemblyEnd_SeqAIJ():
Matrix size: 13891 X 1391; storage space: 72441 unneeded,34049 used</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatAssemblyEnd_SeqAIJ():
Number of mallocs during MatSetValues() is 2469</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatAssemblyEnd_SeqAIJ():
Maximum nonzeros in any row is 41</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatCheckCompressedRow():
Found the ratio (num_zerorows 12501)/(num_localrows 13891) > 0.6. Use
CompressedRow routines.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">Assemble Time : 174.141234sec</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Stash has 13891 entries, uses 8 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Block-Stash has 0 entries, uses 0 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatAssemblyEnd_SeqAIJ():
Matrix size: 13891 X 13891; storage space: 0 unneeded,987325 used</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatAssemblyEnd_SeqAIJ():
Number of mallocs during MatSetValues() is 0</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatAssemblyEnd_SeqAIJ():
Maximum nonzeros in any row is 81</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatCheckCompressedRow():
Found the ratio (num_zerorows 0)/(num_localrows 13891) < 0.6. Do not use
CompressedRow routines.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <pc> PCSetUp(): Setting up PC for
first time</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <pc> PCSetUp(): Leaving PC with
identical preconditioner since operator is unchanged</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <pc> PCSetUp(): Leaving PC with
identical preconditioner since operator is unchanged</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <pc> PCSetUp(): Leaving PC with
identical preconditioner since operator is unchanged</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">Solving Time : 5.085394sec</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <ksp> KSPConvergedDefault():
Linear solver has converged. Residual norm 1.258030470407e-17 is less than
relative tolerance 1.000000000000e-05 times initial right hand side norm
2.579617304779e-03 at iteration 1</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">Solving Time : 5.089733sec</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Stash has 661 entries, uses 5 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Block-Stash has 0 entries, uses 0 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <mat> MatAssemblyBegin_MPIAIJ():
Stash has 460416 entries, uses 0 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatAssemblyBegin_MPIAIJ():
Stash has 461184 entries, uses 0 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">Assemble Time : 5.242508sec</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">Assemble Time : 5.240863sec</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Stash has 13891 entries, uses 0 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Block-Stash has 0 entries, uses 0 mallocs.</span></p><div style="margin: 0cm 0cm 8pt; text-align: justify; line-height: 107%; font-size: 10pt; font-family: "맑은 고딕";"><span lang="EN-US"> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">
TIME : 1.000000, TIME_STEP :
1.000000, ITER : 2, RESIDUAL : 2.761615e-03</span></p><div style="margin: 0cm 0cm 8pt; text-align: justify; line-height: 107%; font-size: 10pt; font-family: "맑은 고딕";"><span lang="EN-US"> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">
TIME : 1.000000, TIME_STEP :
1.000000, ITER : 2, RESIDUAL : 2.761615e-03</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <pc> PCSetUp(): Setting up PC
with same nonzero pattern</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <pc> PCSetUp(): Leaving PC with
identical preconditioner since operator is unchanged</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <pc> PCSetUp(): Leaving PC with
identical preconditioner since operator is unchanged</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <ksp> KSPConvergedDefault():
Linear solver has converged. Residual norm 1.539725065974e-19 is less than
relative tolerance 1.000000000000e-05 times initial right hand side norm
8.015104666105e-06 at iteration 1</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">Solving Time : 4.662785sec</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">Solving Time : 4.664515sec</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Stash has 661 entries, uses 5 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Block-Stash has 0 entries, uses 0 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <mat> MatAssemblyBegin_MPIAIJ():
Stash has 461184 entries, uses 0 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <mat> MatAssemblyBegin_MPIAIJ():
Stash has 460416 entries, uses 0 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">Assemble Time : 5.238257sec</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139620736897504 94891083133744</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">Assemble Time : 5.236535sec</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscCommDuplicate(): Using
internal PETSc communicator 139687279636960 94370403730224</span></p><div style="margin: 0cm 0cm 8pt; text-align: justify; line-height: 107%; font-size: 10pt; font-family: "맑은 고딕";"><span lang="EN-US"> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">
TIME : 1.000000, TIME_STEP :
1.000000, ITER : 3, RESIDUAL : 3.705062e-08</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US"> TIME0 : 1.000000</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Stash has 13891 entries, uses 0 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Block-Stash has 0 entries, uses 0 mallocs.</span></p><div style="margin: 0cm 0cm 8pt; text-align: justify; line-height: 107%; font-size: 10pt; font-family: "맑은 고딕";"><span lang="EN-US"> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">
TIME : 1.000000, TIME_STEP :
1.000000, ITER : 3, RESIDUAL : 3.705062e-08</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US"> TIME0 : 1.000000</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[1] <sys> PetscFinalize():
PetscFinalize() called</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Stash has 661 entries, uses 5 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <vec> VecAssemblyBegin_MPI_BTS():
Block-Stash has 0 entries, uses 0 mallocs.</span></p><p class="MsoNormal" style="margin:0cm 0cm 8pt;text-align:justify;line-height:107%;font-size:10pt;font-family:"\00b9d1\00c740 \00ace0\00b515""><span lang="EN-US">[0] <sys> PetscFinalize():
PetscFinalize() called</span></p></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">2022년 12월 13일 (화) 오전 12:50, Barry Smith <<a href="mailto:bsmith@petsc.dev">bsmith@petsc.dev</a>>님이 작성:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
The problem is possibly due to most elements being computed on "wrong" MPI rank and thus requiring almost all the matrix entries to be "stashed" when computed and then sent off to the owning MPI rank. Please send ALL the output of a parallel run with -info so we can see how much communication is done in the matrix assembly.<br>
<br>
Barry<br>
<br>
<br>
> On Dec 12, 2022, at 6:16 AM, 김성익 <<a href="mailto:ksi2443@gmail.com" target="_blank">ksi2443@gmail.com</a>> wrote:<br>
> <br>
> Hello,<br>
> <br>
> <br>
> I need some keyword or some examples for parallelizing matrix assemble process.<br>
> <br>
> My current state is as below.<br>
> - Finite element analysis code for Structural mechanics.<br>
> - problem size : 3D solid hexa element (number of elements : 125,000), number of degree of freedom : 397,953<br>
> - Matrix type : seqaij, matrix set preallocation by using MatSeqAIJSetPreallocation<br>
> - Matrix assemble time by using 1 core : 120 sec<br>
> for (int i=0; i<125000; i++) {<br>
> ~~ element matrix calculation}<br>
> matassemblybegin<br>
> matassemblyend<br>
> - Matrix assemble time by using 8 core : 70,234sec<br>
> int start, end;<br>
> VecGetOwnershipRange( element_vec, &start, &end);<br>
> for (int i=start; i<end; i++){<br>
> ~~ element matrix calculation<br>
> matassemblybegin<br>
> matassemblyend<br>
> <br>
> <br>
> As you see the state, the parallel case spent a lot of time than sequential case..<br>
> How can I speed up in this case?<br>
> Can I get some keyword or examples for parallelizing assembly of matrix in finite element analysis ?<br>
> <br>
> Thanks,<br>
> Hyung Kim<br>
> <br>
<br>
</blockquote></div>
</div></blockquote></div><br></div></body></html>