<div dir="ltr"><div><br></div><div>With the following example <a href="https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c" target="_blank">https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c</a></div><div><br></div><div>There are some questions about MatPreallocator.</div><div><br></div><div>1. In parallel run, all the MPI ranks should do the same preallocator procedure?</div><div><br></div><div>2. In ex230.c, the difference between ex1 of ex230.c and ex2 of ex230.c is the block.<br>Developers want  to show using block is more efficient method than just using matsetvalues?</div><div><br></div><div>Thanks, </div><div>Hyung Kim</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">2022년 12월 13일 (화) 오전 1:43, Junchao Zhang <<a href="mailto:junchao.zhang@gmail.com">junchao.zhang@gmail.com</a>>님이 작성:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Since you run with multiple ranks, you should use matrix type mpiaij and MatMPIAIJSetPreallocation. If preallocation is difficult to estimate, you can use MatPreallocator, see an example at <a href="https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c" target="_blank">https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c</a></div><div><br></div><div><div><div dir="ltr"><div dir="ltr">--Junchao Zhang</div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Dec 12, 2022 at 5:16 AM 김성익 <<a href="mailto:ksi2443@gmail.com" target="_blank">ksi2443@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hello,<div><br></div><div><br></div><div>I need some keyword or some examples for parallelizing matrix assemble process.</div><div><br></div><div>My current state is as below.</div><div>- Finite element analysis code for Structural mechanics.</div><div>- problem size : 3D solid hexa element (number of elements : 125,000), number of degree of freedom : 397,953</div><div>- Matrix type : seqaij, matrix set preallocation by using MatSeqAIJSetPreallocation</div><div>- Matrix assemble time by using 1 core : 120 sec<br>   for (int i=0; i<125000; i++) {</div><div>    ~~ element matrix calculation}</div><div>   matassemblybegin</div><div>   matassemblyend</div><div>- Matrix assemble time by using 8 core : 70,234sec</div><div>  int start, end;</div><div>  VecGetOwnershipRange( element_vec, &start, &end);</div><div>  for (int i=start; i<end; i++){</div><div>   ~~ element matrix calculation</div><div>   matassemblybegin</div><div>   matassemblyend</div><div><br></div><div><br></div><div>As you see the state, the parallel case spent a lot of time than sequential case..</div><div>How can I speed up in this case?</div><div>Can I get some keyword or examples for parallelizing assembly of matrix in finite element analysis ?</div><div><br></div><div>Thanks,</div><div>Hyung Kim</div><div><br></div></div>

</blockquote></div>

</blockquote></div>