<div dir="ltr">Barry's suggestion for testing got garbled in the gitlab issue posting. Here it is, I think:<div><br></div><div>07:53 main *= ~/Codes/petsc$ make test s=ksp_ksp_tutorials-ex1_mpi_linear_solver_server_1<br>/usr/local/bin/gmake --no-print-directory -f /Users/markadams/Codes/petsc/gmakefile.test PETSC_ARCH=arch-macosx-gnu-O PETSC_DIR=/Users/markadams/Codes/petsc test<br>Using MAKEFLAGS: --no-print-directory -- PETSC_DIR=/Users/markadams/Codes/petsc PETSC_ARCH=arch-macosx-gnu-O s=ksp_ksp_tutorials-ex1_mpi_linear_solver_server_1<br>Application at path ( /Users/markadams/Codes/petsc/arch-macosx-gnu-O/bin/mpiexec.hydra ) removed from firewall<br>Application at path ( /Users/markadams/Codes/petsc/arch-macosx-gnu-O/bin/mpiexec.hydra ) added to firewall<br>Incoming connection to the application is blocked<br> TEST arch-macosx-gnu-O/tests/counts/ksp_ksp_tutorials-ex1_mpi_linear_solver_server_1.counts<br> ok ksp_ksp_tutorials-ex1_mpi_linear_solver_server_1<br> ok diff-ksp_ksp_tutorials-ex1_mpi_linear_solver_server_1<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Sep 4, 2024 at 3:59 PM Lin_Yuxiang <<a href="mailto:linyx199071@gmail.com">linyx199071@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">To whom it may concern:</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif"> </p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">I recently tried to use the 64 indices PETSc to replace the
legacy code's solver using MPI linear solver server. However, it gives me error
when I use more than 8 cores, saying</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif"> </p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">Get NNZ</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">MatsetPreallocation</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">MatsetValue</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">MatSetValue Time per kernel: 43.1147 s</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">Matassembly</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">VecsetValue</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">pestc_solve</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif"><font color="#ff0000">Read -1, expected 1951397280, errno = 14</font></p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif"> </p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">When I tried the -start_in_debugger, the error seems from
MPI_Scatter:</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif"> </p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">Rank0:</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#3 0x00001555512e4de5
in mca_pml_ob1_recv () from
/usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_ob1.so</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#4 0x0000155553e01e60
in PMPI_Scatterv () from /lib/x86_64-linux-gnu/libmpi.so.40</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#5 0x0000155554b13eab
in PCMPISetMat (pc=pc@entry=0x0) at
/auto/research/rdfs/home/lyuxiang/petsc-3.20.4/src/ksp/pc/impls/mpi/pcmpi.c:230</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#6 0x0000155554b17403
in PCMPIServerBegin () at
/auto/research/rdfs/home/lyuxiang/petsc-3.20.4/src/ksp/pc/impls/mpi/pcmpi.c:464</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#7 0x00001555540b9aa4
in PetscInitialize_Common (prog=0x7fffffffe27b "geosimtrs_mpiserver",
file=file@entry=0x0, </p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">
help=help@entry=0x55555555a1e0 <help> "Solves a linear system
in parallel with KSP.\nInput parameters include:\n -view_exact_sol : write exact solution vector to
stdout\n -m <mesh_x> : number of mesh points in
x-direction\n -n <mesh"..., ftn=ftn@entry=PETSC_FALSE,
readarguments=readarguments@entry=PETSC_FALSE, len=len@entry=0)</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif"> at
/auto/research/rdfs/home/lyuxiang/petsc-3.20.4/src/sys/objects/pinit.c:1109</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#8 0x00001555540bba82
in PetscInitialize (argc=argc@entry=0x7fffffffda8c,
args=args@entry=0x7fffffffda80, file=file@entry=0x0, </p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">
help=help@entry=0x55555555a1e0 <help> "Solves a linear system
in parallel with KSP.\nInput parameters include:\n -view_exact_sol : write exact solution vector to
stdout\n -m <mesh_x> : number of mesh points in
x-direction\n -n <mesh"...) at
/auto/research/rdfs/home/lyuxiang/petsc-3.20.4/src/sys/objects/pinit.c:1274</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#9 0x0000555555557673
in main (argc=<optimized out>, args=<optimized out>) at
geosimtrs_mpiserver.c:29</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif"> </p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif"> Rank1-10</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">0x0000155553e1f030 in ompi_coll_base_allgather_intra_bruck
() from /lib/x86_64-linux-gnu/libmpi.so.40</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#4 0x0000155550f62aaa
in ompi_coll_tuned_allgather_intra_dec_fixed () from
/usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_coll_tuned.so</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#5 0x0000155553ddb431
in PMPI_Allgather () from /lib/x86_64-linux-gnu/libmpi.so.40</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#6 0x00001555541a2289
in PetscLayoutSetUp (map=0x555555721ed0) at
/auto/research/rdfs/home/lyuxiang/petsc-3.20.4/src/vec/is/utils/pmap.c:248</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#7 0x000015555442e06a
in MatMPIAIJSetPreallocationCSR_MPIAIJ (B=0x55555572d850, Ii=0x15545a778010,
J=0x15545beacb60, v=0x1554cff55e60)</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif"> at
/auto/research/rdfs/home/lyuxiang/petsc-3.20.4/src/mat/impls/aij/mpi/mpiaij.c:3885</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#8 0x00001555544284e3
in MatMPIAIJSetPreallocationCSR (B=0x55555572d850, i=0x15545a778010,
j=0x15545beacb60, v=0x1554cff55e60) at
/auto/research/rdfs/home/lyuxiang/petsc-3.20.4/src/mat/impls/aij/mpi/mpiaij.c:3998</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#9 0x0000155554b1412f
in PCMPISetMat (pc=pc@entry=0x0) at
/auto/research/rdfs/home/lyuxiang/petsc-3.20.4/src/ksp/pc/impls/mpi/pcmpi.c:250</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#10 0x0000155554b17403 in PCMPIServerBegin () at
/auto/research/rdfs/home/lyuxiang/petsc-3.20.4/src/ksp/pc/impls/mpi/pcmpi.c:464</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#11 0x00001555540b9aa4 in PetscInitialize_Common
(prog=0x7fffffffe27b "geosimtrs_mpiserver", file=file@entry=0x0, </p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">
help=help@entry=0x55555555a1e0 <help> "Solves a linear system
in parallel with KSP.\nInput parameters include:\n -view_exact_sol : write exact solution vector to
stdout\n -m <mesh_x> : number of mesh points in
x-direction\n -n <mesh"..., ftn=ftn@entry=PETSC_FALSE,
readarguments=readarguments@entry=PETSC_FALSE, len=len@entry=0) at
/auto/research/rdfs/home/lyuxiang/petsc-3.20.4/src/sys/objects/pinit.c:1109</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#12 0x00001555540bba82 in PetscInitialize
(argc=argc@entry=0x7fffffffda8c, args=args@entry=0x7fffffffda80,
file=file@entry=0x0, </p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">
help=help@entry=0x55555555a1e0 <help> "Solves a linear system
in parallel with KSP.\nInput parameters include:\n -view_exact_sol : write exact solution vector to
stdout\n -m <mesh_x> : number of mesh points in
x-direction\n -n <mesh"...) at
/auto/research/rdfs/home/lyuxiang/petsc-3.20.4/src/sys/objects/pinit.c:1274</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">#13 0x0000555555557673 in main (argc=<optimized out>,
args=<optimized out>) at geosimtrs_mpiserver.c:29</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif"> </p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">This did not happen in 32bit indiced PETSc, running with
more than 8 cores runs smoothly using MPI linear solver server, nor did it
happen on 64 bit indiced MPI version (not with mpi_linear_solver_server), only
happens on 64 bit PETSc mpi linear solver server, I think it maybe a potential
bug?</p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif"> </p>
<p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">Any advice would be greatly appreciated, the matrix and ia,
ja is too big to upload, so if anything you need to debug pls let me know</p><p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif"><br></p><ul style="box-sizing:border-box;padding-left:2rem;margin:0.4em 1.4rem 0px;color:rgb(34,40,50);font-family:-apple-system,BlinkMacSystemFont,"Segoe UI","Helvetica Neue",Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol";font-size:16px"><li style="box-sizing:border-box"><p style="box-sizing:border-box;margin-bottom:0px;margin-top:0px;font-size:1em">Machine type: HPC</p></li><li style="box-sizing:border-box"><p style="box-sizing:border-box;margin-bottom:0px;margin-top:0px;font-size:1em">OS version and type: Linux houamd009 6.1.55-cggdb11-1 #1 SMP Fri Sep 29 10:09:13 UTC 2023 x86_64 GNU/Linux</p></li><li style="box-sizing:border-box"><p style="box-sizing:border-box;margin-bottom:0px;margin-top:0px;font-size:1em">PETSc version: #define PETSC_VERSION_RELEASE 1</p>#define PETSC_VERSION_MAJOR 3<br>#define PETSC_VERSION_MINOR 20<br>#define PETSC_VERSION_SUBMINOR 4<br>#define PETSC_RELEASE_DATE "Sep 28, 2023"<br>#define PETSC_VERSION_DATE "Jan 29, 2024"<br></li><li style="box-sizing:border-box"><p style="box-sizing:border-box;margin-bottom:0px;margin-top:0px;font-size:1em">MPI implementation: OpenMPI</p></li><li style="box-sizing:border-box"><p style="box-sizing:border-box;margin-bottom:0px;margin-top:0px;font-size:1em">Compiler and version: GNU</p></li></ul><p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif"><br></p><p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif"><br></p><p class="MsoNormal" style="margin:0in;font-size:11pt;font-family:Aptos,sans-serif">Yuxiang Lin</p></div>
</blockquote></div>