[petsc-users] MatCreateSubMatricesMPI strange behavior

Alexis SALZMAN alexis.salzman at ec-nantes.fr
Mon Aug 25 13:00:54 CDT 2025


Thanks Mark for your attention.

The uncleaned error message, compared to my post in July, is as follows:

[0]PETSC ERROR: --------------------- Error Message 
--------------------------------------------------------------
[0]PETSC ERROR: Argument out of range
[0]PETSC ERROR: Column too large: col 4 max 3
[0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dFZ4KSyqnoKD_8HJEOBBrLiK5TUCQmZbw09Dxau1D-3pxswHNP1D3HpEP-nXcrUdppQnRXo6rVLtt26Bd50bK6i7-_w38dW0qu4$  for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.22.2, unknown
[0]PETSC ERROR: subnb with 3 MPI process(es) and PETSC_ARCH  on 
pc-str97.ec-nantes.fr by salzman Mon Aug 25 19:11:37 2025
[0]PETSC ERROR: Configure options: PETSC_ARCH=real_fc41_Release_gcc_i4 
PETSC_DIR=/home/salzman/devel/ExternalLib/build/PETSC/petsc 
--doCleanup=1 --with-scalar-type=real --known-level1-dcach
e-linesize=64 --with-cc=gcc --CFLAGS="-fPIC " --CC_LINKER_FLAGS=-fopenmp 
--with-cxx=g++ --with-cxx-dialect=c++20 --CXXFLAGS="-fPIC " 
--CXX_LINKER_FLAGS=-fopenmp --with-fc=gfortran --FFLAGS=
"-fPIC " --FC_LINKER_FLAGS=-fopenmp --with-debugging=0 
--with-fortran-bindings=0 --with-fortran-kernels=1 
--with-mpi-compilers=0 --with-mpi-include=/usr/include/openmpi-x86_64 
--with-mpi-li
b="[/usr/lib64/openmpi/lib/libmpi.so,/usr/lib64/openmpi/lib/libmpi.so,/usr/lib64/openmpi/lib/libmpi_mpifh.so]" 
--with-blas-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_intel_lp64.so,/opt/i
ntel/oneapi/mkl/latest/lib/libmkl_gnu_thread.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_core.so]" 
--with-lapack-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_intel_lp64.so,/opt/intel/oneapi
/mkl/latest/lib/libmkl_gnu_thread.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_core.so]" 
--with-mumps=1 --with-mumps-include=/home/salzman/local/i4_gcc/include 
--with-mumps-lib="[/home/salzma
n/local/i4_gcc/lib/libdmumps.so,/home/salzman/local/i4_gcc/lib/libmumps_common.so,/home/salzman/local/i4_gcc/lib/libpord.so]" 
--with-scalapack-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_
scalapack_lp64.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_blacs_openmpi_lp64.so]" 
--with-mkl_pardiso=1 
--with-mkl_pardiso-include=/opt/intel/oneapi/mkl/latest/include 
--with-mkl_pardiso-lib
="[/opt/intel/oneapi/mkl/latest/lib/intel64/libmkl_intel_lp64.so]" 
--with-hdf5=1 --with-hdf5-include=/usr/include/openmpi-x86_64 
--with-hdf5-lib="[/usr/lib64/openmpi/lib/libhdf5.so]" --with
-pastix=0 --download-pastix=no --with-hwloc=1 
--with-hwloc-dir=/home/salzman/local/i4_gcc --download-hwloc=no 
--with-ptscotch-include=/home/salzman/local/i4_gcc/include 
--with-ptscotch-lib=
"[/home/salzman/local/i4_gcc/lib/libptscotch.a,/home/salzman/local/i4_gcc/lib/libptscotcherr.a,/home/salzman/local/i4_gcc/lib/libptscotcherrexit.a,/home/salzman/local/i4_gcc/lib/libscotch.a
,/home/salzman/local/i4_gcc/lib/libscotcherr.a,/home/salzman/local/i4_gcc/lib/libscotcherrexit.a]" 
--with-hypre=1 --download-hypre=yes --with-suitesparse=1 
--with-suitesparse-include=/home/
salzman/local/i4_gcc/include 
--with-suitesparse-lib="[/home/salzman/local/i4_gcc/lib/libsuitesparseconfig.so,/home/salzman/local/i4_gcc/lib/libumfpack.so,/home/salzman/local/i4_gcc/lib/libk
lu.so,/home/salzman/local/i4_gcc/lib/libcholmod.so,/home/salzman/local/i4_gcc/lib/libspqr.so,/home/salzman/local/i4_gcc/lib/libcolamd.so,/home/salzman/local/i4_gcc/lib/libccolamd.so,/home/s
alzman/local/i4_gcc/lib/libcamd.so,/home/salzman/local/i4_gcc/lib/libamd.so,/home/salzman/local/i4_gcc/lib/libmetis.so]" 
--download-suitesparse=no --with-python-exec=python3.12 --have-numpy
=1 ---with-petsc4py=1 ---with-petsc4py-test-np=4 ---with-mpi4py=1 
--prefix=/home/salzman/local/i4_gcc/real_arithmetic COPTFLAGS="-O3 -g " 
CXXOPTFLAGS="-O3 -g " FOPTFLAGS="-O3 -g "
[0]PETSC ERROR: #1 MatSetValues_SeqAIJ() at 
/home/salzman/devel/PETSc/petsc/src/mat/impls/aij/seq/aij.c:426
[0]PETSC ERROR: #2 MatSetValues() at 
/home/salzman/devel/PETSc/petsc/src/mat/interface/matrix.c:1543
[0]PETSC ERROR: #3 MatSetSeqMats_MPIAIJ() at 
/home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:2965
[0]PETSC ERROR: #4 MatCreateSubMatricesMPI_MPIXAIJ() at 
/home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:3163
[0]PETSC ERROR: #5 MatCreateSubMatricesMPI_MPIAIJ() at 
/home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:3196
[0]PETSC ERROR: #6 MatCreateSubMatricesMPI() at 
/home/salzman/devel/PETSc/petsc/src/mat/interface/matrix.c:7293
[0]PETSC ERROR: #7 main() at subnb.c:181
[0]PETSC ERROR: No PETSc Option Table entries
[0]PETSC ERROR: ----------------End of Error Message -------send entire 
error message to petsc-maint at mcs.anl.gov----------
--------------------------------------------------------------------------

This message comes from executing the attached test (I simplified the 
test by removing the block size from the matrix used for extraction, 
compared to the July test). In proc_xx_output.txt, you will find the 
output from the code execution with the -ok option (i.e. irow/idxr and 
icol/idxc are the same, i.e. a square sub-block for colour 0 distributed 
across the first two processes).

Has expected in this case we obtain the 0,3,6,9 sub-block terms, which 
are distributed across processes 0 and 1 (two rows per proc).

When asking for rectangular sub-block (i.e. with no option) it crash 
with column to large on process 0: 4 col max 3 ??? I ask for 4 rows and 
2 columns in this process ???

Otherwise, I mention the dense aspect of the matrix in ex183.c, because, 
in this case, no matter what selection is requested, all terms are 
non-null. If there is an issue with the way the selection is coded in 
the user program, I think it will be masked thanks to the full graph 
representation. However, this may not be the case — I should test it.

I'll take a look at ex23.c.

Thanks,

A.S.



Le 25/08/2025 à 17:55, Mark Adams a écrit :
> Ah, OK, never say never.
>
> MatCreateSubMatrices seems to support creating a new matrix with the 
> communicator of the IS.
> It just needs to read from the input matrix and does not use it for 
> communication, so it can do that.
>
> As far as rectangular matrices, there is no reason not to support that 
> (the row IS and column IS can be distinct).
> Can you send the whole error message?
> There may not be a test that does this, but src/mat/tests/ex23.c looks 
> like it may be a rectangular matrix output.
>
> And, it should not matter if the input matrix has a 100% full sparse 
> matrix. It is still MatAIJ.
> The semantics and API is the same for sparse or dense matrices.
>
> Thanks,
> Mark
>
> On Mon, Aug 25, 2025 at 7:31 AM Alexis SALZMAN 
> <alexis.salzman at ec-nantes.fr> wrote:
>
>     Hi,
>
>     Thanks for your answer, Mark. Perhaps MatCreateSubMatricesMPI is
>     the only PETSc function that acts on a sub-communicator — I'm not
>     sure — but it's clear that there's no ambiguity on that point. The
>     first line of the documentation for that function states that it
>     'may live on subcomms'. This is confirmed by the
>     'src/mat/tests/ex183.c' test case. I used this test case to
>     understand the function, which helped me with my code and the
>     example I provided in my initial post. Unfortunately, in this
>     example, the matrix from which the sub-matrices are extracted is
>     dense, even though it uses a sparse structure. This does not
>     clarify how to define sub-matrices when extracting from a sparse
>     distributed matrix. Since my initial post, I have discovered that
>     having more columns than rows can also result in the same error
>     message.
>
>     So, my questions boil down to:
>
>     Can MatCreateSubMatricesMPI extract rectangular matrices from a
>     square distributed sparse matrix?
>
>     If not, the fact that only square matrices can be extracted in
>     this context should perhaps be mentioned in the documentation.
>
>     If so, I would be very grateful for any assistance in defining an
>     IS pair in this context.
>
>     Regards
>
>     A.S.
>
>     Le 27/07/2025 à 00:15, Mark Adams a écrit :
>>     First, you can not mix communicators in PETSc calls in general
>>     (ever?), but this error looks like you might be asking for a row
>>     from the matrix that does not exist.
>>     You should start with a PETSc example code. Test it and modify it
>>     to suit your needs.
>>
>>     Good luck,
>>     Mark
>>
>>     On Fri, Jul 25, 2025 at 9:31 AM Alexis SALZMAN
>>     <alexis.salzman at ec-nantes.fr> wrote:
>>
>>         Hi,
>>
>>         As I am relatively new to Petsc, I may have misunderstood how
>>         to use the
>>         MatCreateSubMatricesMPI function. The attached code is tuned
>>         for three
>>         processes and extracts one matrix for each colour of a
>>         subcommunicator
>>         that has been created using the MPI_Comm_split function from
>>         an  MPIAij
>>         matrix. The following error message appears when the code is
>>         set to its
>>         default configuration (i.e. when a rectangular matrix is
>>         extracted with
>>         more rows than columns for colour 0):
>>
>>         [0]PETSC ERROR: --------------------- Error Message
>>         --------------------------------------------------------------
>>         [0]PETSC ERROR: Argument out of range
>>         [0]PETSC ERROR: Column too large: col 4 max 3
>>         [0]PETSC ERROR: See
>>         https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ZqH097BZ0G0O3WI7RWrwIKFNpyk0czSWEqfusAeTlgEygAffwpgBUzsLw1TIoGkjZ3mYG-NRQxxFoxU4y8EyY0ofiz9I43Qwe0w$
>>         for trouble shooting.
>>         [0]PETSC ERROR: Petsc Release Version 3.22.2, unknown
>>
>>         ... petsc git hash 2a89477b25f compiled on a dell i9 computer
>>         with Gcc
>>         14.3, mkl 2025.2, .....
>>         [0]PETSC ERROR: #1 MatSetValues_SeqAIJ() at
>>         ...petsc/src/mat/impls/aij/seq/aij.c:426
>>         [0]PETSC ERROR: #2 MatSetValues() at
>>         ...petsc/src/mat/interface/matrix.c:1543
>>         [0]PETSC ERROR: #3 MatSetSeqMats_MPIAIJ() at
>>         .../petsc/src/mat/impls/aij/mpi/mpiov.c:2965
>>         [0]PETSC ERROR: #4 MatCreateSubMatricesMPI_MPIXAIJ() at
>>         .../petsc/src/mat/impls/aij/mpi/mpiov.c:3163
>>         [0]PETSC ERROR: #5 MatCreateSubMatricesMPI_MPIAIJ() at
>>         .../petsc/src/mat/impls/aij/mpi/mpiov.c:3196
>>         [0]PETSC ERROR: #6 MatCreateSubMatricesMPI() at
>>         .../petsc/src/mat/interface/matrix.c:7293
>>         [0]PETSC ERROR: #7 main() at sub.c:169
>>
>>         When the '-ok' option is selected, the code extracts a square
>>         matrix for
>>         colour 0, which runs smoothly in this case. Selecting the
>>         '-trans'
>>         option swaps the row and column selection indices, providing a
>>         transposed submatrix smoothly. For colour 1, which uses only
>>         one process
>>         and is therefore sequential, rectangular extraction is OK
>>         regardless of
>>         the shape.
>>
>>         Is this dependency on the shape expected? Have I missed an
>>         important
>>         tuning step somewhere?
>>
>>         Thank you in advance for any clarification.
>>
>>         Regards
>>
>>         A.S.
>>
>>         P.S.: I'm sorry, but as I'm leaving my office for the
>>         following weeks
>>         this evening, I won't be very responsive during this period.
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250825/c20c8c82/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: subnb.c
Type: text/x-csrc
Size: 6093 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250825/c20c8c82/attachment-0001.bin>
-------------- next part --------------
rstart 0 rend 4
Mat Object: 3 MPI processes
  type: mpiaij
  row 0:   (0, 101.)    (3, 104.)    (6, 107.)    (9, 110.)   
  row 1:   (2, 203.)    (5, 206.)    (8, 209.)    (11, 212.)   
  row 2:   (1, 302.)    (4, 305.)    (7, 308.)    (10, 311.)   
  row 3:   (0, 401.)    (3, 404.)    (6, 407.)    (9, 410.)   
  row 4:   (2, 503.)    (5, 506.)    (8, 509.)    (11, 512.)   
  row 5:   (1, 602.)    (4, 605.)    (7, 608.)    (10, 611.)   
  row 6:   (0, 701.)    (3, 704.)    (6, 707.)    (9, 710.)   
  row 7:   (2, 803.)    (5, 806.)    (8, 809.)    (11, 812.)   
  row 8:   (1, 902.)    (4, 905.)    (7, 908.)    (10, 911.)   
  row 9:   (0, 1001.)    (3, 1004.)    (6, 1007.)    (9, 1010.)   
  row 10:   (2, 1103.)    (5, 1106.)    (8, 1109.)    (11, 1112.)   
  row 11:   (1, 1202.)    (4, 1205.)    (7, 1208.)    (10, 1211.)   
idxr proc
IS Object: 2 MPI processes
  type: general
[0] Number of indices in set 2
[0] 0 0
[0] 1 3
[1] Number of indices in set 2
[1] 0 6
[1] 1 9
idxc proc
IS Object: 2 MPI processes
  type: general
[0] Number of indices in set 2
[0] 0 0
[0] 1 3
[1] Number of indices in set 2
[1] 0 6
[1] 1 9
Mat Object: 2 MPI processes
  type: mpiaij
  row 0:   (0, 101.)    (1, 104.)    (2, 107.)    (3, 110.)   
  row 1:   (0, 401.)    (1, 404.)    (2, 407.)    (3, 410.)   
  row 2:   (0, 701.)    (1, 704.)    (2, 707.)    (3, 710.)   
  row 3:   (0, 1001.)    (1, 1004.)    (2, 1007.)    (3, 1010.)   
rstart 0 rend 2
local row 0: ( 0 , 1.010000e+02) ( 1 , 1.040000e+02) ( 2 , 1.070000e+02) ( 3 , 1.100000e+02)
local row 1: ( 0 , 4.010000e+02) ( 1 , 4.040000e+02) ( 2 , 4.070000e+02) ( 3 , 4.100000e+02)
-------------- next part --------------
rstart 4 rend 8
idxr proc
idxc proc
rstart 2 rend 4
local row 2: ( 0 , 7.010000e+02) ( 1 , 7.040000e+02) ( 2 , 7.070000e+02) ( 3 , 7.100000e+02)
local row 3: ( 0 , 1.001000e+03) ( 1 , 1.004000e+03) ( 2 , 1.007000e+03) ( 3 , 1.010000e+03)
-------------- next part --------------
rstart 8 rend 12
idxr proc
IS Object: 1 MPI process
  type: general
Number of indices in set 12
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 11
idxc proc
IS Object: 1 MPI process
  type: general
Number of indices in set 4
0 0
1 1
2 8
3 9
Mat Object: 1 MPI process
  type: seqaij
row 0: (0, 101.)  (3, 110.) 
row 1: (2, 209.) 
row 2: (1, 302.) 
row 3: (0, 401.)  (3, 410.) 
row 4: (2, 509.) 
row 5: (1, 602.) 
row 6: (0, 701.)  (3, 710.) 
row 7: (2, 809.) 
row 8: (1, 902.) 
row 9: (0, 1001.)  (3, 1010.) 
row 10: (2, 1109.) 
row 11: (1, 1202.) 
rstart 0 rend 12
local row 0: ( 0 , 1.010000e+02) ( 3 , 1.100000e+02)
local row 1: ( 2 , 2.090000e+02)
local row 2: ( 1 , 3.020000e+02)
local row 3: ( 0 , 4.010000e+02) ( 3 , 4.100000e+02)
local row 4: ( 2 , 5.090000e+02)
local row 5: ( 1 , 6.020000e+02)
local row 6: ( 0 , 7.010000e+02) ( 3 , 7.100000e+02)
local row 7: ( 2 , 8.090000e+02)
local row 8: ( 1 , 9.020000e+02)
local row 9: ( 0 , 1.001000e+03) ( 3 , 1.010000e+03)
local row 10: ( 2 , 1.109000e+03)
local row 11: ( 1 , 1.202000e+03)


More information about the petsc-users mailing list