[petsc-users] MatCreateSubMatricesMPI strange behavior

Alexis SALZMAN alexis.salzman at ec-nantes.fr
Tue Aug 26 05:50:10 CDT 2025


Mark, you were right and I was wrong about the dense matrix. Adding 
explicit zeros to the distributed matrix used to extract the 
sub-matrices (making it dense) in my test does not change the behaviour: 
there is still an error.

I am finding it increasingly difficult to understand the logic of the 
row and column 'IS' creation. I ran many tests to achieve the desired 
result: a rectangular sub-matrix (so a rectangular or square sub-matrix 
appears to be possible). However, many others resulted in the same kind 
of error.

 From what I observed, the test only works if the column selection 
contribution (size_c in the test) has a specific value related to the 
row selection contribution (size_r in the test) for proc 0 (rank for 
both communicator and sub-communicator):

  * if size_r==2 then if size_c<=2 it works.
  * if size_r>=3 and size_r<=5 then size_c==size_r is the only working case.

This occurs "regardless" of what is requested in proc 1 and in selr/selc 
(It can't be a dummy setting, though). In any case, it's certainly not 
an exhaustive analysis.

Many thanks to anyone who can explain to me the logic behind the 
construction of row and column 'IS'.

Regards

A.S.


Le 25/08/2025 à 20:00, Alexis SALZMAN a écrit :
>
> Thanks Mark for your attention.
>
> The uncleaned error message, compared to my post in July, is as follows:
>
> [0]PETSC ERROR: --------------------- Error Message 
> --------------------------------------------------------------
> [0]PETSC ERROR: Argument out of range
> [0]PETSC ERROR: Column too large: col 4 max 3
> [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dWBkCu100EMuxu8ooVUnqSFN7OhzOBoNHAiwDYEQ5cJ921sU5hdFb-G24ounZFeUQgZkfWqGRX4iIHyQ-xLQElJst5RbKa2pGnk$  for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.22.2, unknown
> [0]PETSC ERROR: subnb with 3 MPI process(es) and PETSC_ARCH  on 
> pc-str97.ec-nantes.fr by salzman Mon Aug 25 19:11:37 2025
> [0]PETSC ERROR: Configure options: PETSC_ARCH=real_fc41_Release_gcc_i4 
> PETSC_DIR=/home/salzman/devel/ExternalLib/build/PETSC/petsc 
> --doCleanup=1 --with-scalar-type=real --known-level1-dcach
> e-linesize=64 --with-cc=gcc --CFLAGS="-fPIC " 
> --CC_LINKER_FLAGS=-fopenmp --with-cxx=g++ --with-cxx-dialect=c++20 
> --CXXFLAGS="-fPIC " --CXX_LINKER_FLAGS=-fopenmp --with-fc=gfortran 
> --FFLAGS=
> "-fPIC " --FC_LINKER_FLAGS=-fopenmp --with-debugging=0 
> --with-fortran-bindings=0 --with-fortran-kernels=1 
> --with-mpi-compilers=0 --with-mpi-include=/usr/include/openmpi-x86_64 
> --with-mpi-li
> b="[/usr/lib64/openmpi/lib/libmpi.so,/usr/lib64/openmpi/lib/libmpi.so,/usr/lib64/openmpi/lib/libmpi_mpifh.so]" 
> --with-blas-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_intel_lp64.so,/opt/i
> ntel/oneapi/mkl/latest/lib/libmkl_gnu_thread.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_core.so]" 
> --with-lapack-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_intel_lp64.so,/opt/intel/oneapi
> /mkl/latest/lib/libmkl_gnu_thread.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_core.so]" 
> --with-mumps=1 --with-mumps-include=/home/salzman/local/i4_gcc/include 
> --with-mumps-lib="[/home/salzma
> n/local/i4_gcc/lib/libdmumps.so,/home/salzman/local/i4_gcc/lib/libmumps_common.so,/home/salzman/local/i4_gcc/lib/libpord.so]" 
> --with-scalapack-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_
> scalapack_lp64.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_blacs_openmpi_lp64.so]" 
> --with-mkl_pardiso=1 
> --with-mkl_pardiso-include=/opt/intel/oneapi/mkl/latest/include 
> --with-mkl_pardiso-lib
> ="[/opt/intel/oneapi/mkl/latest/lib/intel64/libmkl_intel_lp64.so]" 
> --with-hdf5=1 --with-hdf5-include=/usr/include/openmpi-x86_64 
> --with-hdf5-lib="[/usr/lib64/openmpi/lib/libhdf5.so]" --with
> -pastix=0 --download-pastix=no --with-hwloc=1 
> --with-hwloc-dir=/home/salzman/local/i4_gcc --download-hwloc=no 
> --with-ptscotch-include=/home/salzman/local/i4_gcc/include 
> --with-ptscotch-lib=
> "[/home/salzman/local/i4_gcc/lib/libptscotch.a,/home/salzman/local/i4_gcc/lib/libptscotcherr.a,/home/salzman/local/i4_gcc/lib/libptscotcherrexit.a,/home/salzman/local/i4_gcc/lib/libscotch.a
> ,/home/salzman/local/i4_gcc/lib/libscotcherr.a,/home/salzman/local/i4_gcc/lib/libscotcherrexit.a]" 
> --with-hypre=1 --download-hypre=yes --with-suitesparse=1 
> --with-suitesparse-include=/home/
> salzman/local/i4_gcc/include 
> --with-suitesparse-lib="[/home/salzman/local/i4_gcc/lib/libsuitesparseconfig.so,/home/salzman/local/i4_gcc/lib/libumfpack.so,/home/salzman/local/i4_gcc/lib/libk
> lu.so,/home/salzman/local/i4_gcc/lib/libcholmod.so,/home/salzman/local/i4_gcc/lib/libspqr.so,/home/salzman/local/i4_gcc/lib/libcolamd.so,/home/salzman/local/i4_gcc/lib/libccolamd.so,/home/s
> alzman/local/i4_gcc/lib/libcamd.so,/home/salzman/local/i4_gcc/lib/libamd.so,/home/salzman/local/i4_gcc/lib/libmetis.so]" 
> --download-suitesparse=no --with-python-exec=python3.12 --have-numpy
> =1 ---with-petsc4py=1 ---with-petsc4py-test-np=4 ---with-mpi4py=1 
> --prefix=/home/salzman/local/i4_gcc/real_arithmetic COPTFLAGS="-O3 -g 
> " CXXOPTFLAGS="-O3 -g " FOPTFLAGS="-O3 -g "
> [0]PETSC ERROR: #1 MatSetValues_SeqAIJ() at 
> /home/salzman/devel/PETSc/petsc/src/mat/impls/aij/seq/aij.c:426
> [0]PETSC ERROR: #2 MatSetValues() at 
> /home/salzman/devel/PETSc/petsc/src/mat/interface/matrix.c:1543
> [0]PETSC ERROR: #3 MatSetSeqMats_MPIAIJ() at 
> /home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:2965
> [0]PETSC ERROR: #4 MatCreateSubMatricesMPI_MPIXAIJ() at 
> /home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:3163
> [0]PETSC ERROR: #5 MatCreateSubMatricesMPI_MPIAIJ() at 
> /home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:3196
> [0]PETSC ERROR: #6 MatCreateSubMatricesMPI() at 
> /home/salzman/devel/PETSc/petsc/src/mat/interface/matrix.c:7293
> [0]PETSC ERROR: #7 main() at subnb.c:181
> [0]PETSC ERROR: No PETSc Option Table entries
> [0]PETSC ERROR: ----------------End of Error Message -------send 
> entire error message to petsc-maint at mcs.anl.gov----------
> --------------------------------------------------------------------------
>
> This message comes from executing the attached test (I simplified the 
> test by removing the block size from the matrix used for extraction, 
> compared to the July test). In proc_xx_output.txt, you will find the 
> output from the code execution with the -ok option (i.e. irow/idxr and 
> icol/idxc are the same, i.e. a square sub-block for colour 0 
> distributed across the first two processes).
>
> Has expected in this case we obtain the 0,3,6,9 sub-block terms, which 
> are distributed across processes 0 and 1 (two rows per proc).
>
> When asking for rectangular sub-block (i.e. with no option) it crash 
> with column to large on process 0: 4 col max 3 ??? I ask for 4 rows 
> and 2 columns in this process ???
>
> Otherwise, I mention the dense aspect of the matrix in ex183.c, 
> because, in this case, no matter what selection is requested, all 
> terms are non-null. If there is an issue with the way the selection is 
> coded in the user program, I think it will be masked thanks to the 
> full graph representation. However, this may not be the case — I 
> should test it.
>
> I'll take a look at ex23.c.
>
> Thanks,
>
> A.S.
>
>
>
> Le 25/08/2025 à 17:55, Mark Adams a écrit :
>> Ah, OK, never say never.
>>
>> MatCreateSubMatrices seems to support creating a new matrix with the 
>> communicator of the IS.
>> It just needs to read from the input matrix and does not use it for 
>> communication, so it can do that.
>>
>> As far as rectangular matrices, there is no reason not to support 
>> that (the row IS and column IS can be distinct).
>> Can you send the whole error message?
>> There may not be a test that does this, but src/mat/tests/ex23.c 
>> looks like it may be a rectangular matrix output.
>>
>> And, it should not matter if the input matrix has a 100% full sparse 
>> matrix. It is still MatAIJ.
>> The semantics and API is the same for sparse or dense matrices.
>>
>> Thanks,
>> Mark
>>
>> On Mon, Aug 25, 2025 at 7:31 AM Alexis SALZMAN 
>> <alexis.salzman at ec-nantes.fr> wrote:
>>
>>     Hi,
>>
>>     Thanks for your answer, Mark. Perhaps MatCreateSubMatricesMPI is
>>     the only PETSc function that acts on a sub-communicator — I'm not
>>     sure — but it's clear that there's no ambiguity on that point.
>>     The first line of the documentation for that function states that
>>     it 'may live on subcomms'. This is confirmed by the
>>     'src/mat/tests/ex183.c' test case. I used this test case to
>>     understand the function, which helped me with my code and the
>>     example I provided in my initial post. Unfortunately, in this
>>     example, the matrix from which the sub-matrices are extracted is
>>     dense, even though it uses a sparse structure. This does not
>>     clarify how to define sub-matrices when extracting from a sparse
>>     distributed matrix. Since my initial post, I have discovered that
>>     having more columns than rows can also result in the same error
>>     message.
>>
>>     So, my questions boil down to:
>>
>>     Can MatCreateSubMatricesMPI extract rectangular matrices from a
>>     square distributed sparse matrix?
>>
>>     If not, the fact that only square matrices can be extracted in
>>     this context should perhaps be mentioned in the documentation.
>>
>>     If so, I would be very grateful for any assistance in defining an
>>     IS pair in this context.
>>
>>     Regards
>>
>>     A.S.
>>
>>     Le 27/07/2025 à 00:15, Mark Adams a écrit :
>>>     First, you can not mix communicators in PETSc calls in general
>>>     (ever?), but this error looks like you might be asking for a row
>>>     from the matrix that does not exist.
>>>     You should start with a PETSc example code. Test it and
>>>     modify it to suit your needs.
>>>
>>>     Good luck,
>>>     Mark
>>>
>>>     On Fri, Jul 25, 2025 at 9:31 AM Alexis SALZMAN
>>>     <alexis.salzman at ec-nantes.fr> wrote:
>>>
>>>         Hi,
>>>
>>>         As I am relatively new to Petsc, I may have misunderstood
>>>         how to use the
>>>         MatCreateSubMatricesMPI function. The attached code is tuned
>>>         for three
>>>         processes and extracts one matrix for each colour of a
>>>         subcommunicator
>>>         that has been created using the MPI_Comm_split function from
>>>         an  MPIAij
>>>         matrix. The following error message appears when the code is
>>>         set to its
>>>         default configuration (i.e. when a rectangular matrix is
>>>         extracted with
>>>         more rows than columns for colour 0):
>>>
>>>         [0]PETSC ERROR: --------------------- Error Message
>>>         --------------------------------------------------------------
>>>         [0]PETSC ERROR: Argument out of range
>>>         [0]PETSC ERROR: Column too large: col 4 max 3
>>>         [0]PETSC ERROR: See
>>>         https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ZqH097BZ0G0O3WI7RWrwIKFNpyk0czSWEqfusAeTlgEygAffwpgBUzsLw1TIoGkjZ3mYG-NRQxxFoxU4y8EyY0ofiz9I43Qwe0w$
>>>         for trouble shooting.
>>>         [0]PETSC ERROR: Petsc Release Version 3.22.2, unknown
>>>
>>>         ... petsc git hash 2a89477b25f compiled on a dell i9
>>>         computer with Gcc
>>>         14.3, mkl 2025.2, .....
>>>         [0]PETSC ERROR: #1 MatSetValues_SeqAIJ() at
>>>         ...petsc/src/mat/impls/aij/seq/aij.c:426
>>>         [0]PETSC ERROR: #2 MatSetValues() at
>>>         ...petsc/src/mat/interface/matrix.c:1543
>>>         [0]PETSC ERROR: #3 MatSetSeqMats_MPIAIJ() at
>>>         .../petsc/src/mat/impls/aij/mpi/mpiov.c:2965
>>>         [0]PETSC ERROR: #4 MatCreateSubMatricesMPI_MPIXAIJ() at
>>>         .../petsc/src/mat/impls/aij/mpi/mpiov.c:3163
>>>         [0]PETSC ERROR: #5 MatCreateSubMatricesMPI_MPIAIJ() at
>>>         .../petsc/src/mat/impls/aij/mpi/mpiov.c:3196
>>>         [0]PETSC ERROR: #6 MatCreateSubMatricesMPI() at
>>>         .../petsc/src/mat/interface/matrix.c:7293
>>>         [0]PETSC ERROR: #7 main() at sub.c:169
>>>
>>>         When the '-ok' option is selected, the code extracts a
>>>         square matrix for
>>>         colour 0, which runs smoothly in this case. Selecting the
>>>         '-trans'
>>>         option swaps the row and column selection indices, providing a
>>>         transposed submatrix smoothly. For colour 1, which uses only
>>>         one process
>>>         and is therefore sequential, rectangular extraction is OK
>>>         regardless of
>>>         the shape.
>>>
>>>         Is this dependency on the shape expected? Have I missed an
>>>         important
>>>         tuning step somewhere?
>>>
>>>         Thank you in advance for any clarification.
>>>
>>>         Regards
>>>
>>>         A.S.
>>>
>>>         P.S.: I'm sorry, but as I'm leaving my office for the
>>>         following weeks
>>>         this evening, I won't be very responsive during this period.
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250826/bb4669cf/attachment-0001.html>


More information about the petsc-users mailing list