[petsc-users] MatCreateSubMatricesMPI strange behavior
Alexis SALZMAN
alexis.salzman at ec-nantes.fr
Mon Aug 25 13:00:54 CDT 2025
Thanks Mark for your attention.
The uncleaned error message, compared to my post in July, is as follows:
[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR: Argument out of range
[0]PETSC ERROR: Column too large: col 4 max 3
[0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dFZ4KSyqnoKD_8HJEOBBrLiK5TUCQmZbw09Dxau1D-3pxswHNP1D3HpEP-nXcrUdppQnRXo6rVLtt26Bd50bK6i7-_w38dW0qu4$ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.22.2, unknown
[0]PETSC ERROR: subnb with 3 MPI process(es) and PETSC_ARCH on
pc-str97.ec-nantes.fr by salzman Mon Aug 25 19:11:37 2025
[0]PETSC ERROR: Configure options: PETSC_ARCH=real_fc41_Release_gcc_i4
PETSC_DIR=/home/salzman/devel/ExternalLib/build/PETSC/petsc
--doCleanup=1 --with-scalar-type=real --known-level1-dcach
e-linesize=64 --with-cc=gcc --CFLAGS="-fPIC " --CC_LINKER_FLAGS=-fopenmp
--with-cxx=g++ --with-cxx-dialect=c++20 --CXXFLAGS="-fPIC "
--CXX_LINKER_FLAGS=-fopenmp --with-fc=gfortran --FFLAGS=
"-fPIC " --FC_LINKER_FLAGS=-fopenmp --with-debugging=0
--with-fortran-bindings=0 --with-fortran-kernels=1
--with-mpi-compilers=0 --with-mpi-include=/usr/include/openmpi-x86_64
--with-mpi-li
b="[/usr/lib64/openmpi/lib/libmpi.so,/usr/lib64/openmpi/lib/libmpi.so,/usr/lib64/openmpi/lib/libmpi_mpifh.so]"
--with-blas-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_intel_lp64.so,/opt/i
ntel/oneapi/mkl/latest/lib/libmkl_gnu_thread.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_core.so]"
--with-lapack-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_intel_lp64.so,/opt/intel/oneapi
/mkl/latest/lib/libmkl_gnu_thread.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_core.so]"
--with-mumps=1 --with-mumps-include=/home/salzman/local/i4_gcc/include
--with-mumps-lib="[/home/salzma
n/local/i4_gcc/lib/libdmumps.so,/home/salzman/local/i4_gcc/lib/libmumps_common.so,/home/salzman/local/i4_gcc/lib/libpord.so]"
--with-scalapack-lib="[/opt/intel/oneapi/mkl/latest/lib/libmkl_
scalapack_lp64.so,/opt/intel/oneapi/mkl/latest/lib/libmkl_blacs_openmpi_lp64.so]"
--with-mkl_pardiso=1
--with-mkl_pardiso-include=/opt/intel/oneapi/mkl/latest/include
--with-mkl_pardiso-lib
="[/opt/intel/oneapi/mkl/latest/lib/intel64/libmkl_intel_lp64.so]"
--with-hdf5=1 --with-hdf5-include=/usr/include/openmpi-x86_64
--with-hdf5-lib="[/usr/lib64/openmpi/lib/libhdf5.so]" --with
-pastix=0 --download-pastix=no --with-hwloc=1
--with-hwloc-dir=/home/salzman/local/i4_gcc --download-hwloc=no
--with-ptscotch-include=/home/salzman/local/i4_gcc/include
--with-ptscotch-lib=
"[/home/salzman/local/i4_gcc/lib/libptscotch.a,/home/salzman/local/i4_gcc/lib/libptscotcherr.a,/home/salzman/local/i4_gcc/lib/libptscotcherrexit.a,/home/salzman/local/i4_gcc/lib/libscotch.a
,/home/salzman/local/i4_gcc/lib/libscotcherr.a,/home/salzman/local/i4_gcc/lib/libscotcherrexit.a]"
--with-hypre=1 --download-hypre=yes --with-suitesparse=1
--with-suitesparse-include=/home/
salzman/local/i4_gcc/include
--with-suitesparse-lib="[/home/salzman/local/i4_gcc/lib/libsuitesparseconfig.so,/home/salzman/local/i4_gcc/lib/libumfpack.so,/home/salzman/local/i4_gcc/lib/libk
lu.so,/home/salzman/local/i4_gcc/lib/libcholmod.so,/home/salzman/local/i4_gcc/lib/libspqr.so,/home/salzman/local/i4_gcc/lib/libcolamd.so,/home/salzman/local/i4_gcc/lib/libccolamd.so,/home/s
alzman/local/i4_gcc/lib/libcamd.so,/home/salzman/local/i4_gcc/lib/libamd.so,/home/salzman/local/i4_gcc/lib/libmetis.so]"
--download-suitesparse=no --with-python-exec=python3.12 --have-numpy
=1 ---with-petsc4py=1 ---with-petsc4py-test-np=4 ---with-mpi4py=1
--prefix=/home/salzman/local/i4_gcc/real_arithmetic COPTFLAGS="-O3 -g "
CXXOPTFLAGS="-O3 -g " FOPTFLAGS="-O3 -g "
[0]PETSC ERROR: #1 MatSetValues_SeqAIJ() at
/home/salzman/devel/PETSc/petsc/src/mat/impls/aij/seq/aij.c:426
[0]PETSC ERROR: #2 MatSetValues() at
/home/salzman/devel/PETSc/petsc/src/mat/interface/matrix.c:1543
[0]PETSC ERROR: #3 MatSetSeqMats_MPIAIJ() at
/home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:2965
[0]PETSC ERROR: #4 MatCreateSubMatricesMPI_MPIXAIJ() at
/home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:3163
[0]PETSC ERROR: #5 MatCreateSubMatricesMPI_MPIAIJ() at
/home/salzman/devel/PETSc/petsc/src/mat/impls/aij/mpi/mpiov.c:3196
[0]PETSC ERROR: #6 MatCreateSubMatricesMPI() at
/home/salzman/devel/PETSc/petsc/src/mat/interface/matrix.c:7293
[0]PETSC ERROR: #7 main() at subnb.c:181
[0]PETSC ERROR: No PETSc Option Table entries
[0]PETSC ERROR: ----------------End of Error Message -------send entire
error message to petsc-maint at mcs.anl.gov----------
--------------------------------------------------------------------------
This message comes from executing the attached test (I simplified the
test by removing the block size from the matrix used for extraction,
compared to the July test). In proc_xx_output.txt, you will find the
output from the code execution with the -ok option (i.e. irow/idxr and
icol/idxc are the same, i.e. a square sub-block for colour 0 distributed
across the first two processes).
Has expected in this case we obtain the 0,3,6,9 sub-block terms, which
are distributed across processes 0 and 1 (two rows per proc).
When asking for rectangular sub-block (i.e. with no option) it crash
with column to large on process 0: 4 col max 3 ??? I ask for 4 rows and
2 columns in this process ???
Otherwise, I mention the dense aspect of the matrix in ex183.c, because,
in this case, no matter what selection is requested, all terms are
non-null. If there is an issue with the way the selection is coded in
the user program, I think it will be masked thanks to the full graph
representation. However, this may not be the case — I should test it.
I'll take a look at ex23.c.
Thanks,
A.S.
Le 25/08/2025 à 17:55, Mark Adams a écrit :
> Ah, OK, never say never.
>
> MatCreateSubMatrices seems to support creating a new matrix with the
> communicator of the IS.
> It just needs to read from the input matrix and does not use it for
> communication, so it can do that.
>
> As far as rectangular matrices, there is no reason not to support that
> (the row IS and column IS can be distinct).
> Can you send the whole error message?
> There may not be a test that does this, but src/mat/tests/ex23.c looks
> like it may be a rectangular matrix output.
>
> And, it should not matter if the input matrix has a 100% full sparse
> matrix. It is still MatAIJ.
> The semantics and API is the same for sparse or dense matrices.
>
> Thanks,
> Mark
>
> On Mon, Aug 25, 2025 at 7:31 AM Alexis SALZMAN
> <alexis.salzman at ec-nantes.fr> wrote:
>
> Hi,
>
> Thanks for your answer, Mark. Perhaps MatCreateSubMatricesMPI is
> the only PETSc function that acts on a sub-communicator — I'm not
> sure — but it's clear that there's no ambiguity on that point. The
> first line of the documentation for that function states that it
> 'may live on subcomms'. This is confirmed by the
> 'src/mat/tests/ex183.c' test case. I used this test case to
> understand the function, which helped me with my code and the
> example I provided in my initial post. Unfortunately, in this
> example, the matrix from which the sub-matrices are extracted is
> dense, even though it uses a sparse structure. This does not
> clarify how to define sub-matrices when extracting from a sparse
> distributed matrix. Since my initial post, I have discovered that
> having more columns than rows can also result in the same error
> message.
>
> So, my questions boil down to:
>
> Can MatCreateSubMatricesMPI extract rectangular matrices from a
> square distributed sparse matrix?
>
> If not, the fact that only square matrices can be extracted in
> this context should perhaps be mentioned in the documentation.
>
> If so, I would be very grateful for any assistance in defining an
> IS pair in this context.
>
> Regards
>
> A.S.
>
> Le 27/07/2025 à 00:15, Mark Adams a écrit :
>> First, you can not mix communicators in PETSc calls in general
>> (ever?), but this error looks like you might be asking for a row
>> from the matrix that does not exist.
>> You should start with a PETSc example code. Test it and modify it
>> to suit your needs.
>>
>> Good luck,
>> Mark
>>
>> On Fri, Jul 25, 2025 at 9:31 AM Alexis SALZMAN
>> <alexis.salzman at ec-nantes.fr> wrote:
>>
>> Hi,
>>
>> As I am relatively new to Petsc, I may have misunderstood how
>> to use the
>> MatCreateSubMatricesMPI function. The attached code is tuned
>> for three
>> processes and extracts one matrix for each colour of a
>> subcommunicator
>> that has been created using the MPI_Comm_split function from
>> an MPIAij
>> matrix. The following error message appears when the code is
>> set to its
>> default configuration (i.e. when a rectangular matrix is
>> extracted with
>> more rows than columns for colour 0):
>>
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> [0]PETSC ERROR: Argument out of range
>> [0]PETSC ERROR: Column too large: col 4 max 3
>> [0]PETSC ERROR: See
>> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ZqH097BZ0G0O3WI7RWrwIKFNpyk0czSWEqfusAeTlgEygAffwpgBUzsLw1TIoGkjZ3mYG-NRQxxFoxU4y8EyY0ofiz9I43Qwe0w$
>> for trouble shooting.
>> [0]PETSC ERROR: Petsc Release Version 3.22.2, unknown
>>
>> ... petsc git hash 2a89477b25f compiled on a dell i9 computer
>> with Gcc
>> 14.3, mkl 2025.2, .....
>> [0]PETSC ERROR: #1 MatSetValues_SeqAIJ() at
>> ...petsc/src/mat/impls/aij/seq/aij.c:426
>> [0]PETSC ERROR: #2 MatSetValues() at
>> ...petsc/src/mat/interface/matrix.c:1543
>> [0]PETSC ERROR: #3 MatSetSeqMats_MPIAIJ() at
>> .../petsc/src/mat/impls/aij/mpi/mpiov.c:2965
>> [0]PETSC ERROR: #4 MatCreateSubMatricesMPI_MPIXAIJ() at
>> .../petsc/src/mat/impls/aij/mpi/mpiov.c:3163
>> [0]PETSC ERROR: #5 MatCreateSubMatricesMPI_MPIAIJ() at
>> .../petsc/src/mat/impls/aij/mpi/mpiov.c:3196
>> [0]PETSC ERROR: #6 MatCreateSubMatricesMPI() at
>> .../petsc/src/mat/interface/matrix.c:7293
>> [0]PETSC ERROR: #7 main() at sub.c:169
>>
>> When the '-ok' option is selected, the code extracts a square
>> matrix for
>> colour 0, which runs smoothly in this case. Selecting the
>> '-trans'
>> option swaps the row and column selection indices, providing a
>> transposed submatrix smoothly. For colour 1, which uses only
>> one process
>> and is therefore sequential, rectangular extraction is OK
>> regardless of
>> the shape.
>>
>> Is this dependency on the shape expected? Have I missed an
>> important
>> tuning step somewhere?
>>
>> Thank you in advance for any clarification.
>>
>> Regards
>>
>> A.S.
>>
>> P.S.: I'm sorry, but as I'm leaving my office for the
>> following weeks
>> this evening, I won't be very responsive during this period.
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250825/c20c8c82/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: subnb.c
Type: text/x-csrc
Size: 6093 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250825/c20c8c82/attachment-0001.bin>
-------------- next part --------------
rstart 0 rend 4
Mat Object: 3 MPI processes
type: mpiaij
row 0: (0, 101.) (3, 104.) (6, 107.) (9, 110.)
row 1: (2, 203.) (5, 206.) (8, 209.) (11, 212.)
row 2: (1, 302.) (4, 305.) (7, 308.) (10, 311.)
row 3: (0, 401.) (3, 404.) (6, 407.) (9, 410.)
row 4: (2, 503.) (5, 506.) (8, 509.) (11, 512.)
row 5: (1, 602.) (4, 605.) (7, 608.) (10, 611.)
row 6: (0, 701.) (3, 704.) (6, 707.) (9, 710.)
row 7: (2, 803.) (5, 806.) (8, 809.) (11, 812.)
row 8: (1, 902.) (4, 905.) (7, 908.) (10, 911.)
row 9: (0, 1001.) (3, 1004.) (6, 1007.) (9, 1010.)
row 10: (2, 1103.) (5, 1106.) (8, 1109.) (11, 1112.)
row 11: (1, 1202.) (4, 1205.) (7, 1208.) (10, 1211.)
idxr proc
IS Object: 2 MPI processes
type: general
[0] Number of indices in set 2
[0] 0 0
[0] 1 3
[1] Number of indices in set 2
[1] 0 6
[1] 1 9
idxc proc
IS Object: 2 MPI processes
type: general
[0] Number of indices in set 2
[0] 0 0
[0] 1 3
[1] Number of indices in set 2
[1] 0 6
[1] 1 9
Mat Object: 2 MPI processes
type: mpiaij
row 0: (0, 101.) (1, 104.) (2, 107.) (3, 110.)
row 1: (0, 401.) (1, 404.) (2, 407.) (3, 410.)
row 2: (0, 701.) (1, 704.) (2, 707.) (3, 710.)
row 3: (0, 1001.) (1, 1004.) (2, 1007.) (3, 1010.)
rstart 0 rend 2
local row 0: ( 0 , 1.010000e+02) ( 1 , 1.040000e+02) ( 2 , 1.070000e+02) ( 3 , 1.100000e+02)
local row 1: ( 0 , 4.010000e+02) ( 1 , 4.040000e+02) ( 2 , 4.070000e+02) ( 3 , 4.100000e+02)
-------------- next part --------------
rstart 4 rend 8
idxr proc
idxc proc
rstart 2 rend 4
local row 2: ( 0 , 7.010000e+02) ( 1 , 7.040000e+02) ( 2 , 7.070000e+02) ( 3 , 7.100000e+02)
local row 3: ( 0 , 1.001000e+03) ( 1 , 1.004000e+03) ( 2 , 1.007000e+03) ( 3 , 1.010000e+03)
-------------- next part --------------
rstart 8 rend 12
idxr proc
IS Object: 1 MPI process
type: general
Number of indices in set 12
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 11
idxc proc
IS Object: 1 MPI process
type: general
Number of indices in set 4
0 0
1 1
2 8
3 9
Mat Object: 1 MPI process
type: seqaij
row 0: (0, 101.) (3, 110.)
row 1: (2, 209.)
row 2: (1, 302.)
row 3: (0, 401.) (3, 410.)
row 4: (2, 509.)
row 5: (1, 602.)
row 6: (0, 701.) (3, 710.)
row 7: (2, 809.)
row 8: (1, 902.)
row 9: (0, 1001.) (3, 1010.)
row 10: (2, 1109.)
row 11: (1, 1202.)
rstart 0 rend 12
local row 0: ( 0 , 1.010000e+02) ( 3 , 1.100000e+02)
local row 1: ( 2 , 2.090000e+02)
local row 2: ( 1 , 3.020000e+02)
local row 3: ( 0 , 4.010000e+02) ( 3 , 4.100000e+02)
local row 4: ( 2 , 5.090000e+02)
local row 5: ( 1 , 6.020000e+02)
local row 6: ( 0 , 7.010000e+02) ( 3 , 7.100000e+02)
local row 7: ( 2 , 8.090000e+02)
local row 8: ( 1 , 9.020000e+02)
local row 9: ( 0 , 1.001000e+03) ( 3 , 1.010000e+03)
local row 10: ( 2 , 1.109000e+03)
local row 11: ( 1 , 1.202000e+03)
More information about the petsc-users
mailing list