[petsc-users] SuperLU_Dist bug or "intentional error"

Junchao Zhang jczhang at mcs.anl.gov
Mon Mar 9 14:51:37 CDT 2020


Let me try it.  BTW, did you find the same code at
https://gitlab.com/petsc/petsc/-/blob/master/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
--Junchao Zhang


On Mon, Mar 9, 2020 at 2:46 PM Rochan Upadhyay <u.rochan at gmail.com> wrote:

> Hi Junchao,
> I doubt if it was fixed as diff of the
> src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c between master-branch and
> version 12.4 shows no changes.
> I am unable to compile the master version (configure.log attached) but I
> think you can recreate the problem by running the ex12.c program that
> I attached on my previous mail.
> Regards,
> Rochan
>
> On Mon, Mar 9, 2020 at 2:22 PM Junchao Zhang <jczhang at mcs.anl.gov> wrote:
>
>> Could you try the master branch since it seems Stefano fixed this problem
>> recently?
>> --Junchao Zhang
>>
>>
>> On Mon, Mar 9, 2020 at 2:04 PM Rochan Upadhyay <u.rochan at gmail.com>
>> wrote:
>>
>>> Dear PETSc Developers,
>>>
>>> I am having trouble interfacing SuperLU_Dist as a direct solver for
>>> certain problems in PETSc. The problem is that when interfacing with
>>> SuperLU_Dist, you need your matrix to be of Type MPISEQAIJ when running MPI
>>> with one processor. PETSc has long allowed the use of Matrix type MPIAIJ
>>> for all MPI runs, including MPI with a single processor and that is still
>>> the case for all of PETSc's native solvers. This however has been broken
>>> for the SuperLU_Dist option. The following code snippet (in PETSc and not
>>> SuperLU_Dist) is responsible for this restriction and I do not know if it
>>> is by design or accident :
>>>
>>> In file petsc-3.12.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> line 257 onwards :
>>>
>>> ierr =
>>> MPI_Comm_size(PetscObjectComm((PetscObject)A),&size);CHKERRQ(ierr);
>>>
>>>   if (size == 1) {
>>>     aa = (Mat_SeqAIJ*)A->data;
>>>     rstart = 0;
>>>     nz     = aa->nz;
>>>   } else {
>>>     Mat_MPIAIJ *mat = (Mat_MPIAIJ*)A->data;
>>>     aa = (Mat_SeqAIJ*)(mat->A)->data;
>>>     bb = (Mat_SeqAIJ*)(mat->B)->data;
>>>     ai = aa->i; aj = aa->j;
>>>     bi = bb->i; bj = bb->j;
>>>
>>> The code seems to check for number of processors and if it is = 1
>>> conclude that the matrix is a Mat_SeqAIJ and perform some operations. Only
>>> if number-of-procs > 1 then it assumes that matrix is of type Mat_MPIAIJ. I
>>> think this is unwieldy and lacks generality. One would like the same piece
>>> of code to run in MPI mode for all processors with type Mat_MPIAIJ. Also
>>> this restriction has suddenly appeared in a recent version. The issue was
>>> not there until at least 3.9.4. So my question is from now (e.g. v12.4) on,
>>> should we always use matrix type Mat_SeqAIJ when running on 1 processor
>>> even with MPI enabled and use Mat_MPIAIJ for more than 1 processor. That is
>>> use the number of processors in use as a criterion to set the matrix type ?
>>>
>>> A an illustration, I have attached a minor modification of KSP example
>>> 12, that used to work with all PETSc versions until at least 3.9.4 but now
>>> throws a segmentation fault. It was compiled with MPI and run with mpiexec
>>> -n 1 ./ex12
>>> If I remove the "ierr = MatSetType(A,MATMPIAIJ);CHKERRQ(ierr);" it is
>>> okay.
>>>
>>> I hope you can clarify my confusion.
>>>
>>> Regards,
>>> Rochan
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200309/b7bc0a33/attachment.html>


More information about the petsc-users mailing list