[petsc-users] SuperLU_Dist bug or "intentional error"

Junchao Zhang jczhang at mcs.anl.gov
Mon Mar 9 15:27:57 CDT 2020


I checked and I could ran your test correctly with petsc master.
--Junchao Zhang


On Mon, Mar 9, 2020 at 2:51 PM Junchao Zhang <jczhang at mcs.anl.gov> wrote:

> Let me try it.  BTW, did you find the same code at
> https://gitlab.com/petsc/petsc/-/blob/master/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> --Junchao Zhang
>
>
> On Mon, Mar 9, 2020 at 2:46 PM Rochan Upadhyay <u.rochan at gmail.com> wrote:
>
>> Hi Junchao,
>> I doubt if it was fixed as diff of the
>> src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c between master-branch and
>> version 12.4 shows no changes.
>> I am unable to compile the master version (configure.log attached) but I
>> think you can recreate the problem by running the ex12.c program that
>> I attached on my previous mail.
>> Regards,
>> Rochan
>>
>> On Mon, Mar 9, 2020 at 2:22 PM Junchao Zhang <jczhang at mcs.anl.gov> wrote:
>>
>>> Could you try the master branch since it seems Stefano fixed this
>>> problem recently?
>>> --Junchao Zhang
>>>
>>>
>>> On Mon, Mar 9, 2020 at 2:04 PM Rochan Upadhyay <u.rochan at gmail.com>
>>> wrote:
>>>
>>>> Dear PETSc Developers,
>>>>
>>>> I am having trouble interfacing SuperLU_Dist as a direct solver for
>>>> certain problems in PETSc. The problem is that when interfacing with
>>>> SuperLU_Dist, you need your matrix to be of Type MPISEQAIJ when running MPI
>>>> with one processor. PETSc has long allowed the use of Matrix type MPIAIJ
>>>> for all MPI runs, including MPI with a single processor and that is still
>>>> the case for all of PETSc's native solvers. This however has been broken
>>>> for the SuperLU_Dist option. The following code snippet (in PETSc and not
>>>> SuperLU_Dist) is responsible for this restriction and I do not know if it
>>>> is by design or accident :
>>>>
>>>> In file petsc-3.12.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>>> line 257 onwards :
>>>>
>>>> ierr =
>>>> MPI_Comm_size(PetscObjectComm((PetscObject)A),&size);CHKERRQ(ierr);
>>>>
>>>>   if (size == 1) {
>>>>     aa = (Mat_SeqAIJ*)A->data;
>>>>     rstart = 0;
>>>>     nz     = aa->nz;
>>>>   } else {
>>>>     Mat_MPIAIJ *mat = (Mat_MPIAIJ*)A->data;
>>>>     aa = (Mat_SeqAIJ*)(mat->A)->data;
>>>>     bb = (Mat_SeqAIJ*)(mat->B)->data;
>>>>     ai = aa->i; aj = aa->j;
>>>>     bi = bb->i; bj = bb->j;
>>>>
>>>> The code seems to check for number of processors and if it is = 1
>>>> conclude that the matrix is a Mat_SeqAIJ and perform some operations. Only
>>>> if number-of-procs > 1 then it assumes that matrix is of type Mat_MPIAIJ. I
>>>> think this is unwieldy and lacks generality. One would like the same piece
>>>> of code to run in MPI mode for all processors with type Mat_MPIAIJ. Also
>>>> this restriction has suddenly appeared in a recent version. The issue was
>>>> not there until at least 3.9.4. So my question is from now (e.g. v12.4) on,
>>>> should we always use matrix type Mat_SeqAIJ when running on 1 processor
>>>> even with MPI enabled and use Mat_MPIAIJ for more than 1 processor. That is
>>>> use the number of processors in use as a criterion to set the matrix type ?
>>>>
>>>> A an illustration, I have attached a minor modification of KSP example
>>>> 12, that used to work with all PETSc versions until at least 3.9.4 but now
>>>> throws a segmentation fault. It was compiled with MPI and run with mpiexec
>>>> -n 1 ./ex12
>>>> If I remove the "ierr = MatSetType(A,MATMPIAIJ);CHKERRQ(ierr);" it is
>>>> okay.
>>>>
>>>> I hope you can clarify my confusion.
>>>>
>>>> Regards,
>>>> Rochan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200309/e09a10a4/attachment.html>


More information about the petsc-users mailing list