[petsc-users] Memory Corruption Error in MatPartitioningApply

Eda Oktay eda.oktay at metu.edu.tr
Thu May 9 01:46:18 CDT 2019


I misread local sizes of the matrix. Without using valgrind, I was able to
fix the problem by using small sized matrix. It turned out that there is an
indexing mistake.

Thank you!

Eda

On Thu, May 9, 2019, 8:53 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:

>
>   Did you ever make progress on this issue?
>
> > On Apr 22, 2019, at 8:47 AM, Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> >
> >
> >  Are you able to run under valgrind? It is a bit better than the PETSc
> malloc to find each instance of memory corruption and the sooner you find
> it the easier it is to find the bug.
> https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> >
> >
> >
> >> On Apr 22, 2019, at 7:31 AM, Eda Oktay via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> >>
> >> Hello,
> >>
> >> I am trying to partition an odd-numbered sized (for example 4253*4253),
> square permutation matrix by using 2 processors with ParMETIS. The
> permutation matrix is obtained by permuting the matrix by an index set "is"
> (MatPermute(A,is,is,&PL)). I checked the index set, it gives a permutation
> and it is correct.
> >>
> >> When I look at the local size of the matrix, it is given by 2127 and
> 2127 on each processor, so in order the local sizes of matrix and index
> sets to be same, I defined the index sets' sizes as 2127 and 2127.
> >>
> >> When I do that, I get memory corruption error in MatPartiitioningApply
> function. The error is as follows:
> >>
> >> [0]PETSC ERROR: PetscMallocValidate: error detected at
> MatPartitioningApply_Parmetis_Private() line 141 in
> /home/edaoktay/petsc-3.10.3/src/mat/partition/impls/pmetis/pmetis.c
> >> [0]PETSC ERROR: Memory [id=0(8512)] at address 0x19e6870 is corrupted
> (probably write past end of array)
> >> [0]PETSC ERROR: Memory originally allocated in main() line 310 in
> /home/edaoktay/petsc-3.10.3/arch-linux2-c-debug/share/slepc/examples/src/eda/TEK_SAYI_SON_YENI_DENEME_TEMIZ_ENYENI_FINAL.c
> >> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >> [0]PETSC ERROR: Memory corruption:
> http://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind
> >> [0]PETSC ERROR:
> >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> >> [0]PETSC ERROR: Petsc Release Version 3.10.3, Dec, 18, 2018
> >> [0]PETSC ERROR: ./TEK_SAYI_SON_YENI_DENEME_TEMIZ_ENYENI_FINAL on a
> arch-linux2-c-debug named 13ed.wls.metu.edu.tr by edaoktay Mon Apr 22
> 14:58:52 2019
> >> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --with-cxx-dialect=C++11 --download-openblas
> --download-metis --download-parmetis --download-superlu_dist
> --download-slepc --download-mpich
> >> [0]PETSC ERROR: #1 PetscMallocValidate() line 146 in
> /home/edaoktay/petsc-3.10.3/src/sys/memory/mtr.c
> >> [0]PETSC ERROR: #2 MatPartitioningApply_Parmetis_Private() line 141 in
> /home/edaoktay/petsc-3.10.3/src/mat/partition/impls/pmetis/pmetis.c
> >> [0]PETSC ERROR: #3 MatPartitioningApply_Parmetis() line 215 in
> /home/edaoktay/petsc-3.10.3/src/mat/partition/impls/pmetis/pmetis.c
> >> [0]PETSC ERROR: #4 MatPartitioningApply() line 340 in
> /home/edaoktay/petsc-3.10.3/src/mat/partition/partition.c
> >> [0]PETSC ERROR: #5 main() line 374 in
> /home/edaoktay/petsc-3.10.3/arch-linux2-c-debug/share/slepc/examples/src/eda/TEK_SAYI_SON_YENI_DENEME_TEMIZ_ENYENI_FINAL.c
> >> [0]PETSC ERROR: PETSc Option Table entries:
> >> [0]PETSC ERROR: -f
> /home/edaoktay/petsc-3.10.3/share/petsc/datafiles/matrices/binary_files/airfoil1_binary
> >> [0]PETSC ERROR: -mat_partitioning_type parmetis
> >> [0]PETSC ERROR: -unweighted
> >> [0]PETSC ERROR: ----------------End of Error Message -------send entire
> error message to petsc-maint at mcs.anl.gov----------
> >>
> >>
> >> The line 310 is PetscMalloc1(ss,&idxx). The part of my program is
> written as below:
> >>
> >>  if (mod != 0){
> >>      ss = (siz+1)/size;//(siz+size-mod)/size;
> >>  } else{
> >>      ss = siz/size;
> >>  }
> >>
> >> PetscMalloc1(ss,&idxx);                                       // LINE
> 310
> >>
> >>  if (rank != size-1) {
> >>    j =0;
> >>    for (i=rank*ss; i<(rank+1)*ss; i++) {
> >>      idxx[j] = idx[i];
> >>      j++;
> >>    }
> >>
> >>  } else {
> >>
> >>    j =0;
> >>    for (i=rank*ss; i<siz; i++) {
> >>      idxx[j] = idx[i];
> >>      j++;
> >>    }
> >>
> >>  }
> >>
> >>  if (mod != 0){
> >>    if (rank<mod){
> >>        idxx[ss+1] = idx[ss*size+rank+1];
> >>    }
> >>  }
> >>
> >>  /*Permute matrix L (spy(A(p1,p1))*/
> >>
> >>    if (mod != 0){
> >>        if (rank<mod){
> >>            ierr =
> ISCreateGeneral(PETSC_COMM_WORLD,ss+1,idxx,PETSC_COPY_VALUES,&is);CHKERRQ(ierr);
> >>        } else{
> >>            ierr =
> ISCreateGeneral(PETSC_COMM_WORLD,ss,idxx,PETSC_COPY_VALUES,&is);CHKERRQ(ierr);
> >>        }
> >>
> >>    }else {
> >>        ierr =
> ISCreateGeneral(PETSC_COMM_WORLD,ss,idxx,PETSC_COPY_VALUES,&is);CHKERRQ(ierr);
> >>    }
> >>
> >>  ierr = ISSetPermutation(is);CHKERRQ(ierr);
> >>  ierr = MatPermute(A,is,is,&PL);CHKERRQ(ierr);
>
> >>
> >>    /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> -
> >>                    Create Partitioning
> >>     - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> */
> >>
> >>  ierr = MatConvert(PL,MATMPIADJ,MAT_INITIAL_MATRIX,&AL);CHKERRQ(ierr);
>
> >>  ierr = MatPartitioningCreate(MPI_COMM_WORLD,&part);CHKERRQ(ierr);
> >>  ierr = MatPartitioningSetAdjacency(part,AL);CHKERRQ(ierr);
>
> >>  ierr = MatPartitioningSetFromOptions(part);CHKERRQ(ierr);
> >>  ierr = MatPartitioningApply(part,&partitioning);CHKERRQ(ierr);
> >>
> >> I understood that I cannot change the local size of the matrix since it
> is read from a file. But as you can see above, when I defined index sets'
> sizes as 2127 and 2127, memory corruption occurs. I tried several things
> but at the end I got error in MatPermute or here.
> >>
> >> By the way, idx is from 0 to 4252 but the global size of is is 4253. If
> I change idx to 0:4253 then I think it will be incorrect since actually
> there is no 4254th element.
> >>
> >> How can I solve this problem?
> >>
> >> Thank you,
> >>
> >> Eda
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190509/668e7847/attachment.html>


More information about the petsc-users mailing list