[petsc-users] Memory Corruption Error in MatPartitioningApply

Smith, Barry F. bsmith at mcs.anl.gov
Mon Apr 22 08:47:19 CDT 2019


  Are you able to run under valgrind? It is a bit better than the PETSc malloc to find each instance of memory corruption and the sooner you find it the easier it is to find the bug. https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind



> On Apr 22, 2019, at 7:31 AM, Eda Oktay via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hello,
> 
> I am trying to partition an odd-numbered sized (for example 4253*4253), square permutation matrix by using 2 processors with ParMETIS. The permutation matrix is obtained by permuting the matrix by an index set "is" (MatPermute(A,is,is,&PL)). I checked the index set, it gives a permutation and it is correct.
> 
> When I look at the local size of the matrix, it is given by 2127 and 2127 on each processor, so in order the local sizes of matrix and index sets to be same, I defined the index sets' sizes as 2127 and 2127. 
> 
> When I do that, I get memory corruption error in MatPartiitioningApply function. The error is as follows:
> 
> [0]PETSC ERROR: PetscMallocValidate: error detected at MatPartitioningApply_Parmetis_Private() line 141 in /home/edaoktay/petsc-3.10.3/src/mat/partition/impls/pmetis/pmetis.c
> [0]PETSC ERROR: Memory [id=0(8512)] at address 0x19e6870 is corrupted (probably write past end of array)
> [0]PETSC ERROR: Memory originally allocated in main() line 310 in /home/edaoktay/petsc-3.10.3/arch-linux2-c-debug/share/slepc/examples/src/eda/TEK_SAYI_SON_YENI_DENEME_TEMIZ_ENYENI_FINAL.c
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Memory corruption: http://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind
> [0]PETSC ERROR:  
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.10.3, Dec, 18, 2018 
> [0]PETSC ERROR: ./TEK_SAYI_SON_YENI_DENEME_TEMIZ_ENYENI_FINAL on a arch-linux2-c-debug named 13ed.wls.metu.edu.tr by edaoktay Mon Apr 22 14:58:52 2019
> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-cxx-dialect=C++11 --download-openblas --download-metis --download-parmetis --download-superlu_dist --download-slepc --download-mpich
> [0]PETSC ERROR: #1 PetscMallocValidate() line 146 in /home/edaoktay/petsc-3.10.3/src/sys/memory/mtr.c
> [0]PETSC ERROR: #2 MatPartitioningApply_Parmetis_Private() line 141 in /home/edaoktay/petsc-3.10.3/src/mat/partition/impls/pmetis/pmetis.c
> [0]PETSC ERROR: #3 MatPartitioningApply_Parmetis() line 215 in /home/edaoktay/petsc-3.10.3/src/mat/partition/impls/pmetis/pmetis.c
> [0]PETSC ERROR: #4 MatPartitioningApply() line 340 in /home/edaoktay/petsc-3.10.3/src/mat/partition/partition.c
> [0]PETSC ERROR: #5 main() line 374 in /home/edaoktay/petsc-3.10.3/arch-linux2-c-debug/share/slepc/examples/src/eda/TEK_SAYI_SON_YENI_DENEME_TEMIZ_ENYENI_FINAL.c
> [0]PETSC ERROR: PETSc Option Table entries:
> [0]PETSC ERROR: -f /home/edaoktay/petsc-3.10.3/share/petsc/datafiles/matrices/binary_files/airfoil1_binary
> [0]PETSC ERROR: -mat_partitioning_type parmetis
> [0]PETSC ERROR: -unweighted
> [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
> 
> 
> The line 310 is PetscMalloc1(ss,&idxx). The part of my program is written as below:
> 
>   if (mod != 0){
>       ss = (siz+1)/size;//(siz+size-mod)/size;
>   } else{
>       ss = siz/size;
>   }
>   
> PetscMalloc1(ss,&idxx);                                       // LINE 310
> 
>   if (rank != size-1) {
>     j =0;
>     for (i=rank*ss; i<(rank+1)*ss; i++) {
>       idxx[j] = idx[i];
>       j++;
>     }
> 
>   } else {
>       
>     j =0;
>     for (i=rank*ss; i<siz; i++) {
>       idxx[j] = idx[i];
>       j++;
>     }
>       
>   }
>    
>   if (mod != 0){
>     if (rank<mod){
>         idxx[ss+1] = idx[ss*size+rank+1];
>     }
>   }
> 
>   /*Permute matrix L (spy(A(p1,p1))*/
>   
>     if (mod != 0){
>         if (rank<mod){
>             ierr = ISCreateGeneral(PETSC_COMM_WORLD,ss+1,idxx,PETSC_COPY_VALUES,&is);CHKERRQ(ierr);
>         } else{
>             ierr = ISCreateGeneral(PETSC_COMM_WORLD,ss,idxx,PETSC_COPY_VALUES,&is);CHKERRQ(ierr);
>         } 
>         
>     }else {
>         ierr = ISCreateGeneral(PETSC_COMM_WORLD,ss,idxx,PETSC_COPY_VALUES,&is);CHKERRQ(ierr);
>     }
> 
>   ierr = ISSetPermutation(is);CHKERRQ(ierr); 
>   ierr = MatPermute(A,is,is,&PL);CHKERRQ(ierr);                                
> 
>     /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>                     Create Partitioning
>      - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
>   
>   ierr = MatConvert(PL,MATMPIADJ,MAT_INITIAL_MATRIX,&AL);CHKERRQ(ierr);                    
>   ierr = MatPartitioningCreate(MPI_COMM_WORLD,&part);CHKERRQ(ierr);
>   ierr = MatPartitioningSetAdjacency(part,AL);CHKERRQ(ierr);                             
>   ierr = MatPartitioningSetFromOptions(part);CHKERRQ(ierr);
>   ierr = MatPartitioningApply(part,&partitioning);CHKERRQ(ierr);        
> 
> I understood that I cannot change the local size of the matrix since it is read from a file. But as you can see above, when I defined index sets' sizes as 2127 and 2127, memory corruption occurs. I tried several things but at the end I got error in MatPermute or here.
> 
> By the way, idx is from 0 to 4252 but the global size of is is 4253. If I change idx to 0:4253 then I think it will be incorrect since actually there is no 4254th element.
> 
> How can I solve this problem?
> 
> Thank you,
> 
> Eda



More information about the petsc-users mailing list