[petsc-users] On the edge of 2^31 unknowns

Eric Chamberland Eric.Chamberland at giref.ulaval.ca
Tue Jun 21 15:20:24 CDT 2016


Hi Hong,

thanks... but I just realized, that into petsc 3.5.4 (which I used for 
this calculus) the default *was* the scalable version... :/

So I prepared a patch (see attachment) to help debug, but it will be 
based on 3.7.2.  We may use it to extract information to do a better bug 
report.  Do you think there is enough PetscInfo added to resolve this issue?

What else would you add?

Thanks,

Eric

On 21/06/16 09:36 AM, hong at aspiritech.org wrote:
> Eric:
> The nonscalable implementation is robust, and faster for small to medium
> size problems, thus we set it as the default. You can switch with option
> '-matmatmult_via scalable', which requires estimate of nonzeros A*B.
> The estimate was buggy, not well-tested. If you encounter any problem,
> let us know.
>
> Hong
>
> On Mon, Jun 20, 2016 at 10:49 PM, Eric Chamberland
> <Eric.Chamberland at giref.ulaval.ca
> <mailto:Eric.Chamberland at giref.ulaval.ca>> wrote:
>
>
>
>     Le 2016-06-20 23:37, Barry Smith a écrit :
>
>             On Jun 20, 2016, at 10:32 PM, Eric Chamberland
>             <Eric.Chamberland at giref.ulaval.ca
>             <mailto:Eric.Chamberland at giref.ulaval.ca>> wrote:
>
>             ok, but what about -matmatmult_via scalable?
>
>             Both should work. It just may be one is faster or slower
>         than the other depending on the problem size.
>
>     ok, digging further, I found this into blaming the code:
>
>     0fc8cf34 (Hong Zhang       2013-06-27 14:04:58 -0500  696) /* same
>     as MatMatMultSymbolic_MPIAIJ_MPIAIJ_nonscalable(), except using
>     LLCondensed to avoid O(BN) memory requirement */
>
>     But the commit comment says:
>     ...
>          rename MatMatMultSymbolic_MPIAIJ_MPIAIJ ->
>     MatMatMultSymbolic_MPIAIJ_MPIAIJ_nonscalable (non-default)
>
>     But it *is* the default...  since another commit:
>
>     commit 0d3441ae8a080c728abf17e90308c510e39e951b
>     Author: Hong Zhang <hzhang at mcs.anl.gov <mailto:hzhang at mcs.anl.gov>>
>     Date:   Mon Aug 24 16:40:35 2015 -0500
>
>          add MatPtAPxxx_MPIAIJ_MPIAIJ_new
>
>     which changed the behaviour programmed in 0fc8cf34.  Is it normal?
>
>     Eric
>
>
>

-------------- next part --------------
commit a965ae203520bf228c3e87d1afcbac679da3e65a
Author: Eric Chamberland <ericc at giref.ulaval.ca>
Date:   Mon Jun 20 14:24:53 2016 -0400

    DBG: Info pour matmatmult

diff --git a/src/mat/impls/aij/mpi/mpimatmatmult.c b/src/mat/impls/aij/mpi/mpimatmatmult.c
index 816b735..6a81b70 100644
--- a/src/mat/impls/aij/mpi/mpimatmatmult.c
+++ b/src/mat/impls/aij/mpi/mpimatmatmult.c
@@ -212,12 +212,15 @@ PetscErrorCode MatMatMultSymbolic_MPIAIJ_MPIAIJ_nonscalable(Mat A,Mat P,PetscRea
 
   /* first, compute symbolic AP = A_loc*P = A_diag*P_loc + A_off*P_oth */
   /*-------------------------------------------------------------------*/
+  ierr = PetscInfo1(A,"MatMatMultSymns_mpi_mpi: PetscMalloc1 am+2 %d;\n",am+2);CHKERRQ(ierr);
+  printf("COUCOU DANS MATMATMULTMPINS\n");
   ierr      = PetscMalloc1(am+2,&api);CHKERRQ(ierr);
   ptap->api = api;
   api[0]    = 0;
 
   /* create and initialize a linked list */
-  ierr = PetscTableCreate(pN,pN,&ta);CHKERRQ(ierr); 
+  ierr = PetscInfo1(A,"MatMatMultSymns_mpi_mpi: PetscTableCreate pN %d;\n",pN);CHKERRQ(ierr);
+  ierr = PetscTableCreate(pN,pN,&ta);CHKERRQ(ierr);
   MatRowMergeMax_SeqAIJ(p_loc,ptap->P_loc->rmap->N,ta);
   MatRowMergeMax_SeqAIJ(p_oth,ptap->P_oth->rmap->N,ta);
   ierr = PetscTableGetCount(ta,&Crmax);CHKERRQ(ierr);
@@ -226,6 +229,11 @@ PetscErrorCode MatMatMultSymbolic_MPIAIJ_MPIAIJ_nonscalable(Mat A,Mat P,PetscRea
   ierr = PetscLLCondensedCreate(Crmax,pN,&lnk,&lnkbt);CHKERRQ(ierr);
 
   /* Initial FreeSpace size is fill*(nnz(A)+nnz(P)) */
+  ierr = PetscInfo1(A,"MatMatMultSymns_mpi_mpi: fill %g;\n",fill);CHKERRQ(ierr);
+  PetscInt lnnz = PetscIntSumTruncate(adi[am],PetscIntSumTruncate(aoi[am],pi_loc[pm]));
+  ierr = PetscInfo1(A,"MatMatMultSymns_mpi_mpi: lnnz %g;\n",lnnz);CHKERRQ(ierr);
+  PetscInt lMem = PetscRealIntMultTruncate(fill,PetscIntSumTruncate(adi[am],PetscIntSumTruncate(aoi[am],pi_loc[pm])));
+  ierr = PetscInfo1(A,"MatMatMultSymns_mpi_mpi: lMem %g;\n",lMem);CHKERRQ(ierr);
   ierr = PetscFreeSpaceGet(PetscRealIntMultTruncate(fill,PetscIntSumTruncate(adi[am],PetscIntSumTruncate(aoi[am],pi_loc[pm]))),&free_space);CHKERRQ(ierr);
   current_space = free_space;
 
@@ -733,6 +741,8 @@ PetscErrorCode MatMatMultSymbolic_MPIAIJ_MPIAIJ(Mat A,Mat P,PetscReal fill,Mat *
 
   /* first, compute symbolic AP = A_loc*P = A_diag*P_loc + A_off*P_oth */
   /*-------------------------------------------------------------------*/
+  ierr = PetscInfo1(A,"MatMatMultSym_mpi_mpi: PetscMalloc1 am+2 %d;\n",am+2);CHKERRQ(ierr);
+  printf("COUCOU DANS MATMATMULTMPI\n");
   ierr      = PetscMalloc1(am+2,&api);CHKERRQ(ierr);
   ptap->api = api;
   api[0]    = 0;
@@ -740,7 +750,9 @@ PetscErrorCode MatMatMultSymbolic_MPIAIJ_MPIAIJ(Mat A,Mat P,PetscReal fill,Mat *
   /* create and initialize a linked list */
   apnz_max = 6*(p_loc->rmax + (PetscInt)(1.e-2*pN)); /* expected apnz_max */
   if (apnz_max > pN) apnz_max = pN;
-  ierr = PetscTableCreate(apnz_max,pN,&ta);CHKERRQ(ierr); 
+  ierr = PetscInfo1(A,"MatMatMultSym_mpi_mpi: PetscTableCreate am+2 %d;\n",apnz_max);CHKERRQ(ierr);
+  ierr = PetscInfo1(A,"MatMatMultSym_mpi_mpi: PetscTableCreate pN %d;\n",pN);CHKERRQ(ierr);
+  ierr = PetscTableCreate(apnz_max,pN,&ta);CHKERRQ(ierr);
 
   /* Calculate apnz_max */
   apnz_max = 0;
@@ -765,6 +777,12 @@ PetscErrorCode MatMatMultSymbolic_MPIAIJ_MPIAIJ(Mat A,Mat P,PetscReal fill,Mat *
   ierr = PetscLLCondensedCreate_Scalable(apnz_max,&lnk);CHKERRQ(ierr);
 
   /* Initial FreeSpace size is fill*(nnz(A)+nnz(P)) */
+  ierr = PetscInfo1(A,"MatMatMultSym_mpi_mpi: fill %g;\n",fill);CHKERRQ(ierr);
+  PetscInt lnnz = PetscIntSumTruncate(adi[am],PetscIntSumTruncate(aoi[am],pi_loc[pm]));
+  ierr = PetscInfo1(A,"MatMatMultSym_mpi_mpi: lnnz %g;\n",lnnz);CHKERRQ(ierr);
+  PetscInt lMem = PetscRealIntMultTruncate(fill,PetscIntSumTruncate(adi[am],PetscIntSumTruncate(aoi[am],pi_loc[pm])));
+  ierr = PetscInfo1(A,"MatMatMultSym_mpi_mpi: lMem %g;\n",lMem);CHKERRQ(ierr);
+
   ierr = PetscFreeSpaceGet(PetscRealIntMultTruncate(fill,PetscIntSumTruncate(adi[am],PetscIntSumTruncate(aoi[am],pi_loc[pm]))),&free_space);CHKERRQ(ierr);
   current_space = free_space;
   ierr = MatPreallocateInitialize(comm,am,pn,dnz,onz);CHKERRQ(ierr);


More information about the petsc-users mailing list