[petsc-users] How to confirm the performance of asynchronous computations

Thu Jan 21 20:20:40 CST 2021

  ierr = VecNormBegin(u,NORM_2,&norm1);
    ierr = PetscCommSplitReductionBegin(PetscObjectComm((PetscObject)Ax)); 

How come you call this on Ax and not on u? For clarity, if nothing else, I think you should call it on u.

comb.c has 

/*
      Split phase global vector reductions with support for combining the
   communication portion of several operations. Using MPI-1.1 support only

      The idea for this and much of the initial code is contributed by
   Victor Eijkhout.

       Usage:
             VecDotBegin(Vec,Vec,PetscScalar *);
             VecNormBegin(Vec,NormType,PetscReal *);
             ....
             VecDotEnd(Vec,Vec,PetscScalar *);
             VecNormEnd(Vec,NormType,PetscReal *);

       Limitations:
         - The order of the xxxEnd() functions MUST be in the same order
           as the xxxBegin(). There is extensive error checking to try to
           insure that the user calls the routines in the correct order
*/

#include <petsc/private/vecimpl.h>    /*I   "petscvec.h"    I*/

static PetscErrorCode MPIPetsc_Iallreduce(void *sendbuf,void *recvbuf,PetscMPIInt count,MPI_Datatype datatype,MPI_Op op,MPI_Comm comm,MPI_Request *request)
{
  PETSC_UNUSED PetscErrorCode ierr;

  PetscFunctionBegin;
#if defined(PETSC_HAVE_MPI_IALLREDUCE)
  ierr = MPI_Iallreduce(sendbuf,recvbuf,count,datatype,op,comm,request);CHKERRMPI(ierr);
#elif defined(PETSC_HAVE_MPIX_IALLREDUCE)
  ierr = MPIX_Iallreduce(sendbuf,recvbuf,count,datatype,op,comm,request);CHKERRQ(ierr);
#else
  ierr = MPIU_Allreduce(sendbuf,recvbuf,count,datatype,op,comm);CHKERRQ(ierr);
  *request = MPI_REQUEST_NULL;
#endif
  PetscFunctionReturn(0);
}

So first check if $PETSC_DIR/include/petscconf.h has 

PETSC_HAVE_MPI_IALLREDUCE

if it does not then the standard MPI reduce is called. 

If this is set then any improvement depends on the implementation of iallreduce inside the MPI you are using. 

Barry

> On Jan 21, 2021, at 6:52 AM, Viet H.Q.H. <hqhviet at tohoku.ac.jp> wrote:
> 
> 
> Hello Petsc developers and supporters,
> 
> I would like to confirm the performance of asynchronous computations of inner product computation overlapping with matrix-vector multiplication computation by the below code.
> 
> 
>  PetscLogDouble tt1,tt2;
>     KSP ksp;
>     //ierr = VecSet(c,one);
>     ierr = VecSet(c,one);
>     ierr = VecSet(u,one);
>     ierr = VecSet(b,one);
> 
>     ierr = KSPCreate(PETSC_COMM_WORLD,&ksp); CHKERRQ(ierr);
>     ierr = KSP_MatMult(ksp,A,x,Ax); CHKERRQ(ierr);
> 
> 
>     ierr = PetscTime(&tt1);CHKERRQ(ierr);
>     ierr = VecNormBegin(u,NORM_2,&norm1);
>     ierr = PetscCommSplitReductionBegin(PetscObjectComm((PetscObject)Ax)); 
>     ierr = KSP_MatMult(ksp,A,c,Ac); 
>     ierr = VecNormEnd(u,NORM_2,&norm1);
>     ierr = PetscTime(&tt2);CHKERRQ(ierr);
> 
>     ierr = PetscPrintf(PETSC_COMM_WORLD, "The time used for the asynchronous calculation: %f\n",tt2-tt1); CHKERRQ(ierr);
>     ierr = PetscPrintf(PETSC_COMM_WORLD,"+ |u| =  %g\n",(double) norm1); CHKERRQ(ierr);
> 
> 
>     ierr = PetscTime(&tt1);CHKERRQ(ierr);
>     ierr = VecNorm(b,NORM_2,&norm2); CHKERRQ(ierr);
>     ierr = KSP_MatMult(ksp,A,c,Ac); 
>     ierr = PetscTime(&tt2);CHKERRQ(ierr);
> 
>     ierr = PetscPrintf(PETSC_COMM_WORLD, "The time used for the synchronous calculation: %f\n",tt2-tt1); CHKERRQ(ierr);
>     ierr = PetscPrintf(PETSC_COMM_WORLD,"+ |b| =  %g\n",(double) norm2); CHKERRQ(ierr);
> 
> 
> On a cluster with two or four nodes, the asynchronous computation is always much slower than synchronous computation.
> 
> The time used for the asynchronous calculation: 0.000203
> + |u| =  100.
> The time used for the synchronous calculation: 0.000006
> + |b| =  100.
> 
> Are there any necessary settings on MPI or Petsc to gain performance of asynchronous computation?
> 
> Thank you very much for anything you can provide.
> Sincerely,
> Viet.
> 
> 
> 
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210121/c28f15b0/attachment.html>