<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
Hi Krys,<br class="">
<div><br class="">
<blockquote type="cite" class="">
<div class="">On Dec 20, 2018, at 10:59 AM, Krzysztof Kamieniecki via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" class="">petsc-users@mcs.anl.gov</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div dir="ltr" class="">
<div dir="ltr" class="">
<div dir="ltr" class="">
<div dir="ltr" class="">
<div dir="ltr" class="">That example seems to have critical sections around certain Vec calls, and it looks like my problem occurs in VecDotBegin/VecDotEnd which is called by TAO/BLMVM.</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>The quasi-Newton matrix objects in BLMVM have asynchronous dot products in the matrix-free forward and inverse product formulations. This is a relatively recent performance optimization. If avoiding this split phase communication would solve the problem,
 and you don’t need other recent PETSc features, you could revert to 3.9 and use the old version of BLMVM that will use straight VecDot operations instead.</div>
<div><br class="">
</div>
<div>Unfortunately I don’t know enough about multithreading to definitively say whether that will actually solve the problem or not. Other members of the community can probably provide a more complete answer on that.</div>
<br class="">
<blockquote type="cite" class="">
<div class="">
<div dir="ltr" class="">
<div dir="ltr" class="">
<div dir="ltr" class="">
<div dir="ltr" class="">
<div dir="ltr" class=""><br class="">
</div>
<div dir="ltr" class="">I assume  PetscSplitReductionGet is pulling the PetscSplitReduction for PETSC_COMM_SELF which is shared across the whole process?</div>
<div dir="ltr" class=""><br class="">
</div>
<div dir="ltr" class="">I tried PetscCommDuplicate/PetscCommDestroy but that does not seem to help.<br class="">
</div>
<div dir="ltr" class="">
<div class=""><br class="">
</div>
<div class="">
<div class="">PetscErrorCode  VecDotBegin(Vec x,Vec y,PetscScalar *result)</div>
<div class="">{</div>
<div class="">  PetscErrorCode      ierr;</div>
<div class="">  PetscSplitReduction *sr;</div>
<div class="">  MPI_Comm            comm;</div>
<div class=""><br class="">
</div>
<div class="">  PetscFunctionBegin;</div>
<div class="">  PetscValidHeaderSpecific(x,VEC_CLASSID,1);</div>
<div class="">  PetscValidHeaderSpecific(y,VEC_CLASSID,1);</div>
<div class="">  ierr = PetscObjectGetComm((PetscObject)x,&comm);CHKERRQ(ierr);</div>
<div class="">  ierr = PetscSplitReductionGet(comm,&sr);CHKERRQ(ierr);</div>
<div class="">  if (sr->state != STATE_BEGIN) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ORDER,"Called before all VecxxxEnd() called");</div>
<div class="">  if (sr->numopsbegin >= sr->maxops) {</div>
<div class="">    ierr = PetscSplitReductionExtend(sr);CHKERRQ(ierr);</div>
<div class="">  }</div>
<div class="">  sr->reducetype[sr->numopsbegin] = PETSC_SR_REDUCE_SUM;</div>
<div class="">  sr->invecs[sr->numopsbegin]     = (void*)x;</div>
<div class="">  if (!x->ops->dot_local) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_SUP,"Vector does not suppport local dots");</div>
<div class="">  ierr = PetscLogEventBegin(VEC_ReduceArithmetic,0,0,0,0);CHKERRQ(ierr);</div>
<div class="">  ierr = (*x->ops->dot_local)(x,y,sr->lvalues+sr->numopsbegin++);CHKERRQ(ierr);</div>
<div class="">  ierr = PetscLogEventEnd(VEC_ReduceArithmetic,0,0,0,0);CHKERRQ(ierr);</div>
<div class="">  PetscFunctionReturn(0);</div>
<div class="">}</div>
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
</div>
</div>
</div>
</div>
</div>
<br class="">
<div class="gmail_quote">
<div dir="ltr" class="">On Thu, Dec 20, 2018 at 11:26 AM Smith, Barry F. <<a href="mailto:bsmith@mcs.anl.gov" class="">bsmith@mcs.anl.gov</a>> wrote:<br class="">
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br class="">
   The code src/ksp/ksp/examples/tutorials/ex61f.F90 demonstrates working with multiple threads each managing their own collection of PETSc objects. Hope this helps.<br class="">
<br class="">
    Barry<br class="">
<br class="">
<br class="">
> On Dec 20, 2018, at 9:28 AM, Krzysztof Kamieniecki via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank" class="">petsc-users@mcs.anl.gov</a>> wrote:<br class="">
> <br class="">
> Hello All,<br class="">
> <br class="">
> I have an embarrassingly parallel problem that I would like to use TAO on, is there some way to do this with threads as opposed to multiple processes?<br class="">
> <br class="">
>  I compiled PETSc with the following flags<br class="">
> ./configure \<br class="">
> --prefix=${DEP_INSTALL_DIR} \<br class="">
> --with-threadsafety --with-log=0 --download-concurrencykit \<br class="">
> --with-openblas=1 \<br class="">
> --with-openblas-dir=${DEP_INSTALL_DIR} \<br class="">
> --with-mpi=0 \<br class="">
> --with-shared=0 \<br class="">
> --with-debugging=0 COPTFLAGS='-O3' CXXOPTFLAGS='-O3' FOPTFLAGS='-O3' <br class="">
> <br class="">
> When I run TAO in multiple threads I get the error "Called VecxxxEnd() in a different order or with a different vector than VecxxxBegin()"<br class="">
> <br class="">
> Thanks,<br class="">
> Krys<br class="">
> <br class="">
<br class="">
</blockquote>
</div>
</div>
</blockquote>
<br class="">
</div>
<div>—</div>
<div>Alp</div>
<br class="">
</body>
</html>