<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><br><div><div>On Oct 20, 2012, at 5:02 PM, Jed Brown wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">On Sat, Oct 20, 2012 at 4:54 PM, Shri <span dir="ltr"><<a href="mailto:abhyshr@mcs.anl.gov" target="_blank">abhyshr@mcs.anl.gov</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; position: static; z-index: auto; ">
<div class="im">On Oct 20, 2012, at 3:58 PM, Jed Brown wrote:<br>
<br>
> Worse, validity of the cache changes whether it has a barrier (in debug mode) or not.<br>
<br>
</div>I don't understand the point you are making here.</blockquote></div><br><div>There is a barrier in PetscCommDuplicate, but we cache the thread comm because it involves several lookups and had a nontrivial performance effect. Thus</div>
<div><br></div><div>Rank 0</div><div>PetscCommGetThreadComm(WORLD)</div><div>PetscCommGetThreadComm(SELF) // tcomm cache miss, okay because comm is sequential</div><div>PetscCommGetThreadComm(WORLD) // tcomm cache miss, triggers an MPI_Barrier(WORLD)</div>
<div><br></div><div>Rank 1</div><div><div>PetscCommGetThreadComm(WORLD)</div><div>PetscCommGetThreadComm(WORLD) // tcomm cache HIT, no matching barrier</div></div><div><br></div><div><br></div><div>I'm going to commit the patch below unless there is an objection.</div></blockquote><div><br></div>Seems fine to me.</div><div><br><blockquote type="cite">
<div><br></div><div><div># HG changeset patch</div><div># User Jed Brown <<a href="mailto:jed@59A2.org">jed@59A2.org</a>></div><div># Date 1350767578 18000</div><div># Node ID 376bb5351f20762cfd2309cbede0c349a2b34a6e</div><div># Parent 538a44f2a56cf3a419067269fb5f2e2ce7c2a70d</div>
<div>Attach ThreadComm to outer comm instead of inner comm</div><div><br></div><div>diff --git a/src/sys/threadcomm/interface/threadcomm.c b/src/sys/threadcomm/interface/threadcomm.c</div><div>--- a/src/sys/threadcomm/interface/threadcomm.c</div>
<div>+++ b/src/sys/threadcomm/interface/threadcomm.c</div><div>@@ -77,18 +77,15 @@</div><div> PetscErrorCode ierr;</div><div> PetscMPIInt flg;</div><div> void* ptr;</div><div>- MPI_Comm icomm;</div>
<div> </div><div> PetscFunctionBegin;</div><div> if (comm == comm_cached) {</div><div> *tcommp = tcomm_cached;</div><div> PetscFunctionReturn(0);</div><div> }</div><div>- ierr = PetscCommDuplicate(comm,&icomm,PETSC_NULL);CHKERRQ(ierr);</div>
<div>- ierr = MPI_Attr_get(icomm,Petsc_ThreadComm_keyval,&ptr,&flg);CHKERRQ(ierr);</div><div>+ ierr = MPI_Attr_get(comm,Petsc_ThreadComm_keyval,&ptr,&flg);CHKERRQ(ierr);</div><div> if (!flg) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_CORRUPT,"MPI_Comm does not have a thread communicator");</div>
<div> *tcommp = (PetscThreadComm)ptr;</div><div>- ierr = PetscCommDestroy(&icomm);CHKERRQ(ierr);</div><div> comm_cached = comm;</div><div> tcomm_cached = *tcommp;</div><div> PetscFunctionReturn(0);</div><div>
@@ -1154,20 +1151,14 @@</div><div> PetscErrorCode PetscThreadCommDetach(MPI_Comm comm)</div><div> {</div><div> PetscErrorCode ierr;</div><div>- MPI_Comm icomm;</div><div> PetscMPIInt flg;</div><div> void *ptr;</div>
<div> </div><div> PetscFunctionBegin;</div><div>- ierr = PetscCommDuplicate(comm,&icomm,PETSC_NULL);CHKERRQ(ierr);</div><div>- ierr = MPI_Attr_get(icomm,Petsc_ThreadComm_keyval,&ptr,&flg);CHKERRQ(ierr);</div>
<div>+ ierr = MPI_Attr_get(comm,Petsc_ThreadComm_keyval,&ptr,&flg);CHKERRQ(ierr);</div><div> if (flg) {</div><div>- ierr = MPI_Attr_delete(icomm,Petsc_ThreadComm_keyval);CHKERRQ(ierr);</div><div>+ ierr = MPI_Attr_delete(comm,Petsc_ThreadComm_keyval);CHKERRQ(ierr);</div>
<div> }</div><div>- ierr = PetscCommDestroy(&icomm);CHKERRQ(ierr);</div><div>- /* Release reference from PetscThreadCommAttach */</div><div>- ierr = PetscCommDestroy(&comm);CHKERRQ(ierr);</div><div>-</div><div>
PetscFunctionReturn(0);</div><div> }</div><div> </div><div>@@ -1180,16 +1171,14 @@</div><div> PetscErrorCode PetscThreadCommAttach(MPI_Comm comm,PetscThreadComm tcomm)</div><div> {</div><div> PetscErrorCode ierr;</div>
<div>- MPI_Comm icomm;</div><div> PetscMPIInt flg;</div><div> void *ptr;</div><div> </div><div> PetscFunctionBegin;</div><div>- ierr = PetscCommDuplicate(comm,&icomm,PETSC_NULL);CHKERRQ(ierr); /* This extra reference is released in PetscThreadCommDetach */</div>
<div>- ierr = MPI_Attr_get(icomm,Petsc_ThreadComm_keyval,&ptr,&flg);CHKERRQ(ierr);</div><div>+ ierr = MPI_Attr_get(comm,Petsc_ThreadComm_keyval,&ptr,&flg);CHKERRQ(ierr);</div><div> if (!flg) {</div><div>
tcomm->refct++;</div><div>- ierr = MPI_Attr_put(icomm,Petsc_ThreadComm_keyval,tcomm);CHKERRQ(ierr);</div><div>+ ierr = MPI_Attr_put(comm,Petsc_ThreadComm_keyval,tcomm);CHKERRQ(ierr);</div><div> }</div><div>
PetscFunctionReturn(0);</div><div> }</div></div><div><br></div>
</blockquote></div><br></body></html>