On Wed, Feb 15, 2012 at 8:53 PM, Sylvain Barbot <span dir="ltr"><<a href="mailto:sylbar.vainbot@gmail.com">sylbar.vainbot@gmail.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi Matt,<br>
<br>
You're right, I'm not worried about the interior points, that why I'm<br>
looking for updating the ghost points only - the interior points<br>
unaffected by the process. Jed seems to indicate that it's not a<br>
bottle neck.<br>
<br>
Jed,<br>
<br>
Would you recommend a particular step to improve cache performance?<br>
Can you be a bit more specific?<br></blockquote><div><br></div><div>R-B has crappy cache performance since you alternately pull data but ignore</div><div>half of it. Why not just use the GMG in PETSc? Its very flexible, scalable,</div>
<div>and efficient.</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Sylvain<br>
<br>
2012/2/15 Matthew Knepley <<a href="mailto:knepley@gmail.com">knepley@gmail.com</a>>:<br>
> On Wed, Feb 15, 2012 at 5:25 PM, Sylvain Barbot <<a href="mailto:sylbar.vainbot@gmail.com">sylbar.vainbot@gmail.com</a>><br>
<div class="im">> wrote:<br>
>><br>
>> Hi All,<br>
>><br>
>> I'm implementing an elasticity multi-grid solver with Petsc with<br>
>> matrix shells. I am using the Red-Black Gauss-Siedel smoother. In the<br>
>> middle of the Red-Black passes, I need to update the ghost points<br>
>> between processors. To do that effectively, I'd like to update only<br>
>> the ghost points between the different processors, instead of the<br>
>> whole array. The goal is to do the most possible operations in place<br>
><br>
><br>
</div>> What does that mean? Updating ghost dofs means taking the value of that<br>
> dof on the process that owns it, and copying it to all the processes which<br>
> hold<br>
> that as a ghost dof. This operation has no meaning for interior points.<br>
><br>
> Matt<br>
<div class="im">><br>
>><br>
>> in local array lx, and having to update global vector x only once. The<br>
>> working matrix multiply smoothing operation is shown below. I'm using<br>
>> Petsc 3.1. I read about<br>
>><br>
>> <a href="http://www.mcs.anl.gov/petsc/petsc-3.1/docs/manualpages/Vec/VecGhostUpdateBegin.html" target="_blank">http://www.mcs.anl.gov/petsc/petsc-3.1/docs/manualpages/Vec/VecGhostUpdateBegin.html</a>,<br>
>> but I'm not entirely clear whether this does what I want or not.<br>
>><br>
>><br>
</div>>> SUBROUTINE matrixsmooth(A,b,omega,flag,shift,its,level,x,ierr)<br>
>> USE types<br>
>> USE heap<br>
>> USE petscsys<br>
>> USE petscda<br>
>> USE petscis<br>
>> USE petscvec<br>
>> USE petscmat<br>
>> USE petscpc<br>
>> USE petscksp<br>
>><br>
>> IMPLICIT NONE<br>
>><br>
>> Mat, INTENT(IN) :: A<br>
>> Vec, INTENT(IN) :: b<br>
>> Vec, INTENT(INOUT) :: x<br>
>> PetscReal, INTENT(IN) :: omega,shift<br>
>> MatSORType, INTENT(IN) :: flag<br>
>> PetscInt, INTENT(IN) :: its, level ! iterations, multi-grid level<br>
>> PetscErrorCode, INTENT(OUT) :: ierr<br>
>><br>
>> Vec :: lx,lb<br>
>> PetscInt :: istart,iend<br>
>> PetscScalar, POINTER :: px(:) ! pointer to solution array<br>
>> PetscScalar, POINTER :: pb(:) ! pointer to body-force array<br>
>> INTEGER :: rank,isize,i,k<br>
>> INTEGER :: sw,off,p<br>
>> INTEGER :: i2,i3,i2i,i3i, &<br>
>> i000,i0p0,i00p, &<br>
>> i0m0,i00m<br>
>> TYPE(DALOCALINFOF90) :: info<br>
>><br>
>> CALL MPI_COMM_RANK(PETSC_COMM_WORLD,rank,ierr)<br>
>> CALL MPI_COMM_SIZE(PETSC_COMM_WORLD,isize,ierr)<br>
>><br>
>> CALL VecGetOwnershipRange(x,istart,iend,ierr)<br>
>><br>
>> ! allocate memory for local vector with ghost points<br>
>> CALL DACreateLocalVector(c%daul(1+level),lx,ierr)<br>
>> CALL VecDuplicate(lx,lb,ierr)<br>
>><br>
>> ! retrieve forcing term b with ghost points<br>
>> CALL DAGlobalToLocalBegin(c%daul(1+level),b,INSERT_VALUES,lb,ierr)<br>
>> CALL DAGlobalToLocalEnd(c%daul(1+level),b,INSERT_VALUES,lb,ierr)<br>
>><br>
>> ! obtain a pointer to local data<br>
>> CALL VecGetArrayF90(lx,px,ierr);<br>
>> CALL VecGetArrayF90(lb,pb,ierr);<br>
>><br>
>> ! geometry info about local vector and padding<br>
>> CALL DAGetLocalInfoF90(c%daul(1+level),info,ierr);<br>
>><br>
>> ! retrieve stencil width (ghost-point padding)<br>
>> CALL DAGetInfo(c%daul(1+level),PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, &<br>
>><br>
>><br>
>> PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,<br>
>> &<br>
>> PETSC_NULL_INTEGER,PETSC_NULL_INTEGER, &<br>
>> sw,PETSC_NULL_INTEGER, &<br>
>> PETSC_NULL_INTEGER,ierr);<br>
>><br>
>> ! offset due to node topology<br>
>> off=MOD(info%xm+info%ym,2)<br>
>><br>
>> ! fast smoothing (relaxation) its=number of iteration<br>
>> DO k=1,its<br>
>> ! Red-Black Gauss-Siedel scheme<br>
>> DO p=0,1<br>
>> ! retrieve initial guess x with ghost points<br>
>> CALL<br>
>> DAGlobalToLocalBegin(c%daul(1+level),x,INSERT_VALUES,lx,ierr)<br>
>> CALL DAGlobalToLocalEnd(c%daul(1+level),x,INSERT_VALUES,lx,ierr)<br>
>> ! smoothing (relaxation)<br>
>> DO i=istart+p+off,iend-1,info%dof*2<br>
>> i3=(i-istart)/(info%xm*info%dof)<br>
>> i2=(i-istart-i3*info%xm*info%dof)/info%dof<br>
>> i3=i3+i3i ! i3 in ( 0 .. sx3-1 )<br>
>> i2=i2+i2i ! i2 in ( 0 .. sx2-1 )<br>
>><br>
>> i000=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof<br>
>><br>
>> i0p0=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i+1)*info%dof<br>
>> i00p=1+((sw+i3-i3i+1)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof<br>
>> i0m0=1+((sw+i3-i3i+0)*(info%xm+2*sw)+sw+i2-i2i-1)*info%dof<br>
>> i00m=1+((sw+i3-i3i-1)*(info%xm+2*sw)+sw+i2-i2i+0)*info%dof<br>
>><br>
>> px(i000+IU1)=px(i000+IU1)+ &<br>
>> (pb(i000+IU1)-((+(px(i0p0+IU1)-px(i000+IU1)<br>
>> )-(px(i000+IU1)-px(i0m0+IU1) )) &<br>
>><br>
>> +(+(px(i00p+IU1)-px(i000+IU1) )-(px(i000+IU1)-px(i00m+IU1) )))) /<br>
>> (-4._8)<br>
>> END DO<br>
>><br>
>> ! publish new values of x for its global vector<br>
>> CALL DALocalToGlobal(c%daul(1+level),lx,INSERT_VALUES,x,ierr)<br>
>><br>
>> END DO<br>
>> END DO<br>
>><br>
>> ! dispose of the local vectors<br>
>> CALL VecRestoreArrayF90(lx,px,ierr)<br>
>> CALL VecRestoreArrayF90(lb,pb,ierr)<br>
>> CALL DARestoreLocalVector(c%daul(1+level),lx,ierr)<br>
>> CALL DARestoreLocalVector(c%daul(1+level),lb,ierr)<br>
>><br>
>> END SUBROUTINE matrixsmooth<br>
>><br>
>><br>
>><br>
>><br>
>><br>
>> Could this be changed to something like:<br>
>><br>
>><br>
>><br>
>> ... INITIALIZE...<br>
>><br>
>> CALL DAGlobalToLocalBegin(c%daul(1+c%level),x,INSERT_VALUES,lx,ierr)<br>
>> CALL DAGlobalToLocalEnd(c%daul(1+c%level),x,INSERT_VALUES,lx,ierr)<br>
>><br>
>> ! fast smoothing (relaxation) its=number of iteration<br>
>> DO k=1,its<br>
>> ! Red-Black Gauss-Siedel scheme<br>
>> DO p=0,1<br>
>> ! smoothing (relaxation)<br>
>> DO i=istart+p+off,iend-1,info%dof*2<br>
>> ...STENCIL OPERATION...<br>
>> END DO<br>
>><br>
>> ...UPDATE GHOST POINTS...<br>
>><br>
>> END DO<br>
>> END DO<br>
>><br>
>> ! publish new values of x for its global vector<br>
>> CALL DALocalToGlobal(c%daul(1+level),lx,INSERT_VALUES,x,ierr)<br>
>><br>
>> ...CLEAN MEMORY...<br>
>><br>
>><br>
>><br>
>><br>
>><br>
>> Any recommendation?<br>
>><br>
>> Best wishes,<br>
>> Sylvain Barbot<br>
><br>
><br>
><br>
<span class="HOEnZb"><font color="#888888">><br>
> --<br>
> What most experimenters take for granted before they begin their experiments<br>
> is infinitely more interesting than any results to which their experiments<br>
> lead.<br>
> -- Norbert Wiener<br>
</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener<br>