Hi Jed<div><br></div><div>That commit fixes the mg->galerkin issue. </div><div><br></div><div>I'm still having issues with the projection/restriction matrices. Are the projection/restriction matrices supposed to be serial matrices operating on the unknowns on each processor even when running in parallel? </div>
<div><br></div><div>Thanks,</div><div><br></div><div>Gaetan<br><br><div class="gmail_quote">On Sat, Sep 1, 2012 at 6:33 PM, Barry Smith <span dir="ltr"><<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"><br>
On Sep 1, 2012, at 5:16 PM, Gaetan Kenway <<a href="mailto:kenway@utias.utoronto.ca">kenway@utias.utoronto.ca</a>> wrote:<br>
<br>
> Hi Jed<br>
><br>
> After probing why the following code snippit isn't run:<br>
><br>
> if (mg->galerkin == 1) {<br>
> Mat B;<br>
> /* currently only handle case where mat and pmat are the same on coarser levels */<br>
</div><div class="im">> dB = B;<br>
> }<br>
> }<br>
><br>
> It is because after I use PCMGSetGalerkin(), mg->galerkin has a value of -1 not 1. From what I can tell the value of PETSC_TRUE is architecture/implementation dependant. I can't see where/how the value of mg->galerkin obtains a value of 1 and runs the code above.<br>
><br>
> As for the other issue, after converting my baij matrix to aij, running MatRART() is ok.<br>
><br>
> So we just have to figure out why calling<br>
><br>
><br>
> call PCMGSetGalerkin(pc, PETSC_TRUE, PETScIerr)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> sets mg->galerkin to a value of -1 and the code above is expecting a value of 1.<br>
<br>
</div> This is a mistake with our Fortran interface. Some Fortran compilers (for what crazy reason I don't know) use -1 as bool for true. Normally this doesn't matter because we have if (boolvalue) { checks and so negative one and one work the same way;<br>
<br>
But the code mg->galerkin = (PetscInt)use; in PCMGSetGalekin() fails because we then check the numerical value of the result, instead of just if it is zero or nonzero.<br>
<br>
I can change this conversion to mg->galerkin = use ? 1 : 0 so the the Fortran use of -1 gets converted to one and this should resolve the problem.<br>
<br>
I have pushed this fix into the petsc-3.3 and petsc-dev repositories. If you are using 3.3 I've attached a new src/ksp/pc/impls/mg/mg.c that you can drop into that directory and run make in that directory to update your library. If you are using petsc-dev then do an hg pull -u and rerun make in that directory.[see attached file: mg.c]<br>
<br>
Thanks for determining the cause of this rather nasty bug.<br>
<span class="HOEnZb"><font color="#888888"><br>
Barry<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
<br>
><br>
> Gaetan<br>
><br>
><br>
><br>
><br>
> On Sat, Sep 1, 2012 at 5:34 PM, Jed Brown <<a href="mailto:jedbrown@mcs.anl.gov">jedbrown@mcs.anl.gov</a>> wrote:<br>
> On Sat, Sep 1, 2012 at 4:20 PM, Gaetan Kenway <<a href="mailto:kenway@utias.utoronto.ca">kenway@utias.utoronto.ca</a>> wrote:<br>
> Sorry about the last email...I hit sent it too early.<br>
><br>
> The full calling sequence is:<br>
><br>
><br>
> call KSPCreate(SUMB_COMM_WORLD, ksp, PETScIerr)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> useAD = .False.<br>
> useTranspose = .True.<br>
> usePC = .True.<br>
> call setupStateResidualMatrix(drdwpret,useAD,usePC,useTranspose)<br>
><br>
> ! dRdwT has already been created<br>
> call KSPSetOperators(ksp, dRdWT, dRdwpreT, &<br>
> DIFFERENT_NONZERO_PATTERN, PETScIerr)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> call KSPSetFromOptions(ksp, PETScIerr)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> call KSPSetType(ksp, KSPFGMRES, PETScIerr)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> call KSPGMRESSetRestart(ksp, adjRestart, PETScIerr)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> call KSPGMRESSetCGSRefinementType(ksp, &<br>
> KSP_GMRES_CGS_REFINE_IFNEEDED, PETScIerr)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> call kspgetpc(ksp, pc, PETScIerr)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> call pcsettype(pc, PCMG, PETScIerr)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> call PetscOptionsSetValue('-pc_mg_levels','2',PETScIerr)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> ! Set command line options<br>
> call PCSetFromOptions(pc, PETScIerr)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> call PCMGSetGalerkin(pc, PETSC_TRUE, PETScIerr)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> ! Create the restriction operator between finest and one below:<br>
> call createRestrictOperator(RL1, 1)<br>
><br>
> call PCMGSetRestriction(pc, 1, RL1, PETScierr)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> call PCSetup(pc, PETScierr)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> The error occurs in PCSetup() which calls PCSetup_MG() and the issue is in KSPGetVecs() on line 644 in mg.c. Just to clarify, this isn't a memory problem/segfault.<br>
><br>
> The output below is a SEGV.<br>
><br>
> PETSc returns control cleanly and I my code exits when the error is detected. Therefore a backtrace or valgrind doesn't yield any additional information. ( I Ran both).<br>
><br>
> You can still set a breakpoint in PetscError (-start_in_debugger / -on_error_attach_debugger does this automatically). Then you can examine the stack.<br>
><br>
> The terminal output is:<br>
><br>
> [0]PETSC ERROR: ------------------------------------------------------------------------<br>
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range<br>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger<br>
> [0]PETSC ERROR: or see <a href="http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC" target="_blank">http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC</a> ERROR: or try <a href="http://valgrind.org" target="_blank">http://valgrind.org</a> on GNU/linux and Apple Mac OS X to find memory corruption errors<br>
> [0]PETSC ERROR: likely location of problem given in stack below<br>
> [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------<br>
> =================================================================<br>
> PETSc or MPI Error. Error Code 73. Detected on Proc 0<br>
> Error at line: 130 in file: setupPETScKsp.F90<br>
> =================================================================<br>
> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,<br>
> [0]PETSC ERROR: INSTEAD the line number of the start of the function<br>
> [0]PETSC ERROR: is given.<br>
> [0]PETSC ERROR: [0] MatGetVecs line 8142 /nfs/mica/home/kenway/Downloads/petsc-3.3/src/mat/interface/matrix.c<br>
> [0]PETSC ERROR: [0] KSPGetVecs line 774 /nfs/mica/home/kenway/Downloads/petsc-3.3/src/ksp/ksp/interface/iterativ.c<br>
> [0]PETSC ERROR: [0] PCSetUp_MG line 508 /nfs/mica/home/kenway/Downloads/petsc-3.3/src/ksp/pc/impls/mg/mg.c<br>
> [0]PETSC ERROR: [0] PCSetUp line 810 /nfs/mica/home/kenway/Downloads/petsc-3.3/src/ksp/pc/interface/precon.c<br>
><br>
> I had a look at ex42.c but it uses a DM which I am not doing.<br>
><br>
> It uses the DM for convenience to build the operators, but it does *not* tell the KSP about the DM. That is, it does not call KSPSetDM() or PCSetDM(). So the code in mg.c does not know about the DM.<br>
><br>
> From the example there is the following:<br>
><br>
> ierr = PCMGSetLevels(pc,nlevels,PETSC_NULL);CHKERRQ(ierr);<br>
> ierr = PCMGSetType(pc,PC_MG_MULTIPLICATIVE);CHKERRQ(ierr);<br>
> ierr = PCMGSetGalerkin(pc,PETSC_TRUE);CHKERRQ(ierr);<br>
><br>
> for (k=1; k<nlevels; k++){<br>
> ierr = DMCreateInterpolation(da_list[k-1],da_list[k],&R,PETSC_NULL);CHKERRQ(ierr);<br>
> ierr = PCMGSetInterpolation(pc,k,R);CHKERRQ(ierr);<br>
> ierr = MatDestroy(&R);CHKERRQ(ierr);<br>
> }<br>
><br>
> Is there anything special about "R" in this case?<br>
><br>
> Nothing special, it is just an MAIJ matrix that performs the interpolation.<br>
><br>
> As I mentioned before, I know what the issue is, the coarse grid operators are not set at all. From what I can tell, they should be set in the routine PCSetup_MG in mg.c, but the code is never run unless pc->dm is True.<br>
><br>
> What code is not run unless pc->dm?<br>
><br>
> I guess I'll just create the operators myself. Even if I modify the code to make it enter the if statement at line 577<br>
><br>
> You called PCMGSetGalerkin() so the code below should run, constructing the coarse levels. You could try setting a breakpoint in there to see why it's not running.<br>
><br>
><br>
> if (mg->galerkin == 1) {<br>
> Mat B;<br>
> /* currently only handle case where mat and pmat are the same on coarser levels */<br>
> ierr = KSPGetOperators(mglevels[n-1]->smoothd,&dA,&dB,&uflag);CHKERRQ(ierr);<br>
> if (!pc->setupcalled) {<br>
> for (i=n-2; i>-1; i--) {<br>
> ierr = MatPtAP(dB,mglevels[i+1]->interpolate,MAT_INITIAL_MATRIX,1.0,&B);CHKERRQ(ierr);<br>
> ierr = KSPSetOperators(mglevels[i]->smoothd,B,B,uflag);CHKERRQ(ierr);<br>
> if (i != n-2) {ierr = PetscObjectDereference((PetscObject)dB);CHKERRQ(ierr);}<br>
> dB = B;<br>
> }<br>
> if (n > 1) {ierr = PetscObjectDereference((PetscObject)dB);CHKERRQ(ierr);}<br>
> } else {<br>
> for (i=n-2; i>-1; i--) {<br>
> ierr = KSPGetOperators(mglevels[i]->smoothd,PETSC_NULL,&B,PETSC_NULL);CHKERRQ(ierr);<br>
> ierr = MatPtAP(dB,mglevels[i+1]->interpolate,MAT_REUSE_MATRIX,1.0,&B);CHKERRQ(ierr);<br>
> ierr = KSPSetOperators(mglevels[i]->smoothd,B,B,uflag);CHKERRQ(ierr);<br>
> dB = B;<br>
> }<br>
> }<br>
><br>
> which seems like where it should be done....the MatPTAP will not run.<br>
><br>
> What do you mean "will not run"?<br>
><br>
> I'm using a mpibaij matrix for 'dRdwPreT' and my restriction matrix 'RL1' is a mpiaij matrix. When I try to run MatRART myself with the code:<br>
><br>
> call MatRART(dRdwT, RL1, MAT_INITIAL_MATRIX, 1.0, newMat, PETScIERR)<br>
> call EChk(PETScIerr,__FILE__,__LINE__)<br>
><br>
> I receive error code 56: /* no support for requested operation */ (this is a clean exit from PETSc)<br>
><br>
> MatRARt (or MatPtAP, for that matter) is not implemented for BAIJ with an AIJ projection. The AIJ projection will lose the block structure anyway, so just use AIJ for both.<br>
><br>
><br>
> In conclusion, I have two issues: 1) The coarse grid operators are not being set automatically in mg.c. 2) Using MatRART() with a mpibaij matrix and a mpiaij matrix is unsupported.<br>
><br>
> Is there a PCMG example that doesn't use a DM to generate the restriction/prolongation/matrix information I can have a look at?<br>
><br>
><br>
</div></div></blockquote></div><br></div>