[petsc-users] Big discrepancy between machines

Thu Dec 17 21:53:22 CST 2015

  Except for the "flexible" Krylov methods like FGMRES, you cannot put a Krylov method inside a Krylov method because (non-flexible) Krylov methods require the preconditioner be a linear operator and Krylov methods are not linear operators.

> On Dec 17, 2015, at 9:41 PM, Timothée Nicolas <timothee.nicolas at gmail.com> wrote:
> 
> Ah ! That was the reason ! 
> 
> I do have actually four GMRES inside the outside main ksp GMRES. I do a lot of shenanigans in my preconditioning. Namely, I take the input vector x, and extract the different degrees of freedom, putting them inside 4 vectors (2 vectors with 3 dof, and 2 vectors of scalars). Then I run a GMRES on each of these vectors (preconditioned with multigrid for at least one of them). When this is all done I combine all the vectors again (there are some matrix multiplication involved, with the different vectors coupled) to put them in a vector with same format (8 dof) as the initial input vector, and the preconditioning is done.
> 
> So, simply adding -ksp_type fgmres seems to fix my problem, I just checked. You have just saved me several days/weeks of scratching my head, if not more !
> 
> I had no clue standard GMRES was picky about preconditioners. So now, after telling you what I do in the preconditioner, you confirm indeed that simply using FGMRES is OK right ? 
> 
> Thx !
> 
> Best 
> 
> Timothee
> 
> 
> 
> 2015-12-18 11:51 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:
> 
>    There is something inconsistent with the preconditioners or the matrix vector product. Do you have a gmres outside a gmres? That definitely won't work and would produce this problem. If you have a gmres somewhere inside the preconditioner then you need to have a fgmres outside of that (and not a gmres). Run with -ksp_view and send the output showing what your solver is
> 
>   Barry
> 
> > On Dec 17, 2015, at 8:16 PM, Timothée Nicolas <timothee.nicolas at gmail.com> wrote:
> >
> > OK, I can build the residuals with this routine. It allowed me to see that the residuals computed by PETSc (with KSPGetResidualNorm), which do converge rapidly, are totally inconsistent with what I compute from
> >
> >   call KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr)
> >   call VecDuplicate(x,y,ierr)
> >   call KSPGetRHS(ksp,b,ierr)
> >   call KSPGetOperators(ksp,A,PETSC_NULL_OBJECT,ierr)
> >   call MatMult(A,x,y,ierr)
> >   call VecAXPY(y,-one,b,ierr)
> >   call VecNorm(y,NORM_2,norm,ierr)
> >   call PrintReal('      KSP Residual norm',norm,itwelve,ierr)
> >   call VecDestroy(y,ierr)
> >
> > Now, the residuals computed with this method are consistent with the next residual computed by SNES. In other words, I get the following output (see below, the first KSP residual is the internal PETSc one, the second is mine). As you can see, the one I compute is consistent (up to a few digits) with what goes inside the updated solution of SNES. How is that possible ? I tried to see what KSPGetResidualNorm does but I could only find the instruction
> >
> >   *rnorm = ksp->rnorm
> >
> > and then I don't know where to look for...
> >
> > Here's what the output looks like
> >
> > 0 SNES Function norm 4.867111713544e-03
> >
> >     0 KSP Residual norm 4.958714442097e-03
> >       KSP Residual norm 4.867111713544E-03
> >
> >     1 KSP Residual norm 3.549907385578e-04
> >       KSP Residual norm 1.651154147130E-03
> >
> >     2 KSP Residual norm 3.522862963778e-05
> >       KSP Residual norm 1.557509645650E-03
> >
> >     3 KSP Residual norm 3.541384239147e-06
> >       KSP Residual norm 1.561088378958E-03
> >
> >     4 KSP Residual norm 3.641326695942e-07
> >       KSP Residual norm 1.560783284631E-03
> >
> >     5 KSP Residual norm 3.850512634373e-08
> >       KSP Residual norm 1.560806669239E-03
> >
> >   1 SNES Function norm 1.560806759605e-03
> >       etc.. etc...
> >
> > On the cluster, I don't find exactly the behavior above, the two norms agree rather well up to a few digits, at least at the beginning, but they start to be very different at the end of the iterations (up to 2 orders of magnitude, which also gets me quite worried).
> >
> > Thx
> >
> > Timothee
> >
> >
> >
> > 2015-12-18 1:45 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:
> >
> > > On Dec 17, 2015, at 7:47 AM, Timothée Nicolas <timothee.nicolas at gmail.com> wrote:
> > >
> > > It works very smoothly for the SNES :-), but KSPGetSolution keeps returning a zero vector... KSPGetResidualNorm gives me the norm alright, but I would like to actually see the vectors. Is KSPGetSolution the wrong routine ?
> >
> >   If you are using GMRES, which is the default, it actually DOES NOT have a representation of the solution at each step. Yes that seems odd but it only computes the solution vector at a restart or when the iteration ends. This is why KSPGetSolution doesn't provide anything useful.  You can use KSPBuildSolution() to have it construct the "current" solution whenever you need it.
> >
> >   Barry
> >
> >
> >
> > >
> > > Thx
> > >
> > > Timothée
> > >
> > > 2015-12-17 19:19 GMT+09:00 Dave May <dave.mayhem23 at gmail.com>:
> > >
> > >
> > > On 17 December 2015 at 11:00, Timothée Nicolas <timothee.nicolas at gmail.com> wrote:
> > > Hi,
> > >
> > > So, valgrind is OK (at least on the local machine. Actually on the cluster helios, it produces strange results even for the simplest petsc program PetscInitialize followed by PetscFinalize, I will try to figure this out with their technical team), and I have also tried with exactly the same versions (3.6.0) and it does not change the behavior.
> > >
> > > So now I would like to now how to have a grip on what comes in and out of the SNES and the KSP internal to the SNES. That is, I would like to inspect manually the vector which enters the SNES in the first place (should be zero I believe), what is being fed to the KSP, and the vector which comes out of it, etc. if possible at each iteration of the SNES. I want to actually see these vectors, and compute there norm by hand. The trouble is, it is really hard to understand why the newton residuals are not reduced since the KSP converges so nicely. This does not make any sense to me, so I want to know what happens to the vectors. But on the SNES list of routines, I did not find the tools that would allow me to do that (and messing around with the C code is too hard for me, it would take me weeks). Does someone have a hint ?
> > >
> > > The only sane way to do this is to write a custom monitor for your SNES object.
> > >
> > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESMonitorSet.html
> > >
> > > Inside your monitor, you have access the SNES, and everything it defines, e.g. the current solution, non-linear residual, KSP etc. See these pages
> > >
> > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESGetSolution.html
> > >
> > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESGetFunction.html
> > >
> > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESGetKSP.html
> > >
> > > Then you can pull apart the residual and compute specific norms (or plot the residual).
> > >
> > > Hopefully you can access everything you need to perform your analysis.
> > >
> > > Cheers,
> > >   Dave
> > >
> > >
> > > Thx
> > >
> > > Timothee
> > >
> > >
> > >
> > >
> > > 2015-12-15 14:20 GMT+09:00 Matthew Knepley <knepley at gmail.com>:
> > > On Mon, Dec 14, 2015 at 11:06 PM, Timothée Nicolas <timothee.nicolas at gmail.com> wrote:
> > > There is a diference in valgrind indeed between the two. It seems to be clean on my desktop Mac OS X but not on the cluster. I'll try to see what's causing this. I still don't understand well what's causing memory leaks in the case where all PETSc objects are freed correctly (as can pbe checked with -log_summary).
> > >
> > > Also, I have tried running either
> > >
> > > valgrind ./my_code -option1 -option2...
> > >
> > > or
> > >
> > > valgrind mpiexec -n 1 ./my_code -option1 -option2...
> > >
> > > Note here you would need --trace-children=yes for valgrind.
> > >
> > >   Matt
> > >
> > > It seems the second is the correct way to proceed right ? This gives very different behaviour for valgrind.
> > >
> > > Timothee
> > >
> > >
> > >
> > > 2015-12-14 17:38 GMT+09:00 Timothée Nicolas <timothee.nicolas at gmail.com>:
> > > OK, I'll try that, thx
> > >
> > > 2015-12-14 17:38 GMT+09:00 Dave May <dave.mayhem23 at gmail.com>:
> > > You have the configure line, so it should be relatively straight forward to configure / build petsc in your home directory.
> > >
> > >
> > > On 14 December 2015 at 09:34, Timothée Nicolas <timothee.nicolas at gmail.com> wrote:
> > > OK, The problem is that I don't think I can change this easily as far as the cluster is concerned. I obtain access to petsc by loading the petsc module, and even if I have a few choices, I don't see any debug builds...
> > >
> > > 2015-12-14 17:26 GMT+09:00 Dave May <dave.mayhem23 at gmail.com>:
> > >
> > >
> > > On Monday, 14 December 2015, Timothée Nicolas <timothee.nicolas at gmail.com> wrote:
> > > Hum, OK. I use FORTRAN by the way. Is your comment still valid ?
> > >
> > > No. Fortran compilers init variables to zero.
> > > In this case, I would run a debug build on your OSX machine through valgrind and make sure it is clean.
> > >
> > > Other obvious thing to check what happens if use exactly the same petsc builds on both machines. I see 3.6.1 and 3.6.0 are being used.
> > >
> > > For all this type of checking, I would definitely use debug builds on both machines. Your cluster build is using the highest level of optimization...
> > >
> > >
> > >
> > > I'll check anyway, but I thought I had been careful about this sort of things.
> > >
> > > Also, I thought the problem on Mac OS X may have been due to the fact I used the version with debugging on, so I rerun configure with --with-debugging=no, which did not change anything.
> > >
> > > Thx
> > >
> > > Timothee
> > >
> > >
> > > 2015-12-14 17:04 GMT+09:00 Dave May <dave.mayhem23 at gmail.com>:
> > > One suggestion is you have some uninitialized variables in your pcshell. Despite your arch being called "debug", your configure options indicate you have turned debugging off.
> > >
> > > C standard doesn't prescribe how uninit variables should be treated - the behavior is labelled as undefined. As a result, different compilers on different archs with the same optimization flags can and will treat uninit variables differently. I find OSX c compilers tend to set them to zero.
> > >
> > > I suggest compiling a debug build on both machines and trying your test again. Also, consider running the debug builds through valgrind.
> > >
> > > Thanks,
> > >   Dave
> > >
> > > On Monday, 14 December 2015, Timothée Nicolas <timothee.nicolas at gmail.com> wrote:
> > > Hi,
> > >
> > > I have noticed I have a VERY big difference in behaviour between two machines in my problem, solved with SNES. I can't explain it, because I have tested my operators which give the same result. I also checked that the vectors fed to the SNES are the same. The problem happens only with my shell preconditioner. When I don't use it, and simply solve using -snes_mf, I don't see anymore than the usual 3-4 changing digits at the end of the residuals. However, when I use my pcshell, the results are completely different between the two machines.
> > >
> > > I have attached output_SuperComputer.txt and output_DesktopComputer.txt, which correspond to the output from the exact same code and options (and of course same input data file !). More precisely
> > >
> > > output_SuperComputer.txt : output on a supercomputer called Helios, sorry I don't know the exact specs.
> > > In this case, the SNES norms are reduced successively:
> > > 0 SNES Function norm 4.867111712420e-03
> > > 1 SNES Function norm 5.632325929998e-08
> > > 2 SNES Function norm 7.427800084502e-15
> > >
> > > output_DesktopComputer.txt : output on a Mac OS X Yosemite 3.4 GHz Intel Core i5 16GB 1600 MHz DDr3. (the same happens on an other laptop with Mac OS X Mavericks).
> > > In this case, I obtain the following for the SNES norms,
> > > while in the other, I obtain
> > > 0 SNES Function norm 4.867111713544e-03
> > > 1 SNES Function norm 1.560094052222e-03
> > > 2 SNES Function norm 1.552118650943e-03
> > > 3 SNES Function norm 1.552106297094e-03
> > > 4 SNES Function norm 1.552106277949e-03
> > > which I can't explain, because otherwise the KSP residual (with the same operator, which I checked) behave well.
> > >
> > > As you can see, the first time the preconditioner is applied (DB_, DP_, Drho_ and PS_ solves), the two outputs coincide (except for the few last digits, up to 9 actually, which is more than I would expect), and everything starts to diverge at the first print of the main KSP (the one stemming from the SNES) residual norms.
> > >
> > > Do you have an idea what may cause such a strange behaviour ?
> > >
> > > Best
> > >
> > > Timothee
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > --
> > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > > -- Norbert Wiener
> > >
> > >
> > >
> >
> >
> 
>