From doss0032 at umn.edu  Mon May  1 15:13:31 2017
From: doss0032 at umn.edu (Scott Dossa)
Date: Mon, 1 May 2017 15:13:31 -0500
Subject: [petsc-users] Call KSP routine before each timestep
Message-ID: <CA+M81Jf7ON+JKNLHkjBRmC141GWbTbWoTC_CJK4ypUkz95OX+Q@mail.gmail.com>

Hi All,

I'm looking to pass a vector between a KSP and TS routine. The KSP routine
must be called before each timestep, and the solution vector is needed for
the TS routine. Normally, TSSolve() runs over all timesteps, but in my
case, I'd like to be able to add a routine before each timestep.

Can someone direct me to an example script or briefly explain a case which
shows how to control time stepping such that one could achieve something
along the lines of:

while (step < maxsteps+1){
        KSPSolve(ksp, v, p); /* solves for Vec p and passes this info onto
TS */
        TSSolve(ts, u); /* only iterate for 1 timestep */
}

The function TSSetPreStep() seemed promising, but it can only take TS as
arguments which may not be sufficient to pass a global vector.

Thank you in advance.
Scott Dossa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170501/3df69425/attachment.html>

From knepley at gmail.com  Mon May  1 15:24:20 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 1 May 2017 15:24:20 -0500
Subject: [petsc-users] Call KSP routine before each timestep
In-Reply-To: <CA+M81Jf7ON+JKNLHkjBRmC141GWbTbWoTC_CJK4ypUkz95OX+Q@mail.gmail.com>
References: <CA+M81Jf7ON+JKNLHkjBRmC141GWbTbWoTC_CJK4ypUkz95OX+Q@mail.gmail.com>
Message-ID: <CAMYG4GkZyTHAWGjBUtmi=++NhJFLF9fZEiYp-eYMxXUKW9TgYg@mail.gmail.com>

On Mon, May 1, 2017 at 3:13 PM, Scott Dossa <doss0032 at umn.edu> wrote:

> Hi All,
>
> I'm looking to pass a vector between a KSP and TS routine. The KSP routine
> must be called before each timestep, and the solution vector is needed for
> the TS routine. Normally, TSSolve() runs over all timesteps, but in my
> case, I'd like to be able to add a routine before each timestep.
>
> Can someone direct me to an example script or briefly explain a case which
> shows how to control time stepping such that one could achieve something
> along the lines of:
>
> while (step < maxsteps+1){
>         KSPSolve(ksp, v, p); /* solves for Vec p and passes this info onto
> TS */
>         TSSolve(ts, u); /* only iterate for 1 timestep */
> }
>
> The function TSSetPreStep() seemed promising, but it can only take TS as
> arguments which may not be sufficient to pass a global vector.
>

Yes, this is the correct thing. You can

  a) Just attach a Vec to the TS using PetscObjectCompose(), but that is
ugly so you can

  b) Make a context structure, and stick it in the TS using


http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetApplicationContext.html

      That is also where the KSP should go.

  Thanks,

    Matt


> Thank you in advance.
> Scott Dossa
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170501/51700639/attachment.html>

From bsmith at mcs.anl.gov  Mon May  1 15:32:22 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 1 May 2017 15:32:22 -0500
Subject: [petsc-users] Call KSP routine before each timestep
In-Reply-To: <CAMYG4GkZyTHAWGjBUtmi=++NhJFLF9fZEiYp-eYMxXUKW9TgYg@mail.gmail.com>
References: <CA+M81Jf7ON+JKNLHkjBRmC141GWbTbWoTC_CJK4ypUkz95OX+Q@mail.gmail.com>
	<CAMYG4GkZyTHAWGjBUtmi=++NhJFLF9fZEiYp-eYMxXUKW9TgYg@mail.gmail.com>
Message-ID: <BC304A78-D8F5-4E37-8DEE-9E5E5904AEE1@mcs.anl.gov>


   Scott -  Are you doing some kind of pressure projection method?

   PETSc-developers - should this functionality be directly added to TS since it comes up fairly often?

   Barry



> On May 1, 2017, at 3:24 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Mon, May 1, 2017 at 3:13 PM, Scott Dossa <doss0032 at umn.edu> wrote:
> Hi All,
> 
> I'm looking to pass a vector between a KSP and TS routine. The KSP routine must be called before each timestep, and the solution vector is needed for the TS routine. Normally, TSSolve() runs over all timesteps, but in my case, I'd like to be able to add a routine before each timestep.
> 
> Can someone direct me to an example script or briefly explain a case which shows how to control time stepping such that one could achieve something along the lines of:
> 
> while (step < maxsteps+1){
>         KSPSolve(ksp, v, p); /* solves for Vec p and passes this info onto TS */
>         TSSolve(ts, u); /* only iterate for 1 timestep */
> }
> 
> The function TSSetPreStep() seemed promising, but it can only take TS as arguments which may not be sufficient to pass a global vector.
> 
> Yes, this is the correct thing. You can
> 
>   a) Just attach a Vec to the TS using PetscObjectCompose(), but that is ugly so you can
> 
>   b) Make a context structure, and stick it in the TS using
> 
>   http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetApplicationContext.html
> 
>       That is also where the KSP should go.
> 
>   Thanks,
> 
>     Matt
>  
> Thank you in advance.
> Scott Dossa
> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From doss0032 at umn.edu  Mon May  1 16:27:45 2017
From: doss0032 at umn.edu (Scott Dossa)
Date: Mon, 1 May 2017 16:27:45 -0500
Subject: [petsc-users] Call KSP routine before each timestep
In-Reply-To: <BC304A78-D8F5-4E37-8DEE-9E5E5904AEE1@mcs.anl.gov>
References: <CA+M81Jf7ON+JKNLHkjBRmC141GWbTbWoTC_CJK4ypUkz95OX+Q@mail.gmail.com>
	<CAMYG4GkZyTHAWGjBUtmi=++NhJFLF9fZEiYp-eYMxXUKW9TgYg@mail.gmail.com>
	<BC304A78-D8F5-4E37-8DEE-9E5E5904AEE1@mcs.anl.gov>
Message-ID: <CA+M81Jc5QjGifoWaKx87=TWUWbScPP_qHuF1FKaLVJunFvZoeQ@mail.gmail.com>

Hi All,

Matt:
Thank you! Using the application context is a good approach to pass the
vector information. Can you also direct me to which command allows TSSolve
to be only called for one timestep / start at the correct timestep? When
TSSolve() is called, it always resets to timestep 0.

Barry:
Yes, this is a pressure projection method where one needs the pressure
field at each timestep to solve for the velocity field.

I will likely have more follow up questions as I quick write this up. Thank
you both for your input.
-Scott Dossa

On Mon, May 1, 2017 at 3:32 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>    Scott -  Are you doing some kind of pressure projection method?
>
>    PETSc-developers - should this functionality be directly added to TS
> since it comes up fairly often?
>
>    Barry
>
>
>
> > On May 1, 2017, at 3:24 PM, Matthew Knepley <knepley at gmail.com> wrote:
> >
> > On Mon, May 1, 2017 at 3:13 PM, Scott Dossa <doss0032 at umn.edu> wrote:
> > Hi All,
> >
> > I'm looking to pass a vector between a KSP and TS routine. The KSP
> routine must be called before each timestep, and the solution vector is
> needed for the TS routine. Normally, TSSolve() runs over all timesteps, but
> in my case, I'd like to be able to add a routine before each timestep.
> >
> > Can someone direct me to an example script or briefly explain a case
> which shows how to control time stepping such that one could achieve
> something along the lines of:
> >
> > while (step < maxsteps+1){
> >         KSPSolve(ksp, v, p); /* solves for Vec p and passes this info
> onto TS */
> >         TSSolve(ts, u); /* only iterate for 1 timestep */
> > }
> >
> > The function TSSetPreStep() seemed promising, but it can only take TS as
> arguments which may not be sufficient to pass a global vector.
> >
> > Yes, this is the correct thing. You can
> >
> >   a) Just attach a Vec to the TS using PetscObjectCompose(), but that is
> ugly so you can
> >
> >   b) Make a context structure, and stick it in the TS using
> >
> >   http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/
> TSSetApplicationContext.html
> >
> >       That is also where the KSP should go.
> >
> >   Thanks,
> >
> >     Matt
> >
> > Thank you in advance.
> > Scott Dossa
> >
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170501/10d237bc/attachment.html>

From knepley at gmail.com  Mon May  1 16:42:07 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 1 May 2017 16:42:07 -0500
Subject: [petsc-users] Call KSP routine before each timestep
In-Reply-To: <CA+M81Jc5QjGifoWaKx87=TWUWbScPP_qHuF1FKaLVJunFvZoeQ@mail.gmail.com>
References: <CA+M81Jf7ON+JKNLHkjBRmC141GWbTbWoTC_CJK4ypUkz95OX+Q@mail.gmail.com>
	<CAMYG4GkZyTHAWGjBUtmi=++NhJFLF9fZEiYp-eYMxXUKW9TgYg@mail.gmail.com>
	<BC304A78-D8F5-4E37-8DEE-9E5E5904AEE1@mcs.anl.gov>
	<CA+M81Jc5QjGifoWaKx87=TWUWbScPP_qHuF1FKaLVJunFvZoeQ@mail.gmail.com>
Message-ID: <CAMYG4GnRU4HO3wggYpC_kAgpqEoSFKgb_C-9UBekDYaVEpiY5w@mail.gmail.com>

On Mon, May 1, 2017 at 4:27 PM, Scott Dossa <doss0032 at umn.edu> wrote:

> Hi All,
>
> Matt:
> Thank you! Using the application context is a good approach to pass the
> vector information. Can you also direct me to which command allows TSSolve
> to be only called for one timestep / start at the correct timestep? When
> TSSolve() is called, it always resets to timestep 0.
>

You should not need that since PreStep will be called at the beginning of
each step, but just in case

  http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSStep.html

although using this is tricky so I do not recommend it.


> Barry:
> Yes, this is a pressure projection method where one needs the pressure
> field at each timestep to solve for the velocity field.
>

If it was me, I would not do it this way, but its somewhat a matter of
taste. It makes more sense to me to formulate the whole
system as a DAE, meaning time derivatives on some things (v) and not others
(p). Then use a DAE timestepper and your
fluid solver can be formulated as pressure projection using PCFIELDSPLIT.
This way, if you want to use another kind of fluid
solver, you can, whereas now you are stuck with the alternation of
projection of momentum update.

  Thanks,

    Matt


> I will likely have more follow up questions as I quick write this up.
> Thank you both for your input.
> -Scott Dossa
>
> On Mon, May 1, 2017 at 3:32 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>>    Scott -  Are you doing some kind of pressure projection method?
>>
>>    PETSc-developers - should this functionality be directly added to TS
>> since it comes up fairly often?
>>
>>    Barry
>>
>>
>>
>> > On May 1, 2017, at 3:24 PM, Matthew Knepley <knepley at gmail.com> wrote:
>> >
>> > On Mon, May 1, 2017 at 3:13 PM, Scott Dossa <doss0032 at umn.edu> wrote:
>> > Hi All,
>> >
>> > I'm looking to pass a vector between a KSP and TS routine. The KSP
>> routine must be called before each timestep, and the solution vector is
>> needed for the TS routine. Normally, TSSolve() runs over all timesteps, but
>> in my case, I'd like to be able to add a routine before each timestep.
>> >
>> > Can someone direct me to an example script or briefly explain a case
>> which shows how to control time stepping such that one could achieve
>> something along the lines of:
>> >
>> > while (step < maxsteps+1){
>> >         KSPSolve(ksp, v, p); /* solves for Vec p and passes this info
>> onto TS */
>> >         TSSolve(ts, u); /* only iterate for 1 timestep */
>> > }
>> >
>> > The function TSSetPreStep() seemed promising, but it can only take TS
>> as arguments which may not be sufficient to pass a global vector.
>> >
>> > Yes, this is the correct thing. You can
>> >
>> >   a) Just attach a Vec to the TS using PetscObjectCompose(), but that
>> is ugly so you can
>> >
>> >   b) Make a context structure, and stick it in the TS using
>> >
>> >   http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages
>> /TS/TSSetApplicationContext.html
>> >
>> >       That is also where the KSP should go.
>> >
>> >   Thanks,
>> >
>> >     Matt
>> >
>> > Thank you in advance.
>> > Scott Dossa
>> >
>> >
>> >
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170501/c43b4ac4/attachment-0001.html>

From mfadams at lbl.gov  Mon May  1 19:14:56 2017
From: mfadams at lbl.gov (Mark Adams)
Date: Mon, 1 May 2017 20:14:56 -0400
Subject: [petsc-users] SNES error
Message-ID: <CADOhEh7_aBgChPnHnca1iweEssO4jP+CDzy3y06GSUEpyWr9Uw@mail.gmail.com>

I get this SNES failure and I don't understand what the problem is. The
rtol is 1.e-6 and the first iteration reduces the residual by 9 orders of
magnitude. Yet, TS is not satisfied. What is going on here?

mpiexec -n 1 ./vml -v_coord_cylinder -x_dm_refine 2 -v_dm_refine 2
-snes_rtol 1.e-6 -snes_stol 1.e-6 -ts_type cn -snes_fd -pc_type lu
-ksp_type preonly -x_petscspace_order 1 -x_petscspace_poly_tensor
-v_petscspace_order 1 -v_petscspace_poly_tensor -ts_dt .1 -ts_max_steps 10
-ts_final_time 1e10 -verbose 3 -num_species 1 -snes_monitor -masses 1,2,4
-thermal_temps 30,30,30  -domainv_lo -2,-2 -domainv_hi 2,2 -domainx_lo
-12,-12 -domainx_hi 12,12 -E 0,0 -blobx_radius 2 -x_dm_view hdf5:x.h5
-x_vec_view hdf5:x.h5::append -v_dm_view hdf5:v.h5 -v_vec_view
hdf5:v.h5::append -x_pre_dm_view hdf5:prex.h5 -x_pre_vec_view
hdf5:prex.h5::append
 ....

   0 SNES Function norm 4.097052680599e+00
    1 SNES Function norm 1.213148652908e-09
[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR:
[0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE, increase
-ts_max_snes_failures or make negative to attempt recovery
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.7.6-3659-g699918129c
 GIT Date: 2017-04-26 08:18:35 -0400
[0]PETSC ERROR: ./vml on a arch-macosx-gnu-O named MarksMac-5.local by
markadams Mon May  1 19:21:32 2017
[0]PETSC ERROR: Configure options --with-cc=clang --with-cc++=clang++
COPTFLAGS="-O3 -g -mavx2" CXXOPTFLAGS="-O3 -g -mavx2" FOPTFLAGS="-O3 -g
-mavx2" --download-mpich=1 --download-parmetis=1 --download-metis=1
--download-hypre=1 --download-ml=1 --download-triangle=1
--download-ctetgen=1 --download-p4est=1 --with-x=0 --download-superlu_dist
--download-superlu --download-ctetgen --with-debugging=0 --download-hdf5=1
PETSC_ARCH=arch-macosx-gnu-O --download-chaco --with-viewfromoptions=1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170501/a6b967bb/attachment.html>

From knepley at gmail.com  Mon May  1 19:51:46 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 1 May 2017 19:51:46 -0500
Subject: [petsc-users] SNES error
In-Reply-To: <CADOhEh7_aBgChPnHnca1iweEssO4jP+CDzy3y06GSUEpyWr9Uw@mail.gmail.com>
References: <CADOhEh7_aBgChPnHnca1iweEssO4jP+CDzy3y06GSUEpyWr9Uw@mail.gmail.com>
Message-ID: <CAMYG4Gmwh3Xzjta+C9HRE8eAtU7-5y7qtabw6Z9WR---R0AJxg@mail.gmail.com>

Run with -snes_converged_reason.

   Matt

On Mon, May 1, 2017 at 7:14 PM, Mark Adams <mfadams at lbl.gov> wrote:

> I get this SNES failure and I don't understand what the problem is. The
> rtol is 1.e-6 and the first iteration reduces the residual by 9 orders of
> magnitude. Yet, TS is not satisfied. What is going on here?
>
> mpiexec -n 1 ./vml -v_coord_cylinder -x_dm_refine 2 -v_dm_refine 2
> -snes_rtol 1.e-6 -snes_stol 1.e-6 -ts_type cn -snes_fd -pc_type lu
> -ksp_type preonly -x_petscspace_order 1 -x_petscspace_poly_tensor
> -v_petscspace_order 1 -v_petscspace_poly_tensor -ts_dt .1 -ts_max_steps 10
> -ts_final_time 1e10 -verbose 3 -num_species 1 -snes_monitor -masses 1,2,4
> -thermal_temps 30,30,30  -domainv_lo -2,-2 -domainv_hi 2,2 -domainx_lo
> -12,-12 -domainx_hi 12,12 -E 0,0 -blobx_radius 2 -x_dm_view hdf5:x.h5
> -x_vec_view hdf5:x.h5::append -v_dm_view hdf5:v.h5 -v_vec_view
> hdf5:v.h5::append -x_pre_dm_view hdf5:prex.h5 -x_pre_vec_view
> hdf5:prex.h5::append
>  ....
>
>    0 SNES Function norm 4.097052680599e+00
>     1 SNES Function norm 1.213148652908e-09
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR:
> [0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE,
> increase -ts_max_snes_failures or make negative to attempt recovery
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.6-3659-g699918129c
>  GIT Date: 2017-04-26 08:18:35 -0400
> [0]PETSC ERROR: ./vml on a arch-macosx-gnu-O named MarksMac-5.local by
> markadams Mon May  1 19:21:32 2017
> [0]PETSC ERROR: Configure options --with-cc=clang --with-cc++=clang++
> COPTFLAGS="-O3 -g -mavx2" CXXOPTFLAGS="-O3 -g -mavx2" FOPTFLAGS="-O3 -g
> -mavx2" --download-mpich=1 --download-parmetis=1 --download-metis=1
> --download-hypre=1 --download-ml=1 --download-triangle=1
> --download-ctetgen=1 --download-p4est=1 --with-x=0 --download-superlu_dist
> --download-superlu --download-ctetgen --with-debugging=0 --download-hdf5=1
> PETSC_ARCH=arch-macosx-gnu-O --download-chaco --with-viewfromoptions=1
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170501/88c82a9f/attachment.html>

From bsmith at mcs.anl.gov  Mon May  1 21:25:24 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 1 May 2017 21:25:24 -0500
Subject: [petsc-users] SNES error
In-Reply-To: <CAMYG4Gmwh3Xzjta+C9HRE8eAtU7-5y7qtabw6Z9WR---R0AJxg@mail.gmail.com>
References: <CADOhEh7_aBgChPnHnca1iweEssO4jP+CDzy3y06GSUEpyWr9Uw@mail.gmail.com>
	<CAMYG4Gmwh3Xzjta+C9HRE8eAtU7-5y7qtabw6Z9WR---R0AJxg@mail.gmail.com>
Message-ID: <677760BF-5666-4C9D-A064-B495ACD80889@mcs.anl.gov>


  and 

  -snes_linesearch_monitor
  -ts_adapt_monitor


> On May 1, 2017, at 7:51 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> Run with -snes_converged_reason.
> 
>    Matt
> 
> On Mon, May 1, 2017 at 7:14 PM, Mark Adams <mfadams at lbl.gov> wrote:
> I get this SNES failure and I don't understand what the problem is. The rtol is 1.e-6 and the first iteration reduces the residual by 9 orders of magnitude. Yet, TS is not satisfied. What is going on here?
>  
> mpiexec -n 1 ./vml -v_coord_cylinder -x_dm_refine 2 -v_dm_refine 2 -snes_rtol 1.e-6 -snes_stol 1.e-6 -ts_type cn -snes_fd -pc_type lu -ksp_type preonly -x_petscspace_order 1 -x_petscspace_poly_tensor -v_petscspace_order 1 -v_petscspace_poly_tensor -ts_dt .1 -ts_max_steps 10 -ts_final_time 1e10 -verbose 3 -num_species 1 -snes_monitor -masses 1,2,4 -thermal_temps 30,30,30  -domainv_lo -2,-2 -domainv_hi 2,2 -domainx_lo -12,-12 -domainx_hi 12,12 -E 0,0 -blobx_radius 2 -x_dm_view hdf5:x.h5 -x_vec_view hdf5:x.h5::append -v_dm_view hdf5:v.h5 -v_vec_view hdf5:v.h5::append -x_pre_dm_view hdf5:prex.h5 -x_pre_vec_view hdf5:prex.h5::append 
>  ....
> 
>    0 SNES Function norm 4.097052680599e+00 
>     1 SNES Function norm 1.213148652908e-09 
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR:   
> [0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE, increase -ts_max_snes_failures or make negative to attempt recovery
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.6-3659-g699918129c  GIT Date: 2017-04-26 08:18:35 -0400
> [0]PETSC ERROR: ./vml on a arch-macosx-gnu-O named MarksMac-5.local by markadams Mon May  1 19:21:32 2017
> [0]PETSC ERROR: Configure options --with-cc=clang --with-cc++=clang++ COPTFLAGS="-O3 -g -mavx2" CXXOPTFLAGS="-O3 -g -mavx2" FOPTFLAGS="-O3 -g -mavx2" --download-mpich=1 --download-parmetis=1 --download-metis=1 --download-hypre=1 --download-ml=1 --download-triangle=1 --download-ctetgen=1 --download-p4est=1 --with-x=0 --download-superlu_dist --download-superlu --download-ctetgen --with-debugging=0 --download-hdf5=1 PETSC_ARCH=arch-macosx-gnu-O --download-chaco --with-viewfromoptions=1
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From emconsta at mcs.anl.gov  Mon May  1 22:06:27 2017
From: emconsta at mcs.anl.gov (Emil Constantinescu)
Date: Mon, 1 May 2017 22:06:27 -0500
Subject: [petsc-users] Call KSP routine before each timestep
In-Reply-To: <CAMYG4GnRU4HO3wggYpC_kAgpqEoSFKgb_C-9UBekDYaVEpiY5w@mail.gmail.com>
References: <CA+M81Jf7ON+JKNLHkjBRmC141GWbTbWoTC_CJK4ypUkz95OX+Q@mail.gmail.com>
	<CAMYG4GkZyTHAWGjBUtmi=++NhJFLF9fZEiYp-eYMxXUKW9TgYg@mail.gmail.com>
	<BC304A78-D8F5-4E37-8DEE-9E5E5904AEE1@mcs.anl.gov>
	<CA+M81Jc5QjGifoWaKx87=TWUWbScPP_qHuF1FKaLVJunFvZoeQ@mail.gmail.com>
	<CAMYG4GnRU4HO3wggYpC_kAgpqEoSFKgb_C-9UBekDYaVEpiY5w@mail.gmail.com>
Message-ID: <c25041dd-c9a5-2a89-d912-df44a958a07a@mcs.anl.gov>



On 5/1/17 4:42 PM, Matthew Knepley wrote:
> On Mon, May 1, 2017 at 4:27 PM, Scott Dossa <doss0032 at umn.edu 
> <mailto:doss0032 at umn.edu>> wrote:
> 
>     Hi All,
> 
>     Matt:
>     Thank you! Using the application context is a good approach to pass
>     the vector information. Can you also direct me to which command
>     allows TSSolve to be only called for one timestep / start at the
>     correct timestep? When TSSolve() is called, it always resets to
>     timestep 0.
> 
> 
> You should not need that since PreStep will be called at the beginning 
> of each step, but just in case
> 
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSStep.html
> 
> although using this is tricky so I do not recommend it.

If it's a projection you may need to set the PostStep and (or) PostStage 
if using multistage methods 
(http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetPostStage.html#TSSetPostStage); 
otherwise the last step may not be div free.

>     Barry:
>     Yes, this is a pressure projection method where one needs the
>     pressure field at each timestep to solve for the velocity field.
> 
> 
> If it was me, I would not do it this way, but its somewhat a matter of 
> taste. It makes more sense to me to formulate the whole
> system as a DAE, meaning time derivatives on some things (v) and not 
> others (p). Then use a DAE timestepper and your
> fluid solver can be formulated as pressure projection using 
> PCFIELDSPLIT. This way, if you want to use another kind of fluid
> solver, you can, whereas now you are stuck with the alternation of 
> projection of momentum update.

Yes, formulating it as a DAE is desirable; however, if you project it 
separately you have access to significantly more time steppers.

Emil


>    Thanks,
> 
>      Matt
> 
>     I will likely have more follow up questions as I quick write this
>     up. Thank you both for your input.
>     -Scott Dossa
> 
>     On Mon, May 1, 2017 at 3:32 PM, Barry Smith <bsmith at mcs.anl.gov
>     <mailto:bsmith at mcs.anl.gov>> wrote:
> 
> 
>             Scott -  Are you doing some kind of pressure projection method?
> 
>             PETSc-developers - should this functionality be directly
>         added to TS since it comes up fairly often?
> 
>             Barry
> 
> 
> 
>          > On May 1, 2017, at 3:24 PM, Matthew Knepley
>         <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
>          >
>          > On Mon, May 1, 2017 at 3:13 PM, Scott Dossa <doss0032 at umn.edu
>         <mailto:doss0032 at umn.edu>> wrote:
>          > Hi All,
>          >
>          > I'm looking to pass a vector between a KSP and TS routine.
>         The KSP routine must be called before each timestep, and the
>         solution vector is needed for the TS routine. Normally,
>         TSSolve() runs over all timesteps, but in my case, I'd like to
>         be able to add a routine before each timestep.
>          >
>          > Can someone direct me to an example script or briefly explain
>         a case which shows how to control time stepping such that one
>         could achieve something along the lines of:
>          >
>          > while (step < maxsteps+1){
>          >         KSPSolve(ksp, v, p); /* solves for Vec p and passes
>         this info onto TS */
>          >         TSSolve(ts, u); /* only iterate for 1 timestep */
>          > }
>          >
>          > The function TSSetPreStep() seemed promising, but it can only
>         take TS as arguments which may not be sufficient to pass a
>         global vector.
>          >
>          > Yes, this is the correct thing. You can
>          >
>          >   a) Just attach a Vec to the TS using PetscObjectCompose(),
>         but that is ugly so you can
>          >
>          >   b) Make a context structure, and stick it in the TS using
>          >
>          >
>         http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetApplicationContext.html
>         <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetApplicationContext.html>
>          >
>          >       That is also where the KSP should go.
>          >
>          >   Thanks,
>          >
>          >     Matt
>          >
>          > Thank you in advance.
>          > Scott Dossa
>          >
>          >
>          >
>          >
>          >
>          > --
>          > What most experimenters take for granted before they begin
>         their experiments is infinitely more interesting than any
>         results to which their experiments lead.
>          > -- Norbert Wiener
> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener

From mfadams at lbl.gov  Tue May  2 10:10:18 2017
From: mfadams at lbl.gov (Mark Adams)
Date: Tue, 2 May 2017 11:10:18 -0400
Subject: [petsc-users] SNES error
In-Reply-To: <677760BF-5666-4C9D-A064-B495ACD80889@mcs.anl.gov>
References: <CADOhEh7_aBgChPnHnca1iweEssO4jP+CDzy3y06GSUEpyWr9Uw@mail.gmail.com>
	<CAMYG4Gmwh3Xzjta+C9HRE8eAtU7-5y7qtabw6Z9WR---R0AJxg@mail.gmail.com>
	<677760BF-5666-4C9D-A064-B495ACD80889@mcs.anl.gov>
Message-ID: <CADOhEh66JN=yX=V8E0Y8uhiOmfS6sA6-Lup37ubyh8wW0y38vg@mail.gmail.com>

/Users/markadams/Codes/petsc/arch-macosx-gnu-O/bin/mpiexec -n 1 ./vml
-v_coord_cylinder -x_dm_refine 2 -v_dm_refine 2 -snes_rtol 1.e-6 -snes_stol
1.e-6 -ts_type cn -snes_fd -pc_type lu -ksp_type preonly
-x_petscspace_order 1 -x_petscspace_poly_tensor -v_petscspace_order 1
-v_petscspace_poly_tensor -ts_dt .1 -ts_max_steps 10 -ts_final_time 1e10
-verbose 3 -num_species 1 -snes_monitor -masses 1,2,4 -thermal_temps
30,30,30  -domainv_lo -2,-2 -domainv_hi 2,2 -domainx_lo -12,-12 -domainx_hi
12,12 -E 0,0 -blobx_radius 2 -x_dm_view hdf5:x.h5 -x_vec_view
hdf5:x.h5::append -v_dm_view hdf5:v.h5 -v_vec_view hdf5:v.h5::append
-x_pre_dm_view hdf5:prex.h5 -x_pre_vec_view hdf5:prex.h5::append
-snes_converged_reason -snes_linesearch_monitor -ts_adapt_monitor
main call SetupXDiscretization
main call SetInitialConditionDomain
                VMLViewX DMGetOutputSequenceNumber=-1,
cmd_str=-x_pre_vec_view
  0) species 0: charge density= -2.3940791757186e+00, z-momentum=
 5.9851979392559e-01, energy=  3.2314073646197e-01, thermal-flux=
 2.4419137539877e-01
          0) Normalized: charge density= -2.3940791757186e+00, z momentum=
 5.9851979392559e-01, energy=  3.2314073646197e-01, thermal flux=
 2.4419137539877e-01, local: 64 X cells, 81 X vertices
                VMLViewX DMGetOutputSequenceNumber=0, cmd_str=(null)
        VMLViewV DMGetOutputSequenceNumber=-1
    0 SNES Function norm 4.097052680599e+00
    1 SNES Function norm 1.213148652908e-09
  Nonlinear solve did not converge due to DIVERGED_FUNCTION_COUNT
iterations 1
      TSAdapt none step   0 stage rejected t=0          + 1.000e-01,
nonlinear solve failures 1 greater than current TS allowed
[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR:
[0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE, increase
-ts_max_snes_failures or make negative to attempt recovery
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.7.6-3659-g699918129c
 GIT Date: 2017-04-26 08:18:35 -0400
[0]PETSC ERROR: ./vml on a arch-macosx-gnu-O named MarksMac-5.local by
markadams Tue May  2 11:04:02 2017
[0]PETSC ERROR: Configure options --with-cc=clang --with-cc++=clang++
COPTFLAGS="-O3 -g -mavx2" CXXOPTFLAGS="-O3 -g -mavx2" FOPTFLAGS="-O3 -g
-mavx2" --download-mpich=1 --download-parmetis=1 --download-metis=1
--download-hypre=1 --download-ml=1 --download-triangle=1
--download-ctetgen=1 --download-p4est=1 --with-x=0 --download-superlu_dist
--download-superlu --download-ctetgen --with-debugging=0 --download-hdf5=1
PETSC_ARCH=arch-macosx-gnu-O --download-chaco --with-viewfromoptions=1


On Mon, May 1, 2017 at 10:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   and
>
>   -snes_linesearch_monitor
>   -ts_adapt_monitor
>
>
> > On May 1, 2017, at 7:51 PM, Matthew Knepley <knepley at gmail.com> wrote:
> >
> > Run with -snes_converged_reason.
> >
> >    Matt
> >
> > On Mon, May 1, 2017 at 7:14 PM, Mark Adams <mfadams at lbl.gov> wrote:
> > I get this SNES failure and I don't understand what the problem is. The
> rtol is 1.e-6 and the first iteration reduces the residual by 9 orders of
> magnitude. Yet, TS is not satisfied. What is going on here?
> >
> > mpiexec -n 1 ./vml -v_coord_cylinder -x_dm_refine 2 -v_dm_refine 2
> -snes_rtol 1.e-6 -snes_stol 1.e-6 -ts_type cn -snes_fd -pc_type lu
> -ksp_type preonly -x_petscspace_order 1 -x_petscspace_poly_tensor
> -v_petscspace_order 1 -v_petscspace_poly_tensor -ts_dt .1 -ts_max_steps 10
> -ts_final_time 1e10 -verbose 3 -num_species 1 -snes_monitor -masses 1,2,4
> -thermal_temps 30,30,30  -domainv_lo -2,-2 -domainv_hi 2,2 -domainx_lo
> -12,-12 -domainx_hi 12,12 -E 0,0 -blobx_radius 2 -x_dm_view hdf5:x.h5
> -x_vec_view hdf5:x.h5::append -v_dm_view hdf5:v.h5 -v_vec_view
> hdf5:v.h5::append -x_pre_dm_view hdf5:prex.h5 -x_pre_vec_view
> hdf5:prex.h5::append
> >  ....
> >
> >    0 SNES Function norm 4.097052680599e+00
> >     1 SNES Function norm 1.213148652908e-09
> > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > [0]PETSC ERROR:
> > [0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE,
> increase -ts_max_snes_failures or make negative to attempt recovery
> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> > [0]PETSC ERROR: Petsc Development GIT revision: v3.7.6-3659-g699918129c
> GIT Date: 2017-04-26 08:18:35 -0400
> > [0]PETSC ERROR: ./vml on a arch-macosx-gnu-O named MarksMac-5.local by
> markadams Mon May  1 19:21:32 2017
> > [0]PETSC ERROR: Configure options --with-cc=clang --with-cc++=clang++
> COPTFLAGS="-O3 -g -mavx2" CXXOPTFLAGS="-O3 -g -mavx2" FOPTFLAGS="-O3 -g
> -mavx2" --download-mpich=1 --download-parmetis=1 --download-metis=1
> --download-hypre=1 --download-ml=1 --download-triangle=1
> --download-ctetgen=1 --download-p4est=1 --with-x=0 --download-superlu_dist
> --download-superlu --download-ctetgen --with-debugging=0 --download-hdf5=1
> PETSC_ARCH=arch-macosx-gnu-O --download-chaco --with-viewfromoptions=1
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170502/1c86239c/attachment-0001.html>

From knepley at gmail.com  Tue May  2 10:18:53 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 2 May 2017 10:18:53 -0500
Subject: [petsc-users] SNES error
In-Reply-To: <CADOhEh66JN=yX=V8E0Y8uhiOmfS6sA6-Lup37ubyh8wW0y38vg@mail.gmail.com>
References: <CADOhEh7_aBgChPnHnca1iweEssO4jP+CDzy3y06GSUEpyWr9Uw@mail.gmail.com>
	<CAMYG4Gmwh3Xzjta+C9HRE8eAtU7-5y7qtabw6Z9WR---R0AJxg@mail.gmail.com>
	<677760BF-5666-4C9D-A064-B495ACD80889@mcs.anl.gov>
	<CADOhEh66JN=yX=V8E0Y8uhiOmfS6sA6-Lup37ubyh8wW0y38vg@mail.gmail.com>
Message-ID: <CAMYG4GkNsgU4FR8GfJvVeZ28qY2PpYGcR+Jj1L7umJr8XXVTZQ@mail.gmail.com>

On Tue, May 2, 2017 at 10:10 AM, Mark Adams <mfadams at lbl.gov> wrote:

> /Users/markadams/Codes/petsc/arch-macosx-gnu-O/bin/mpiexec -n 1 ./vml
> -v_coord_cylinder -x_dm_refine 2 -v_dm_refine 2 -snes_rtol 1.e-6 -snes_stol
> 1.e-6 -ts_type cn -snes_fd -pc_type lu -ksp_type preonly
> -x_petscspace_order 1 -x_petscspace_poly_tensor -v_petscspace_order 1
> -v_petscspace_poly_tensor -ts_dt .1 -ts_max_steps 10 -ts_final_time 1e10
> -verbose 3 -num_species 1 -snes_monitor -masses 1,2,4 -thermal_temps
> 30,30,30  -domainv_lo -2,-2 -domainv_hi 2,2 -domainx_lo -12,-12 -domainx_hi
> 12,12 -E 0,0 -blobx_radius 2 -x_dm_view hdf5:x.h5 -x_vec_view
> hdf5:x.h5::append -v_dm_view hdf5:v.h5 -v_vec_view hdf5:v.h5::append
> -x_pre_dm_view hdf5:prex.h5 -x_pre_vec_view hdf5:prex.h5::append
> -snes_converged_reason -snes_linesearch_monitor -ts_adapt_monitor
> main call SetupXDiscretization
> main call SetInitialConditionDomain
>                 VMLViewX DMGetOutputSequenceNumber=-1,
> cmd_str=-x_pre_vec_view
>   0) species 0: charge density= -2.3940791757186e+00, z-momentum=
>  5.9851979392559e-01, energy=  3.2314073646197e-01, thermal-flux=
>  2.4419137539877e-01
>           0) Normalized: charge density= -2.3940791757186e+00, z momentum=
>  5.9851979392559e-01, energy=  3.2314073646197e-01, thermal flux=
>  2.4419137539877e-01, local: 64 X cells, 81 X vertices
>                 VMLViewX DMGetOutputSequenceNumber=0, cmd_str=(null)
>         VMLViewV DMGetOutputSequenceNumber=-1
>     0 SNES Function norm 4.097052680599e+00
>     1 SNES Function norm 1.213148652908e-09
>   Nonlinear solve did not converge due to DIVERGED_FUNCTION_COUNT
> iterations 1
>

Neat! Mark, I think this has to do with you calling SNESEvaluateFunc()
inside another one. We limit the number of function evaluations
to 10,000 by default, mostly to corral line searches. I think you hit this,
and thus need to up the count.

  Thanks,

     Matt


>       TSAdapt none step   0 stage rejected t=0          + 1.000e-01,
> nonlinear solve failures 1 greater than current TS allowed
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR:
> [0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE,
> increase -ts_max_snes_failures or make negative to attempt recovery
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.6-3659-g699918129c
>  GIT Date: 2017-04-26 08:18:35 -0400
> [0]PETSC ERROR: ./vml on a arch-macosx-gnu-O named MarksMac-5.local by
> markadams Tue May  2 11:04:02 2017
> [0]PETSC ERROR: Configure options --with-cc=clang --with-cc++=clang++
> COPTFLAGS="-O3 -g -mavx2" CXXOPTFLAGS="-O3 -g -mavx2" FOPTFLAGS="-O3 -g
> -mavx2" --download-mpich=1 --download-parmetis=1 --download-metis=1
> --download-hypre=1 --download-ml=1 --download-triangle=1
> --download-ctetgen=1 --download-p4est=1 --with-x=0 --download-superlu_dist
> --download-superlu --download-ctetgen --with-debugging=0 --download-hdf5=1
> PETSC_ARCH=arch-macosx-gnu-O --download-chaco --with-viewfromoptions=1
>
>
> On Mon, May 1, 2017 at 10:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>>   and
>>
>>   -snes_linesearch_monitor
>>   -ts_adapt_monitor
>>
>>
>> > On May 1, 2017, at 7:51 PM, Matthew Knepley <knepley at gmail.com> wrote:
>> >
>> > Run with -snes_converged_reason.
>> >
>> >    Matt
>> >
>> > On Mon, May 1, 2017 at 7:14 PM, Mark Adams <mfadams at lbl.gov> wrote:
>> > I get this SNES failure and I don't understand what the problem is. The
>> rtol is 1.e-6 and the first iteration reduces the residual by 9 orders of
>> magnitude. Yet, TS is not satisfied. What is going on here?
>> >
>> > mpiexec -n 1 ./vml -v_coord_cylinder -x_dm_refine 2 -v_dm_refine 2
>> -snes_rtol 1.e-6 -snes_stol 1.e-6 -ts_type cn -snes_fd -pc_type lu
>> -ksp_type preonly -x_petscspace_order 1 -x_petscspace_poly_tensor
>> -v_petscspace_order 1 -v_petscspace_poly_tensor -ts_dt .1 -ts_max_steps 10
>> -ts_final_time 1e10 -verbose 3 -num_species 1 -snes_monitor -masses 1,2,4
>> -thermal_temps 30,30,30  -domainv_lo -2,-2 -domainv_hi 2,2 -domainx_lo
>> -12,-12 -domainx_hi 12,12 -E 0,0 -blobx_radius 2 -x_dm_view hdf5:x.h5
>> -x_vec_view hdf5:x.h5::append -v_dm_view hdf5:v.h5 -v_vec_view
>> hdf5:v.h5::append -x_pre_dm_view hdf5:prex.h5 -x_pre_vec_view
>> hdf5:prex.h5::append
>> >  ....
>> >
>> >    0 SNES Function norm 4.097052680599e+00
>> >     1 SNES Function norm 1.213148652908e-09
>> > [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> > [0]PETSC ERROR:
>> > [0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE,
>> increase -ts_max_snes_failures or make negative to attempt recovery
>> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>> > [0]PETSC ERROR: Petsc Development GIT revision:
>> v3.7.6-3659-g699918129c  GIT Date: 2017-04-26 08:18:35 -0400
>> > [0]PETSC ERROR: ./vml on a arch-macosx-gnu-O named MarksMac-5.local by
>> markadams Mon May  1 19:21:32 2017
>> > [0]PETSC ERROR: Configure options --with-cc=clang --with-cc++=clang++
>> COPTFLAGS="-O3 -g -mavx2" CXXOPTFLAGS="-O3 -g -mavx2" FOPTFLAGS="-O3 -g
>> -mavx2" --download-mpich=1 --download-parmetis=1 --download-metis=1
>> --download-hypre=1 --download-ml=1 --download-triangle=1
>> --download-ctetgen=1 --download-p4est=1 --with-x=0 --download-superlu_dist
>> --download-superlu --download-ctetgen --with-debugging=0 --download-hdf5=1
>> PETSC_ARCH=arch-macosx-gnu-O --download-chaco --with-viewfromoptions=1
>> >
>> >
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170502/425b3bd1/attachment.html>

From hgbk2008 at gmail.com  Wed May  3 02:29:13 2017
From: hgbk2008 at gmail.com (Hoang Giang Bui)
Date: Wed, 3 May 2017 09:29:13 +0200
Subject: [petsc-users] strange convergence
In-Reply-To: <87wpa3wd5j.fsf@jedbrown.org>
References: <CAJW_hKcJN520Q=0cAJyAQ=ZF1Wh0f63=5AXSJ5MSyXMFv2AnNQ@mail.gmail.com>
	<7891536D-91FE-4BFF-8DAD-CE7AB85A4E57@mcs.anl.gov>
	<CAJW_hKfFpmJRx+XHSZxbdPEK8_B1sWQTATKVaAZu6vi-1uALqg@mail.gmail.com>
	<425BBB58-9721-49F3-8C86-940F08E925F7@mcs.anl.gov>
	<CAJW_hKd+GAwf4ZS3_337n3PAJmxC6dm2B5uGDS88nrH-KjV60g@mail.gmail.com>
	<A1749310-AE79-4D42-A14F-991DA59F4919@mcs.anl.gov>
	<CAJW_hKd+m3p+Oju7XSPha=_AqYbGD_uF0ZV6qe8HzxbFWZ_2Zg@mail.gmail.com>
	<42EB791A-40C2-439F-A5F7-5F8C15CECA6F@mcs.anl.gov>
	<CAJW_hKd=d2NMxiaU1h6Ro3S3f_iaY294yRJyCG_tb1KN6Wiiug@mail.gmail.com>
	<F5D2FC76-C3A2-4FC8-8144-0D7500F26547@mcs.anl.gov>
	<CAJW_hKc7ZnhAGrq=wAgr3kyBw536GSDL+deDvfmpwkwnmVzHzw@mail.gmail.com>
	<82193784-B4C4-47D7-80EA-25F549C9091B@mcs.anl.gov>
	<CAJW_hKeiozsxQ7XkVneArg_0UYiGXPU7Zovn_PMkNWFrix9w7A@mail.gmail.com>
	<B3AE9D6B-7826-4ED2-B9A2-12250D37BB01@mcs.anl.gov>
	<CAJW_hKfzOV3dDjjgZGvexAm_Sg+8N8fb82J_-iMfxJ2oYvBm9w@mail.gmail.com>
	<87wpa3wd5j.fsf@jedbrown.org>
Message-ID: <CAJW_hKdxHMKtN5P7edsZ=V2VTeWebR=D6chkZTMy5L=2fTsT+g@mail.gmail.com>

Dear Jed

If I understood you correctly you suggest to avoid penalty by using the
Lagrange multiplier for the mortar constraint? In this case it leads to the
use of discrete Lagrange multiplier space. Do you or anyone already have
experience using discrete Lagrange multiplier space with Petsc?

There is also similar question on stackexchange
https://scicomp.stackexchange.com/questions/25113/preconditioners-and-discrete-lagrange-multipliers

Giang

On Sat, Apr 29, 2017 at 3:34 PM, Jed Brown <jed at jedbrown.org> wrote:

> Hoang Giang Bui <hgbk2008 at gmail.com> writes:
>
> > Hi Barry
> >
> > The first block is from a standard solid mechanics discretization based
> on
> > balance of momentum equation. There is some material involved but in
> > principal it's well-posed elasticity equation with positive definite
> > tangent operator. The "gluing business" uses the mortar method to keep
> the
> > continuity of displacement. Instead of using Lagrange multiplier to treat
> > the constraint I used penalty method to penalize the energy. The
> > discretization form of mortar is quite simple
> >
> > \int_{\Gamma_1} { rho * (\delta u_1 - \delta u_2) * (u_1 - u_2) dA }
> >
> > rho is penalty parameter. In the simulation I initially set it low (~E)
> to
> > preserve the conditioning of the system.
>
> There are two things that can go wrong here with AMG:
>
> * The penalty term can mess up the strength of connection heuristics
>   such that you get poor choice of C-points (classical AMG like
>   BoomerAMG) or poor choice of aggregates (smoothed aggregation).
>
> * The penalty term can prevent Jacobi smoothing from being effective; in
>   this case, it can lead to poor coarse basis functions (higher energy
>   than they should be) and poor smoothing in an MG cycle.  You can fix
>   the poor smoothing in the MG cycle by using a stronger smoother, like
>   ASM with some overlap.
>
> I'm generally not a fan of penalty methods due to the irritating
> tradeoffs and often poor solver performance.
>
> > In the figure below, the colorful blocks are u_1 and the base is u_2.
> Both
> > u_1 and u_2 use isoparametric quadratic approximation.
> >
> > ?
> >  Snapshot.png
> > <https://drive.google.com/file/d/0Bw8Hmu0-YGQXc2hKQ1BhQ1I4OEU/view?usp=
> drive_web>
> > ???
> >
> > Giang
> >
> > On Fri, Apr 28, 2017 at 6:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >>
> >>   Ok, so boomerAMG algebraic multigrid is not good for the first block.
> >> You mentioned the first block has two things glued together? AMG is
> >> fantastic for certain problems but doesn't work for everything.
> >>
> >>    Tell us more about the first block, what PDE it comes from, what
> >> discretization, and what the "gluing business" is and maybe we'll have
> >> suggestions for how to precondition it.
> >>
> >>    Barry
> >>
> >> > On Apr 28, 2017, at 3:56 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
> wrote:
> >> >
> >> > It's in fact quite good
> >> >
> >> >     Residual norms for fieldsplit_u_ solve.
> >> >     0 KSP Residual norm 4.014715925568e+00
> >> >     1 KSP Residual norm 2.160497019264e-10
> >> >     Residual norms for fieldsplit_wp_ solve.
> >> >     0 KSP Residual norm 0.000000000000e+00
> >> >   0 KSP preconditioned resid norm 4.014715925568e+00 true resid norm
> >> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
> >> >     Residual norms for fieldsplit_u_ solve.
> >> >     0 KSP Residual norm 9.999999999416e-01
> >> >     1 KSP Residual norm 7.118380416383e-11
> >> >     Residual norms for fieldsplit_wp_ solve.
> >> >     0 KSP Residual norm 0.000000000000e+00
> >> >   1 KSP preconditioned resid norm 1.701150951035e-10 true resid norm
> >> 5.494262251846e-04 ||r(i)||/||b|| 6.100334726599e-11
> >> > Linear solve converged due to CONVERGED_ATOL iterations 1
> >> >
> >> > Giang
> >> >
> >> > On Thu, Apr 27, 2017 at 5:25 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> >> >
> >> >   Run again using LU on both blocks to see what happens.
> >> >
> >> >
> >> > > On Apr 27, 2017, at 2:14 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
> >> wrote:
> >> > >
> >> > > I have changed the way to tie the nonconforming mesh. It seems the
> >> matrix now is better
> >> > >
> >> > > with -pc_type lu  the output is
> >> > >   0 KSP preconditioned resid norm 3.308678584240e-01 true resid norm
> >> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
> >> > >   1 KSP preconditioned resid norm 2.004313395301e-12 true resid norm
> >> 2.549872332830e-05 ||r(i)||/||b|| 2.831148938173e-12
> >> > > Linear solve converged due to CONVERGED_ATOL iterations 1
> >> > >
> >> > >
> >> > > with -pc_type fieldsplit  -fieldsplit_u_pc_type hypre
> >> -fieldsplit_wp_pc_type lu    the convergence is slow
> >> > >   0 KSP preconditioned resid norm 1.116302362553e-01 true resid norm
> >> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
> >> > >   1 KSP preconditioned resid norm 2.582134825666e-02 true resid norm
> >> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
> >> > > ...
> >> > > 824 KSP preconditioned resid norm 1.018542387738e-09 true resid norm
> >> 2.906608839310e+02 ||r(i)||/||b|| 3.227237074804e-05
> >> > > 825 KSP preconditioned resid norm 9.743727947637e-10 true resid norm
> >> 2.820369993061e+02 ||r(i)||/||b|| 3.131485215062e-05
> >> > > Linear solve converged due to CONVERGED_ATOL iterations 825
> >> > >
> >> > > checking with additional  -fieldsplit_u_ksp_type richardson
> >> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
> >> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
> >> -fieldsplit_wp_ksp_max_it 1  gives
> >> > >
> >> > >   0 KSP preconditioned resid norm 1.116302362553e-01 true resid norm
> >> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
> >> > >     Residual norms for fieldsplit_u_ solve.
> >> > >     0 KSP Residual norm 5.803507549280e-01
> >> > >     1 KSP Residual norm 2.069538175950e-01
> >> > >     Residual norms for fieldsplit_wp_ solve.
> >> > >     0 KSP Residual norm 0.000000000000e+00
> >> > >   1 KSP preconditioned resid norm 2.582134825666e-02 true resid norm
> >> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
> >> > >     Residual norms for fieldsplit_u_ solve.
> >> > >     0 KSP Residual norm 7.831796195225e-01
> >> > >     1 KSP Residual norm 1.734608520110e-01
> >> > >     Residual norms for fieldsplit_wp_ solve.
> >> > >     0 KSP Residual norm 0.000000000000e+00
> >> > > ....
> >> > > 823 KSP preconditioned resid norm 1.065070135605e-09 true resid norm
> >> 3.081881356833e+02 ||r(i)||/||b|| 3.421843916665e-05
> >> > >     Residual norms for fieldsplit_u_ solve.
> >> > >     0 KSP Residual norm 6.113806394327e-01
> >> > >     1 KSP Residual norm 1.535465290944e-01
> >> > >     Residual norms for fieldsplit_wp_ solve.
> >> > >     0 KSP Residual norm 0.000000000000e+00
> >> > > 824 KSP preconditioned resid norm 1.018542387746e-09 true resid norm
> >> 2.906608839353e+02 ||r(i)||/||b|| 3.227237074851e-05
> >> > >     Residual norms for fieldsplit_u_ solve.
> >> > >     0 KSP Residual norm 6.123437055586e-01
> >> > >     1 KSP Residual norm 1.524661826133e-01
> >> > >     Residual norms for fieldsplit_wp_ solve.
> >> > >     0 KSP Residual norm 0.000000000000e+00
> >> > > 825 KSP preconditioned resid norm 9.743727947718e-10 true resid norm
> >> 2.820369990571e+02 ||r(i)||/||b|| 3.131485212298e-05
> >> > > Linear solve converged due to CONVERGED_ATOL iterations 825
> >> > >
> >> > >
> >> > > The residual for wp block is zero since in this first step the rhs
> is
> >> zero. As can see in the output, the multigrid does not perform well to
> >> reduce the residual in the sub-solve. Is my observation right? what can
> be
> >> done to improve this?
> >> > >
> >> > >
> >> > > Giang
> >> > >
> >> > > On Tue, Apr 25, 2017 at 12:17 AM, Barry Smith <bsmith at mcs.anl.gov>
> >> wrote:
> >> > >
> >> > >    This can happen in the matrix is singular or nearly singular or
> if
> >> the factorization generates small pivots, which can occur for even
> >> nonsingular problems if the matrix is poorly scaled or just plain nasty.
> >> > >
> >> > >
> >> > > > On Apr 24, 2017, at 5:10 PM, Hoang Giang Bui <hgbk2008 at gmail.com>
> >> wrote:
> >> > > >
> >> > > > It took a while, here I send you the output
> >> > > >
> >> > > >   0 KSP preconditioned resid norm 3.129073545457e+05 true resid
> norm
> >> 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
> >> > > >   1 KSP preconditioned resid norm 7.442444222843e-01 true resid
> norm
> >> 1.003356247696e+02 ||r(i)||/||b|| 1.112966720375e-05
> >> > > >   2 KSP preconditioned resid norm 3.267453132529e-07 true resid
> norm
> >> 3.216722968300e+01 ||r(i)||/||b|| 3.568130084011e-06
> >> > > >   3 KSP preconditioned resid norm 1.155046883816e-11 true resid
> norm
> >> 3.234460376820e+01 ||r(i)||/||b|| 3.587805194854e-06
> >> > > > Linear solve converged due to CONVERGED_ATOL iterations 3
> >> > > > KSP Object: 4 MPI processes
> >> > > >   type: gmres
> >> > > >     GMRES: restart=1000, using Modified Gram-Schmidt
> >> Orthogonalization
> >> > > >     GMRES: happy breakdown tolerance 1e-30
> >> > > >   maximum iterations=1000, initial guess is zero
> >> > > >   tolerances:  relative=1e-20, absolute=1e-09, divergence=10000
> >> > > >   left preconditioning
> >> > > >   using PRECONDITIONED norm type for convergence test
> >> > > > PC Object: 4 MPI processes
> >> > > >   type: lu
> >> > > >     LU: out-of-place factorization
> >> > > >     tolerance for zero pivot 2.22045e-14
> >> > > >     matrix ordering: natural
> >> > > >     factor fill ratio given 0, needed 0
> >> > > >       Factored matrix follows:
> >> > > >         Mat Object:         4 MPI processes
> >> > > >           type: mpiaij
> >> > > >           rows=973051, cols=973051
> >> > > >           package used to perform factorization: pastix
> >> > > >   Error :                        3.24786e-14
> >> > > >           total: nonzeros=0, allocated nonzeros=0
> >> > > >           total number of mallocs used during MatSetValues calls
> =0
> >> > > >             PaStiX run parameters:
> >> > > >               Matrix type :                      Unsymmetric
> >> > > >               Level of printing (0,1,2):         0
> >> > > >               Number of refinements iterations : 3
> >> > > >   Error :                        3.24786e-14
> >> > > >   linear system matrix = precond matrix:
> >> > > >   Mat Object:   4 MPI processes
> >> > > >     type: mpiaij
> >> > > >     rows=973051, cols=973051
> >> > > >   Error :                        3.24786e-14
> >> > > >     total: nonzeros=9.90037e+07, allocated nonzeros=9.90037e+07
> >> > > >     total number of mallocs used during MatSetValues calls =0
> >> > > >       using I-node (on process 0) routines: found 78749 nodes,
> limit
> >> used is 5
> >> > > >   Error :                        3.24786e-14
> >> > > >
> >> > > > It doesn't do as you said. Something is not right here. I will
> look
> >> in depth.
> >> > > >
> >> > > > Giang
> >> > > >
> >> > > > On Mon, Apr 24, 2017 at 8:21 PM, Barry Smith <bsmith at mcs.anl.gov>
> >> wrote:
> >> > > >
> >> > > > > On Apr 24, 2017, at 12:47 PM, Hoang Giang Bui <
> hgbk2008 at gmail.com>
> >> wrote:
> >> > > > >
> >> > > > > Good catch. I get this for the very first step, maybe at that
> time
> >> the rhs_w is zero.
> >> > > >
> >> > > >     With the multiplicative composition the right hand side of the
> >> second solve is the initial right hand side of the second solve minus
> >> A_10*x where x is the solution to the first sub solve and A_10 is the
> lower
> >> left block of the outer matrix. So unless both the initial right hand
> side
> >> has a zero for the second block and A_10 is identically zero the right
> hand
> >> side for the second sub solve should not be zero. Is A_10 == 0?
> >> > > >
> >> > > >
> >> > > > > In the later step, it shows 2 step convergence
> >> > > > >
> >> > > > > Residual norms for fieldsplit_u_ solve.
> >> > > > >     0 KSP Residual norm 3.165886479830e+04
> >> > > > >     1 KSP Residual norm 2.905922877684e-01
> >> > > > >     Residual norms for fieldsplit_wp_ solve.
> >> > > > >     0 KSP Residual norm 2.397669419027e-01
> >> > > > >     1 KSP Residual norm 0.000000000000e+00
> >> > > > >   0 KSP preconditioned resid norm 3.165886479920e+04 true resid
> >> norm 7.963616922323e+05 ||r(i)||/||b|| 1.000000000000e+00
> >> > > > >     Residual norms for fieldsplit_u_ solve.
> >> > > > >     0 KSP Residual norm 9.999891813771e-01
> >> > > > >     1 KSP Residual norm 1.512000395579e-05
> >> > > > >     Residual norms for fieldsplit_wp_ solve.
> >> > > > >     0 KSP Residual norm 8.192702188243e-06
> >> > > > >     1 KSP Residual norm 0.000000000000e+00
> >> > > > >   1 KSP preconditioned resid norm 5.252183822848e-02 true resid
> >> norm 7.135927677844e+04 ||r(i)||/||b|| 8.960661653427e-02
> >> > > >
> >> > > >     The outer residual norms are still wonky, the preconditioned
> >> residual norm goes from 3.165886479920e+04 to 5.252183822848e-02 which
> is a
> >> huge drop but the 7.963616922323e+05  drops very much less
> >> 7.135927677844e+04. This is not normal.
> >> > > >
> >> > > >    What if you just use -pc_type lu for the entire system (no
> >> fieldsplit), does the true residual drop to almost zero in the first
> >> iteration (as it should?). Send the output.
> >> > > >
> >> > > >
> >> > > >
> >> > > > >     Residual norms for fieldsplit_u_ solve.
> >> > > > >     0 KSP Residual norm 6.946213936597e-01
> >> > > > >     1 KSP Residual norm 1.195514007343e-05
> >> > > > >     Residual norms for fieldsplit_wp_ solve.
> >> > > > >     0 KSP Residual norm 1.025694497535e+00
> >> > > > >     1 KSP Residual norm 0.000000000000e+00
> >> > > > >   2 KSP preconditioned resid norm 8.785709535405e-03 true resid
> >> norm 1.419341799277e+04 ||r(i)||/||b|| 1.782282866091e-02
> >> > > > >     Residual norms for fieldsplit_u_ solve.
> >> > > > >     0 KSP Residual norm 7.255149996405e-01
> >> > > > >     1 KSP Residual norm 6.583512434218e-06
> >> > > > >     Residual norms for fieldsplit_wp_ solve.
> >> > > > >     0 KSP Residual norm 1.015229700337e+00
> >> > > > >     1 KSP Residual norm 0.000000000000e+00
> >> > > > >   3 KSP preconditioned resid norm 7.110407712709e-04 true resid
> >> norm 5.284940654154e+02 ||r(i)||/||b|| 6.636357205153e-04
> >> > > > >     Residual norms for fieldsplit_u_ solve.
> >> > > > >     0 KSP Residual norm 3.512243341400e-01
> >> > > > >     1 KSP Residual norm 2.032490351200e-06
> >> > > > >     Residual norms for fieldsplit_wp_ solve.
> >> > > > >     0 KSP Residual norm 1.282327290982e+00
> >> > > > >     1 KSP Residual norm 0.000000000000e+00
> >> > > > >   4 KSP preconditioned resid norm 3.482036620521e-05 true resid
> >> norm 4.291231924307e+01 ||r(i)||/||b|| 5.388546393133e-05
> >> > > > >     Residual norms for fieldsplit_u_ solve.
> >> > > > >     0 KSP Residual norm 3.423609338053e-01
> >> > > > >     1 KSP Residual norm 4.213703301972e-07
> >> > > > >     Residual norms for fieldsplit_wp_ solve.
> >> > > > >     0 KSP Residual norm 1.157384757538e+00
> >> > > > >     1 KSP Residual norm 0.000000000000e+00
> >> > > > >   5 KSP preconditioned resid norm 1.203470314534e-06 true resid
> >> norm 4.544956156267e+00 ||r(i)||/||b|| 5.707150658550e-06
> >> > > > >     Residual norms for fieldsplit_u_ solve.
> >> > > > >     0 KSP Residual norm 3.838596289995e-01
> >> > > > >     1 KSP Residual norm 9.927864176103e-08
> >> > > > >     Residual norms for fieldsplit_wp_ solve.
> >> > > > >     0 KSP Residual norm 1.066298905618e+00
> >> > > > >     1 KSP Residual norm 0.000000000000e+00
> >> > > > >   6 KSP preconditioned resid norm 3.331619244266e-08 true resid
> >> norm 2.821511729024e+00 ||r(i)||/||b|| 3.543002829675e-06
> >> > > > >     Residual norms for fieldsplit_u_ solve.
> >> > > > >     0 KSP Residual norm 4.624964188094e-01
> >> > > > >     1 KSP Residual norm 6.418229775372e-08
> >> > > > >     Residual norms for fieldsplit_wp_ solve.
> >> > > > >     0 KSP Residual norm 9.800784311614e-01
> >> > > > >     1 KSP Residual norm 0.000000000000e+00
> >> > > > >   7 KSP preconditioned resid norm 8.788046233297e-10 true resid
> >> norm 2.849209671705e+00 ||r(i)||/||b|| 3.577783436215e-06
> >> > > > > Linear solve converged due to CONVERGED_ATOL iterations 7
> >> > > > >
> >> > > > > The outer operator is an explicit matrix.
> >> > > > >
> >> > > > > Giang
> >> > > > >
> >> > > > > On Mon, Apr 24, 2017 at 7:32 PM, Barry Smith <
> bsmith at mcs.anl.gov>
> >> wrote:
> >> > > > >
> >> > > > > > On Apr 24, 2017, at 3:16 AM, Hoang Giang Bui <
> hgbk2008 at gmail.com>
> >> wrote:
> >> > > > > >
> >> > > > > > Thanks Barry, trying with -fieldsplit_u_type lu gives better
> >> convergence. I still used 4 procs though, probably with 1 proc it should
> >> also be the same.
> >> > > > > >
> >> > > > > > The u block used a Nitsche-type operator to connect two
> >> non-matching domains. I don't think it will leave some rigid body motion
> >> leads to not sufficient constraints. Maybe you have other idea?
> >> > > > > >
> >> > > > > >     Residual norms for fieldsplit_u_ solve.
> >> > > > > >     0 KSP Residual norm 3.129067184300e+05
> >> > > > > >     1 KSP Residual norm 5.906261468196e-01
> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
> >> > > > >
> >> > > > >     ^^^^ something is wrong here. The sub solve should not be
> >> starting with a 0 residual (this means the right hand side for this sub
> >> solve is zero which it should not be).
> >> > > > >
> >> > > > > > FieldSplit with MULTIPLICATIVE composition: total splits = 2
> >> > > > >
> >> > > > >
> >> > > > >    How are you providing the outer operator? As an explicit
> matrix
> >> or with some shell matrix?
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > >   0 KSP preconditioned resid norm 3.129067184300e+05 true
> resid
> >> norm 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
> >> > > > > >     Residual norms for fieldsplit_u_ solve.
> >> > > > > >     0 KSP Residual norm 9.999955993437e-01
> >> > > > > >     1 KSP Residual norm 4.019774691831e-06
> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
> >> > > > > >   1 KSP preconditioned resid norm 5.003913641475e-01 true
> resid
> >> norm 4.692996324114e+01 ||r(i)||/||b|| 5.205677185522e-06
> >> > > > > >     Residual norms for fieldsplit_u_ solve.
> >> > > > > >     0 KSP Residual norm 1.000012180204e+00
> >> > > > > >     1 KSP Residual norm 1.017367950422e-05
> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
> >> > > > > >   2 KSP preconditioned resid norm 2.330910333756e-07 true
> resid
> >> norm 3.474855463983e+01 ||r(i)||/||b|| 3.854461960453e-06
> >> > > > > >     Residual norms for fieldsplit_u_ solve.
> >> > > > > >     0 KSP Residual norm 1.000004200085e+00
> >> > > > > >     1 KSP Residual norm 6.231613102458e-06
> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
> >> > > > > >   3 KSP preconditioned resid norm 8.671259838389e-11 true
> resid
> >> norm 3.545103468011e+01 ||r(i)||/||b|| 3.932384125024e-06
> >> > > > > > Linear solve converged due to CONVERGED_ATOL iterations 3
> >> > > > > > KSP Object: 4 MPI processes
> >> > > > > >   type: gmres
> >> > > > > >     GMRES: restart=1000, using Modified Gram-Schmidt
> >> Orthogonalization
> >> > > > > >     GMRES: happy breakdown tolerance 1e-30
> >> > > > > >   maximum iterations=1000, initial guess is zero
> >> > > > > >   tolerances:  relative=1e-20, absolute=1e-09,
> divergence=10000
> >> > > > > >   left preconditioning
> >> > > > > >   using PRECONDITIONED norm type for convergence test
> >> > > > > > PC Object: 4 MPI processes
> >> > > > > >   type: fieldsplit
> >> > > > > >     FieldSplit with MULTIPLICATIVE composition: total splits
> = 2
> >> > > > > >     Solver info for each split is in the following KSP
> objects:
> >> > > > > >     Split number 0 Defined by IS
> >> > > > > >     KSP Object:    (fieldsplit_u_)     4 MPI processes
> >> > > > > >       type: richardson
> >> > > > > >         Richardson: damping factor=1
> >> > > > > >       maximum iterations=1, initial guess is zero
> >> > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
> >> divergence=10000
> >> > > > > >       left preconditioning
> >> > > > > >       using PRECONDITIONED norm type for convergence test
> >> > > > > >     PC Object:    (fieldsplit_u_)     4 MPI processes
> >> > > > > >       type: lu
> >> > > > > >         LU: out-of-place factorization
> >> > > > > >         tolerance for zero pivot 2.22045e-14
> >> > > > > >         matrix ordering: natural
> >> > > > > >         factor fill ratio given 0, needed 0
> >> > > > > >           Factored matrix follows:
> >> > > > > >             Mat Object:             4 MPI processes
> >> > > > > >               type: mpiaij
> >> > > > > >               rows=938910, cols=938910
> >> > > > > >               package used to perform factorization: pastix
> >> > > > > >               total: nonzeros=0, allocated nonzeros=0
> >> > > > > >       Error :                        3.36878e-14
> >> > > > > >           total number of mallocs used during MatSetValues
> calls
> >> =0
> >> > > > > >                 PaStiX run parameters:
> >> > > > > >                   Matrix type :
> Unsymmetric
> >> > > > > >                   Level of printing (0,1,2):         0
> >> > > > > >                   Number of refinements iterations : 3
> >> > > > > >   Error :                        3.36878e-14
> >> > > > > >       linear system matrix = precond matrix:
> >> > > > > >       Mat Object:      (fieldsplit_u_)       4 MPI processes
> >> > > > > >         type: mpiaij
> >> > > > > >         rows=938910, cols=938910, bs=3
> >> > > > > >   Error :                        3.36878e-14
> >> > > > > >   Error :                        3.36878e-14
> >> > > > > >         total: nonzeros=8.60906e+07, allocated
> >> nonzeros=8.60906e+07
> >> > > > > >         total number of mallocs used during MatSetValues
> calls =0
> >> > > > > >           using I-node (on process 0) routines: found 78749
> >> nodes, limit used is 5
> >> > > > > >     Split number 1 Defined by IS
> >> > > > > >     KSP Object:    (fieldsplit_wp_)     4 MPI processes
> >> > > > > >       type: richardson
> >> > > > > >         Richardson: damping factor=1
> >> > > > > >       maximum iterations=1, initial guess is zero
> >> > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
> >> divergence=10000
> >> > > > > >       left preconditioning
> >> > > > > >       using PRECONDITIONED norm type for convergence test
> >> > > > > >     PC Object:    (fieldsplit_wp_)     4 MPI processes
> >> > > > > >       type: lu
> >> > > > > >         LU: out-of-place factorization
> >> > > > > >         tolerance for zero pivot 2.22045e-14
> >> > > > > >         matrix ordering: natural
> >> > > > > >         factor fill ratio given 0, needed 0
> >> > > > > >           Factored matrix follows:
> >> > > > > >             Mat Object:             4 MPI processes
> >> > > > > >               type: mpiaij
> >> > > > > >               rows=34141, cols=34141
> >> > > > > >               package used to perform factorization: pastix
> >> > > > > >                 Error :                        -nan
> >> > > > > >   Error :                        -nan
> >> > > > > >   Error :                        -nan
> >> > > > > > total: nonzeros=0, allocated nonzeros=0
> >> > > > > >               total number of mallocs used during MatSetValues
> >> calls =0
> >> > > > > >                 PaStiX run parameters:
> >> > > > > >                   Matrix type :                      Symmetric
> >> > > > > >                   Level of printing (0,1,2):         0
> >> > > > > >                   Number of refinements iterations : 0
> >> > > > > >   Error :                        -nan
> >> > > > > >       linear system matrix = precond matrix:
> >> > > > > >       Mat Object:      (fieldsplit_wp_)       4 MPI processes
> >> > > > > >         type: mpiaij
> >> > > > > >         rows=34141, cols=34141
> >> > > > > >         total: nonzeros=485655, allocated nonzeros=485655
> >> > > > > >         total number of mallocs used during MatSetValues
> calls =0
> >> > > > > >           not using I-node (on process 0) routines
> >> > > > > >   linear system matrix = precond matrix:
> >> > > > > >   Mat Object:   4 MPI processes
> >> > > > > >     type: mpiaij
> >> > > > > >     rows=973051, cols=973051
> >> > > > > >     total: nonzeros=9.90037e+07, allocated
> nonzeros=9.90037e+07
> >> > > > > >     total number of mallocs used during MatSetValues calls =0
> >> > > > > >       using I-node (on process 0) routines: found 78749 nodes,
> >> limit used is 5
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > Giang
> >> > > > > >
> >> > > > > > On Sun, Apr 23, 2017 at 10:19 PM, Barry Smith <
> >> bsmith at mcs.anl.gov> wrote:
> >> > > > > >
> >> > > > > > > On Apr 23, 2017, at 2:42 PM, Hoang Giang Bui <
> >> hgbk2008 at gmail.com> wrote:
> >> > > > > > >
> >> > > > > > > Dear Matt/Barry
> >> > > > > > >
> >> > > > > > > With your options, it results in
> >> > > > > > >
> >> > > > > > >   0 KSP preconditioned resid norm 1.106709687386e+31 true
> >> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
> >> > > > > > >     Residual norms for fieldsplit_u_ solve.
> >> > > > > > >     0 KSP Residual norm 2.407308987203e+36
> >> > > > > > >     1 KSP Residual norm 5.797185652683e+72
> >> > > > > >
> >> > > > > > It looks like Matt is right, hypre is seemly producing useless
> >> garbage.
> >> > > > > >
> >> > > > > > First how do things run on one process. If you have similar
> >> problems then debug on one process (debugging any kind of problem is
> always
> >> far easy on one process).
> >> > > > > >
> >> > > > > > First run with -fieldsplit_u_type lu (instead of using hypre)
> to
> >> see if that works or also produces something bad.
> >> > > > > >
> >> > > > > > What is the operator and the boundary conditions for u? It
> could
> >> be singular.
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > >     Residual norms for fieldsplit_wp_ solve.
> >> > > > > > >     0 KSP Residual norm 0.000000000000e+00
> >> > > > > > > ...
> >> > > > > > > 999 KSP preconditioned resid norm 2.920157329174e+12 true
> >> resid norm 9.015683504616e+06 ||r(i)||/||b|| 1.000059124102e+00
> >> > > > > > >     Residual norms for fieldsplit_u_ solve.
> >> > > > > > >     0 KSP Residual norm 1.533726746719e+36
> >> > > > > > >     1 KSP Residual norm 3.692757392261e+72
> >> > > > > > >     Residual norms for fieldsplit_wp_ solve.
> >> > > > > > >     0 KSP Residual norm 0.000000000000e+00
> >> > > > > > >
> >> > > > > > > Do you suggest that the pastix solver for the "wp" block
> >> encounters small pivot? In addition, seem like the "u" block is also
> >> singular.
> >> > > > > > >
> >> > > > > > > Giang
> >> > > > > > >
> >> > > > > > > On Sun, Apr 23, 2017 at 7:39 PM, Barry Smith <
> >> bsmith at mcs.anl.gov> wrote:
> >> > > > > > >
> >> > > > > > >    Huge preconditioned norms but normal unpreconditioned
> norms
> >> almost always come from a very small pivot in an LU or ILU
> factorization.
> >> > > > > > >
> >> > > > > > >    The first thing to do is monitor the two sub solves. Run
> >> with the additional options -fieldsplit_u_ksp_type richardson
> >> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
> >> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
> >> -fieldsplit_wp_ksp_max_it 1
> >> > > > > > >
> >> > > > > > > > On Apr 23, 2017, at 12:22 PM, Hoang Giang Bui <
> >> hgbk2008 at gmail.com> wrote:
> >> > > > > > > >
> >> > > > > > > > Hello
> >> > > > > > > >
> >> > > > > > > > I encountered a strange convergence behavior that I have
> >> trouble to understand
> >> > > > > > > >
> >> > > > > > > > KSPSetFromOptions completed
> >> > > > > > > >   0 KSP preconditioned resid norm 1.106709687386e+31 true
> >> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
> >> > > > > > > >   1 KSP preconditioned resid norm 2.933141742664e+29 true
> >> resid norm 9.015152282123e+06 ||r(i)||/||b|| 1.000000198575e+00
> >> > > > > > > >   2 KSP preconditioned resid norm 9.686409637174e+16 true
> >> resid norm 9.015354521944e+06 ||r(i)||/||b|| 1.000022631902e+00
> >> > > > > > > >   3 KSP preconditioned resid norm 4.219243615809e+15 true
> >> resid norm 9.017157702420e+06 ||r(i)||/||b|| 1.000222648583e+00
> >> > > > > > > > .....
> >> > > > > > > > 999 KSP preconditioned resid norm 3.043754298076e+12 true
> >> resid norm 9.015425041089e+06 ||r(i)||/||b|| 1.000030454195e+00
> >> > > > > > > > 1000 KSP preconditioned resid norm 3.043000287819e+12 true
> >> resid norm 9.015424313455e+06 ||r(i)||/||b|| 1.000030373483e+00
> >> > > > > > > > Linear solve did not converge due to DIVERGED_ITS
> iterations
> >> 1000
> >> > > > > > > > KSP Object: 4 MPI processes
> >> > > > > > > >   type: gmres
> >> > > > > > > >     GMRES: restart=1000, using Modified Gram-Schmidt
> >> Orthogonalization
> >> > > > > > > >     GMRES: happy breakdown tolerance 1e-30
> >> > > > > > > >   maximum iterations=1000, initial guess is zero
> >> > > > > > > >   tolerances:  relative=1e-20, absolute=1e-09,
> >> divergence=10000
> >> > > > > > > >   left preconditioning
> >> > > > > > > >   using PRECONDITIONED norm type for convergence test
> >> > > > > > > > PC Object: 4 MPI processes
> >> > > > > > > >   type: fieldsplit
> >> > > > > > > >     FieldSplit with MULTIPLICATIVE composition: total
> splits
> >> = 2
> >> > > > > > > >     Solver info for each split is in the following KSP
> >> objects:
> >> > > > > > > >     Split number 0 Defined by IS
> >> > > > > > > >     KSP Object:    (fieldsplit_u_)     4 MPI processes
> >> > > > > > > >       type: preonly
> >> > > > > > > >       maximum iterations=10000, initial guess is zero
> >> > > > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
> >> divergence=10000
> >> > > > > > > >       left preconditioning
> >> > > > > > > >       using NONE norm type for convergence test
> >> > > > > > > >     PC Object:    (fieldsplit_u_)     4 MPI processes
> >> > > > > > > >       type: hypre
> >> > > > > > > >         HYPRE BoomerAMG preconditioning
> >> > > > > > > >         HYPRE BoomerAMG: Cycle type V
> >> > > > > > > >         HYPRE BoomerAMG: Maximum number of levels 25
> >> > > > > > > >         HYPRE BoomerAMG: Maximum number of iterations PER
> >> hypre call 1
> >> > > > > > > >         HYPRE BoomerAMG: Convergence tolerance PER hypre
> >> call 0
> >> > > > > > > >         HYPRE BoomerAMG: Threshold for strong coupling 0.6
> >> > > > > > > >         HYPRE BoomerAMG: Interpolation truncation factor 0
> >> > > > > > > >         HYPRE BoomerAMG: Interpolation: max elements per
> row
> >> 0
> >> > > > > > > >         HYPRE BoomerAMG: Number of levels of aggressive
> >> coarsening 0
> >> > > > > > > >         HYPRE BoomerAMG: Number of paths for aggressive
> >> coarsening 1
> >> > > > > > > >         HYPRE BoomerAMG: Maximum row sums 0.9
> >> > > > > > > >         HYPRE BoomerAMG: Sweeps down         1
> >> > > > > > > >         HYPRE BoomerAMG: Sweeps up           1
> >> > > > > > > >         HYPRE BoomerAMG: Sweeps on coarse    1
> >> > > > > > > >         HYPRE BoomerAMG: Relax down
> >> symmetric-SOR/Jacobi
> >> > > > > > > >         HYPRE BoomerAMG: Relax up
> >> symmetric-SOR/Jacobi
> >> > > > > > > >         HYPRE BoomerAMG: Relax on coarse
> >>  Gaussian-elimination
> >> > > > > > > >         HYPRE BoomerAMG: Relax weight  (all)      1
> >> > > > > > > >         HYPRE BoomerAMG: Outer relax weight (all) 1
> >> > > > > > > >         HYPRE BoomerAMG: Using CF-relaxation
> >> > > > > > > >         HYPRE BoomerAMG: Measure type        local
> >> > > > > > > >         HYPRE BoomerAMG: Coarsen type        PMIS
> >> > > > > > > >         HYPRE BoomerAMG: Interpolation type  classical
> >> > > > > > > >       linear system matrix = precond matrix:
> >> > > > > > > >       Mat Object:      (fieldsplit_u_)       4 MPI
> processes
> >> > > > > > > >         type: mpiaij
> >> > > > > > > >         rows=938910, cols=938910, bs=3
> >> > > > > > > >         total: nonzeros=8.60906e+07, allocated
> >> nonzeros=8.60906e+07
> >> > > > > > > >         total number of mallocs used during MatSetValues
> >> calls =0
> >> > > > > > > >           using I-node (on process 0) routines: found
> 78749
> >> nodes, limit used is 5
> >> > > > > > > >     Split number 1 Defined by IS
> >> > > > > > > >     KSP Object:    (fieldsplit_wp_)     4 MPI processes
> >> > > > > > > >       type: preonly
> >> > > > > > > >       maximum iterations=10000, initial guess is zero
> >> > > > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
> >> divergence=10000
> >> > > > > > > >       left preconditioning
> >> > > > > > > >       using NONE norm type for convergence test
> >> > > > > > > >     PC Object:    (fieldsplit_wp_)     4 MPI processes
> >> > > > > > > >       type: lu
> >> > > > > > > >         LU: out-of-place factorization
> >> > > > > > > >         tolerance for zero pivot 2.22045e-14
> >> > > > > > > >         matrix ordering: natural
> >> > > > > > > >         factor fill ratio given 0, needed 0
> >> > > > > > > >           Factored matrix follows:
> >> > > > > > > >             Mat Object:             4 MPI processes
> >> > > > > > > >               type: mpiaij
> >> > > > > > > >               rows=34141, cols=34141
> >> > > > > > > >               package used to perform factorization:
> pastix
> >> > > > > > > >             Error :                        -nan
> >> > > > > > > >   Error :                        -nan
> >> > > > > > > >     total: nonzeros=0, allocated nonzeros=0
> >> > > > > > > >             Error :                        -nan
> >> > > > > > > >     total number of mallocs used during MatSetValues
> calls =0
> >> > > > > > > >                 PaStiX run parameters:
> >> > > > > > > >                   Matrix type :
> >> Symmetric
> >> > > > > > > >                   Level of printing (0,1,2):         0
> >> > > > > > > >                   Number of refinements iterations : 0
> >> > > > > > > >   Error :                        -nan
> >> > > > > > > >       linear system matrix = precond matrix:
> >> > > > > > > >       Mat Object:      (fieldsplit_wp_)       4 MPI
> processes
> >> > > > > > > >         type: mpiaij
> >> > > > > > > >         rows=34141, cols=34141
> >> > > > > > > >         total: nonzeros=485655, allocated nonzeros=485655
> >> > > > > > > >         total number of mallocs used during MatSetValues
> >> calls =0
> >> > > > > > > >           not using I-node (on process 0) routines
> >> > > > > > > >   linear system matrix = precond matrix:
> >> > > > > > > >   Mat Object:   4 MPI processes
> >> > > > > > > >     type: mpiaij
> >> > > > > > > >     rows=973051, cols=973051
> >> > > > > > > >     total: nonzeros=9.90037e+07, allocated
> >> nonzeros=9.90037e+07
> >> > > > > > > >     total number of mallocs used during MatSetValues
> calls =0
> >> > > > > > > >       using I-node (on process 0) routines: found 78749
> >> nodes, limit used is 5
> >> > > > > > > >
> >> > > > > > > > The pattern of convergence gives a hint that this system
> is
> >> somehow bad/singular. But I don't know why the preconditioned error
> goes up
> >> too high. Anyone has an idea?
> >> > > > > > > >
> >> > > > > > > > Best regards
> >> > > > > > > > Giang Bui
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > >
> >> > >
> >> >
> >> >
> >>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170503/89385968/attachment-0001.html>

From dave.mayhem23 at gmail.com  Wed May  3 02:45:04 2017
From: dave.mayhem23 at gmail.com (Dave May)
Date: Wed, 03 May 2017 07:45:04 +0000
Subject: [petsc-users] strange convergence
In-Reply-To: <CAJW_hKdxHMKtN5P7edsZ=V2VTeWebR=D6chkZTMy5L=2fTsT+g@mail.gmail.com>
References: <CAJW_hKcJN520Q=0cAJyAQ=ZF1Wh0f63=5AXSJ5MSyXMFv2AnNQ@mail.gmail.com>
	<7891536D-91FE-4BFF-8DAD-CE7AB85A4E57@mcs.anl.gov>
	<CAJW_hKfFpmJRx+XHSZxbdPEK8_B1sWQTATKVaAZu6vi-1uALqg@mail.gmail.com>
	<425BBB58-9721-49F3-8C86-940F08E925F7@mcs.anl.gov>
	<CAJW_hKd+GAwf4ZS3_337n3PAJmxC6dm2B5uGDS88nrH-KjV60g@mail.gmail.com>
	<A1749310-AE79-4D42-A14F-991DA59F4919@mcs.anl.gov>
	<CAJW_hKd+m3p+Oju7XSPha=_AqYbGD_uF0ZV6qe8HzxbFWZ_2Zg@mail.gmail.com>
	<42EB791A-40C2-439F-A5F7-5F8C15CECA6F@mcs.anl.gov>
	<CAJW_hKd=d2NMxiaU1h6Ro3S3f_iaY294yRJyCG_tb1KN6Wiiug@mail.gmail.com>
	<F5D2FC76-C3A2-4FC8-8144-0D7500F26547@mcs.anl.gov>
	<CAJW_hKc7ZnhAGrq=wAgr3kyBw536GSDL+deDvfmpwkwnmVzHzw@mail.gmail.com>
	<82193784-B4C4-47D7-80EA-25F549C9091B@mcs.anl.gov>
	<CAJW_hKeiozsxQ7XkVneArg_0UYiGXPU7Zovn_PMkNWFrix9w7A@mail.gmail.com>
	<B3AE9D6B-7826-4ED2-B9A2-12250D37BB01@mcs.anl.gov>
	<CAJW_hKfzOV3dDjjgZGvexAm_Sg+8N8fb82J_-iMfxJ2oYvBm9w@mail.gmail.com>
	<87wpa3wd5j.fsf@jedbrown.org>
	<CAJW_hKdxHMKtN5P7edsZ=V2VTeWebR=D6chkZTMy5L=2fTsT+g@mail.gmail.com>
Message-ID: <CAJ98EDoPOtf8vhz+ahCPzmpM=LtqyJx2O4REnTQqD8AbiMKc2A@mail.gmail.com>

On Wed, 3 May 2017 at 09:29, Hoang Giang Bui <hgbk2008 at gmail.com> wrote:

> Dear Jed
>
> If I understood you correctly you suggest to avoid penalty by using the
> Lagrange multiplier for the mortar constraint? In this case it leads to the
> use of discrete Lagrange multiplier space. Do you or anyone already have
> experience using discrete Lagrange multiplier space with Petsc?
>

Yes - this is similar to solving incompressible Stokes in which the
pressure is a Lagrange multiplier enforcing the div(v)=0 constraint.

Robust preconditioners for this problem are constructed using PCFIELDSPLIT.

Thanks,
  Dave



> There is also similar question on stackexchange
>
> https://scicomp.stackexchange.com/questions/25113/preconditioners-and-discrete-lagrange-multipliers
>
> Giang
>
> On Sat, Apr 29, 2017 at 3:34 PM, Jed Brown <jed at jedbrown.org> wrote:
>
>> Hoang Giang Bui <hgbk2008 at gmail.com> writes:
>>
>> > Hi Barry
>> >
>> > The first block is from a standard solid mechanics discretization based
>> on
>> > balance of momentum equation. There is some material involved but in
>> > principal it's well-posed elasticity equation with positive definite
>> > tangent operator. The "gluing business" uses the mortar method to keep
>> the
>> > continuity of displacement. Instead of using Lagrange multiplier to
>> treat
>> > the constraint I used penalty method to penalize the energy. The
>> > discretization form of mortar is quite simple
>> >
>> > \int_{\Gamma_1} { rho * (\delta u_1 - \delta u_2) * (u_1 - u_2) dA }
>> >
>> > rho is penalty parameter. In the simulation I initially set it low (~E)
>> to
>> > preserve the conditioning of the system.
>>
>> There are two things that can go wrong here with AMG:
>>
>> * The penalty term can mess up the strength of connection heuristics
>>   such that you get poor choice of C-points (classical AMG like
>>   BoomerAMG) or poor choice of aggregates (smoothed aggregation).
>>
>> * The penalty term can prevent Jacobi smoothing from being effective; in
>>   this case, it can lead to poor coarse basis functions (higher energy
>>   than they should be) and poor smoothing in an MG cycle.  You can fix
>>   the poor smoothing in the MG cycle by using a stronger smoother, like
>>   ASM with some overlap.
>>
>> I'm generally not a fan of penalty methods due to the irritating
>> tradeoffs and often poor solver performance.
>>
>> > In the figure below, the colorful blocks are u_1 and the base is u_2.
>> Both
>> > u_1 and u_2 use isoparametric quadratic approximation.
>> >
>> > ?
>> >  Snapshot.png
>> > <
>> https://drive.google.com/file/d/0Bw8Hmu0-YGQXc2hKQ1BhQ1I4OEU/view?usp=drive_web
>> >
>> > ???
>> >
>> > Giang
>> >
>> > On Fri, Apr 28, 2017 at 6:21 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> >
>> >>
>> >>   Ok, so boomerAMG algebraic multigrid is not good for the first block.
>> >> You mentioned the first block has two things glued together? AMG is
>> >> fantastic for certain problems but doesn't work for everything.
>> >>
>> >>    Tell us more about the first block, what PDE it comes from, what
>> >> discretization, and what the "gluing business" is and maybe we'll have
>> >> suggestions for how to precondition it.
>> >>
>> >>    Barry
>> >>
>> >> > On Apr 28, 2017, at 3:56 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
>> wrote:
>> >> >
>> >> > It's in fact quite good
>> >> >
>> >> >     Residual norms for fieldsplit_u_ solve.
>> >> >     0 KSP Residual norm 4.014715925568e+00
>> >> >     1 KSP Residual norm 2.160497019264e-10
>> >> >     Residual norms for fieldsplit_wp_ solve.
>> >> >     0 KSP Residual norm 0.000000000000e+00
>> >> >   0 KSP preconditioned resid norm 4.014715925568e+00 true resid norm
>> >> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> >     Residual norms for fieldsplit_u_ solve.
>> >> >     0 KSP Residual norm 9.999999999416e-01
>> >> >     1 KSP Residual norm 7.118380416383e-11
>> >> >     Residual norms for fieldsplit_wp_ solve.
>> >> >     0 KSP Residual norm 0.000000000000e+00
>> >> >   1 KSP preconditioned resid norm 1.701150951035e-10 true resid norm
>> >> 5.494262251846e-04 ||r(i)||/||b|| 6.100334726599e-11
>> >> > Linear solve converged due to CONVERGED_ATOL iterations 1
>> >> >
>> >> > Giang
>> >> >
>> >> > On Thu, Apr 27, 2017 at 5:25 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> >> >
>> >> >   Run again using LU on both blocks to see what happens.
>> >> >
>> >> >
>> >> > > On Apr 27, 2017, at 2:14 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
>> >> wrote:
>> >> > >
>> >> > > I have changed the way to tie the nonconforming mesh. It seems the
>> >> matrix now is better
>> >> > >
>> >> > > with -pc_type lu  the output is
>> >> > >   0 KSP preconditioned resid norm 3.308678584240e-01 true resid
>> norm
>> >> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> > >   1 KSP preconditioned resid norm 2.004313395301e-12 true resid
>> norm
>> >> 2.549872332830e-05 ||r(i)||/||b|| 2.831148938173e-12
>> >> > > Linear solve converged due to CONVERGED_ATOL iterations 1
>> >> > >
>> >> > >
>> >> > > with -pc_type fieldsplit  -fieldsplit_u_pc_type hypre
>> >> -fieldsplit_wp_pc_type lu    the convergence is slow
>> >> > >   0 KSP preconditioned resid norm 1.116302362553e-01 true resid
>> norm
>> >> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> > >   1 KSP preconditioned resid norm 2.582134825666e-02 true resid
>> norm
>> >> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
>> >> > > ...
>> >> > > 824 KSP preconditioned resid norm 1.018542387738e-09 true resid
>> norm
>> >> 2.906608839310e+02 ||r(i)||/||b|| 3.227237074804e-05
>> >> > > 825 KSP preconditioned resid norm 9.743727947637e-10 true resid
>> norm
>> >> 2.820369993061e+02 ||r(i)||/||b|| 3.131485215062e-05
>> >> > > Linear solve converged due to CONVERGED_ATOL iterations 825
>> >> > >
>> >> > > checking with additional  -fieldsplit_u_ksp_type richardson
>> >> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
>> >> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
>> >> -fieldsplit_wp_ksp_max_it 1  gives
>> >> > >
>> >> > >   0 KSP preconditioned resid norm 1.116302362553e-01 true resid
>> norm
>> >> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> > >     Residual norms for fieldsplit_u_ solve.
>> >> > >     0 KSP Residual norm 5.803507549280e-01
>> >> > >     1 KSP Residual norm 2.069538175950e-01
>> >> > >     Residual norms for fieldsplit_wp_ solve.
>> >> > >     0 KSP Residual norm 0.000000000000e+00
>> >> > >   1 KSP preconditioned resid norm 2.582134825666e-02 true resid
>> norm
>> >> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
>> >> > >     Residual norms for fieldsplit_u_ solve.
>> >> > >     0 KSP Residual norm 7.831796195225e-01
>> >> > >     1 KSP Residual norm 1.734608520110e-01
>> >> > >     Residual norms for fieldsplit_wp_ solve.
>> >> > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > ....
>> >> > > 823 KSP preconditioned resid norm 1.065070135605e-09 true resid
>> norm
>> >> 3.081881356833e+02 ||r(i)||/||b|| 3.421843916665e-05
>> >> > >     Residual norms for fieldsplit_u_ solve.
>> >> > >     0 KSP Residual norm 6.113806394327e-01
>> >> > >     1 KSP Residual norm 1.535465290944e-01
>> >> > >     Residual norms for fieldsplit_wp_ solve.
>> >> > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > 824 KSP preconditioned resid norm 1.018542387746e-09 true resid
>> norm
>> >> 2.906608839353e+02 ||r(i)||/||b|| 3.227237074851e-05
>> >> > >     Residual norms for fieldsplit_u_ solve.
>> >> > >     0 KSP Residual norm 6.123437055586e-01
>> >> > >     1 KSP Residual norm 1.524661826133e-01
>> >> > >     Residual norms for fieldsplit_wp_ solve.
>> >> > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > 825 KSP preconditioned resid norm 9.743727947718e-10 true resid
>> norm
>> >> 2.820369990571e+02 ||r(i)||/||b|| 3.131485212298e-05
>> >> > > Linear solve converged due to CONVERGED_ATOL iterations 825
>> >> > >
>> >> > >
>> >> > > The residual for wp block is zero since in this first step the rhs
>> is
>> >> zero. As can see in the output, the multigrid does not perform well to
>> >> reduce the residual in the sub-solve. Is my observation right? what
>> can be
>> >> done to improve this?
>> >> > >
>> >> > >
>> >> > > Giang
>> >> > >
>> >> > > On Tue, Apr 25, 2017 at 12:17 AM, Barry Smith <bsmith at mcs.anl.gov>
>> >> wrote:
>> >> > >
>> >> > >    This can happen in the matrix is singular or nearly singular or
>> if
>> >> the factorization generates small pivots, which can occur for even
>> >> nonsingular problems if the matrix is poorly scaled or just plain
>> nasty.
>> >> > >
>> >> > >
>> >> > > > On Apr 24, 2017, at 5:10 PM, Hoang Giang Bui <hgbk2008 at gmail.com
>> >
>> >> wrote:
>> >> > > >
>> >> > > > It took a while, here I send you the output
>> >> > > >
>> >> > > >   0 KSP preconditioned resid norm 3.129073545457e+05 true resid
>> norm
>> >> 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> > > >   1 KSP preconditioned resid norm 7.442444222843e-01 true resid
>> norm
>> >> 1.003356247696e+02 ||r(i)||/||b|| 1.112966720375e-05
>> >> > > >   2 KSP preconditioned resid norm 3.267453132529e-07 true resid
>> norm
>> >> 3.216722968300e+01 ||r(i)||/||b|| 3.568130084011e-06
>> >> > > >   3 KSP preconditioned resid norm 1.155046883816e-11 true resid
>> norm
>> >> 3.234460376820e+01 ||r(i)||/||b|| 3.587805194854e-06
>> >> > > > Linear solve converged due to CONVERGED_ATOL iterations 3
>> >> > > > KSP Object: 4 MPI processes
>> >> > > >   type: gmres
>> >> > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> >> Orthogonalization
>> >> > > >     GMRES: happy breakdown tolerance 1e-30
>> >> > > >   maximum iterations=1000, initial guess is zero
>> >> > > >   tolerances:  relative=1e-20, absolute=1e-09, divergence=10000
>> >> > > >   left preconditioning
>> >> > > >   using PRECONDITIONED norm type for convergence test
>> >> > > > PC Object: 4 MPI processes
>> >> > > >   type: lu
>> >> > > >     LU: out-of-place factorization
>> >> > > >     tolerance for zero pivot 2.22045e-14
>> >> > > >     matrix ordering: natural
>> >> > > >     factor fill ratio given 0, needed 0
>> >> > > >       Factored matrix follows:
>> >> > > >         Mat Object:         4 MPI processes
>> >> > > >           type: mpiaij
>> >> > > >           rows=973051, cols=973051
>> >> > > >           package used to perform factorization: pastix
>> >> > > >   Error :                        3.24786e-14
>> >> > > >           total: nonzeros=0, allocated nonzeros=0
>> >> > > >           total number of mallocs used during MatSetValues calls
>> =0
>> >> > > >             PaStiX run parameters:
>> >> > > >               Matrix type :                      Unsymmetric
>> >> > > >               Level of printing (0,1,2):         0
>> >> > > >               Number of refinements iterations : 3
>> >> > > >   Error :                        3.24786e-14
>> >> > > >   linear system matrix = precond matrix:
>> >> > > >   Mat Object:   4 MPI processes
>> >> > > >     type: mpiaij
>> >> > > >     rows=973051, cols=973051
>> >> > > >   Error :                        3.24786e-14
>> >> > > >     total: nonzeros=9.90037e+07, allocated nonzeros=9.90037e+07
>> >> > > >     total number of mallocs used during MatSetValues calls =0
>> >> > > >       using I-node (on process 0) routines: found 78749 nodes,
>> limit
>> >> used is 5
>> >> > > >   Error :                        3.24786e-14
>> >> > > >
>> >> > > > It doesn't do as you said. Something is not right here. I will
>> look
>> >> in depth.
>> >> > > >
>> >> > > > Giang
>> >> > > >
>> >> > > > On Mon, Apr 24, 2017 at 8:21 PM, Barry Smith <bsmith at mcs.anl.gov
>> >
>> >> wrote:
>> >> > > >
>> >> > > > > On Apr 24, 2017, at 12:47 PM, Hoang Giang Bui <
>> hgbk2008 at gmail.com>
>> >> wrote:
>> >> > > > >
>> >> > > > > Good catch. I get this for the very first step, maybe at that
>> time
>> >> the rhs_w is zero.
>> >> > > >
>> >> > > >     With the multiplicative composition the right hand side of
>> the
>> >> second solve is the initial right hand side of the second solve minus
>> >> A_10*x where x is the solution to the first sub solve and A_10 is the
>> lower
>> >> left block of the outer matrix. So unless both the initial right hand
>> side
>> >> has a zero for the second block and A_10 is identically zero the right
>> hand
>> >> side for the second sub solve should not be zero. Is A_10 == 0?
>> >> > > >
>> >> > > >
>> >> > > > > In the later step, it shows 2 step convergence
>> >> > > > >
>> >> > > > > Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 3.165886479830e+04
>> >> > > > >     1 KSP Residual norm 2.905922877684e-01
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 2.397669419027e-01
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   0 KSP preconditioned resid norm 3.165886479920e+04 true resid
>> >> norm 7.963616922323e+05 ||r(i)||/||b|| 1.000000000000e+00
>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 9.999891813771e-01
>> >> > > > >     1 KSP Residual norm 1.512000395579e-05
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 8.192702188243e-06
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   1 KSP preconditioned resid norm 5.252183822848e-02 true resid
>> >> norm 7.135927677844e+04 ||r(i)||/||b|| 8.960661653427e-02
>> >> > > >
>> >> > > >     The outer residual norms are still wonky, the preconditioned
>> >> residual norm goes from 3.165886479920e+04 to 5.252183822848e-02 which
>> is a
>> >> huge drop but the 7.963616922323e+05  drops very much less
>> >> 7.135927677844e+04. This is not normal.
>> >> > > >
>> >> > > >    What if you just use -pc_type lu for the entire system (no
>> >> fieldsplit), does the true residual drop to almost zero in the first
>> >> iteration (as it should?). Send the output.
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 6.946213936597e-01
>> >> > > > >     1 KSP Residual norm 1.195514007343e-05
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 1.025694497535e+00
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   2 KSP preconditioned resid norm 8.785709535405e-03 true resid
>> >> norm 1.419341799277e+04 ||r(i)||/||b|| 1.782282866091e-02
>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 7.255149996405e-01
>> >> > > > >     1 KSP Residual norm 6.583512434218e-06
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 1.015229700337e+00
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   3 KSP preconditioned resid norm 7.110407712709e-04 true resid
>> >> norm 5.284940654154e+02 ||r(i)||/||b|| 6.636357205153e-04
>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 3.512243341400e-01
>> >> > > > >     1 KSP Residual norm 2.032490351200e-06
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 1.282327290982e+00
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   4 KSP preconditioned resid norm 3.482036620521e-05 true resid
>> >> norm 4.291231924307e+01 ||r(i)||/||b|| 5.388546393133e-05
>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 3.423609338053e-01
>> >> > > > >     1 KSP Residual norm 4.213703301972e-07
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 1.157384757538e+00
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   5 KSP preconditioned resid norm 1.203470314534e-06 true resid
>> >> norm 4.544956156267e+00 ||r(i)||/||b|| 5.707150658550e-06
>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 3.838596289995e-01
>> >> > > > >     1 KSP Residual norm 9.927864176103e-08
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 1.066298905618e+00
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   6 KSP preconditioned resid norm 3.331619244266e-08 true resid
>> >> norm 2.821511729024e+00 ||r(i)||/||b|| 3.543002829675e-06
>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 4.624964188094e-01
>> >> > > > >     1 KSP Residual norm 6.418229775372e-08
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 9.800784311614e-01
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   7 KSP preconditioned resid norm 8.788046233297e-10 true resid
>> >> norm 2.849209671705e+00 ||r(i)||/||b|| 3.577783436215e-06
>> >> > > > > Linear solve converged due to CONVERGED_ATOL iterations 7
>> >> > > > >
>> >> > > > > The outer operator is an explicit matrix.
>> >> > > > >
>> >> > > > > Giang
>> >> > > > >
>> >> > > > > On Mon, Apr 24, 2017 at 7:32 PM, Barry Smith <
>> bsmith at mcs.anl.gov>
>> >> wrote:
>> >> > > > >
>> >> > > > > > On Apr 24, 2017, at 3:16 AM, Hoang Giang Bui <
>> hgbk2008 at gmail.com>
>> >> wrote:
>> >> > > > > >
>> >> > > > > > Thanks Barry, trying with -fieldsplit_u_type lu gives better
>> >> convergence. I still used 4 procs though, probably with 1 proc it
>> should
>> >> also be the same.
>> >> > > > > >
>> >> > > > > > The u block used a Nitsche-type operator to connect two
>> >> non-matching domains. I don't think it will leave some rigid body
>> motion
>> >> leads to not sufficient constraints. Maybe you have other idea?
>> >> > > > > >
>> >> > > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > > >     0 KSP Residual norm 3.129067184300e+05
>> >> > > > > >     1 KSP Residual norm 5.906261468196e-01
>> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > > >
>> >> > > > >     ^^^^ something is wrong here. The sub solve should not be
>> >> starting with a 0 residual (this means the right hand side for this sub
>> >> solve is zero which it should not be).
>> >> > > > >
>> >> > > > > > FieldSplit with MULTIPLICATIVE composition: total splits = 2
>> >> > > > >
>> >> > > > >
>> >> > > > >    How are you providing the outer operator? As an explicit
>> matrix
>> >> or with some shell matrix?
>> >> > > > >
>> >> > > > >
>> >> > > > >
>> >> > > > > >   0 KSP preconditioned resid norm 3.129067184300e+05 true
>> resid
>> >> norm 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> > > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > > >     0 KSP Residual norm 9.999955993437e-01
>> >> > > > > >     1 KSP Residual norm 4.019774691831e-06
>> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > > > >   1 KSP preconditioned resid norm 5.003913641475e-01 true
>> resid
>> >> norm 4.692996324114e+01 ||r(i)||/||b|| 5.205677185522e-06
>> >> > > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > > >     0 KSP Residual norm 1.000012180204e+00
>> >> > > > > >     1 KSP Residual norm 1.017367950422e-05
>> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > > > >   2 KSP preconditioned resid norm 2.330910333756e-07 true
>> resid
>> >> norm 3.474855463983e+01 ||r(i)||/||b|| 3.854461960453e-06
>> >> > > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > > >     0 KSP Residual norm 1.000004200085e+00
>> >> > > > > >     1 KSP Residual norm 6.231613102458e-06
>> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > > > >   3 KSP preconditioned resid norm 8.671259838389e-11 true
>> resid
>> >> norm 3.545103468011e+01 ||r(i)||/||b|| 3.932384125024e-06
>> >> > > > > > Linear solve converged due to CONVERGED_ATOL iterations 3
>> >> > > > > > KSP Object: 4 MPI processes
>> >> > > > > >   type: gmres
>> >> > > > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> >> Orthogonalization
>> >> > > > > >     GMRES: happy breakdown tolerance 1e-30
>> >> > > > > >   maximum iterations=1000, initial guess is zero
>> >> > > > > >   tolerances:  relative=1e-20, absolute=1e-09,
>> divergence=10000
>> >> > > > > >   left preconditioning
>> >> > > > > >   using PRECONDITIONED norm type for convergence test
>> >> > > > > > PC Object: 4 MPI processes
>> >> > > > > >   type: fieldsplit
>> >> > > > > >     FieldSplit with MULTIPLICATIVE composition: total splits
>> = 2
>> >> > > > > >     Solver info for each split is in the following KSP
>> objects:
>> >> > > > > >     Split number 0 Defined by IS
>> >> > > > > >     KSP Object:    (fieldsplit_u_)     4 MPI processes
>> >> > > > > >       type: richardson
>> >> > > > > >         Richardson: damping factor=1
>> >> > > > > >       maximum iterations=1, initial guess is zero
>> >> > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> >> divergence=10000
>> >> > > > > >       left preconditioning
>> >> > > > > >       using PRECONDITIONED norm type for convergence test
>> >> > > > > >     PC Object:    (fieldsplit_u_)     4 MPI processes
>> >> > > > > >       type: lu
>> >> > > > > >         LU: out-of-place factorization
>> >> > > > > >         tolerance for zero pivot 2.22045e-14
>> >> > > > > >         matrix ordering: natural
>> >> > > > > >         factor fill ratio given 0, needed 0
>> >> > > > > >           Factored matrix follows:
>> >> > > > > >             Mat Object:             4 MPI processes
>> >> > > > > >               type: mpiaij
>> >> > > > > >               rows=938910, cols=938910
>> >> > > > > >               package used to perform factorization: pastix
>> >> > > > > >               total: nonzeros=0, allocated nonzeros=0
>> >> > > > > >       Error :                        3.36878e-14
>> >> > > > > >           total number of mallocs used during MatSetValues
>> calls
>> >> =0
>> >> > > > > >                 PaStiX run parameters:
>> >> > > > > >                   Matrix type :
>> Unsymmetric
>> >> > > > > >                   Level of printing (0,1,2):         0
>> >> > > > > >                   Number of refinements iterations : 3
>> >> > > > > >   Error :                        3.36878e-14
>> >> > > > > >       linear system matrix = precond matrix:
>> >> > > > > >       Mat Object:      (fieldsplit_u_)       4 MPI processes
>> >> > > > > >         type: mpiaij
>> >> > > > > >         rows=938910, cols=938910, bs=3
>> >> > > > > >   Error :                        3.36878e-14
>> >> > > > > >   Error :                        3.36878e-14
>> >> > > > > >         total: nonzeros=8.60906e+07, allocated
>> >> nonzeros=8.60906e+07
>> >> > > > > >         total number of mallocs used during MatSetValues
>> calls =0
>> >> > > > > >           using I-node (on process 0) routines: found 78749
>> >> nodes, limit used is 5
>> >> > > > > >     Split number 1 Defined by IS
>> >> > > > > >     KSP Object:    (fieldsplit_wp_)     4 MPI processes
>> >> > > > > >       type: richardson
>> >> > > > > >         Richardson: damping factor=1
>> >> > > > > >       maximum iterations=1, initial guess is zero
>> >> > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> >> divergence=10000
>> >> > > > > >       left preconditioning
>> >> > > > > >       using PRECONDITIONED norm type for convergence test
>> >> > > > > >     PC Object:    (fieldsplit_wp_)     4 MPI processes
>> >> > > > > >       type: lu
>> >> > > > > >         LU: out-of-place factorization
>> >> > > > > >         tolerance for zero pivot 2.22045e-14
>> >> > > > > >         matrix ordering: natural
>> >> > > > > >         factor fill ratio given 0, needed 0
>> >> > > > > >           Factored matrix follows:
>> >> > > > > >             Mat Object:             4 MPI processes
>> >> > > > > >               type: mpiaij
>> >> > > > > >               rows=34141, cols=34141
>> >> > > > > >               package used to perform factorization: pastix
>> >> > > > > >                 Error :                        -nan
>> >> > > > > >   Error :                        -nan
>> >> > > > > >   Error :                        -nan
>> >> > > > > > total: nonzeros=0, allocated nonzeros=0
>> >> > > > > >               total number of mallocs used during
>> MatSetValues
>> >> calls =0
>> >> > > > > >                 PaStiX run parameters:
>> >> > > > > >                   Matrix type :
>> Symmetric
>> >> > > > > >                   Level of printing (0,1,2):         0
>> >> > > > > >                   Number of refinements iterations : 0
>> >> > > > > >   Error :                        -nan
>> >> > > > > >       linear system matrix = precond matrix:
>> >> > > > > >       Mat Object:      (fieldsplit_wp_)       4 MPI processes
>> >> > > > > >         type: mpiaij
>> >> > > > > >         rows=34141, cols=34141
>> >> > > > > >         total: nonzeros=485655, allocated nonzeros=485655
>> >> > > > > >         total number of mallocs used during MatSetValues
>> calls =0
>> >> > > > > >           not using I-node (on process 0) routines
>> >> > > > > >   linear system matrix = precond matrix:
>> >> > > > > >   Mat Object:   4 MPI processes
>> >> > > > > >     type: mpiaij
>> >> > > > > >     rows=973051, cols=973051
>> >> > > > > >     total: nonzeros=9.90037e+07, allocated
>> nonzeros=9.90037e+07
>> >> > > > > >     total number of mallocs used during MatSetValues calls =0
>> >> > > > > >       using I-node (on process 0) routines: found 78749
>> nodes,
>> >> limit used is 5
>> >> > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > > Giang
>> >> > > > > >
>> >> > > > > > On Sun, Apr 23, 2017 at 10:19 PM, Barry Smith <
>> >> bsmith at mcs.anl.gov> wrote:
>> >> > > > > >
>> >> > > > > > > On Apr 23, 2017, at 2:42 PM, Hoang Giang Bui <
>> >> hgbk2008 at gmail.com> wrote:
>> >> > > > > > >
>> >> > > > > > > Dear Matt/Barry
>> >> > > > > > >
>> >> > > > > > > With your options, it results in
>> >> > > > > > >
>> >> > > > > > >   0 KSP preconditioned resid norm 1.106709687386e+31 true
>> >> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> > > > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > > > >     0 KSP Residual norm 2.407308987203e+36
>> >> > > > > > >     1 KSP Residual norm 5.797185652683e+72
>> >> > > > > >
>> >> > > > > > It looks like Matt is right, hypre is seemly producing
>> useless
>> >> garbage.
>> >> > > > > >
>> >> > > > > > First how do things run on one process. If you have similar
>> >> problems then debug on one process (debugging any kind of problem is
>> always
>> >> far easy on one process).
>> >> > > > > >
>> >> > > > > > First run with -fieldsplit_u_type lu (instead of using
>> hypre) to
>> >> see if that works or also produces something bad.
>> >> > > > > >
>> >> > > > > > What is the operator and the boundary conditions for u? It
>> could
>> >> be singular.
>> >> > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > > > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > > > > > ...
>> >> > > > > > > 999 KSP preconditioned resid norm 2.920157329174e+12 true
>> >> resid norm 9.015683504616e+06 ||r(i)||/||b|| 1.000059124102e+00
>> >> > > > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > > > >     0 KSP Residual norm 1.533726746719e+36
>> >> > > > > > >     1 KSP Residual norm 3.692757392261e+72
>> >> > > > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > > > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > > > > >
>> >> > > > > > > Do you suggest that the pastix solver for the "wp" block
>> >> encounters small pivot? In addition, seem like the "u" block is also
>> >> singular.
>> >> > > > > > >
>> >> > > > > > > Giang
>> >> > > > > > >
>> >> > > > > > > On Sun, Apr 23, 2017 at 7:39 PM, Barry Smith <
>> >> bsmith at mcs.anl.gov> wrote:
>> >> > > > > > >
>> >> > > > > > >    Huge preconditioned norms but normal unpreconditioned
>> norms
>> >> almost always come from a very small pivot in an LU or ILU
>> factorization.
>> >> > > > > > >
>> >> > > > > > >    The first thing to do is monitor the two sub solves. Run
>> >> with the additional options -fieldsplit_u_ksp_type richardson
>> >> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
>> >> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
>> >> -fieldsplit_wp_ksp_max_it 1
>> >> > > > > > >
>> >> > > > > > > > On Apr 23, 2017, at 12:22 PM, Hoang Giang Bui <
>> >> hgbk2008 at gmail.com> wrote:
>> >> > > > > > > >
>> >> > > > > > > > Hello
>> >> > > > > > > >
>> >> > > > > > > > I encountered a strange convergence behavior that I have
>> >> trouble to understand
>> >> > > > > > > >
>> >> > > > > > > > KSPSetFromOptions completed
>> >> > > > > > > >   0 KSP preconditioned resid norm 1.106709687386e+31 true
>> >> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> > > > > > > >   1 KSP preconditioned resid norm 2.933141742664e+29 true
>> >> resid norm 9.015152282123e+06 ||r(i)||/||b|| 1.000000198575e+00
>> >> > > > > > > >   2 KSP preconditioned resid norm 9.686409637174e+16 true
>> >> resid norm 9.015354521944e+06 ||r(i)||/||b|| 1.000022631902e+00
>> >> > > > > > > >   3 KSP preconditioned resid norm 4.219243615809e+15 true
>> >> resid norm 9.017157702420e+06 ||r(i)||/||b|| 1.000222648583e+00
>> >> > > > > > > > .....
>> >> > > > > > > > 999 KSP preconditioned resid norm 3.043754298076e+12 true
>> >> resid norm 9.015425041089e+06 ||r(i)||/||b|| 1.000030454195e+00
>> >> > > > > > > > 1000 KSP preconditioned resid norm 3.043000287819e+12
>> true
>> >> resid norm 9.015424313455e+06 ||r(i)||/||b|| 1.000030373483e+00
>> >> > > > > > > > Linear solve did not converge due to DIVERGED_ITS
>> iterations
>> >> 1000
>> >> > > > > > > > KSP Object: 4 MPI processes
>> >> > > > > > > >   type: gmres
>> >> > > > > > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> >> Orthogonalization
>> >> > > > > > > >     GMRES: happy breakdown tolerance 1e-30
>> >> > > > > > > >   maximum iterations=1000, initial guess is zero
>> >> > > > > > > >   tolerances:  relative=1e-20, absolute=1e-09,
>> >> divergence=10000
>> >> > > > > > > >   left preconditioning
>> >> > > > > > > >   using PRECONDITIONED norm type for convergence test
>> >> > > > > > > > PC Object: 4 MPI processes
>> >> > > > > > > >   type: fieldsplit
>> >> > > > > > > >     FieldSplit with MULTIPLICATIVE composition: total
>> splits
>> >> = 2
>> >> > > > > > > >     Solver info for each split is in the following KSP
>> >> objects:
>> >> > > > > > > >     Split number 0 Defined by IS
>> >> > > > > > > >     KSP Object:    (fieldsplit_u_)     4 MPI processes
>> >> > > > > > > >       type: preonly
>> >> > > > > > > >       maximum iterations=10000, initial guess is zero
>> >> > > > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> >> divergence=10000
>> >> > > > > > > >       left preconditioning
>> >> > > > > > > >       using NONE norm type for convergence test
>> >> > > > > > > >     PC Object:    (fieldsplit_u_)     4 MPI processes
>> >> > > > > > > >       type: hypre
>> >> > > > > > > >         HYPRE BoomerAMG preconditioning
>> >> > > > > > > >         HYPRE BoomerAMG: Cycle type V
>> >> > > > > > > >         HYPRE BoomerAMG: Maximum number of levels 25
>> >> > > > > > > >         HYPRE BoomerAMG: Maximum number of iterations PER
>> >> hypre call 1
>> >> > > > > > > >         HYPRE BoomerAMG: Convergence tolerance PER hypre
>> >> call 0
>> >> > > > > > > >         HYPRE BoomerAMG: Threshold for strong coupling
>> 0.6
>> >> > > > > > > >         HYPRE BoomerAMG: Interpolation truncation factor
>> 0
>> >> > > > > > > >         HYPRE BoomerAMG: Interpolation: max elements per
>> row
>> >> 0
>> >> > > > > > > >         HYPRE BoomerAMG: Number of levels of aggressive
>> >> coarsening 0
>> >> > > > > > > >         HYPRE BoomerAMG: Number of paths for aggressive
>> >> coarsening 1
>> >> > > > > > > >         HYPRE BoomerAMG: Maximum row sums 0.9
>> >> > > > > > > >         HYPRE BoomerAMG: Sweeps down         1
>> >> > > > > > > >         HYPRE BoomerAMG: Sweeps up           1
>> >> > > > > > > >         HYPRE BoomerAMG: Sweeps on coarse    1
>> >> > > > > > > >         HYPRE BoomerAMG: Relax down
>> >> symmetric-SOR/Jacobi
>> >> > > > > > > >         HYPRE BoomerAMG: Relax up
>> >> symmetric-SOR/Jacobi
>> >> > > > > > > >         HYPRE BoomerAMG: Relax on coarse
>> >>  Gaussian-elimination
>> >> > > > > > > >         HYPRE BoomerAMG: Relax weight  (all)      1
>> >> > > > > > > >         HYPRE BoomerAMG: Outer relax weight (all) 1
>> >> > > > > > > >         HYPRE BoomerAMG: Using CF-relaxation
>> >> > > > > > > >         HYPRE BoomerAMG: Measure type        local
>> >> > > > > > > >         HYPRE BoomerAMG: Coarsen type        PMIS
>> >> > > > > > > >         HYPRE BoomerAMG: Interpolation type  classical
>> >> > > > > > > >       linear system matrix = precond matrix:
>> >> > > > > > > >       Mat Object:      (fieldsplit_u_)       4 MPI
>> processes
>> >> > > > > > > >         type: mpiaij
>> >> > > > > > > >         rows=938910, cols=938910, bs=3
>> >> > > > > > > >         total: nonzeros=8.60906e+07, allocated
>> >> nonzeros=8.60906e+07
>> >> > > > > > > >         total number of mallocs used during MatSetValues
>> >> calls =0
>> >> > > > > > > >           using I-node (on process 0) routines: found
>> 78749
>> >> nodes, limit used is 5
>> >> > > > > > > >     Split number 1 Defined by IS
>> >> > > > > > > >     KSP Object:    (fieldsplit_wp_)     4 MPI processes
>> >> > > > > > > >       type: preonly
>> >> > > > > > > >       maximum iterations=10000, initial guess is zero
>> >> > > > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> >> divergence=10000
>> >> > > > > > > >       left preconditioning
>> >> > > > > > > >       using NONE norm type for convergence test
>> >> > > > > > > >     PC Object:    (fieldsplit_wp_)     4 MPI processes
>> >> > > > > > > >       type: lu
>> >> > > > > > > >         LU: out-of-place factorization
>> >> > > > > > > >         tolerance for zero pivot 2.22045e-14
>> >> > > > > > > >         matrix ordering: natural
>> >> > > > > > > >         factor fill ratio given 0, needed 0
>> >> > > > > > > >           Factored matrix follows:
>> >> > > > > > > >             Mat Object:             4 MPI processes
>> >> > > > > > > >               type: mpiaij
>> >> > > > > > > >               rows=34141, cols=34141
>> >> > > > > > > >               package used to perform factorization:
>> pastix
>> >> > > > > > > >             Error :                        -nan
>> >> > > > > > > >   Error :                        -nan
>> >> > > > > > > >     total: nonzeros=0, allocated nonzeros=0
>> >> > > > > > > >             Error :                        -nan
>> >> > > > > > > >     total number of mallocs used during MatSetValues
>> calls =0
>> >> > > > > > > >                 PaStiX run parameters:
>> >> > > > > > > >                   Matrix type :
>> >> Symmetric
>> >> > > > > > > >                   Level of printing (0,1,2):         0
>> >> > > > > > > >                   Number of refinements iterations : 0
>> >> > > > > > > >   Error :                        -nan
>> >> > > > > > > >       linear system matrix = precond matrix:
>> >> > > > > > > >       Mat Object:      (fieldsplit_wp_)       4 MPI
>> processes
>> >> > > > > > > >         type: mpiaij
>> >> > > > > > > >         rows=34141, cols=34141
>> >> > > > > > > >         total: nonzeros=485655, allocated nonzeros=485655
>> >> > > > > > > >         total number of mallocs used during MatSetValues
>> >> calls =0
>> >> > > > > > > >           not using I-node (on process 0) routines
>> >> > > > > > > >   linear system matrix = precond matrix:
>> >> > > > > > > >   Mat Object:   4 MPI processes
>> >> > > > > > > >     type: mpiaij
>> >> > > > > > > >     rows=973051, cols=973051
>> >> > > > > > > >     total: nonzeros=9.90037e+07, allocated
>> >> nonzeros=9.90037e+07
>> >> > > > > > > >     total number of mallocs used during MatSetValues
>> calls =0
>> >> > > > > > > >       using I-node (on process 0) routines: found 78749
>> >> nodes, limit used is 5
>> >> > > > > > > >
>> >> > > > > > > > The pattern of convergence gives a hint that this system
>> is
>> >> somehow bad/singular. But I don't know why the preconditioned error
>> goes up
>> >> too high. Anyone has an idea?
>> >> > > > > > > >
>> >> > > > > > > > Best regards
>> >> > > > > > > > Giang Bui
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > > >
>> >> > > >
>> >> > > >
>> >> > >
>> >> > >
>> >> >
>> >> >
>> >>
>> >>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170503/7b68cc45/attachment-0001.html>

From Lukasz.Kaczmarczyk at glasgow.ac.uk  Wed May  3 02:53:46 2017
From: Lukasz.Kaczmarczyk at glasgow.ac.uk (Lukasz Kaczmarczyk)
Date: Wed, 3 May 2017 07:53:46 +0000
Subject: [petsc-users] strange convergence
In-Reply-To: <CAJW_hKdxHMKtN5P7edsZ=V2VTeWebR=D6chkZTMy5L=2fTsT+g@mail.gmail.com>
References: <CAJW_hKcJN520Q=0cAJyAQ=ZF1Wh0f63=5AXSJ5MSyXMFv2AnNQ@mail.gmail.com>
	<7891536D-91FE-4BFF-8DAD-CE7AB85A4E57@mcs.anl.gov>
	<CAJW_hKfFpmJRx+XHSZxbdPEK8_B1sWQTATKVaAZu6vi-1uALqg@mail.gmail.com>
	<425BBB58-9721-49F3-8C86-940F08E925F7@mcs.anl.gov>
	<CAJW_hKd+GAwf4ZS3_337n3PAJmxC6dm2B5uGDS88nrH-KjV60g@mail.gmail.com>
	<A1749310-AE79-4D42-A14F-991DA59F4919@mcs.anl.gov>
	<CAJW_hKd+m3p+Oju7XSPha=_AqYbGD_uF0ZV6qe8HzxbFWZ_2Zg@mail.gmail.com>
	<42EB791A-40C2-439F-A5F7-5F8C15CECA6F@mcs.anl.gov>
	<CAJW_hKd=d2NMxiaU1h6Ro3S3f_iaY294yRJyCG_tb1KN6Wiiug@mail.gmail.com>
	<F5D2FC76-C3A2-4FC8-8144-0D7500F26547@mcs.anl.gov>
	<CAJW_hKc7ZnhAGrq=wAgr3kyBw536GSDL+deDvfmpwkwnmVzHzw@mail.gmail.com>
	<82193784-B4C4-47D7-80EA-25F549C9091B@mcs.anl.gov>
	<CAJW_hKeiozsxQ7XkVneArg_0UYiGXPU7Zovn_PMkNWFrix9w7A@mail.gmail.com>
	<B3AE9D6B-7826-4ED2-B9A2-12250D37BB01@mcs.anl.gov>
	<CAJW_hKfzOV3dDjjgZGvexAm_Sg+8N8fb82J_-iMfxJ2oYvBm9w@mail.gmail.com>
	<87wpa3wd5j.fsf@jedbrown.org>
	<CAJW_hKdxHMKtN5P7edsZ=V2VTeWebR=D6chkZTMy5L=2fTsT+g@mail.gmail.com>
Message-ID: <80368283-C55F-49AD-B986-83AD0CD72338@glasgow.ac.uk>


On 3 May 2017, at 08:29, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> wrote:

Dear Jed

If I understood you correctly you suggest to avoid penalty by using the Lagrange multiplier for the mortar constraint? In this case it leads to the use of discrete Lagrange multiplier space. Do you or anyone already have experience using discrete Lagrange multiplier space with Petsc?

There is also similar question on stackexchange
https://scicomp.stackexchange.com/questions/25113/preconditioners-and-discrete-lagrange-multipliers

Hello,

FIELDSPLIT solver can help with this. We apply this for slightly different problem, but with
Lagrange multipliers, see this

http://mofem.eng.gla.ac.uk/mofem/html/cell__forces_8cpp.html

we working as well on mortar contact, but at development stage we use LU

https://doi.org/10.5281/zenodo.439739

Hope that will be somehow helpful,
Lukasz



Giang

On Sat, Apr 29, 2017 at 3:34 PM, Jed Brown <jed at jedbrown.org<mailto:jed at jedbrown.org>> wrote:
Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> writes:

> Hi Barry
>
> The first block is from a standard solid mechanics discretization based on
> balance of momentum equation. There is some material involved but in
> principal it's well-posed elasticity equation with positive definite
> tangent operator. The "gluing business" uses the mortar method to keep the
> continuity of displacement. Instead of using Lagrange multiplier to treat
> the constraint I used penalty method to penalize the energy. The
> discretization form of mortar is quite simple
>
> \int_{\Gamma_1} { rho * (\delta u_1 - \delta u_2) * (u_1 - u_2) dA }
>
> rho is penalty parameter. In the simulation I initially set it low (~E) to
> preserve the conditioning of the system.

There are two things that can go wrong here with AMG:

* The penalty term can mess up the strength of connection heuristics
  such that you get poor choice of C-points (classical AMG like
  BoomerAMG) or poor choice of aggregates (smoothed aggregation).

* The penalty term can prevent Jacobi smoothing from being effective; in
  this case, it can lead to poor coarse basis functions (higher energy
  than they should be) and poor smoothing in an MG cycle.  You can fix
  the poor smoothing in the MG cycle by using a stronger smoother, like
  ASM with some overlap.

I'm generally not a fan of penalty methods due to the irritating
tradeoffs and often poor solver performance.

> In the figure below, the colorful blocks are u_1 and the base is u_2. Both
> u_1 and u_2 use isoparametric quadratic approximation.
>
> ?
>  Snapshot.png
> <https://drive.google.com/file/d/0Bw8Hmu0-YGQXc2hKQ1BhQ1I4OEU/view?usp=drive_web>
> ???
>
> Giang
>
> On Fri, Apr 28, 2017 at 6:21 PM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>> wrote:
>
>>
>>   Ok, so boomerAMG algebraic multigrid is not good for the first block.
>> You mentioned the first block has two things glued together? AMG is
>> fantastic for certain problems but doesn't work for everything.
>>
>>    Tell us more about the first block, what PDE it comes from, what
>> discretization, and what the "gluing business" is and maybe we'll have
>> suggestions for how to precondition it.
>>
>>    Barry
>>
>> > On Apr 28, 2017, at 3:56 AM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> wrote:
>> >
>> > It's in fact quite good
>> >
>> >     Residual norms for fieldsplit_u_ solve.
>> >     0 KSP Residual norm 4.014715925568e+00
>> >     1 KSP Residual norm 2.160497019264e-10
>> >     Residual norms for fieldsplit_wp_ solve.
>> >     0 KSP Residual norm 0.000000000000e+00
>> >   0 KSP preconditioned resid norm 4.014715925568e+00 true resid norm
>> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >     Residual norms for fieldsplit_u_ solve.
>> >     0 KSP Residual norm 9.999999999416e-01
>> >     1 KSP Residual norm 7.118380416383e-11
>> >     Residual norms for fieldsplit_wp_ solve.
>> >     0 KSP Residual norm 0.000000000000e+00
>> >   1 KSP preconditioned resid norm 1.701150951035e-10 true resid norm
>> 5.494262251846e-04 ||r(i)||/||b|| 6.100334726599e-11
>> > Linear solve converged due to CONVERGED_ATOL iterations 1
>> >
>> > Giang
>> >
>> > On Thu, Apr 27, 2017 at 5:25 PM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>> wrote:
>> >
>> >   Run again using LU on both blocks to see what happens.
>> >
>> >
>> > > On Apr 27, 2017, at 2:14 AM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>>
>> wrote:
>> > >
>> > > I have changed the way to tie the nonconforming mesh. It seems the
>> matrix now is better
>> > >
>> > > with -pc_type lu  the output is
>> > >   0 KSP preconditioned resid norm 3.308678584240e-01 true resid norm
>> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > >   1 KSP preconditioned resid norm 2.004313395301e-12 true resid norm
>> 2.549872332830e-05 ||r(i)||/||b|| 2.831148938173e-12
>> > > Linear solve converged due to CONVERGED_ATOL iterations 1
>> > >
>> > >
>> > > with -pc_type fieldsplit  -fieldsplit_u_pc_type hypre
>> -fieldsplit_wp_pc_type lu    the convergence is slow
>> > >   0 KSP preconditioned resid norm 1.116302362553e-01 true resid norm
>> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > >   1 KSP preconditioned resid norm 2.582134825666e-02 true resid norm
>> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
>> > > ...
>> > > 824 KSP preconditioned resid norm 1.018542387738e-09 true resid norm
>> 2.906608839310e+02 ||r(i)||/||b|| 3.227237074804e-05
>> > > 825 KSP preconditioned resid norm 9.743727947637e-10 true resid norm
>> 2.820369993061e+02 ||r(i)||/||b|| 3.131485215062e-05
>> > > Linear solve converged due to CONVERGED_ATOL iterations 825
>> > >
>> > > checking with additional  -fieldsplit_u_ksp_type richardson
>> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
>> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
>> -fieldsplit_wp_ksp_max_it 1  gives
>> > >
>> > >   0 KSP preconditioned resid norm 1.116302362553e-01 true resid norm
>> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > >     Residual norms for fieldsplit_u_ solve.
>> > >     0 KSP Residual norm 5.803507549280e-01
>> > >     1 KSP Residual norm 2.069538175950e-01
>> > >     Residual norms for fieldsplit_wp_ solve.
>> > >     0 KSP Residual norm 0.000000000000e+00
>> > >   1 KSP preconditioned resid norm 2.582134825666e-02 true resid norm
>> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
>> > >     Residual norms for fieldsplit_u_ solve.
>> > >     0 KSP Residual norm 7.831796195225e-01
>> > >     1 KSP Residual norm 1.734608520110e-01
>> > >     Residual norms for fieldsplit_wp_ solve.
>> > >     0 KSP Residual norm 0.000000000000e+00
>> > > ....
>> > > 823 KSP preconditioned resid norm 1.065070135605e-09 true resid norm
>> 3.081881356833e+02 ||r(i)||/||b|| 3.421843916665e-05
>> > >     Residual norms for fieldsplit_u_ solve.
>> > >     0 KSP Residual norm 6.113806394327e-01
>> > >     1 KSP Residual norm 1.535465290944e-01
>> > >     Residual norms for fieldsplit_wp_ solve.
>> > >     0 KSP Residual norm 0.000000000000e+00
>> > > 824 KSP preconditioned resid norm 1.018542387746e-09 true resid norm
>> 2.906608839353e+02 ||r(i)||/||b|| 3.227237074851e-05
>> > >     Residual norms for fieldsplit_u_ solve.
>> > >     0 KSP Residual norm 6.123437055586e-01
>> > >     1 KSP Residual norm 1.524661826133e-01
>> > >     Residual norms for fieldsplit_wp_ solve.
>> > >     0 KSP Residual norm 0.000000000000e+00
>> > > 825 KSP preconditioned resid norm 9.743727947718e-10 true resid norm
>> 2.820369990571e+02 ||r(i)||/||b|| 3.131485212298e-05
>> > > Linear solve converged due to CONVERGED_ATOL iterations 825
>> > >
>> > >
>> > > The residual for wp block is zero since in this first step the rhs is
>> zero. As can see in the output, the multigrid does not perform well to
>> reduce the residual in the sub-solve. Is my observation right? what can be
>> done to improve this?
>> > >
>> > >
>> > > Giang
>> > >
>> > > On Tue, Apr 25, 2017 at 12:17 AM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>>
>> wrote:
>> > >
>> > >    This can happen in the matrix is singular or nearly singular or if
>> the factorization generates small pivots, which can occur for even
>> nonsingular problems if the matrix is poorly scaled or just plain nasty.
>> > >
>> > >
>> > > > On Apr 24, 2017, at 5:10 PM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>>
>> wrote:
>> > > >
>> > > > It took a while, here I send you the output
>> > > >
>> > > >   0 KSP preconditioned resid norm 3.129073545457e+05 true resid norm
>> 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > > >   1 KSP preconditioned resid norm 7.442444222843e-01 true resid norm
>> 1.003356247696e+02 ||r(i)||/||b|| 1.112966720375e-05
>> > > >   2 KSP preconditioned resid norm 3.267453132529e-07 true resid norm
>> 3.216722968300e+01 ||r(i)||/||b|| 3.568130084011e-06
>> > > >   3 KSP preconditioned resid norm 1.155046883816e-11 true resid norm
>> 3.234460376820e+01 ||r(i)||/||b|| 3.587805194854e-06
>> > > > Linear solve converged due to CONVERGED_ATOL iterations 3
>> > > > KSP Object: 4 MPI processes
>> > > >   type: gmres
>> > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> Orthogonalization
>> > > >     GMRES: happy breakdown tolerance 1e-30
>> > > >   maximum iterations=1000, initial guess is zero
>> > > >   tolerances:  relative=1e-20, absolute=1e-09, divergence=10000
>> > > >   left preconditioning
>> > > >   using PRECONDITIONED norm type for convergence test
>> > > > PC Object: 4 MPI processes
>> > > >   type: lu
>> > > >     LU: out-of-place factorization
>> > > >     tolerance for zero pivot 2.22045e-14
>> > > >     matrix ordering: natural
>> > > >     factor fill ratio given 0, needed 0
>> > > >       Factored matrix follows:
>> > > >         Mat Object:         4 MPI processes
>> > > >           type: mpiaij
>> > > >           rows=973051, cols=973051
>> > > >           package used to perform factorization: pastix
>> > > >   Error :                        3.24786e-14
>> > > >           total: nonzeros=0, allocated nonzeros=0
>> > > >           total number of mallocs used during MatSetValues calls =0
>> > > >             PaStiX run parameters:
>> > > >               Matrix type :                      Unsymmetric
>> > > >               Level of printing (0,1,2):         0
>> > > >               Number of refinements iterations : 3
>> > > >   Error :                        3.24786e-14
>> > > >   linear system matrix = precond matrix:
>> > > >   Mat Object:   4 MPI processes
>> > > >     type: mpiaij
>> > > >     rows=973051, cols=973051
>> > > >   Error :                        3.24786e-14
>> > > >     total: nonzeros=9.90037e+07, allocated nonzeros=9.90037e+07
>> > > >     total number of mallocs used during MatSetValues calls =0
>> > > >       using I-node (on process 0) routines: found 78749 nodes, limit
>> used is 5
>> > > >   Error :                        3.24786e-14
>> > > >
>> > > > It doesn't do as you said. Something is not right here. I will look
>> in depth.
>> > > >
>> > > > Giang
>> > > >
>> > > > On Mon, Apr 24, 2017 at 8:21 PM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>>
>> wrote:
>> > > >
>> > > > > On Apr 24, 2017, at 12:47 PM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>>
>> wrote:
>> > > > >
>> > > > > Good catch. I get this for the very first step, maybe at that time
>> the rhs_w is zero.
>> > > >
>> > > >     With the multiplicative composition the right hand side of the
>> second solve is the initial right hand side of the second solve minus
>> A_10*x where x is the solution to the first sub solve and A_10 is the lower
>> left block of the outer matrix. So unless both the initial right hand side
>> has a zero for the second block and A_10 is identically zero the right hand
>> side for the second sub solve should not be zero. Is A_10 == 0?
>> > > >
>> > > >
>> > > > > In the later step, it shows 2 step convergence
>> > > > >
>> > > > > Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 3.165886479830e+04
>> > > > >     1 KSP Residual norm 2.905922877684e-01
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 2.397669419027e-01
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   0 KSP preconditioned resid norm 3.165886479920e+04 true resid
>> norm 7.963616922323e+05 ||r(i)||/||b|| 1.000000000000e+00
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 9.999891813771e-01
>> > > > >     1 KSP Residual norm 1.512000395579e-05
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 8.192702188243e-06
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   1 KSP preconditioned resid norm 5.252183822848e-02 true resid
>> norm 7.135927677844e+04 ||r(i)||/||b|| 8.960661653427e-02
>> > > >
>> > > >     The outer residual norms are still wonky, the preconditioned
>> residual norm goes from 3.165886479920e+04 to 5.252183822848e-02 which is a
>> huge drop but the 7.963616922323e+05  drops very much less
>> 7.135927677844e+04. This is not normal.
>> > > >
>> > > >    What if you just use -pc_type lu for the entire system (no
>> fieldsplit), does the true residual drop to almost zero in the first
>> iteration (as it should?). Send the output.
>> > > >
>> > > >
>> > > >
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 6.946213936597e-01
>> > > > >     1 KSP Residual norm 1.195514007343e-05
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.025694497535e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   2 KSP preconditioned resid norm 8.785709535405e-03 true resid
>> norm 1.419341799277e+04 ||r(i)||/||b|| 1.782282866091e-02
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 7.255149996405e-01
>> > > > >     1 KSP Residual norm 6.583512434218e-06
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.015229700337e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   3 KSP preconditioned resid norm 7.110407712709e-04 true resid
>> norm 5.284940654154e+02 ||r(i)||/||b|| 6.636357205153e-04
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 3.512243341400e-01
>> > > > >     1 KSP Residual norm 2.032490351200e-06
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.282327290982e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   4 KSP preconditioned resid norm 3.482036620521e-05 true resid
>> norm 4.291231924307e+01 ||r(i)||/||b|| 5.388546393133e-05
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 3.423609338053e-01
>> > > > >     1 KSP Residual norm 4.213703301972e-07
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.157384757538e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   5 KSP preconditioned resid norm 1.203470314534e-06 true resid
>> norm 4.544956156267e+00 ||r(i)||/||b|| 5.707150658550e-06
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 3.838596289995e-01
>> > > > >     1 KSP Residual norm 9.927864176103e-08
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.066298905618e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   6 KSP preconditioned resid norm 3.331619244266e-08 true resid
>> norm 2.821511729024e+00 ||r(i)||/||b|| 3.543002829675e-06
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 4.624964188094e-01
>> > > > >     1 KSP Residual norm 6.418229775372e-08
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 9.800784311614e-01
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   7 KSP preconditioned resid norm 8.788046233297e-10 true resid
>> norm 2.849209671705e+00 ||r(i)||/||b|| 3.577783436215e-06
>> > > > > Linear solve converged due to CONVERGED_ATOL iterations 7
>> > > > >
>> > > > > The outer operator is an explicit matrix.
>> > > > >
>> > > > > Giang
>> > > > >
>> > > > > On Mon, Apr 24, 2017 at 7:32 PM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>>
>> wrote:
>> > > > >
>> > > > > > On Apr 24, 2017, at 3:16 AM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>>
>> wrote:
>> > > > > >
>> > > > > > Thanks Barry, trying with -fieldsplit_u_type lu gives better
>> convergence. I still used 4 procs though, probably with 1 proc it should
>> also be the same.
>> > > > > >
>> > > > > > The u block used a Nitsche-type operator to connect two
>> non-matching domains. I don't think it will leave some rigid body motion
>> leads to not sufficient constraints. Maybe you have other idea?
>> > > > > >
>> > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > >     0 KSP Residual norm 3.129067184300e+05
>> > > > > >     1 KSP Residual norm 5.906261468196e-01
>> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > >
>> > > > >     ^^^^ something is wrong here. The sub solve should not be
>> starting with a 0 residual (this means the right hand side for this sub
>> solve is zero which it should not be).
>> > > > >
>> > > > > > FieldSplit with MULTIPLICATIVE composition: total splits = 2
>> > > > >
>> > > > >
>> > > > >    How are you providing the outer operator? As an explicit matrix
>> or with some shell matrix?
>> > > > >
>> > > > >
>> > > > >
>> > > > > >   0 KSP preconditioned resid norm 3.129067184300e+05 true resid
>> norm 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > >     0 KSP Residual norm 9.999955993437e-01
>> > > > > >     1 KSP Residual norm 4.019774691831e-06
>> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > >   1 KSP preconditioned resid norm 5.003913641475e-01 true resid
>> norm 4.692996324114e+01 ||r(i)||/||b|| 5.205677185522e-06
>> > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > >     0 KSP Residual norm 1.000012180204e+00
>> > > > > >     1 KSP Residual norm 1.017367950422e-05
>> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > >   2 KSP preconditioned resid norm 2.330910333756e-07 true resid
>> norm 3.474855463983e+01 ||r(i)||/||b|| 3.854461960453e-06
>> > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > >     0 KSP Residual norm 1.000004200085e+00
>> > > > > >     1 KSP Residual norm 6.231613102458e-06
>> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > >   3 KSP preconditioned resid norm 8.671259838389e-11 true resid
>> norm 3.545103468011e+01 ||r(i)||/||b|| 3.932384125024e-06
>> > > > > > Linear solve converged due to CONVERGED_ATOL iterations 3
>> > > > > > KSP Object: 4 MPI processes
>> > > > > >   type: gmres
>> > > > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> Orthogonalization
>> > > > > >     GMRES: happy breakdown tolerance 1e-30
>> > > > > >   maximum iterations=1000, initial guess is zero
>> > > > > >   tolerances:  relative=1e-20, absolute=1e-09, divergence=10000
>> > > > > >   left preconditioning
>> > > > > >   using PRECONDITIONED norm type for convergence test
>> > > > > > PC Object: 4 MPI processes
>> > > > > >   type: fieldsplit
>> > > > > >     FieldSplit with MULTIPLICATIVE composition: total splits = 2
>> > > > > >     Solver info for each split is in the following KSP objects:
>> > > > > >     Split number 0 Defined by IS
>> > > > > >     KSP Object:    (fieldsplit_u_)     4 MPI processes
>> > > > > >       type: richardson
>> > > > > >         Richardson: damping factor=1
>> > > > > >       maximum iterations=1, initial guess is zero
>> > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> divergence=10000
>> > > > > >       left preconditioning
>> > > > > >       using PRECONDITIONED norm type for convergence test
>> > > > > >     PC Object:    (fieldsplit_u_)     4 MPI processes
>> > > > > >       type: lu
>> > > > > >         LU: out-of-place factorization
>> > > > > >         tolerance for zero pivot 2.22045e-14
>> > > > > >         matrix ordering: natural
>> > > > > >         factor fill ratio given 0, needed 0
>> > > > > >           Factored matrix follows:
>> > > > > >             Mat Object:             4 MPI processes
>> > > > > >               type: mpiaij
>> > > > > >               rows=938910, cols=938910
>> > > > > >               package used to perform factorization: pastix
>> > > > > >               total: nonzeros=0, allocated nonzeros=0
>> > > > > >       Error :                        3.36878e-14
>> > > > > >           total number of mallocs used during MatSetValues calls
>> =0
>> > > > > >                 PaStiX run parameters:
>> > > > > >                   Matrix type :                      Unsymmetric
>> > > > > >                   Level of printing (0,1,2):         0
>> > > > > >                   Number of refinements iterations : 3
>> > > > > >   Error :                        3.36878e-14
>> > > > > >       linear system matrix = precond matrix:
>> > > > > >       Mat Object:      (fieldsplit_u_)       4 MPI processes
>> > > > > >         type: mpiaij
>> > > > > >         rows=938910, cols=938910, bs=3
>> > > > > >   Error :                        3.36878e-14
>> > > > > >   Error :                        3.36878e-14
>> > > > > >         total: nonzeros=8.60906e+07, allocated
>> nonzeros=8.60906e+07
>> > > > > >         total number of mallocs used during MatSetValues calls =0
>> > > > > >           using I-node (on process 0) routines: found 78749
>> nodes, limit used is 5
>> > > > > >     Split number 1 Defined by IS
>> > > > > >     KSP Object:    (fieldsplit_wp_)     4 MPI processes
>> > > > > >       type: richardson
>> > > > > >         Richardson: damping factor=1
>> > > > > >       maximum iterations=1, initial guess is zero
>> > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> divergence=10000
>> > > > > >       left preconditioning
>> > > > > >       using PRECONDITIONED norm type for convergence test
>> > > > > >     PC Object:    (fieldsplit_wp_)     4 MPI processes
>> > > > > >       type: lu
>> > > > > >         LU: out-of-place factorization
>> > > > > >         tolerance for zero pivot 2.22045e-14
>> > > > > >         matrix ordering: natural
>> > > > > >         factor fill ratio given 0, needed 0
>> > > > > >           Factored matrix follows:
>> > > > > >             Mat Object:             4 MPI processes
>> > > > > >               type: mpiaij
>> > > > > >               rows=34141, cols=34141
>> > > > > >               package used to perform factorization: pastix
>> > > > > >                 Error :                        -nan
>> > > > > >   Error :                        -nan
>> > > > > >   Error :                        -nan
>> > > > > > total: nonzeros=0, allocated nonzeros=0
>> > > > > >               total number of mallocs used during MatSetValues
>> calls =0
>> > > > > >                 PaStiX run parameters:
>> > > > > >                   Matrix type :                      Symmetric
>> > > > > >                   Level of printing (0,1,2):         0
>> > > > > >                   Number of refinements iterations : 0
>> > > > > >   Error :                        -nan
>> > > > > >       linear system matrix = precond matrix:
>> > > > > >       Mat Object:      (fieldsplit_wp_)       4 MPI processes
>> > > > > >         type: mpiaij
>> > > > > >         rows=34141, cols=34141
>> > > > > >         total: nonzeros=485655, allocated nonzeros=485655
>> > > > > >         total number of mallocs used during MatSetValues calls =0
>> > > > > >           not using I-node (on process 0) routines
>> > > > > >   linear system matrix = precond matrix:
>> > > > > >   Mat Object:   4 MPI processes
>> > > > > >     type: mpiaij
>> > > > > >     rows=973051, cols=973051
>> > > > > >     total: nonzeros=9.90037e+07, allocated nonzeros=9.90037e+07
>> > > > > >     total number of mallocs used during MatSetValues calls =0
>> > > > > >       using I-node (on process 0) routines: found 78749 nodes,
>> limit used is 5
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > Giang
>> > > > > >
>> > > > > > On Sun, Apr 23, 2017 at 10:19 PM, Barry Smith <
>> bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>> wrote:
>> > > > > >
>> > > > > > > On Apr 23, 2017, at 2:42 PM, Hoang Giang Bui <
>> hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> wrote:
>> > > > > > >
>> > > > > > > Dear Matt/Barry
>> > > > > > >
>> > > > > > > With your options, it results in
>> > > > > > >
>> > > > > > >   0 KSP preconditioned resid norm 1.106709687386e+31 true
>> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > > >     0 KSP Residual norm 2.407308987203e+36
>> > > > > > >     1 KSP Residual norm 5.797185652683e+72
>> > > > > >
>> > > > > > It looks like Matt is right, hypre is seemly producing useless
>> garbage.
>> > > > > >
>> > > > > > First how do things run on one process. If you have similar
>> problems then debug on one process (debugging any kind of problem is always
>> far easy on one process).
>> > > > > >
>> > > > > > First run with -fieldsplit_u_type lu (instead of using hypre) to
>> see if that works or also produces something bad.
>> > > > > >
>> > > > > > What is the operator and the boundary conditions for u? It could
>> be singular.
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > > > ...
>> > > > > > > 999 KSP preconditioned resid norm 2.920157329174e+12 true
>> resid norm 9.015683504616e+06 ||r(i)||/||b|| 1.000059124102e+00
>> > > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > > >     0 KSP Residual norm 1.533726746719e+36
>> > > > > > >     1 KSP Residual norm 3.692757392261e+72
>> > > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > > >
>> > > > > > > Do you suggest that the pastix solver for the "wp" block
>> encounters small pivot? In addition, seem like the "u" block is also
>> singular.
>> > > > > > >
>> > > > > > > Giang
>> > > > > > >
>> > > > > > > On Sun, Apr 23, 2017 at 7:39 PM, Barry Smith <
>> bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>> wrote:
>> > > > > > >
>> > > > > > >    Huge preconditioned norms but normal unpreconditioned norms
>> almost always come from a very small pivot in an LU or ILU factorization.
>> > > > > > >
>> > > > > > >    The first thing to do is monitor the two sub solves. Run
>> with the additional options -fieldsplit_u_ksp_type richardson
>> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
>> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
>> -fieldsplit_wp_ksp_max_it 1
>> > > > > > >
>> > > > > > > > On Apr 23, 2017, at 12:22 PM, Hoang Giang Bui <
>> hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> wrote:
>> > > > > > > >
>> > > > > > > > Hello
>> > > > > > > >
>> > > > > > > > I encountered a strange convergence behavior that I have
>> trouble to understand
>> > > > > > > >
>> > > > > > > > KSPSetFromOptions completed
>> > > > > > > >   0 KSP preconditioned resid norm 1.106709687386e+31 true
>> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > > > > > > >   1 KSP preconditioned resid norm 2.933141742664e+29 true
>> resid norm 9.015152282123e+06 ||r(i)||/||b|| 1.000000198575e+00
>> > > > > > > >   2 KSP preconditioned resid norm 9.686409637174e+16 true
>> resid norm 9.015354521944e+06 ||r(i)||/||b|| 1.000022631902e+00
>> > > > > > > >   3 KSP preconditioned resid norm 4.219243615809e+15 true
>> resid norm 9.017157702420e+06 ||r(i)||/||b|| 1.000222648583e+00
>> > > > > > > > .....
>> > > > > > > > 999 KSP preconditioned resid norm 3.043754298076e+12 true
>> resid norm 9.015425041089e+06 ||r(i)||/||b|| 1.000030454195e+00
>> > > > > > > > 1000 KSP preconditioned resid norm 3.043000287819e+12 true
>> resid norm 9.015424313455e+06 ||r(i)||/||b|| 1.000030373483e+00
>> > > > > > > > Linear solve did not converge due to DIVERGED_ITS iterations
>> 1000
>> > > > > > > > KSP Object: 4 MPI processes
>> > > > > > > >   type: gmres
>> > > > > > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> Orthogonalization
>> > > > > > > >     GMRES: happy breakdown tolerance 1e-30
>> > > > > > > >   maximum iterations=1000, initial guess is zero
>> > > > > > > >   tolerances:  relative=1e-20, absolute=1e-09,
>> divergence=10000
>> > > > > > > >   left preconditioning
>> > > > > > > >   using PRECONDITIONED norm type for convergence test
>> > > > > > > > PC Object: 4 MPI processes
>> > > > > > > >   type: fieldsplit
>> > > > > > > >     FieldSplit with MULTIPLICATIVE composition: total splits
>> = 2
>> > > > > > > >     Solver info for each split is in the following KSP
>> objects:
>> > > > > > > >     Split number 0 Defined by IS
>> > > > > > > >     KSP Object:    (fieldsplit_u_)     4 MPI processes
>> > > > > > > >       type: preonly
>> > > > > > > >       maximum iterations=10000, initial guess is zero
>> > > > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> divergence=10000
>> > > > > > > >       left preconditioning
>> > > > > > > >       using NONE norm type for convergence test
>> > > > > > > >     PC Object:    (fieldsplit_u_)     4 MPI processes
>> > > > > > > >       type: hypre
>> > > > > > > >         HYPRE BoomerAMG preconditioning
>> > > > > > > >         HYPRE BoomerAMG: Cycle type V
>> > > > > > > >         HYPRE BoomerAMG: Maximum number of levels 25
>> > > > > > > >         HYPRE BoomerAMG: Maximum number of iterations PER
>> hypre call 1
>> > > > > > > >         HYPRE BoomerAMG: Convergence tolerance PER hypre
>> call 0
>> > > > > > > >         HYPRE BoomerAMG: Threshold for strong coupling 0.6
>> > > > > > > >         HYPRE BoomerAMG: Interpolation truncation factor 0
>> > > > > > > >         HYPRE BoomerAMG: Interpolation: max elements per row
>> 0
>> > > > > > > >         HYPRE BoomerAMG: Number of levels of aggressive
>> coarsening 0
>> > > > > > > >         HYPRE BoomerAMG: Number of paths for aggressive
>> coarsening 1
>> > > > > > > >         HYPRE BoomerAMG: Maximum row sums 0.9
>> > > > > > > >         HYPRE BoomerAMG: Sweeps down         1
>> > > > > > > >         HYPRE BoomerAMG: Sweeps up           1
>> > > > > > > >         HYPRE BoomerAMG: Sweeps on coarse    1
>> > > > > > > >         HYPRE BoomerAMG: Relax down
>> symmetric-SOR/Jacobi
>> > > > > > > >         HYPRE BoomerAMG: Relax up
>> symmetric-SOR/Jacobi
>> > > > > > > >         HYPRE BoomerAMG: Relax on coarse
>>  Gaussian-elimination
>> > > > > > > >         HYPRE BoomerAMG: Relax weight  (all)      1
>> > > > > > > >         HYPRE BoomerAMG: Outer relax weight (all) 1
>> > > > > > > >         HYPRE BoomerAMG: Using CF-relaxation
>> > > > > > > >         HYPRE BoomerAMG: Measure type        local
>> > > > > > > >         HYPRE BoomerAMG: Coarsen type        PMIS
>> > > > > > > >         HYPRE BoomerAMG: Interpolation type  classical
>> > > > > > > >       linear system matrix = precond matrix:
>> > > > > > > >       Mat Object:      (fieldsplit_u_)       4 MPI processes
>> > > > > > > >         type: mpiaij
>> > > > > > > >         rows=938910, cols=938910, bs=3
>> > > > > > > >         total: nonzeros=8.60906e+07, allocated
>> nonzeros=8.60906e+07
>> > > > > > > >         total number of mallocs used during MatSetValues
>> calls =0
>> > > > > > > >           using I-node (on process 0) routines: found 78749
>> nodes, limit used is 5
>> > > > > > > >     Split number 1 Defined by IS
>> > > > > > > >     KSP Object:    (fieldsplit_wp_)     4 MPI processes
>> > > > > > > >       type: preonly
>> > > > > > > >       maximum iterations=10000, initial guess is zero
>> > > > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> divergence=10000
>> > > > > > > >       left preconditioning
>> > > > > > > >       using NONE norm type for convergence test
>> > > > > > > >     PC Object:    (fieldsplit_wp_)     4 MPI processes
>> > > > > > > >       type: lu
>> > > > > > > >         LU: out-of-place factorization
>> > > > > > > >         tolerance for zero pivot 2.22045e-14
>> > > > > > > >         matrix ordering: natural
>> > > > > > > >         factor fill ratio given 0, needed 0
>> > > > > > > >           Factored matrix follows:
>> > > > > > > >             Mat Object:             4 MPI processes
>> > > > > > > >               type: mpiaij
>> > > > > > > >               rows=34141, cols=34141
>> > > > > > > >               package used to perform factorization: pastix
>> > > > > > > >             Error :                        -nan
>> > > > > > > >   Error :                        -nan
>> > > > > > > >     total: nonzeros=0, allocated nonzeros=0
>> > > > > > > >             Error :                        -nan
>> > > > > > > >     total number of mallocs used during MatSetValues calls =0
>> > > > > > > >                 PaStiX run parameters:
>> > > > > > > >                   Matrix type :
>> Symmetric
>> > > > > > > >                   Level of printing (0,1,2):         0
>> > > > > > > >                   Number of refinements iterations : 0
>> > > > > > > >   Error :                        -nan
>> > > > > > > >       linear system matrix = precond matrix:
>> > > > > > > >       Mat Object:      (fieldsplit_wp_)       4 MPI processes
>> > > > > > > >         type: mpiaij
>> > > > > > > >         rows=34141, cols=34141
>> > > > > > > >         total: nonzeros=485655, allocated nonzeros=485655
>> > > > > > > >         total number of mallocs used during MatSetValues
>> calls =0
>> > > > > > > >           not using I-node (on process 0) routines
>> > > > > > > >   linear system matrix = precond matrix:
>> > > > > > > >   Mat Object:   4 MPI processes
>> > > > > > > >     type: mpiaij
>> > > > > > > >     rows=973051, cols=973051
>> > > > > > > >     total: nonzeros=9.90037e+07, allocated
>> nonzeros=9.90037e+07
>> > > > > > > >     total number of mallocs used during MatSetValues calls =0
>> > > > > > > >       using I-node (on process 0) routines: found 78749
>> nodes, limit used is 5
>> > > > > > > >
>> > > > > > > > The pattern of convergence gives a hint that this system is
>> somehow bad/singular. But I don't know why the preconditioned error goes up
>> too high. Anyone has an idea?
>> > > > > > > >
>> > > > > > > > Best regards
>> > > > > > > > Giang Bui
>> > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > >
>> > >
>> >
>> >
>>
>>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170503/4103d0d3/attachment-0001.html>

From knepley at gmail.com  Wed May  3 07:22:59 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 3 May 2017 07:22:59 -0500
Subject: [petsc-users] strange convergence
In-Reply-To: <CAJW_hKdxHMKtN5P7edsZ=V2VTeWebR=D6chkZTMy5L=2fTsT+g@mail.gmail.com>
References: <CAJW_hKcJN520Q=0cAJyAQ=ZF1Wh0f63=5AXSJ5MSyXMFv2AnNQ@mail.gmail.com>
	<7891536D-91FE-4BFF-8DAD-CE7AB85A4E57@mcs.anl.gov>
	<CAJW_hKfFpmJRx+XHSZxbdPEK8_B1sWQTATKVaAZu6vi-1uALqg@mail.gmail.com>
	<425BBB58-9721-49F3-8C86-940F08E925F7@mcs.anl.gov>
	<CAJW_hKd+GAwf4ZS3_337n3PAJmxC6dm2B5uGDS88nrH-KjV60g@mail.gmail.com>
	<A1749310-AE79-4D42-A14F-991DA59F4919@mcs.anl.gov>
	<CAJW_hKd+m3p+Oju7XSPha=_AqYbGD_uF0ZV6qe8HzxbFWZ_2Zg@mail.gmail.com>
	<42EB791A-40C2-439F-A5F7-5F8C15CECA6F@mcs.anl.gov>
	<CAJW_hKd=d2NMxiaU1h6Ro3S3f_iaY294yRJyCG_tb1KN6Wiiug@mail.gmail.com>
	<F5D2FC76-C3A2-4FC8-8144-0D7500F26547@mcs.anl.gov>
	<CAJW_hKc7ZnhAGrq=wAgr3kyBw536GSDL+deDvfmpwkwnmVzHzw@mail.gmail.com>
	<82193784-B4C4-47D7-80EA-25F549C9091B@mcs.anl.gov>
	<CAJW_hKeiozsxQ7XkVneArg_0UYiGXPU7Zovn_PMkNWFrix9w7A@mail.gmail.com>
	<B3AE9D6B-7826-4ED2-B9A2-12250D37BB01@mcs.anl.gov>
	<CAJW_hKfzOV3dDjjgZGvexAm_Sg+8N8fb82J_-iMfxJ2oYvBm9w@mail.gmail.com>
	<87wpa3wd5j.fsf@jedbrown.org>
	<CAJW_hKdxHMKtN5P7edsZ=V2VTeWebR=D6chkZTMy5L=2fTsT+g@mail.gmail.com>
Message-ID: <CAMYG4GkBEJCYKS5fn72MceU_WD1QhnitmDRZg1ivGaenzLYLDw@mail.gmail.com>

On Wed, May 3, 2017 at 2:29 AM, Hoang Giang Bui <hgbk2008 at gmail.com> wrote:

> Dear Jed
>
> If I understood you correctly you suggest to avoid penalty by using the
> Lagrange multiplier for the mortar constraint? In this case it leads to the
> use of discrete Lagrange multiplier space.
>

Sorry for being ignorant here, but why is the space "discrete"? It looks
like you should have a continuum formulation
of the mortar as well. Maybe I do not understand something fundamental.
>From this (https://en.wikipedia.org/wiki/Mortar_methods)
short description, it seems that mortars begin from a continuum
formulation, but are then reduced to the discrete level. This is no
problem if done consistently, as for instance in the FETI method where
efficient preconditioners exist.

 Thanks,

    Matt


> Do you or anyone already have experience using discrete Lagrange
> multiplier space with Petsc?
>
> There is also similar question on stackexchange
> https://scicomp.stackexchange.com/questions/25113/
> preconditioners-and-discrete-lagrange-multipliers
>
> Giang
>
> On Sat, Apr 29, 2017 at 3:34 PM, Jed Brown <jed at jedbrown.org> wrote:
>
>> Hoang Giang Bui <hgbk2008 at gmail.com> writes:
>>
>> > Hi Barry
>> >
>> > The first block is from a standard solid mechanics discretization based
>> on
>> > balance of momentum equation. There is some material involved but in
>> > principal it's well-posed elasticity equation with positive definite
>> > tangent operator. The "gluing business" uses the mortar method to keep
>> the
>> > continuity of displacement. Instead of using Lagrange multiplier to
>> treat
>> > the constraint I used penalty method to penalize the energy. The
>> > discretization form of mortar is quite simple
>> >
>> > \int_{\Gamma_1} { rho * (\delta u_1 - \delta u_2) * (u_1 - u_2) dA }
>> >
>> > rho is penalty parameter. In the simulation I initially set it low (~E)
>> to
>> > preserve the conditioning of the system.
>>
>> There are two things that can go wrong here with AMG:
>>
>> * The penalty term can mess up the strength of connection heuristics
>>   such that you get poor choice of C-points (classical AMG like
>>   BoomerAMG) or poor choice of aggregates (smoothed aggregation).
>>
>> * The penalty term can prevent Jacobi smoothing from being effective; in
>>   this case, it can lead to poor coarse basis functions (higher energy
>>   than they should be) and poor smoothing in an MG cycle.  You can fix
>>   the poor smoothing in the MG cycle by using a stronger smoother, like
>>   ASM with some overlap.
>>
>> I'm generally not a fan of penalty methods due to the irritating
>> tradeoffs and often poor solver performance.
>>
>> > In the figure below, the colorful blocks are u_1 and the base is u_2.
>> Both
>> > u_1 and u_2 use isoparametric quadratic approximation.
>> >
>> > ?
>> >  Snapshot.png
>> > <https://drive.google.com/file/d/0Bw8Hmu0-YGQXc2hKQ1BhQ1I4OE
>> U/view?usp=drive_web>
>> > ???
>> >
>> > Giang
>> >
>> > On Fri, Apr 28, 2017 at 6:21 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> >
>> >>
>> >>   Ok, so boomerAMG algebraic multigrid is not good for the first block.
>> >> You mentioned the first block has two things glued together? AMG is
>> >> fantastic for certain problems but doesn't work for everything.
>> >>
>> >>    Tell us more about the first block, what PDE it comes from, what
>> >> discretization, and what the "gluing business" is and maybe we'll have
>> >> suggestions for how to precondition it.
>> >>
>> >>    Barry
>> >>
>> >> > On Apr 28, 2017, at 3:56 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
>> wrote:
>> >> >
>> >> > It's in fact quite good
>> >> >
>> >> >     Residual norms for fieldsplit_u_ solve.
>> >> >     0 KSP Residual norm 4.014715925568e+00
>> >> >     1 KSP Residual norm 2.160497019264e-10
>> >> >     Residual norms for fieldsplit_wp_ solve.
>> >> >     0 KSP Residual norm 0.000000000000e+00
>> >> >   0 KSP preconditioned resid norm 4.014715925568e+00 true resid norm
>> >> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> >     Residual norms for fieldsplit_u_ solve.
>> >> >     0 KSP Residual norm 9.999999999416e-01
>> >> >     1 KSP Residual norm 7.118380416383e-11
>> >> >     Residual norms for fieldsplit_wp_ solve.
>> >> >     0 KSP Residual norm 0.000000000000e+00
>> >> >   1 KSP preconditioned resid norm 1.701150951035e-10 true resid norm
>> >> 5.494262251846e-04 ||r(i)||/||b|| 6.100334726599e-11
>> >> > Linear solve converged due to CONVERGED_ATOL iterations 1
>> >> >
>> >> > Giang
>> >> >
>> >> > On Thu, Apr 27, 2017 at 5:25 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> >> >
>> >> >   Run again using LU on both blocks to see what happens.
>> >> >
>> >> >
>> >> > > On Apr 27, 2017, at 2:14 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
>> >> wrote:
>> >> > >
>> >> > > I have changed the way to tie the nonconforming mesh. It seems the
>> >> matrix now is better
>> >> > >
>> >> > > with -pc_type lu  the output is
>> >> > >   0 KSP preconditioned resid norm 3.308678584240e-01 true resid
>> norm
>> >> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> > >   1 KSP preconditioned resid norm 2.004313395301e-12 true resid
>> norm
>> >> 2.549872332830e-05 ||r(i)||/||b|| 2.831148938173e-12
>> >> > > Linear solve converged due to CONVERGED_ATOL iterations 1
>> >> > >
>> >> > >
>> >> > > with -pc_type fieldsplit  -fieldsplit_u_pc_type hypre
>> >> -fieldsplit_wp_pc_type lu    the convergence is slow
>> >> > >   0 KSP preconditioned resid norm 1.116302362553e-01 true resid
>> norm
>> >> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> > >   1 KSP preconditioned resid norm 2.582134825666e-02 true resid
>> norm
>> >> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
>> >> > > ...
>> >> > > 824 KSP preconditioned resid norm 1.018542387738e-09 true resid
>> norm
>> >> 2.906608839310e+02 ||r(i)||/||b|| 3.227237074804e-05
>> >> > > 825 KSP preconditioned resid norm 9.743727947637e-10 true resid
>> norm
>> >> 2.820369993061e+02 ||r(i)||/||b|| 3.131485215062e-05
>> >> > > Linear solve converged due to CONVERGED_ATOL iterations 825
>> >> > >
>> >> > > checking with additional  -fieldsplit_u_ksp_type richardson
>> >> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
>> >> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
>> >> -fieldsplit_wp_ksp_max_it 1  gives
>> >> > >
>> >> > >   0 KSP preconditioned resid norm 1.116302362553e-01 true resid
>> norm
>> >> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> > >     Residual norms for fieldsplit_u_ solve.
>> >> > >     0 KSP Residual norm 5.803507549280e-01
>> >> > >     1 KSP Residual norm 2.069538175950e-01
>> >> > >     Residual norms for fieldsplit_wp_ solve.
>> >> > >     0 KSP Residual norm 0.000000000000e+00
>> >> > >   1 KSP preconditioned resid norm 2.582134825666e-02 true resid
>> norm
>> >> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
>> >> > >     Residual norms for fieldsplit_u_ solve.
>> >> > >     0 KSP Residual norm 7.831796195225e-01
>> >> > >     1 KSP Residual norm 1.734608520110e-01
>> >> > >     Residual norms for fieldsplit_wp_ solve.
>> >> > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > ....
>> >> > > 823 KSP preconditioned resid norm 1.065070135605e-09 true resid
>> norm
>> >> 3.081881356833e+02 ||r(i)||/||b|| 3.421843916665e-05
>> >> > >     Residual norms for fieldsplit_u_ solve.
>> >> > >     0 KSP Residual norm 6.113806394327e-01
>> >> > >     1 KSP Residual norm 1.535465290944e-01
>> >> > >     Residual norms for fieldsplit_wp_ solve.
>> >> > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > 824 KSP preconditioned resid norm 1.018542387746e-09 true resid
>> norm
>> >> 2.906608839353e+02 ||r(i)||/||b|| 3.227237074851e-05
>> >> > >     Residual norms for fieldsplit_u_ solve.
>> >> > >     0 KSP Residual norm 6.123437055586e-01
>> >> > >     1 KSP Residual norm 1.524661826133e-01
>> >> > >     Residual norms for fieldsplit_wp_ solve.
>> >> > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > 825 KSP preconditioned resid norm 9.743727947718e-10 true resid
>> norm
>> >> 2.820369990571e+02 ||r(i)||/||b|| 3.131485212298e-05
>> >> > > Linear solve converged due to CONVERGED_ATOL iterations 825
>> >> > >
>> >> > >
>> >> > > The residual for wp block is zero since in this first step the rhs
>> is
>> >> zero. As can see in the output, the multigrid does not perform well to
>> >> reduce the residual in the sub-solve. Is my observation right? what
>> can be
>> >> done to improve this?
>> >> > >
>> >> > >
>> >> > > Giang
>> >> > >
>> >> > > On Tue, Apr 25, 2017 at 12:17 AM, Barry Smith <bsmith at mcs.anl.gov>
>> >> wrote:
>> >> > >
>> >> > >    This can happen in the matrix is singular or nearly singular or
>> if
>> >> the factorization generates small pivots, which can occur for even
>> >> nonsingular problems if the matrix is poorly scaled or just plain
>> nasty.
>> >> > >
>> >> > >
>> >> > > > On Apr 24, 2017, at 5:10 PM, Hoang Giang Bui <hgbk2008 at gmail.com
>> >
>> >> wrote:
>> >> > > >
>> >> > > > It took a while, here I send you the output
>> >> > > >
>> >> > > >   0 KSP preconditioned resid norm 3.129073545457e+05 true resid
>> norm
>> >> 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> > > >   1 KSP preconditioned resid norm 7.442444222843e-01 true resid
>> norm
>> >> 1.003356247696e+02 ||r(i)||/||b|| 1.112966720375e-05
>> >> > > >   2 KSP preconditioned resid norm 3.267453132529e-07 true resid
>> norm
>> >> 3.216722968300e+01 ||r(i)||/||b|| 3.568130084011e-06
>> >> > > >   3 KSP preconditioned resid norm 1.155046883816e-11 true resid
>> norm
>> >> 3.234460376820e+01 ||r(i)||/||b|| 3.587805194854e-06
>> >> > > > Linear solve converged due to CONVERGED_ATOL iterations 3
>> >> > > > KSP Object: 4 MPI processes
>> >> > > >   type: gmres
>> >> > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> >> Orthogonalization
>> >> > > >     GMRES: happy breakdown tolerance 1e-30
>> >> > > >   maximum iterations=1000, initial guess is zero
>> >> > > >   tolerances:  relative=1e-20, absolute=1e-09, divergence=10000
>> >> > > >   left preconditioning
>> >> > > >   using PRECONDITIONED norm type for convergence test
>> >> > > > PC Object: 4 MPI processes
>> >> > > >   type: lu
>> >> > > >     LU: out-of-place factorization
>> >> > > >     tolerance for zero pivot 2.22045e-14
>> >> > > >     matrix ordering: natural
>> >> > > >     factor fill ratio given 0, needed 0
>> >> > > >       Factored matrix follows:
>> >> > > >         Mat Object:         4 MPI processes
>> >> > > >           type: mpiaij
>> >> > > >           rows=973051, cols=973051
>> >> > > >           package used to perform factorization: pastix
>> >> > > >   Error :                        3.24786e-14
>> >> > > >           total: nonzeros=0, allocated nonzeros=0
>> >> > > >           total number of mallocs used during MatSetValues calls
>> =0
>> >> > > >             PaStiX run parameters:
>> >> > > >               Matrix type :                      Unsymmetric
>> >> > > >               Level of printing (0,1,2):         0
>> >> > > >               Number of refinements iterations : 3
>> >> > > >   Error :                        3.24786e-14
>> >> > > >   linear system matrix = precond matrix:
>> >> > > >   Mat Object:   4 MPI processes
>> >> > > >     type: mpiaij
>> >> > > >     rows=973051, cols=973051
>> >> > > >   Error :                        3.24786e-14
>> >> > > >     total: nonzeros=9.90037e+07, allocated nonzeros=9.90037e+07
>> >> > > >     total number of mallocs used during MatSetValues calls =0
>> >> > > >       using I-node (on process 0) routines: found 78749 nodes,
>> limit
>> >> used is 5
>> >> > > >   Error :                        3.24786e-14
>> >> > > >
>> >> > > > It doesn't do as you said. Something is not right here. I will
>> look
>> >> in depth.
>> >> > > >
>> >> > > > Giang
>> >> > > >
>> >> > > > On Mon, Apr 24, 2017 at 8:21 PM, Barry Smith <bsmith at mcs.anl.gov
>> >
>> >> wrote:
>> >> > > >
>> >> > > > > On Apr 24, 2017, at 12:47 PM, Hoang Giang Bui <
>> hgbk2008 at gmail.com>
>> >> wrote:
>> >> > > > >
>> >> > > > > Good catch. I get this for the very first step, maybe at that
>> time
>> >> the rhs_w is zero.
>> >> > > >
>> >> > > >     With the multiplicative composition the right hand side of
>> the
>> >> second solve is the initial right hand side of the second solve minus
>> >> A_10*x where x is the solution to the first sub solve and A_10 is the
>> lower
>> >> left block of the outer matrix. So unless both the initial right hand
>> side
>> >> has a zero for the second block and A_10 is identically zero the right
>> hand
>> >> side for the second sub solve should not be zero. Is A_10 == 0?
>> >> > > >
>> >> > > >
>> >> > > > > In the later step, it shows 2 step convergence
>> >> > > > >
>> >> > > > > Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 3.165886479830e+04
>> >> > > > >     1 KSP Residual norm 2.905922877684e-01
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 2.397669419027e-01
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   0 KSP preconditioned resid norm 3.165886479920e+04 true resid
>> >> norm 7.963616922323e+05 ||r(i)||/||b|| 1.000000000000e+00
>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 9.999891813771e-01
>> >> > > > >     1 KSP Residual norm 1.512000395579e-05
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 8.192702188243e-06
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   1 KSP preconditioned resid norm 5.252183822848e-02 true resid
>> >> norm 7.135927677844e+04 ||r(i)||/||b|| 8.960661653427e-02
>> >> > > >
>> >> > > >     The outer residual norms are still wonky, the preconditioned
>> >> residual norm goes from 3.165886479920e+04 to 5.252183822848e-02 which
>> is a
>> >> huge drop but the 7.963616922323e+05  drops very much less
>> >> 7.135927677844e+04. This is not normal.
>> >> > > >
>> >> > > >    What if you just use -pc_type lu for the entire system (no
>> >> fieldsplit), does the true residual drop to almost zero in the first
>> >> iteration (as it should?). Send the output.
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 6.946213936597e-01
>> >> > > > >     1 KSP Residual norm 1.195514007343e-05
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 1.025694497535e+00
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   2 KSP preconditioned resid norm 8.785709535405e-03 true resid
>> >> norm 1.419341799277e+04 ||r(i)||/||b|| 1.782282866091e-02
>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 7.255149996405e-01
>> >> > > > >     1 KSP Residual norm 6.583512434218e-06
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 1.015229700337e+00
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   3 KSP preconditioned resid norm 7.110407712709e-04 true resid
>> >> norm 5.284940654154e+02 ||r(i)||/||b|| 6.636357205153e-04
>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 3.512243341400e-01
>> >> > > > >     1 KSP Residual norm 2.032490351200e-06
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 1.282327290982e+00
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   4 KSP preconditioned resid norm 3.482036620521e-05 true resid
>> >> norm 4.291231924307e+01 ||r(i)||/||b|| 5.388546393133e-05
>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 3.423609338053e-01
>> >> > > > >     1 KSP Residual norm 4.213703301972e-07
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 1.157384757538e+00
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   5 KSP preconditioned resid norm 1.203470314534e-06 true resid
>> >> norm 4.544956156267e+00 ||r(i)||/||b|| 5.707150658550e-06
>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 3.838596289995e-01
>> >> > > > >     1 KSP Residual norm 9.927864176103e-08
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 1.066298905618e+00
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   6 KSP preconditioned resid norm 3.331619244266e-08 true resid
>> >> norm 2.821511729024e+00 ||r(i)||/||b|| 3.543002829675e-06
>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > >     0 KSP Residual norm 4.624964188094e-01
>> >> > > > >     1 KSP Residual norm 6.418229775372e-08
>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > >     0 KSP Residual norm 9.800784311614e-01
>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>> >> > > > >   7 KSP preconditioned resid norm 8.788046233297e-10 true resid
>> >> norm 2.849209671705e+00 ||r(i)||/||b|| 3.577783436215e-06
>> >> > > > > Linear solve converged due to CONVERGED_ATOL iterations 7
>> >> > > > >
>> >> > > > > The outer operator is an explicit matrix.
>> >> > > > >
>> >> > > > > Giang
>> >> > > > >
>> >> > > > > On Mon, Apr 24, 2017 at 7:32 PM, Barry Smith <
>> bsmith at mcs.anl.gov>
>> >> wrote:
>> >> > > > >
>> >> > > > > > On Apr 24, 2017, at 3:16 AM, Hoang Giang Bui <
>> hgbk2008 at gmail.com>
>> >> wrote:
>> >> > > > > >
>> >> > > > > > Thanks Barry, trying with -fieldsplit_u_type lu gives better
>> >> convergence. I still used 4 procs though, probably with 1 proc it
>> should
>> >> also be the same.
>> >> > > > > >
>> >> > > > > > The u block used a Nitsche-type operator to connect two
>> >> non-matching domains. I don't think it will leave some rigid body
>> motion
>> >> leads to not sufficient constraints. Maybe you have other idea?
>> >> > > > > >
>> >> > > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > > >     0 KSP Residual norm 3.129067184300e+05
>> >> > > > > >     1 KSP Residual norm 5.906261468196e-01
>> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > > >
>> >> > > > >     ^^^^ something is wrong here. The sub solve should not be
>> >> starting with a 0 residual (this means the right hand side for this sub
>> >> solve is zero which it should not be).
>> >> > > > >
>> >> > > > > > FieldSplit with MULTIPLICATIVE composition: total splits = 2
>> >> > > > >
>> >> > > > >
>> >> > > > >    How are you providing the outer operator? As an explicit
>> matrix
>> >> or with some shell matrix?
>> >> > > > >
>> >> > > > >
>> >> > > > >
>> >> > > > > >   0 KSP preconditioned resid norm 3.129067184300e+05 true
>> resid
>> >> norm 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> > > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > > >     0 KSP Residual norm 9.999955993437e-01
>> >> > > > > >     1 KSP Residual norm 4.019774691831e-06
>> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > > > >   1 KSP preconditioned resid norm 5.003913641475e-01 true
>> resid
>> >> norm 4.692996324114e+01 ||r(i)||/||b|| 5.205677185522e-06
>> >> > > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > > >     0 KSP Residual norm 1.000012180204e+00
>> >> > > > > >     1 KSP Residual norm 1.017367950422e-05
>> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > > > >   2 KSP preconditioned resid norm 2.330910333756e-07 true
>> resid
>> >> norm 3.474855463983e+01 ||r(i)||/||b|| 3.854461960453e-06
>> >> > > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > > >     0 KSP Residual norm 1.000004200085e+00
>> >> > > > > >     1 KSP Residual norm 6.231613102458e-06
>> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > > > >   3 KSP preconditioned resid norm 8.671259838389e-11 true
>> resid
>> >> norm 3.545103468011e+01 ||r(i)||/||b|| 3.932384125024e-06
>> >> > > > > > Linear solve converged due to CONVERGED_ATOL iterations 3
>> >> > > > > > KSP Object: 4 MPI processes
>> >> > > > > >   type: gmres
>> >> > > > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> >> Orthogonalization
>> >> > > > > >     GMRES: happy breakdown tolerance 1e-30
>> >> > > > > >   maximum iterations=1000, initial guess is zero
>> >> > > > > >   tolerances:  relative=1e-20, absolute=1e-09,
>> divergence=10000
>> >> > > > > >   left preconditioning
>> >> > > > > >   using PRECONDITIONED norm type for convergence test
>> >> > > > > > PC Object: 4 MPI processes
>> >> > > > > >   type: fieldsplit
>> >> > > > > >     FieldSplit with MULTIPLICATIVE composition: total splits
>> = 2
>> >> > > > > >     Solver info for each split is in the following KSP
>> objects:
>> >> > > > > >     Split number 0 Defined by IS
>> >> > > > > >     KSP Object:    (fieldsplit_u_)     4 MPI processes
>> >> > > > > >       type: richardson
>> >> > > > > >         Richardson: damping factor=1
>> >> > > > > >       maximum iterations=1, initial guess is zero
>> >> > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> >> divergence=10000
>> >> > > > > >       left preconditioning
>> >> > > > > >       using PRECONDITIONED norm type for convergence test
>> >> > > > > >     PC Object:    (fieldsplit_u_)     4 MPI processes
>> >> > > > > >       type: lu
>> >> > > > > >         LU: out-of-place factorization
>> >> > > > > >         tolerance for zero pivot 2.22045e-14
>> >> > > > > >         matrix ordering: natural
>> >> > > > > >         factor fill ratio given 0, needed 0
>> >> > > > > >           Factored matrix follows:
>> >> > > > > >             Mat Object:             4 MPI processes
>> >> > > > > >               type: mpiaij
>> >> > > > > >               rows=938910, cols=938910
>> >> > > > > >               package used to perform factorization: pastix
>> >> > > > > >               total: nonzeros=0, allocated nonzeros=0
>> >> > > > > >       Error :                        3.36878e-14
>> >> > > > > >           total number of mallocs used during MatSetValues
>> calls
>> >> =0
>> >> > > > > >                 PaStiX run parameters:
>> >> > > > > >                   Matrix type :
>> Unsymmetric
>> >> > > > > >                   Level of printing (0,1,2):         0
>> >> > > > > >                   Number of refinements iterations : 3
>> >> > > > > >   Error :                        3.36878e-14
>> >> > > > > >       linear system matrix = precond matrix:
>> >> > > > > >       Mat Object:      (fieldsplit_u_)       4 MPI processes
>> >> > > > > >         type: mpiaij
>> >> > > > > >         rows=938910, cols=938910, bs=3
>> >> > > > > >   Error :                        3.36878e-14
>> >> > > > > >   Error :                        3.36878e-14
>> >> > > > > >         total: nonzeros=8.60906e+07, allocated
>> >> nonzeros=8.60906e+07
>> >> > > > > >         total number of mallocs used during MatSetValues
>> calls =0
>> >> > > > > >           using I-node (on process 0) routines: found 78749
>> >> nodes, limit used is 5
>> >> > > > > >     Split number 1 Defined by IS
>> >> > > > > >     KSP Object:    (fieldsplit_wp_)     4 MPI processes
>> >> > > > > >       type: richardson
>> >> > > > > >         Richardson: damping factor=1
>> >> > > > > >       maximum iterations=1, initial guess is zero
>> >> > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> >> divergence=10000
>> >> > > > > >       left preconditioning
>> >> > > > > >       using PRECONDITIONED norm type for convergence test
>> >> > > > > >     PC Object:    (fieldsplit_wp_)     4 MPI processes
>> >> > > > > >       type: lu
>> >> > > > > >         LU: out-of-place factorization
>> >> > > > > >         tolerance for zero pivot 2.22045e-14
>> >> > > > > >         matrix ordering: natural
>> >> > > > > >         factor fill ratio given 0, needed 0
>> >> > > > > >           Factored matrix follows:
>> >> > > > > >             Mat Object:             4 MPI processes
>> >> > > > > >               type: mpiaij
>> >> > > > > >               rows=34141, cols=34141
>> >> > > > > >               package used to perform factorization: pastix
>> >> > > > > >                 Error :                        -nan
>> >> > > > > >   Error :                        -nan
>> >> > > > > >   Error :                        -nan
>> >> > > > > > total: nonzeros=0, allocated nonzeros=0
>> >> > > > > >               total number of mallocs used during
>> MatSetValues
>> >> calls =0
>> >> > > > > >                 PaStiX run parameters:
>> >> > > > > >                   Matrix type :
>> Symmetric
>> >> > > > > >                   Level of printing (0,1,2):         0
>> >> > > > > >                   Number of refinements iterations : 0
>> >> > > > > >   Error :                        -nan
>> >> > > > > >       linear system matrix = precond matrix:
>> >> > > > > >       Mat Object:      (fieldsplit_wp_)       4 MPI processes
>> >> > > > > >         type: mpiaij
>> >> > > > > >         rows=34141, cols=34141
>> >> > > > > >         total: nonzeros=485655, allocated nonzeros=485655
>> >> > > > > >         total number of mallocs used during MatSetValues
>> calls =0
>> >> > > > > >           not using I-node (on process 0) routines
>> >> > > > > >   linear system matrix = precond matrix:
>> >> > > > > >   Mat Object:   4 MPI processes
>> >> > > > > >     type: mpiaij
>> >> > > > > >     rows=973051, cols=973051
>> >> > > > > >     total: nonzeros=9.90037e+07, allocated
>> nonzeros=9.90037e+07
>> >> > > > > >     total number of mallocs used during MatSetValues calls =0
>> >> > > > > >       using I-node (on process 0) routines: found 78749
>> nodes,
>> >> limit used is 5
>> >> > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > > Giang
>> >> > > > > >
>> >> > > > > > On Sun, Apr 23, 2017 at 10:19 PM, Barry Smith <
>> >> bsmith at mcs.anl.gov> wrote:
>> >> > > > > >
>> >> > > > > > > On Apr 23, 2017, at 2:42 PM, Hoang Giang Bui <
>> >> hgbk2008 at gmail.com> wrote:
>> >> > > > > > >
>> >> > > > > > > Dear Matt/Barry
>> >> > > > > > >
>> >> > > > > > > With your options, it results in
>> >> > > > > > >
>> >> > > > > > >   0 KSP preconditioned resid norm 1.106709687386e+31 true
>> >> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> > > > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > > > >     0 KSP Residual norm 2.407308987203e+36
>> >> > > > > > >     1 KSP Residual norm 5.797185652683e+72
>> >> > > > > >
>> >> > > > > > It looks like Matt is right, hypre is seemly producing
>> useless
>> >> garbage.
>> >> > > > > >
>> >> > > > > > First how do things run on one process. If you have similar
>> >> problems then debug on one process (debugging any kind of problem is
>> always
>> >> far easy on one process).
>> >> > > > > >
>> >> > > > > > First run with -fieldsplit_u_type lu (instead of using
>> hypre) to
>> >> see if that works or also produces something bad.
>> >> > > > > >
>> >> > > > > > What is the operator and the boundary conditions for u? It
>> could
>> >> be singular.
>> >> > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > > > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > > > > > ...
>> >> > > > > > > 999 KSP preconditioned resid norm 2.920157329174e+12 true
>> >> resid norm 9.015683504616e+06 ||r(i)||/||b|| 1.000059124102e+00
>> >> > > > > > >     Residual norms for fieldsplit_u_ solve.
>> >> > > > > > >     0 KSP Residual norm 1.533726746719e+36
>> >> > > > > > >     1 KSP Residual norm 3.692757392261e+72
>> >> > > > > > >     Residual norms for fieldsplit_wp_ solve.
>> >> > > > > > >     0 KSP Residual norm 0.000000000000e+00
>> >> > > > > > >
>> >> > > > > > > Do you suggest that the pastix solver for the "wp" block
>> >> encounters small pivot? In addition, seem like the "u" block is also
>> >> singular.
>> >> > > > > > >
>> >> > > > > > > Giang
>> >> > > > > > >
>> >> > > > > > > On Sun, Apr 23, 2017 at 7:39 PM, Barry Smith <
>> >> bsmith at mcs.anl.gov> wrote:
>> >> > > > > > >
>> >> > > > > > >    Huge preconditioned norms but normal unpreconditioned
>> norms
>> >> almost always come from a very small pivot in an LU or ILU
>> factorization.
>> >> > > > > > >
>> >> > > > > > >    The first thing to do is monitor the two sub solves. Run
>> >> with the additional options -fieldsplit_u_ksp_type richardson
>> >> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
>> >> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
>> >> -fieldsplit_wp_ksp_max_it 1
>> >> > > > > > >
>> >> > > > > > > > On Apr 23, 2017, at 12:22 PM, Hoang Giang Bui <
>> >> hgbk2008 at gmail.com> wrote:
>> >> > > > > > > >
>> >> > > > > > > > Hello
>> >> > > > > > > >
>> >> > > > > > > > I encountered a strange convergence behavior that I have
>> >> trouble to understand
>> >> > > > > > > >
>> >> > > > > > > > KSPSetFromOptions completed
>> >> > > > > > > >   0 KSP preconditioned resid norm 1.106709687386e+31 true
>> >> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >> > > > > > > >   1 KSP preconditioned resid norm 2.933141742664e+29 true
>> >> resid norm 9.015152282123e+06 ||r(i)||/||b|| 1.000000198575e+00
>> >> > > > > > > >   2 KSP preconditioned resid norm 9.686409637174e+16 true
>> >> resid norm 9.015354521944e+06 ||r(i)||/||b|| 1.000022631902e+00
>> >> > > > > > > >   3 KSP preconditioned resid norm 4.219243615809e+15 true
>> >> resid norm 9.017157702420e+06 ||r(i)||/||b|| 1.000222648583e+00
>> >> > > > > > > > .....
>> >> > > > > > > > 999 KSP preconditioned resid norm 3.043754298076e+12 true
>> >> resid norm 9.015425041089e+06 ||r(i)||/||b|| 1.000030454195e+00
>> >> > > > > > > > 1000 KSP preconditioned resid norm 3.043000287819e+12
>> true
>> >> resid norm 9.015424313455e+06 ||r(i)||/||b|| 1.000030373483e+00
>> >> > > > > > > > Linear solve did not converge due to DIVERGED_ITS
>> iterations
>> >> 1000
>> >> > > > > > > > KSP Object: 4 MPI processes
>> >> > > > > > > >   type: gmres
>> >> > > > > > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> >> Orthogonalization
>> >> > > > > > > >     GMRES: happy breakdown tolerance 1e-30
>> >> > > > > > > >   maximum iterations=1000, initial guess is zero
>> >> > > > > > > >   tolerances:  relative=1e-20, absolute=1e-09,
>> >> divergence=10000
>> >> > > > > > > >   left preconditioning
>> >> > > > > > > >   using PRECONDITIONED norm type for convergence test
>> >> > > > > > > > PC Object: 4 MPI processes
>> >> > > > > > > >   type: fieldsplit
>> >> > > > > > > >     FieldSplit with MULTIPLICATIVE composition: total
>> splits
>> >> = 2
>> >> > > > > > > >     Solver info for each split is in the following KSP
>> >> objects:
>> >> > > > > > > >     Split number 0 Defined by IS
>> >> > > > > > > >     KSP Object:    (fieldsplit_u_)     4 MPI processes
>> >> > > > > > > >       type: preonly
>> >> > > > > > > >       maximum iterations=10000, initial guess is zero
>> >> > > > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> >> divergence=10000
>> >> > > > > > > >       left preconditioning
>> >> > > > > > > >       using NONE norm type for convergence test
>> >> > > > > > > >     PC Object:    (fieldsplit_u_)     4 MPI processes
>> >> > > > > > > >       type: hypre
>> >> > > > > > > >         HYPRE BoomerAMG preconditioning
>> >> > > > > > > >         HYPRE BoomerAMG: Cycle type V
>> >> > > > > > > >         HYPRE BoomerAMG: Maximum number of levels 25
>> >> > > > > > > >         HYPRE BoomerAMG: Maximum number of iterations PER
>> >> hypre call 1
>> >> > > > > > > >         HYPRE BoomerAMG: Convergence tolerance PER hypre
>> >> call 0
>> >> > > > > > > >         HYPRE BoomerAMG: Threshold for strong coupling
>> 0.6
>> >> > > > > > > >         HYPRE BoomerAMG: Interpolation truncation factor
>> 0
>> >> > > > > > > >         HYPRE BoomerAMG: Interpolation: max elements per
>> row
>> >> 0
>> >> > > > > > > >         HYPRE BoomerAMG: Number of levels of aggressive
>> >> coarsening 0
>> >> > > > > > > >         HYPRE BoomerAMG: Number of paths for aggressive
>> >> coarsening 1
>> >> > > > > > > >         HYPRE BoomerAMG: Maximum row sums 0.9
>> >> > > > > > > >         HYPRE BoomerAMG: Sweeps down         1
>> >> > > > > > > >         HYPRE BoomerAMG: Sweeps up           1
>> >> > > > > > > >         HYPRE BoomerAMG: Sweeps on coarse    1
>> >> > > > > > > >         HYPRE BoomerAMG: Relax down
>> >> symmetric-SOR/Jacobi
>> >> > > > > > > >         HYPRE BoomerAMG: Relax up
>> >> symmetric-SOR/Jacobi
>> >> > > > > > > >         HYPRE BoomerAMG: Relax on coarse
>> >>  Gaussian-elimination
>> >> > > > > > > >         HYPRE BoomerAMG: Relax weight  (all)      1
>> >> > > > > > > >         HYPRE BoomerAMG: Outer relax weight (all) 1
>> >> > > > > > > >         HYPRE BoomerAMG: Using CF-relaxation
>> >> > > > > > > >         HYPRE BoomerAMG: Measure type        local
>> >> > > > > > > >         HYPRE BoomerAMG: Coarsen type        PMIS
>> >> > > > > > > >         HYPRE BoomerAMG: Interpolation type  classical
>> >> > > > > > > >       linear system matrix = precond matrix:
>> >> > > > > > > >       Mat Object:      (fieldsplit_u_)       4 MPI
>> processes
>> >> > > > > > > >         type: mpiaij
>> >> > > > > > > >         rows=938910, cols=938910, bs=3
>> >> > > > > > > >         total: nonzeros=8.60906e+07, allocated
>> >> nonzeros=8.60906e+07
>> >> > > > > > > >         total number of mallocs used during MatSetValues
>> >> calls =0
>> >> > > > > > > >           using I-node (on process 0) routines: found
>> 78749
>> >> nodes, limit used is 5
>> >> > > > > > > >     Split number 1 Defined by IS
>> >> > > > > > > >     KSP Object:    (fieldsplit_wp_)     4 MPI processes
>> >> > > > > > > >       type: preonly
>> >> > > > > > > >       maximum iterations=10000, initial guess is zero
>> >> > > > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> >> divergence=10000
>> >> > > > > > > >       left preconditioning
>> >> > > > > > > >       using NONE norm type for convergence test
>> >> > > > > > > >     PC Object:    (fieldsplit_wp_)     4 MPI processes
>> >> > > > > > > >       type: lu
>> >> > > > > > > >         LU: out-of-place factorization
>> >> > > > > > > >         tolerance for zero pivot 2.22045e-14
>> >> > > > > > > >         matrix ordering: natural
>> >> > > > > > > >         factor fill ratio given 0, needed 0
>> >> > > > > > > >           Factored matrix follows:
>> >> > > > > > > >             Mat Object:             4 MPI processes
>> >> > > > > > > >               type: mpiaij
>> >> > > > > > > >               rows=34141, cols=34141
>> >> > > > > > > >               package used to perform factorization:
>> pastix
>> >> > > > > > > >             Error :                        -nan
>> >> > > > > > > >   Error :                        -nan
>> >> > > > > > > >     total: nonzeros=0, allocated nonzeros=0
>> >> > > > > > > >             Error :                        -nan
>> >> > > > > > > >     total number of mallocs used during MatSetValues
>> calls =0
>> >> > > > > > > >                 PaStiX run parameters:
>> >> > > > > > > >                   Matrix type :
>> >> Symmetric
>> >> > > > > > > >                   Level of printing (0,1,2):         0
>> >> > > > > > > >                   Number of refinements iterations : 0
>> >> > > > > > > >   Error :                        -nan
>> >> > > > > > > >       linear system matrix = precond matrix:
>> >> > > > > > > >       Mat Object:      (fieldsplit_wp_)       4 MPI
>> processes
>> >> > > > > > > >         type: mpiaij
>> >> > > > > > > >         rows=34141, cols=34141
>> >> > > > > > > >         total: nonzeros=485655, allocated nonzeros=485655
>> >> > > > > > > >         total number of mallocs used during MatSetValues
>> >> calls =0
>> >> > > > > > > >           not using I-node (on process 0) routines
>> >> > > > > > > >   linear system matrix = precond matrix:
>> >> > > > > > > >   Mat Object:   4 MPI processes
>> >> > > > > > > >     type: mpiaij
>> >> > > > > > > >     rows=973051, cols=973051
>> >> > > > > > > >     total: nonzeros=9.90037e+07, allocated
>> >> nonzeros=9.90037e+07
>> >> > > > > > > >     total number of mallocs used during MatSetValues
>> calls =0
>> >> > > > > > > >       using I-node (on process 0) routines: found 78749
>> >> nodes, limit used is 5
>> >> > > > > > > >
>> >> > > > > > > > The pattern of convergence gives a hint that this system
>> is
>> >> somehow bad/singular. But I don't know why the preconditioned error
>> goes up
>> >> too high. Anyone has an idea?
>> >> > > > > > > >
>> >> > > > > > > > Best regards
>> >> > > > > > > > Giang Bui
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > > >
>> >> > > >
>> >> > > >
>> >> > >
>> >> > >
>> >> >
>> >> >
>> >>
>> >>
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170503/e90bc523/attachment-0001.html>

From Lukasz.Kaczmarczyk at glasgow.ac.uk  Wed May  3 07:55:19 2017
From: Lukasz.Kaczmarczyk at glasgow.ac.uk (Lukasz Kaczmarczyk)
Date: Wed, 3 May 2017 12:55:19 +0000
Subject: [petsc-users] strange convergence
In-Reply-To: <CAMYG4GkBEJCYKS5fn72MceU_WD1QhnitmDRZg1ivGaenzLYLDw@mail.gmail.com>
References: <CAJW_hKcJN520Q=0cAJyAQ=ZF1Wh0f63=5AXSJ5MSyXMFv2AnNQ@mail.gmail.com>
	<7891536D-91FE-4BFF-8DAD-CE7AB85A4E57@mcs.anl.gov>
	<CAJW_hKfFpmJRx+XHSZxbdPEK8_B1sWQTATKVaAZu6vi-1uALqg@mail.gmail.com>
	<425BBB58-9721-49F3-8C86-940F08E925F7@mcs.anl.gov>
	<CAJW_hKd+GAwf4ZS3_337n3PAJmxC6dm2B5uGDS88nrH-KjV60g@mail.gmail.com>
	<A1749310-AE79-4D42-A14F-991DA59F4919@mcs.anl.gov>
	<CAJW_hKd+m3p+Oju7XSPha=_AqYbGD_uF0ZV6qe8HzxbFWZ_2Zg@mail.gmail.com>
	<42EB791A-40C2-439F-A5F7-5F8C15CECA6F@mcs.anl.gov>
	<CAJW_hKd=d2NMxiaU1h6Ro3S3f_iaY294yRJyCG_tb1KN6Wiiug@mail.gmail.com>
	<F5D2FC76-C3A2-4FC8-8144-0D7500F26547@mcs.anl.gov>
	<CAJW_hKc7ZnhAGrq=wAgr3kyBw536GSDL+deDvfmpwkwnmVzHzw@mail.gmail.com>
	<82193784-B4C4-47D7-80EA-25F549C9091B@mcs.anl.gov>
	<CAJW_hKeiozsxQ7XkVneArg_0UYiGXPU7Zovn_PMkNWFrix9w7A@mail.gmail.com>
	<B3AE9D6B-7826-4ED2-B9A2-12250D37BB01@mcs.anl.gov>
	<CAJW_hKfzOV3dDjjgZGvexAm_Sg+8N8fb82J_-iMfxJ2oYvBm9w@mail.gmail.com>
	<87wpa3wd5j.fsf@jedbrown.org>
	<CAJW_hKdxHMKtN5P7edsZ=V2VTeWebR=D6chkZTMy5L=2fTsT+g@mail.gmail.com>
	<CAMYG4GkBEJCYKS5fn72MceU_WD1QhnitmDRZg1ivGaenzLYLDw@mail.gmail.com>
Message-ID: <DCBF2435-8966-4FBE-9261-54D41460BD06@glasgow.ac.uk>


On 3 May 2017, at 13:22, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:

On Wed, May 3, 2017 at 2:29 AM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> wrote:
Dear Jed

If I understood you correctly you suggest to avoid penalty by using the Lagrange multiplier for the mortar constraint? In this case it leads to the use of discrete Lagrange multiplier space.

Sorry for being ignorant here, but why is the space "discrete"? It looks like you should have a continuum formulation
of the mortar as well. Maybe I do not understand something fundamental. From this (https://en.wikipedia.org/wiki/Mortar_methods)
short description, it seems that mortars begin from a continuum formulation, but are then reduced to the discrete level. This is no
problem if done consistently, as for instance in the FETI method where efficient preconditioners exist.


Hello,

I copied the wrong link to mortar method, how we implemented it, see presentation http://doi.org/10.5281/zenodo.556996

You right that we always start from continuum formulation, on this we apply some discretisation, at the end Lagrange multiplier is expressed by a finite vector of discrete unknowns. It is better to formulate problem first for the continuum; you have better control on what you are doing and stability of the solution.

Of course, you can add some constraints at the discreet level, after you discretised problem, but implicitly you have some continuous space for Lagrange multipliers, which is associated with shape functions which you use to discretise problem.

In our problem which we have,  we try to avoid rebuilding of the system of equations each time contact area is changing. We going to construct DM sub-problem for each body in contact, each sub-problem going to be solved using MG (adjacency for those matrices is fixed in time).  All will go to put in nested matrix with the separate block for Lagrange multipliers (adjacency will change in each time step).  For solving  Lagrange multipliers we going to use FIELDSPLIT using Schur complement. I need to look more detail to FETI method, at are still at development stage for contact problem and direct solver works, for now, small problems at that point.

In our code, we using higher order elements with hierarchical base,  for this we using specialise MG solver, as you can see here, it works pretty well for moderate size problems, <100M
http://mofem.eng.gla.ac.uk/mofem/html/_p_c_m_g_set_up_via_approx_orders_8cpp.html

Regards,
Lukasz



 Thanks,

    Matt

Do you or anyone already have experience using discrete Lagrange multiplier space with Petsc?

There is also similar question on stackexchange
https://scicomp.stackexchange.com/questions/25113/preconditioners-and-discrete-lagrange-multipliers

Giang

On Sat, Apr 29, 2017 at 3:34 PM, Jed Brown <jed at jedbrown.org<mailto:jed at jedbrown.org>> wrote:
Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> writes:

> Hi Barry
>
> The first block is from a standard solid mechanics discretization based on
> balance of momentum equation. There is some material involved but in
> principal it's well-posed elasticity equation with positive definite
> tangent operator. The "gluing business" uses the mortar method to keep the
> continuity of displacement. Instead of using Lagrange multiplier to treat
> the constraint I used penalty method to penalize the energy. The
> discretization form of mortar is quite simple
>
> \int_{\Gamma_1} { rho * (\delta u_1 - \delta u_2) * (u_1 - u_2) dA }
>
> rho is penalty parameter. In the simulation I initially set it low (~E) to
> preserve the conditioning of the system.

There are two things that can go wrong here with AMG:

* The penalty term can mess up the strength of connection heuristics
  such that you get poor choice of C-points (classical AMG like
  BoomerAMG) or poor choice of aggregates (smoothed aggregation).

* The penalty term can prevent Jacobi smoothing from being effective; in
  this case, it can lead to poor coarse basis functions (higher energy
  than they should be) and poor smoothing in an MG cycle.  You can fix
  the poor smoothing in the MG cycle by using a stronger smoother, like
  ASM with some overlap.

I'm generally not a fan of penalty methods due to the irritating
tradeoffs and often poor solver performance.

> In the figure below, the colorful blocks are u_1 and the base is u_2. Both
> u_1 and u_2 use isoparametric quadratic approximation.
>
> ?
>  Snapshot.png
> <https://drive.google.com/file/d/0Bw8Hmu0-YGQXc2hKQ1BhQ1I4OEU/view?usp=drive_web>
> ???
>
> Giang
>
> On Fri, Apr 28, 2017 at 6:21 PM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>> wrote:
>
>>
>>   Ok, so boomerAMG algebraic multigrid is not good for the first block.
>> You mentioned the first block has two things glued together? AMG is
>> fantastic for certain problems but doesn't work for everything.
>>
>>    Tell us more about the first block, what PDE it comes from, what
>> discretization, and what the "gluing business" is and maybe we'll have
>> suggestions for how to precondition it.
>>
>>    Barry
>>
>> > On Apr 28, 2017, at 3:56 AM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> wrote:
>> >
>> > It's in fact quite good
>> >
>> >     Residual norms for fieldsplit_u_ solve.
>> >     0 KSP Residual norm 4.014715925568e+00
>> >     1 KSP Residual norm 2.160497019264e-10
>> >     Residual norms for fieldsplit_wp_ solve.
>> >     0 KSP Residual norm 0.000000000000e+00
>> >   0 KSP preconditioned resid norm 4.014715925568e+00 true resid norm
>> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >     Residual norms for fieldsplit_u_ solve.
>> >     0 KSP Residual norm 9.999999999416e-01
>> >     1 KSP Residual norm 7.118380416383e-11
>> >     Residual norms for fieldsplit_wp_ solve.
>> >     0 KSP Residual norm 0.000000000000e+00
>> >   1 KSP preconditioned resid norm 1.701150951035e-10 true resid norm
>> 5.494262251846e-04 ||r(i)||/||b|| 6.100334726599e-11
>> > Linear solve converged due to CONVERGED_ATOL iterations 1
>> >
>> > Giang
>> >
>> > On Thu, Apr 27, 2017 at 5:25 PM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>> wrote:
>> >
>> >   Run again using LU on both blocks to see what happens.
>> >
>> >
>> > > On Apr 27, 2017, at 2:14 AM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>>
>> wrote:
>> > >
>> > > I have changed the way to tie the nonconforming mesh. It seems the
>> matrix now is better
>> > >
>> > > with -pc_type lu  the output is
>> > >   0 KSP preconditioned resid norm 3.308678584240e-01 true resid norm
>> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > >   1 KSP preconditioned resid norm 2.004313395301e-12 true resid norm
>> 2.549872332830e-05 ||r(i)||/||b|| 2.831148938173e-12
>> > > Linear solve converged due to CONVERGED_ATOL iterations 1
>> > >
>> > >
>> > > with -pc_type fieldsplit  -fieldsplit_u_pc_type hypre
>> -fieldsplit_wp_pc_type lu    the convergence is slow
>> > >   0 KSP preconditioned resid norm 1.116302362553e-01 true resid norm
>> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > >   1 KSP preconditioned resid norm 2.582134825666e-02 true resid norm
>> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
>> > > ...
>> > > 824 KSP preconditioned resid norm 1.018542387738e-09 true resid norm
>> 2.906608839310e+02 ||r(i)||/||b|| 3.227237074804e-05
>> > > 825 KSP preconditioned resid norm 9.743727947637e-10 true resid norm
>> 2.820369993061e+02 ||r(i)||/||b|| 3.131485215062e-05
>> > > Linear solve converged due to CONVERGED_ATOL iterations 825
>> > >
>> > > checking with additional  -fieldsplit_u_ksp_type richardson
>> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
>> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
>> -fieldsplit_wp_ksp_max_it 1  gives
>> > >
>> > >   0 KSP preconditioned resid norm 1.116302362553e-01 true resid norm
>> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > >     Residual norms for fieldsplit_u_ solve.
>> > >     0 KSP Residual norm 5.803507549280e-01
>> > >     1 KSP Residual norm 2.069538175950e-01
>> > >     Residual norms for fieldsplit_wp_ solve.
>> > >     0 KSP Residual norm 0.000000000000e+00
>> > >   1 KSP preconditioned resid norm 2.582134825666e-02 true resid norm
>> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
>> > >     Residual norms for fieldsplit_u_ solve.
>> > >     0 KSP Residual norm 7.831796195225e-01
>> > >     1 KSP Residual norm 1.734608520110e-01
>> > >     Residual norms for fieldsplit_wp_ solve.
>> > >     0 KSP Residual norm 0.000000000000e+00
>> > > ....
>> > > 823 KSP preconditioned resid norm 1.065070135605e-09 true resid norm
>> 3.081881356833e+02 ||r(i)||/||b|| 3.421843916665e-05
>> > >     Residual norms for fieldsplit_u_ solve.
>> > >     0 KSP Residual norm 6.113806394327e-01
>> > >     1 KSP Residual norm 1.535465290944e-01
>> > >     Residual norms for fieldsplit_wp_ solve.
>> > >     0 KSP Residual norm 0.000000000000e+00
>> > > 824 KSP preconditioned resid norm 1.018542387746e-09 true resid norm
>> 2.906608839353e+02 ||r(i)||/||b|| 3.227237074851e-05
>> > >     Residual norms for fieldsplit_u_ solve.
>> > >     0 KSP Residual norm 6.123437055586e-01
>> > >     1 KSP Residual norm 1.524661826133e-01
>> > >     Residual norms for fieldsplit_wp_ solve.
>> > >     0 KSP Residual norm 0.000000000000e+00
>> > > 825 KSP preconditioned resid norm 9.743727947718e-10 true resid norm
>> 2.820369990571e+02 ||r(i)||/||b|| 3.131485212298e-05
>> > > Linear solve converged due to CONVERGED_ATOL iterations 825
>> > >
>> > >
>> > > The residual for wp block is zero since in this first step the rhs is
>> zero. As can see in the output, the multigrid does not perform well to
>> reduce the residual in the sub-solve. Is my observation right? what can be
>> done to improve this?
>> > >
>> > >
>> > > Giang
>> > >
>> > > On Tue, Apr 25, 2017 at 12:17 AM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>>
>> wrote:
>> > >
>> > >    This can happen in the matrix is singular or nearly singular or if
>> the factorization generates small pivots, which can occur for even
>> nonsingular problems if the matrix is poorly scaled or just plain nasty.
>> > >
>> > >
>> > > > On Apr 24, 2017, at 5:10 PM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>>
>> wrote:
>> > > >
>> > > > It took a while, here I send you the output
>> > > >
>> > > >   0 KSP preconditioned resid norm 3.129073545457e+05 true resid norm
>> 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > > >   1 KSP preconditioned resid norm 7.442444222843e-01 true resid norm
>> 1.003356247696e+02 ||r(i)||/||b|| 1.112966720375e-05
>> > > >   2 KSP preconditioned resid norm 3.267453132529e-07 true resid norm
>> 3.216722968300e+01 ||r(i)||/||b|| 3.568130084011e-06
>> > > >   3 KSP preconditioned resid norm 1.155046883816e-11 true resid norm
>> 3.234460376820e+01 ||r(i)||/||b|| 3.587805194854e-06
>> > > > Linear solve converged due to CONVERGED_ATOL iterations 3
>> > > > KSP Object: 4 MPI processes
>> > > >   type: gmres
>> > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> Orthogonalization
>> > > >     GMRES: happy breakdown tolerance 1e-30
>> > > >   maximum iterations=1000, initial guess is zero
>> > > >   tolerances:  relative=1e-20, absolute=1e-09, divergence=10000
>> > > >   left preconditioning
>> > > >   using PRECONDITIONED norm type for convergence test
>> > > > PC Object: 4 MPI processes
>> > > >   type: lu
>> > > >     LU: out-of-place factorization
>> > > >     tolerance for zero pivot 2.22045e-14
>> > > >     matrix ordering: natural
>> > > >     factor fill ratio given 0, needed 0
>> > > >       Factored matrix follows:
>> > > >         Mat Object:         4 MPI processes
>> > > >           type: mpiaij
>> > > >           rows=973051, cols=973051
>> > > >           package used to perform factorization: pastix
>> > > >   Error :                        3.24786e-14
>> > > >           total: nonzeros=0, allocated nonzeros=0
>> > > >           total number of mallocs used during MatSetValues calls =0
>> > > >             PaStiX run parameters:
>> > > >               Matrix type :                      Unsymmetric
>> > > >               Level of printing (0,1,2):         0
>> > > >               Number of refinements iterations : 3
>> > > >   Error :                        3.24786e-14
>> > > >   linear system matrix = precond matrix:
>> > > >   Mat Object:   4 MPI processes
>> > > >     type: mpiaij
>> > > >     rows=973051, cols=973051
>> > > >   Error :                        3.24786e-14
>> > > >     total: nonzeros=9.90037e+07, allocated nonzeros=9.90037e+07
>> > > >     total number of mallocs used during MatSetValues calls =0
>> > > >       using I-node (on process 0) routines: found 78749 nodes, limit
>> used is 5
>> > > >   Error :                        3.24786e-14
>> > > >
>> > > > It doesn't do as you said. Something is not right here. I will look
>> in depth.
>> > > >
>> > > > Giang
>> > > >
>> > > > On Mon, Apr 24, 2017 at 8:21 PM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>>
>> wrote:
>> > > >
>> > > > > On Apr 24, 2017, at 12:47 PM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>>
>> wrote:
>> > > > >
>> > > > > Good catch. I get this for the very first step, maybe at that time
>> the rhs_w is zero.
>> > > >
>> > > >     With the multiplicative composition the right hand side of the
>> second solve is the initial right hand side of the second solve minus
>> A_10*x where x is the solution to the first sub solve and A_10 is the lower
>> left block of the outer matrix. So unless both the initial right hand side
>> has a zero for the second block and A_10 is identically zero the right hand
>> side for the second sub solve should not be zero. Is A_10 == 0?
>> > > >
>> > > >
>> > > > > In the later step, it shows 2 step convergence
>> > > > >
>> > > > > Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 3.165886479830e+04
>> > > > >     1 KSP Residual norm 2.905922877684e-01
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 2.397669419027e-01
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   0 KSP preconditioned resid norm 3.165886479920e+04 true resid
>> norm 7.963616922323e+05 ||r(i)||/||b|| 1.000000000000e+00
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 9.999891813771e-01
>> > > > >     1 KSP Residual norm 1.512000395579e-05
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 8.192702188243e-06
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   1 KSP preconditioned resid norm 5.252183822848e-02 true resid
>> norm 7.135927677844e+04 ||r(i)||/||b|| 8.960661653427e-02
>> > > >
>> > > >     The outer residual norms are still wonky, the preconditioned
>> residual norm goes from 3.165886479920e+04 to 5.252183822848e-02 which is a
>> huge drop but the 7.963616922323e+05  drops very much less
>> 7.135927677844e+04. This is not normal.
>> > > >
>> > > >    What if you just use -pc_type lu for the entire system (no
>> fieldsplit), does the true residual drop to almost zero in the first
>> iteration (as it should?). Send the output.
>> > > >
>> > > >
>> > > >
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 6.946213936597e-01
>> > > > >     1 KSP Residual norm 1.195514007343e-05
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.025694497535e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   2 KSP preconditioned resid norm 8.785709535405e-03 true resid
>> norm 1.419341799277e+04 ||r(i)||/||b|| 1.782282866091e-02
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 7.255149996405e-01
>> > > > >     1 KSP Residual norm 6.583512434218e-06
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.015229700337e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   3 KSP preconditioned resid norm 7.110407712709e-04 true resid
>> norm 5.284940654154e+02 ||r(i)||/||b|| 6.636357205153e-04
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 3.512243341400e-01
>> > > > >     1 KSP Residual norm 2.032490351200e-06
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.282327290982e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   4 KSP preconditioned resid norm 3.482036620521e-05 true resid
>> norm 4.291231924307e+01 ||r(i)||/||b|| 5.388546393133e-05
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 3.423609338053e-01
>> > > > >     1 KSP Residual norm 4.213703301972e-07
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.157384757538e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   5 KSP preconditioned resid norm 1.203470314534e-06 true resid
>> norm 4.544956156267e+00 ||r(i)||/||b|| 5.707150658550e-06
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 3.838596289995e-01
>> > > > >     1 KSP Residual norm 9.927864176103e-08
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.066298905618e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   6 KSP preconditioned resid norm 3.331619244266e-08 true resid
>> norm 2.821511729024e+00 ||r(i)||/||b|| 3.543002829675e-06
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 4.624964188094e-01
>> > > > >     1 KSP Residual norm 6.418229775372e-08
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 9.800784311614e-01
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   7 KSP preconditioned resid norm 8.788046233297e-10 true resid
>> norm 2.849209671705e+00 ||r(i)||/||b|| 3.577783436215e-06
>> > > > > Linear solve converged due to CONVERGED_ATOL iterations 7
>> > > > >
>> > > > > The outer operator is an explicit matrix.
>> > > > >
>> > > > > Giang
>> > > > >
>> > > > > On Mon, Apr 24, 2017 at 7:32 PM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>>
>> wrote:
>> > > > >
>> > > > > > On Apr 24, 2017, at 3:16 AM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>>
>> wrote:
>> > > > > >
>> > > > > > Thanks Barry, trying with -fieldsplit_u_type lu gives better
>> convergence. I still used 4 procs though, probably with 1 proc it should
>> also be the same.
>> > > > > >
>> > > > > > The u block used a Nitsche-type operator to connect two
>> non-matching domains. I don't think it will leave some rigid body motion
>> leads to not sufficient constraints. Maybe you have other idea?
>> > > > > >
>> > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > >     0 KSP Residual norm 3.129067184300e+05
>> > > > > >     1 KSP Residual norm 5.906261468196e-01
>> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > >
>> > > > >     ^^^^ something is wrong here. The sub solve should not be
>> starting with a 0 residual (this means the right hand side for this sub
>> solve is zero which it should not be).
>> > > > >
>> > > > > > FieldSplit with MULTIPLICATIVE composition: total splits = 2
>> > > > >
>> > > > >
>> > > > >    How are you providing the outer operator? As an explicit matrix
>> or with some shell matrix?
>> > > > >
>> > > > >
>> > > > >
>> > > > > >   0 KSP preconditioned resid norm 3.129067184300e+05 true resid
>> norm 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > >     0 KSP Residual norm 9.999955993437e-01
>> > > > > >     1 KSP Residual norm 4.019774691831e-06
>> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > >   1 KSP preconditioned resid norm 5.003913641475e-01 true resid
>> norm 4.692996324114e+01 ||r(i)||/||b|| 5.205677185522e-06
>> > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > >     0 KSP Residual norm 1.000012180204e+00
>> > > > > >     1 KSP Residual norm 1.017367950422e-05
>> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > >   2 KSP preconditioned resid norm 2.330910333756e-07 true resid
>> norm 3.474855463983e+01 ||r(i)||/||b|| 3.854461960453e-06
>> > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > >     0 KSP Residual norm 1.000004200085e+00
>> > > > > >     1 KSP Residual norm 6.231613102458e-06
>> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > >   3 KSP preconditioned resid norm 8.671259838389e-11 true resid
>> norm 3.545103468011e+01 ||r(i)||/||b|| 3.932384125024e-06
>> > > > > > Linear solve converged due to CONVERGED_ATOL iterations 3
>> > > > > > KSP Object: 4 MPI processes
>> > > > > >   type: gmres
>> > > > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> Orthogonalization
>> > > > > >     GMRES: happy breakdown tolerance 1e-30
>> > > > > >   maximum iterations=1000, initial guess is zero
>> > > > > >   tolerances:  relative=1e-20, absolute=1e-09, divergence=10000
>> > > > > >   left preconditioning
>> > > > > >   using PRECONDITIONED norm type for convergence test
>> > > > > > PC Object: 4 MPI processes
>> > > > > >   type: fieldsplit
>> > > > > >     FieldSplit with MULTIPLICATIVE composition: total splits = 2
>> > > > > >     Solver info for each split is in the following KSP objects:
>> > > > > >     Split number 0 Defined by IS
>> > > > > >     KSP Object:    (fieldsplit_u_)     4 MPI processes
>> > > > > >       type: richardson
>> > > > > >         Richardson: damping factor=1
>> > > > > >       maximum iterations=1, initial guess is zero
>> > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> divergence=10000
>> > > > > >       left preconditioning
>> > > > > >       using PRECONDITIONED norm type for convergence test
>> > > > > >     PC Object:    (fieldsplit_u_)     4 MPI processes
>> > > > > >       type: lu
>> > > > > >         LU: out-of-place factorization
>> > > > > >         tolerance for zero pivot 2.22045e-14
>> > > > > >         matrix ordering: natural
>> > > > > >         factor fill ratio given 0, needed 0
>> > > > > >           Factored matrix follows:
>> > > > > >             Mat Object:             4 MPI processes
>> > > > > >               type: mpiaij
>> > > > > >               rows=938910, cols=938910
>> > > > > >               package used to perform factorization: pastix
>> > > > > >               total: nonzeros=0, allocated nonzeros=0
>> > > > > >       Error :                        3.36878e-14
>> > > > > >           total number of mallocs used during MatSetValues calls
>> =0
>> > > > > >                 PaStiX run parameters:
>> > > > > >                   Matrix type :                      Unsymmetric
>> > > > > >                   Level of printing (0,1,2):         0
>> > > > > >                   Number of refinements iterations : 3
>> > > > > >   Error :                        3.36878e-14
>> > > > > >       linear system matrix = precond matrix:
>> > > > > >       Mat Object:      (fieldsplit_u_)       4 MPI processes
>> > > > > >         type: mpiaij
>> > > > > >         rows=938910, cols=938910, bs=3
>> > > > > >   Error :                        3.36878e-14
>> > > > > >   Error :                        3.36878e-14
>> > > > > >         total: nonzeros=8.60906e+07, allocated
>> nonzeros=8.60906e+07
>> > > > > >         total number of mallocs used during MatSetValues calls =0
>> > > > > >           using I-node (on process 0) routines: found 78749
>> nodes, limit used is 5
>> > > > > >     Split number 1 Defined by IS
>> > > > > >     KSP Object:    (fieldsplit_wp_)     4 MPI processes
>> > > > > >       type: richardson
>> > > > > >         Richardson: damping factor=1
>> > > > > >       maximum iterations=1, initial guess is zero
>> > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> divergence=10000
>> > > > > >       left preconditioning
>> > > > > >       using PRECONDITIONED norm type for convergence test
>> > > > > >     PC Object:    (fieldsplit_wp_)     4 MPI processes
>> > > > > >       type: lu
>> > > > > >         LU: out-of-place factorization
>> > > > > >         tolerance for zero pivot 2.22045e-14
>> > > > > >         matrix ordering: natural
>> > > > > >         factor fill ratio given 0, needed 0
>> > > > > >           Factored matrix follows:
>> > > > > >             Mat Object:             4 MPI processes
>> > > > > >               type: mpiaij
>> > > > > >               rows=34141, cols=34141
>> > > > > >               package used to perform factorization: pastix
>> > > > > >                 Error :                        -nan
>> > > > > >   Error :                        -nan
>> > > > > >   Error :                        -nan
>> > > > > > total: nonzeros=0, allocated nonzeros=0
>> > > > > >               total number of mallocs used during MatSetValues
>> calls =0
>> > > > > >                 PaStiX run parameters:
>> > > > > >                   Matrix type :                      Symmetric
>> > > > > >                   Level of printing (0,1,2):         0
>> > > > > >                   Number of refinements iterations : 0
>> > > > > >   Error :                        -nan
>> > > > > >       linear system matrix = precond matrix:
>> > > > > >       Mat Object:      (fieldsplit_wp_)       4 MPI processes
>> > > > > >         type: mpiaij
>> > > > > >         rows=34141, cols=34141
>> > > > > >         total: nonzeros=485655, allocated nonzeros=485655
>> > > > > >         total number of mallocs used during MatSetValues calls =0
>> > > > > >           not using I-node (on process 0) routines
>> > > > > >   linear system matrix = precond matrix:
>> > > > > >   Mat Object:   4 MPI processes
>> > > > > >     type: mpiaij
>> > > > > >     rows=973051, cols=973051
>> > > > > >     total: nonzeros=9.90037e+07, allocated nonzeros=9.90037e+07
>> > > > > >     total number of mallocs used during MatSetValues calls =0
>> > > > > >       using I-node (on process 0) routines: found 78749 nodes,
>> limit used is 5
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > Giang
>> > > > > >
>> > > > > > On Sun, Apr 23, 2017 at 10:19 PM, Barry Smith <
>> bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>> wrote:
>> > > > > >
>> > > > > > > On Apr 23, 2017, at 2:42 PM, Hoang Giang Bui <
>> hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> wrote:
>> > > > > > >
>> > > > > > > Dear Matt/Barry
>> > > > > > >
>> > > > > > > With your options, it results in
>> > > > > > >
>> > > > > > >   0 KSP preconditioned resid norm 1.106709687386e+31 true
>> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > > >     0 KSP Residual norm 2.407308987203e+36
>> > > > > > >     1 KSP Residual norm 5.797185652683e+72
>> > > > > >
>> > > > > > It looks like Matt is right, hypre is seemly producing useless
>> garbage.
>> > > > > >
>> > > > > > First how do things run on one process. If you have similar
>> problems then debug on one process (debugging any kind of problem is always
>> far easy on one process).
>> > > > > >
>> > > > > > First run with -fieldsplit_u_type lu (instead of using hypre) to
>> see if that works or also produces something bad.
>> > > > > >
>> > > > > > What is the operator and the boundary conditions for u? It could
>> be singular.
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > > > ...
>> > > > > > > 999 KSP preconditioned resid norm 2.920157329174e+12 true
>> resid norm 9.015683504616e+06 ||r(i)||/||b|| 1.000059124102e+00
>> > > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > > >     0 KSP Residual norm 1.533726746719e+36
>> > > > > > >     1 KSP Residual norm 3.692757392261e+72
>> > > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > > >
>> > > > > > > Do you suggest that the pastix solver for the "wp" block
>> encounters small pivot? In addition, seem like the "u" block is also
>> singular.
>> > > > > > >
>> > > > > > > Giang
>> > > > > > >
>> > > > > > > On Sun, Apr 23, 2017 at 7:39 PM, Barry Smith <
>> bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>> wrote:
>> > > > > > >
>> > > > > > >    Huge preconditioned norms but normal unpreconditioned norms
>> almost always come from a very small pivot in an LU or ILU factorization.
>> > > > > > >
>> > > > > > >    The first thing to do is monitor the two sub solves. Run
>> with the additional options -fieldsplit_u_ksp_type richardson
>> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
>> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
>> -fieldsplit_wp_ksp_max_it 1
>> > > > > > >
>> > > > > > > > On Apr 23, 2017, at 12:22 PM, Hoang Giang Bui <
>> hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> wrote:
>> > > > > > > >
>> > > > > > > > Hello
>> > > > > > > >
>> > > > > > > > I encountered a strange convergence behavior that I have
>> trouble to understand
>> > > > > > > >
>> > > > > > > > KSPSetFromOptions completed
>> > > > > > > >   0 KSP preconditioned resid norm 1.106709687386e+31 true
>> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > > > > > > >   1 KSP preconditioned resid norm 2.933141742664e+29 true
>> resid norm 9.015152282123e+06 ||r(i)||/||b|| 1.000000198575e+00
>> > > > > > > >   2 KSP preconditioned resid norm 9.686409637174e+16 true
>> resid norm 9.015354521944e+06 ||r(i)||/||b|| 1.000022631902e+00
>> > > > > > > >   3 KSP preconditioned resid norm 4.219243615809e+15 true
>> resid norm 9.017157702420e+06 ||r(i)||/||b|| 1.000222648583e+00
>> > > > > > > > .....
>> > > > > > > > 999 KSP preconditioned resid norm 3.043754298076e+12 true
>> resid norm 9.015425041089e+06 ||r(i)||/||b|| 1.000030454195e+00
>> > > > > > > > 1000 KSP preconditioned resid norm 3.043000287819e+12 true
>> resid norm 9.015424313455e+06 ||r(i)||/||b|| 1.000030373483e+00
>> > > > > > > > Linear solve did not converge due to DIVERGED_ITS iterations
>> 1000
>> > > > > > > > KSP Object: 4 MPI processes
>> > > > > > > >   type: gmres
>> > > > > > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> Orthogonalization
>> > > > > > > >     GMRES: happy breakdown tolerance 1e-30
>> > > > > > > >   maximum iterations=1000, initial guess is zero
>> > > > > > > >   tolerances:  relative=1e-20, absolute=1e-09,
>> divergence=10000
>> > > > > > > >   left preconditioning
>> > > > > > > >   using PRECONDITIONED norm type for convergence test
>> > > > > > > > PC Object: 4 MPI processes
>> > > > > > > >   type: fieldsplit
>> > > > > > > >     FieldSplit with MULTIPLICATIVE composition: total splits
>> = 2
>> > > > > > > >     Solver info for each split is in the following KSP
>> objects:
>> > > > > > > >     Split number 0 Defined by IS
>> > > > > > > >     KSP Object:    (fieldsplit_u_)     4 MPI processes
>> > > > > > > >       type: preonly
>> > > > > > > >       maximum iterations=10000, initial guess is zero
>> > > > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> divergence=10000
>> > > > > > > >       left preconditioning
>> > > > > > > >       using NONE norm type for convergence test
>> > > > > > > >     PC Object:    (fieldsplit_u_)     4 MPI processes
>> > > > > > > >       type: hypre
>> > > > > > > >         HYPRE BoomerAMG preconditioning
>> > > > > > > >         HYPRE BoomerAMG: Cycle type V
>> > > > > > > >         HYPRE BoomerAMG: Maximum number of levels 25
>> > > > > > > >         HYPRE BoomerAMG: Maximum number of iterations PER
>> hypre call 1
>> > > > > > > >         HYPRE BoomerAMG: Convergence tolerance PER hypre
>> call 0
>> > > > > > > >         HYPRE BoomerAMG: Threshold for strong coupling 0.6
>> > > > > > > >         HYPRE BoomerAMG: Interpolation truncation factor 0
>> > > > > > > >         HYPRE BoomerAMG: Interpolation: max elements per row
>> 0
>> > > > > > > >         HYPRE BoomerAMG: Number of levels of aggressive
>> coarsening 0
>> > > > > > > >         HYPRE BoomerAMG: Number of paths for aggressive
>> coarsening 1
>> > > > > > > >         HYPRE BoomerAMG: Maximum row sums 0.9
>> > > > > > > >         HYPRE BoomerAMG: Sweeps down         1
>> > > > > > > >         HYPRE BoomerAMG: Sweeps up           1
>> > > > > > > >         HYPRE BoomerAMG: Sweeps on coarse    1
>> > > > > > > >         HYPRE BoomerAMG: Relax down
>> symmetric-SOR/Jacobi
>> > > > > > > >         HYPRE BoomerAMG: Relax up
>> symmetric-SOR/Jacobi
>> > > > > > > >         HYPRE BoomerAMG: Relax on coarse
>>  Gaussian-elimination
>> > > > > > > >         HYPRE BoomerAMG: Relax weight  (all)      1
>> > > > > > > >         HYPRE BoomerAMG: Outer relax weight (all) 1
>> > > > > > > >         HYPRE BoomerAMG: Using CF-relaxation
>> > > > > > > >         HYPRE BoomerAMG: Measure type        local
>> > > > > > > >         HYPRE BoomerAMG: Coarsen type        PMIS
>> > > > > > > >         HYPRE BoomerAMG: Interpolation type  classical
>> > > > > > > >       linear system matrix = precond matrix:
>> > > > > > > >       Mat Object:      (fieldsplit_u_)       4 MPI processes
>> > > > > > > >         type: mpiaij
>> > > > > > > >         rows=938910, cols=938910, bs=3
>> > > > > > > >         total: nonzeros=8.60906e+07, allocated
>> nonzeros=8.60906e+07
>> > > > > > > >         total number of mallocs used during MatSetValues
>> calls =0
>> > > > > > > >           using I-node (on process 0) routines: found 78749
>> nodes, limit used is 5
>> > > > > > > >     Split number 1 Defined by IS
>> > > > > > > >     KSP Object:    (fieldsplit_wp_)     4 MPI processes
>> > > > > > > >       type: preonly
>> > > > > > > >       maximum iterations=10000, initial guess is zero
>> > > > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> divergence=10000
>> > > > > > > >       left preconditioning
>> > > > > > > >       using NONE norm type for convergence test
>> > > > > > > >     PC Object:    (fieldsplit_wp_)     4 MPI processes
>> > > > > > > >       type: lu
>> > > > > > > >         LU: out-of-place factorization
>> > > > > > > >         tolerance for zero pivot 2.22045e-14
>> > > > > > > >         matrix ordering: natural
>> > > > > > > >         factor fill ratio given 0, needed 0
>> > > > > > > >           Factored matrix follows:
>> > > > > > > >             Mat Object:             4 MPI processes
>> > > > > > > >               type: mpiaij
>> > > > > > > >               rows=34141, cols=34141
>> > > > > > > >               package used to perform factorization: pastix
>> > > > > > > >             Error :                        -nan
>> > > > > > > >   Error :                        -nan
>> > > > > > > >     total: nonzeros=0, allocated nonzeros=0
>> > > > > > > >             Error :                        -nan
>> > > > > > > >     total number of mallocs used during MatSetValues calls =0
>> > > > > > > >                 PaStiX run parameters:
>> > > > > > > >                   Matrix type :
>> Symmetric
>> > > > > > > >                   Level of printing (0,1,2):         0
>> > > > > > > >                   Number of refinements iterations : 0
>> > > > > > > >   Error :                        -nan
>> > > > > > > >       linear system matrix = precond matrix:
>> > > > > > > >       Mat Object:      (fieldsplit_wp_)       4 MPI processes
>> > > > > > > >         type: mpiaij
>> > > > > > > >         rows=34141, cols=34141
>> > > > > > > >         total: nonzeros=485655, allocated nonzeros=485655
>> > > > > > > >         total number of mallocs used during MatSetValues
>> calls =0
>> > > > > > > >           not using I-node (on process 0) routines
>> > > > > > > >   linear system matrix = precond matrix:
>> > > > > > > >   Mat Object:   4 MPI processes
>> > > > > > > >     type: mpiaij
>> > > > > > > >     rows=973051, cols=973051
>> > > > > > > >     total: nonzeros=9.90037e+07, allocated
>> nonzeros=9.90037e+07
>> > > > > > > >     total number of mallocs used during MatSetValues calls =0
>> > > > > > > >       using I-node (on process 0) routines: found 78749
>> nodes, limit used is 5
>> > > > > > > >
>> > > > > > > > The pattern of convergence gives a hint that this system is
>> somehow bad/singular. But I don't know why the preconditioned error goes up
>> too high. Anyone has an idea?
>> > > > > > > >
>> > > > > > > > Best regards
>> > > > > > > > Giang Bui
>> > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > >
>> > >
>> >
>> >
>>
>>




--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170503/5f588541/attachment-0001.html>

From mfadams at lbl.gov  Wed May  3 10:01:43 2017
From: mfadams at lbl.gov (Mark Adams)
Date: Wed, 3 May 2017 11:01:43 -0400
Subject: [petsc-users] GAMG scaling
Message-ID: <CADOhEh7wt-k7be4Zx+hNL6J3gyYfVWh3PD5ejv0XaagsV0XFXg@mail.gmail.com>

(Hong), what is the current state of optimizing RAP for scaling?

Nate, is driving 3D elasticity problems at scaling with GAMG and we are
working out performance problems. They are hitting problems at ~1.5B dof
problems on a basic Cray (XC30 I think).

Thanks,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170503/6028475d/attachment.html>

From hzhang at mcs.anl.gov  Wed May  3 11:17:22 2017
From: hzhang at mcs.anl.gov (Hong)
Date: Wed, 3 May 2017 11:17:22 -0500
Subject: [petsc-users] GAMG scaling
In-Reply-To: <CADOhEh7wt-k7be4Zx+hNL6J3gyYfVWh3PD5ejv0XaagsV0XFXg@mail.gmail.com>
References: <CADOhEh7wt-k7be4Zx+hNL6J3gyYfVWh3PD5ejv0XaagsV0XFXg@mail.gmail.com>
Message-ID: <CAGCphBuQ+QLg=HUQq3v1_4FDPHw1_XrsVhuYUbrVwxq2G_+VeA@mail.gmail.com>

Mark,
Below is the copy of my email sent to you on Feb 27:

I implemented scalable MatPtAP and did comparisons of three implementations
using ex56.c on alcf cetus machine (this machine has small memory,
1GB/core):
- nonscalable PtAP: use an array of length PN to do dense axpy
- scalable PtAP:       do sparse axpy without use of PN array
- hypre PtAP.

The results are attached. Summary:
- nonscalable PtAP is 2x faster than scalable, 8x faster than hypre PtAP
- scalable PtAP is 4x faster than hypre PtAP
- hypre uses less memory (see job.ne399.n63.np1000.sh)

Based on above observation, I set the default PtAP algorithm as
'nonscalable'.
When PN > local estimated nonzero of C=PtAP, then switch default to
'scalable'.
User can overwrite default.

For the case of np=8000, ne=599 (see job.ne599.n500.np8000.sh), I get
MatPtAP                   3.6224e+01 (nonscalable for small mats, scalable
for larger ones)
scalable MatPtAP     4.6129e+01
hypre                        1.9389e+02

This work in on petsc-master. Give it a try. If you encounter any problem,
let me know.

Hong

On Wed, May 3, 2017 at 10:01 AM, Mark Adams <mfadams at lbl.gov> wrote:

> (Hong), what is the current state of optimizing RAP for scaling?
>
> Nate, is driving 3D elasticity problems at scaling with GAMG and we are
> working out performance problems. They are hitting problems at ~1.5B dof
> problems on a basic Cray (XC30 I think).
>
> Thanks,
> Mark
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170503/dc4f5eec/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: out_ex56_cetus_short
Type: application/octet-stream
Size: 5377 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170503/dc4f5eec/attachment.obj>

From fande.kong at inl.gov  Wed May  3 13:24:29 2017
From: fande.kong at inl.gov (Kong, Fande)
Date: Wed, 3 May 2017 12:24:29 -0600
Subject: [petsc-users] log_view for the master branch
Message-ID: <CAK4PXd0OCS8AikSCkMMwYRPRxUN49Dsa7atBhL2H+4zeAfg01w@mail.gmail.com>

Hi,

I am using the current master branch. The log_view gives me the summary as
follows, and the "WARNING" box repeats three times. Are we intending to do
so?

Thanks,

Fande,


************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r
-fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary:
----------------------------------------------



      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was compiled with a debugging option,      #
      #   To get timing results run ./configure                #
      #   using --with-debugging=no, the performance will      #
      #   be generally two or three times faster.              #
      #                                                        #
      ##########################################################


./ex29 on a arch-darwin-c-debug-master named FN604208 with 1 processor, by
kongf Wed May  3 12:28:23 2017
Using Petsc Development GIT revision: v3.7.6-3529-g76c7fe0  GIT Date:
2017-05-03 08:46:23 -0500

                         Max       Max/Min        Avg      Total
Time (sec):           1.350e-02      1.00000   1.350e-02
Objects:              4.100e+01      1.00000   4.100e+01
Flop:                 3.040e+02      1.00000   3.040e+02  3.040e+02
Flop/sec:            2.251e+04      1.00000   2.251e+04  2.251e+04
Memory:               1.576e+05      1.00000              1.576e+05
MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Reductions:       0.000e+00      0.00000

Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N
--> 2N flop
                            and VecAXPY() for complex vectors of length N
--> 8N flop

Summary of Stages:   ----- Time ------  ----- Flop -----  --- Messages ---
-- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts
%Total     Avg         %Total   counts   %Total
 0:      Main Stage: 1.3483e-02  99.8%  3.0400e+02 100.0%  0.000e+00
0.0%  0.000e+00        0.0%  0.000e+00   0.0%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on
interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and
PetscLogStagePop().
      %T - percent time in this phase         %F - percent flop in this
phase
      %M - percent messages in this phase     %L - percent message lengths
in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over
all processors)
------------------------------------------------------------------------------------------------------------------------


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was compiled with a debugging option,      #
      #   To get timing results run ./configure                #
      #   using --with-debugging=no, the performance will      #
      #   be generally two or three times faster.              #
      #                                                        #
      ##########################################################


Event                Count      Time (sec)
Flop                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage

KSPGMRESOrthog         1 1.0 1.3617e-04 1.0 3.50e+01 1.0 0.0e+00 0.0e+00
0.0e+00  1 12  0  0  0   1 12  0  0  0     0
KSPSetUp               1 1.0 4.1097e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  3  0  0  0  0   3  0  0  0  0     0
KSPSolve               1 1.0 1.4596e-03 1.0 2.85e+02 1.0 0.0e+00 0.0e+00
0.0e+00 11 94  0  0  0  11 94  0  0  0     0
VecMDot                1 1.0 1.7958e-05 1.0 1.70e+01 1.0 0.0e+00 0.0e+00
0.0e+00  0  6  0  0  0   0  6  0  0  0     1
VecNorm                2 1.0 1.9152e-05 1.0 3.40e+01 1.0 0.0e+00 0.0e+00
0.0e+00  0 11  0  0  0   0 11  0  0  0     2
VecScale               1 1.0 4.4771e-05 1.0 9.00e+00 1.0 0.0e+00 0.0e+00
0.0e+00  0  3  0  0  0   0  3  0  0  0     0
VecCopy                1 1.0 1.2218e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                10 1.0 7.3789e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  1  0  0  0  0   1  0  0  0  0     0
VecAXPY                1 1.0 6.3397e-05 1.0 1.80e+01 1.0 0.0e+00 0.0e+00
0.0e+00  0  6  0  0  0   0  6  0  0  0     0
VecMAXPY               2 1.0 4.8989e-05 1.0 3.60e+01 1.0 0.0e+00 0.0e+00
0.0e+00  0 12  0  0  0   0 12  0  0  0     1
VecAssemblyBegin       2 1.0 7.5148e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd         2 1.0 7.5093e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize           2 1.0 9.5865e-05 1.0 4.30e+01 1.0 0.0e+00 0.0e+00
0.0e+00  1 14  0  0  0   1 14  0  0  0     0
MatMult                1 1.0 1.3781e-05 1.0 5.70e+01 1.0 0.0e+00 0.0e+00
0.0e+00  0 19  0  0  0   0 19  0  0  0     4
MatSolve               2 1.0 7.4019e-04 1.0 1.14e+02 1.0 0.0e+00 0.0e+00
0.0e+00  5 38  0  0  0   5 38  0  0  0     0
MatLUFactorNum         1 1.0 2.8001e-05 1.0 1.90e+01 1.0 0.0e+00 0.0e+00
0.0e+00  0  6  0  0  0   0  6  0  0  0     1
MatILUFactorSym        1 1.0 9.1556e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatAssemblyBegin       2 1.0 7.7938e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         2 1.0 4.5131e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            1 1.0 4.0429e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 1.7907e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  1  0  0  0  0   1  0  0  0  0     0
PCSetUp                1 1.0 5.8597e-04 1.0 1.90e+01 1.0 0.0e+00 0.0e+00
0.0e+00  4  6  0  0  0   4  6  0  0  0     0
PCApply                2 1.0 7.8497e-04 1.0 1.14e+02 1.0 0.0e+00 0.0e+00
0.0e+00  6 38  0  0  0   6 38  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

       Krylov Solver     1              1        18408     0.
     DMKSP interface     1              1          648     0.
              Vector    12             12        19224     0.
      Vector Scatter     2              2         1312     0.
              Matrix     2              2         7380     0.
    Distributed Mesh     3              3        14960     0.
           Index Set     7              7         5632     0.
   IS L to G Mapping     2              2         1368     0.
Star Forest Bipartite Graph     6              6         4864     0.
     Discrete System     3              3         2596     0.
      Preconditioner     1              1         1000     0.
              Viewer     1              0            0     0.
========================================================================================================================
Average time to get PetscTime(): 4.50294e-08
#PETSc Option Table entries:
-log_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --download-hypre=1 --with-ssl=0 --with-debugging=yes
--with-pic=1 --with-shared-libraries=1 --with-cc=mpicc --with-cxx=mpicxx
--with-fc=mpif90 --download-fblaslapack=1 --download-metis=1
--download-parmetis=1 --download-superlu_dist=1 --download-scalapack=1
--download-mumps=1 CC=mpicc CXX=mpicxx FC=mpif90 F77=mpif77 F90=mpif90
CFLAGS="-fPIC -fopenmp" CXXFLAGS="-fPIC -fopenmp" FFLAGS="-fPIC -fopenmp"
FCFLAGS="-fPIC -fopenmp" F90FLAGS="-fPIC -fopenmp" F77FLAGS="-fPIC
-fopenmp" PETSC_ARCH=arch-darwin-c-debug-master
-----------------------------------------
Libraries compiled on Wed May  3 11:04:44 2017 on FN604208
Machine characteristics: Darwin-15.5.0-x86_64-i386-64bit
Using PETSc directory: /Users/kongf/projects/petsc
Using PETSc arch: arch-darwin-c-debug-master
-----------------------------------------

Using C compiler: mpicc -fPIC -fopenmp   -g3  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90 -fPIC -fopenmp  -g   ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------

Using include paths:
-I/Users/kongf/projects/petsc/arch-darwin-c-debug-master/include
-I/Users/kongf/projects/petsc/include -I/Users/kongf/projects/petsc/include
-I/Users/kongf/projects/petsc/arch-darwin-c-debug-master/include
-I/opt/X11/include
-----------------------------------------

Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries:
-Wl,-rpath,/Users/kongf/projects/petsc/arch-darwin-c-debug-master/lib
-L/Users/kongf/projects/petsc/arch-darwin-c-debug-master/lib -lpetsc
-Wl,-rpath,/Users/kongf/projects/petsc/arch-darwin-c-debug-master/lib
-L/Users/kongf/projects/petsc/arch-darwin-c-debug-master/lib
-Wl,-rpath,/opt/X11/lib -L/opt/X11/lib
-Wl,-rpath,/opt/moose/mpich/mpich-3.2/clang-opt/lib
-L/opt/moose/mpich/mpich-3.2/clang-opt/lib
-Wl,-rpath,/opt/moose/llvm-3.9.0/lib -L/opt/moose/llvm-3.9.0/lib
-Wl,-rpath,/opt/moose/llvm-3.9.0/lib/clang/3.9.0/lib/darwin
-L/opt/moose/llvm-3.9.0/lib/clang/3.9.0/lib/darwin
-Wl,-rpath,/opt/moose/gcc-6.2.0/lib/gcc/x86_64-apple-darwin15.6.0/6.2.0
-L/opt/moose/gcc-6.2.0/lib/gcc/x86_64-apple-darwin15.6.0/6.2.0
-Wl,-rpath,/opt/moose/gcc-6.2.0/lib -L/opt/moose/gcc-6.2.0/lib
-Wl,-rpath,/opt/moose/llvm-3.9.0/bin/../lib/clang/3.9.0/lib/darwin
-L/opt/moose/llvm-3.9.0/bin/../lib/clang/3.9.0/lib/darwin -lsuperlu_dist
-lHYPRE -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord
-lscalapack -lflapack -lfblas -lparmetis -lmetis -lX11 -lclang_rt.osx
-lmpifort -lgfortran -lgomp -lgcc_ext.10.5 -lquadmath -lm -lclang_rt.osx
-lmpicxx -lc++ -lclang_rt.osx -ldl -lmpi -lpmpi -lomp -lSystem
-lclang_rt.osx -ldl
-----------------------------------------



      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was compiled with a debugging option,      #
      #   To get timing results run ./configure                #
      #   using --with-debugging=no, the performance will      #
      #   be generally two or three times faster.              #
      #                                                        #
      ##########################################################
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170503/4d7ba592/attachment.html>

From knepley at gmail.com  Wed May  3 13:27:47 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 3 May 2017 13:27:47 -0500
Subject: [petsc-users] log_view for the master branch
In-Reply-To: <CAK4PXd0OCS8AikSCkMMwYRPRxUN49Dsa7atBhL2H+4zeAfg01w@mail.gmail.com>
References: <CAK4PXd0OCS8AikSCkMMwYRPRxUN49Dsa7atBhL2H+4zeAfg01w@mail.gmail.com>
Message-ID: <CAMYG4Gmuw1nW-Dt5x2WcWvEEiZrRx9wuSfN2RP4W7uaGX80xGA@mail.gmail.com>

On Wed, May 3, 2017 at 1:24 PM, Kong, Fande <fande.kong at inl.gov> wrote:

> Hi,
>
> I am using the current master branch. The log_view gives me the summary as
> follows, and the "WARNING" box repeats three times. Are we intending to do
> so?
>

Yep, Barry is Really Freaking Serious@ that you should not interpret these
numbers without optimization on.

   Matt


> Thanks,
>
> Fande,
>
>
> ************************************************************
> ************************************************************
> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r
> -fCourier9' to print this document            ***
> ************************************************************
> ************************************************************
>
> ---------------------------------------------- PETSc Performance Summary:
> ----------------------------------------------
>
>
>
>       ##########################################################
>       #                                                        #
>       #                          WARNING!!!                    #
>       #                                                        #
>       #   This code was compiled with a debugging option,      #
>       #   To get timing results run ./configure                #
>       #   using --with-debugging=no, the performance will      #
>       #   be generally two or three times faster.              #
>       #                                                        #
>       ##########################################################
>
>
> ./ex29 on a arch-darwin-c-debug-master named FN604208 with 1 processor, by
> kongf Wed May  3 12:28:23 2017
> Using Petsc Development GIT revision: v3.7.6-3529-g76c7fe0  GIT Date:
> 2017-05-03 08:46:23 -0500
>
>                          Max       Max/Min        Avg      Total
> Time (sec):           1.350e-02      1.00000   1.350e-02
> Objects:              4.100e+01      1.00000   4.100e+01
> Flop:                 3.040e+02      1.00000   3.040e+02  3.040e+02
> Flop/sec:            2.251e+04      1.00000   2.251e+04  2.251e+04
> Memory:               1.576e+05      1.00000              1.576e+05
> MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
> MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
> MPI Reductions:       0.000e+00      0.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length N
> --> 2N flop
>                             and VecAXPY() for complex vectors of length N
> --> 8N flop
>
> Summary of Stages:   ----- Time ------  ----- Flop -----  --- Messages
> ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts
> %Total     Avg         %Total   counts   %Total
>  0:      Main Stage: 1.3483e-02  99.8%  3.0400e+02 100.0%  0.000e+00
> 0.0%  0.000e+00        0.0%  0.000e+00   0.0%
>
> ------------------------------------------------------------
> ------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flop: Max - maximum over all processors
>                    Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    Avg. len: average message length (bytes)
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush() and
> PetscLogStagePop().
>       %T - percent time in this phase         %F - percent flop in this
> phase
>       %M - percent messages in this phase     %L - percent message lengths
> in this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over
> all processors)
> ------------------------------------------------------------
> ------------------------------------------------------------
>
>
>       ##########################################################
>       #                                                        #
>       #                          WARNING!!!                    #
>       #                                                        #
>       #   This code was compiled with a debugging option,      #
>       #   To get timing results run ./configure                #
>       #   using --with-debugging=no, the performance will      #
>       #   be generally two or three times faster.              #
>       #                                                        #
>       ##########################################################
>
>
> Event                Count      Time (sec)
> Flop                             --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ------------------------------------------------------------
> ------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> KSPGMRESOrthog         1 1.0 1.3617e-04 1.0 3.50e+01 1.0 0.0e+00 0.0e+00
> 0.0e+00  1 12  0  0  0   1 12  0  0  0     0
> KSPSetUp               1 1.0 4.1097e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  3  0  0  0  0   3  0  0  0  0     0
> KSPSolve               1 1.0 1.4596e-03 1.0 2.85e+02 1.0 0.0e+00 0.0e+00
> 0.0e+00 11 94  0  0  0  11 94  0  0  0     0
> VecMDot                1 1.0 1.7958e-05 1.0 1.70e+01 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  6  0  0  0   0  6  0  0  0     1
> VecNorm                2 1.0 1.9152e-05 1.0 3.40e+01 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 11  0  0  0   0 11  0  0  0     2
> VecScale               1 1.0 4.4771e-05 1.0 9.00e+00 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  3  0  0  0   0  3  0  0  0     0
> VecCopy                1 1.0 1.2218e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet                10 1.0 7.3789e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> VecAXPY                1 1.0 6.3397e-05 1.0 1.80e+01 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  6  0  0  0   0  6  0  0  0     0
> VecMAXPY               2 1.0 4.8989e-05 1.0 3.60e+01 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 12  0  0  0   0 12  0  0  0     1
> VecAssemblyBegin       2 1.0 7.5148e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAssemblyEnd         2 1.0 7.5093e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecNormalize           2 1.0 9.5865e-05 1.0 4.30e+01 1.0 0.0e+00 0.0e+00
> 0.0e+00  1 14  0  0  0   1 14  0  0  0     0
> MatMult                1 1.0 1.3781e-05 1.0 5.70e+01 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 19  0  0  0   0 19  0  0  0     4
> MatSolve               2 1.0 7.4019e-04 1.0 1.14e+02 1.0 0.0e+00 0.0e+00
> 0.0e+00  5 38  0  0  0   5 38  0  0  0     0
> MatLUFactorNum         1 1.0 2.8001e-05 1.0 1.90e+01 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  6  0  0  0   0  6  0  0  0     1
> MatILUFactorSym        1 1.0 9.1556e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> MatAssemblyBegin       2 1.0 7.7938e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd         2 1.0 4.5131e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetRowIJ            1 1.0 4.0429e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         1 1.0 1.7907e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> PCSetUp                1 1.0 5.8597e-04 1.0 1.90e+01 1.0 0.0e+00 0.0e+00
> 0.0e+00  4  6  0  0  0   4  6  0  0  0     0
> PCApply                2 1.0 7.8497e-04 1.0 1.14e+02 1.0 0.0e+00 0.0e+00
> 0.0e+00  6 38  0  0  0   6 38  0  0  0     0
> ------------------------------------------------------------
> ------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type          Creations   Destructions     Memory  Descendants' Mem.
> Reports information only for process 0.
>
> --- Event Stage 0: Main Stage
>
>        Krylov Solver     1              1        18408     0.
>      DMKSP interface     1              1          648     0.
>               Vector    12             12        19224     0.
>       Vector Scatter     2              2         1312     0.
>               Matrix     2              2         7380     0.
>     Distributed Mesh     3              3        14960     0.
>            Index Set     7              7         5632     0.
>    IS L to G Mapping     2              2         1368     0.
> Star Forest Bipartite Graph     6              6         4864     0.
>      Discrete System     3              3         2596     0.
>       Preconditioner     1              1         1000     0.
>               Viewer     1              0            0     0.
> ============================================================
> ============================================================
> Average time to get PetscTime(): 4.50294e-08
> #PETSc Option Table entries:
> -log_view
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure options: --download-hypre=1 --with-ssl=0 --with-debugging=yes
> --with-pic=1 --with-shared-libraries=1 --with-cc=mpicc --with-cxx=mpicxx
> --with-fc=mpif90 --download-fblaslapack=1 --download-metis=1
> --download-parmetis=1 --download-superlu_dist=1 --download-scalapack=1
> --download-mumps=1 CC=mpicc CXX=mpicxx FC=mpif90 F77=mpif77 F90=mpif90
> CFLAGS="-fPIC -fopenmp" CXXFLAGS="-fPIC -fopenmp" FFLAGS="-fPIC -fopenmp"
> FCFLAGS="-fPIC -fopenmp" F90FLAGS="-fPIC -fopenmp" F77FLAGS="-fPIC
> -fopenmp" PETSC_ARCH=arch-darwin-c-debug-master
> -----------------------------------------
> Libraries compiled on Wed May  3 11:04:44 2017 on FN604208
> Machine characteristics: Darwin-15.5.0-x86_64-i386-64bit
> Using PETSc directory: /Users/kongf/projects/petsc
> Using PETSc arch: arch-darwin-c-debug-master
> -----------------------------------------
>
> Using C compiler: mpicc -fPIC -fopenmp   -g3  ${COPTFLAGS} ${CFLAGS}
> Using Fortran compiler: mpif90 -fPIC -fopenmp  -g   ${FOPTFLAGS} ${FFLAGS}
> -----------------------------------------
>
> Using include paths: -I/Users/kongf/projects/petsc/
> arch-darwin-c-debug-master/include -I/Users/kongf/projects/petsc/include
> -I/Users/kongf/projects/petsc/include -I/Users/kongf/projects/petsc/
> arch-darwin-c-debug-master/include -I/opt/X11/include
> -----------------------------------------
>
> Using C linker: mpicc
> Using Fortran linker: mpif90
> Using libraries: -Wl,-rpath,/Users/kongf/projects/petsc/arch-darwin-c-debug-master/lib
> -L/Users/kongf/projects/petsc/arch-darwin-c-debug-master/lib -lpetsc
> -Wl,-rpath,/Users/kongf/projects/petsc/arch-darwin-c-debug-master/lib
> -L/Users/kongf/projects/petsc/arch-darwin-c-debug-master/lib
> -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -Wl,-rpath,/opt/moose/mpich/mpich-3.2/clang-opt/lib
> -L/opt/moose/mpich/mpich-3.2/clang-opt/lib -Wl,-rpath,/opt/moose/llvm-3.9.0/lib
> -L/opt/moose/llvm-3.9.0/lib -Wl,-rpath,/opt/moose/llvm-3.9.0/lib/clang/3.9.0/lib/darwin
> -L/opt/moose/llvm-3.9.0/lib/clang/3.9.0/lib/darwin
> -Wl,-rpath,/opt/moose/gcc-6.2.0/lib/gcc/x86_64-apple-darwin15.6.0/6.2.0
> -L/opt/moose/gcc-6.2.0/lib/gcc/x86_64-apple-darwin15.6.0/6.2.0
> -Wl,-rpath,/opt/moose/gcc-6.2.0/lib -L/opt/moose/gcc-6.2.0/lib
> -Wl,-rpath,/opt/moose/llvm-3.9.0/bin/../lib/clang/3.9.0/lib/darwin
> -L/opt/moose/llvm-3.9.0/bin/../lib/clang/3.9.0/lib/darwin -lsuperlu_dist
> -lHYPRE -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord
> -lscalapack -lflapack -lfblas -lparmetis -lmetis -lX11 -lclang_rt.osx
> -lmpifort -lgfortran -lgomp -lgcc_ext.10.5 -lquadmath -lm -lclang_rt.osx
> -lmpicxx -lc++ -lclang_rt.osx -ldl -lmpi -lpmpi -lomp -lSystem
> -lclang_rt.osx -ldl
> -----------------------------------------
>
>
>
>       ##########################################################
>       #                                                        #
>       #                          WARNING!!!                    #
>       #                                                        #
>       #   This code was compiled with a debugging option,      #
>       #   To get timing results run ./configure                #
>       #   using --with-debugging=no, the performance will      #
>       #   be generally two or three times faster.              #
>       #                                                        #
>       ##########################################################
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170503/f4e67a99/attachment-0001.html>

From mfadams at lbl.gov  Wed May  3 14:08:03 2017
From: mfadams at lbl.gov (Mark Adams)
Date: Wed, 3 May 2017 15:08:03 -0400
Subject: [petsc-users] GAMG scaling
In-Reply-To: <CAGCphBuQ+QLg=HUQq3v1_4FDPHw1_XrsVhuYUbrVwxq2G_+VeA@mail.gmail.com>
References: <CADOhEh7wt-k7be4Zx+hNL6J3gyYfVWh3PD5ejv0XaagsV0XFXg@mail.gmail.com>
	<CAGCphBuQ+QLg=HUQq3v1_4FDPHw1_XrsVhuYUbrVwxq2G_+VeA@mail.gmail.com>
Message-ID: <CADOhEh4bxBVeg5SJrqfGtc5F5UqN6eo1anKoRjZn1gdS7KnFoQ@mail.gmail.com>

Hong,the input files do not seem to be accessible. What are the command
line option? (I don't see a "rap" or "scale" in the source).



On Wed, May 3, 2017 at 12:17 PM, Hong <hzhang at mcs.anl.gov> wrote:

> Mark,
> Below is the copy of my email sent to you on Feb 27:
>
> I implemented scalable MatPtAP and did comparisons of three
> implementations using ex56.c on alcf cetus machine (this machine has
> small memory, 1GB/core):
> - nonscalable PtAP: use an array of length PN to do dense axpy
> - scalable PtAP:       do sparse axpy without use of PN array
> - hypre PtAP.
>
> The results are attached. Summary:
> - nonscalable PtAP is 2x faster than scalable, 8x faster than hypre PtAP
> - scalable PtAP is 4x faster than hypre PtAP
> - hypre uses less memory (see job.ne399.n63.np1000.sh)
>
> Based on above observation, I set the default PtAP algorithm as
> 'nonscalable'.
> When PN > local estimated nonzero of C=PtAP, then switch default to
> 'scalable'.
> User can overwrite default.
>
> For the case of np=8000, ne=599 (see job.ne599.n500.np8000.sh), I get
> MatPtAP                   3.6224e+01 (nonscalable for small mats, scalable
> for larger ones)
> scalable MatPtAP     4.6129e+01
> hypre                        1.9389e+02
>
> This work in on petsc-master. Give it a try. If you encounter any problem,
> let me know.
>
> Hong
>
> On Wed, May 3, 2017 at 10:01 AM, Mark Adams <mfadams at lbl.gov> wrote:
>
>> (Hong), what is the current state of optimizing RAP for scaling?
>>
>> Nate, is driving 3D elasticity problems at scaling with GAMG and we are
>> working out performance problems. They are hitting problems at ~1.5B dof
>> problems on a basic Cray (XC30 I think).
>>
>> Thanks,
>> Mark
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170503/a246292d/attachment.html>

From hgbk2008 at gmail.com  Wed May  3 16:19:01 2017
From: hgbk2008 at gmail.com (Hoang Giang Bui)
Date: Wed, 3 May 2017 23:19:01 +0200
Subject: [petsc-users] strange convergence
In-Reply-To: <DCBF2435-8966-4FBE-9261-54D41460BD06@glasgow.ac.uk>
References: <CAJW_hKcJN520Q=0cAJyAQ=ZF1Wh0f63=5AXSJ5MSyXMFv2AnNQ@mail.gmail.com>
	<7891536D-91FE-4BFF-8DAD-CE7AB85A4E57@mcs.anl.gov>
	<CAJW_hKfFpmJRx+XHSZxbdPEK8_B1sWQTATKVaAZu6vi-1uALqg@mail.gmail.com>
	<425BBB58-9721-49F3-8C86-940F08E925F7@mcs.anl.gov>
	<CAJW_hKd+GAwf4ZS3_337n3PAJmxC6dm2B5uGDS88nrH-KjV60g@mail.gmail.com>
	<A1749310-AE79-4D42-A14F-991DA59F4919@mcs.anl.gov>
	<CAJW_hKd+m3p+Oju7XSPha=_AqYbGD_uF0ZV6qe8HzxbFWZ_2Zg@mail.gmail.com>
	<42EB791A-40C2-439F-A5F7-5F8C15CECA6F@mcs.anl.gov>
	<CAJW_hKd=d2NMxiaU1h6Ro3S3f_iaY294yRJyCG_tb1KN6Wiiug@mail.gmail.com>
	<F5D2FC76-C3A2-4FC8-8144-0D7500F26547@mcs.anl.gov>
	<CAJW_hKc7ZnhAGrq=wAgr3kyBw536GSDL+deDvfmpwkwnmVzHzw@mail.gmail.com>
	<82193784-B4C4-47D7-80EA-25F549C9091B@mcs.anl.gov>
	<CAJW_hKeiozsxQ7XkVneArg_0UYiGXPU7Zovn_PMkNWFrix9w7A@mail.gmail.com>
	<B3AE9D6B-7826-4ED2-B9A2-12250D37BB01@mcs.anl.gov>
	<CAJW_hKfzOV3dDjjgZGvexAm_Sg+8N8fb82J_-iMfxJ2oYvBm9w@mail.gmail.com>
	<87wpa3wd5j.fsf@jedbrown.org>
	<CAJW_hKdxHMKtN5P7edsZ=V2VTeWebR=D6chkZTMy5L=2fTsT+g@mail.gmail.com>
	<CAMYG4GkBEJCYKS5fn72MceU_WD1QhnitmDRZg1ivGaenzLYLDw@mail.gmail.com>
	<DCBF2435-8966-4FBE-9261-54D41460BD06@glasgow.ac.uk>
Message-ID: <CAJW_hKfZNcswbD3j7KCwWyXRNcxv2QfBfwNjnSo8mKsqiiRyCA@mail.gmail.com>

Hi Lukasz, thanks for sharing very interesting slide.

Both of you are right, the mortar method starts from continuum argument
then reduce to discrete space by discretizing the Lagrange multiplier.
However, the way to choose the interpolation space has some implication on
the properties of the mortar matrices. For example, the dual mortar space
can help to reduce the multiplier by static condensation but it creates
some numerical oscillation. In my opinion I think it's not stable despite a
very sound theoretical foundation is developed. Both standard and dual
mortar approach impose some drawbacks for high order contact because of
negative shape function can create some spurious negative nodal gap. How do
you cope with that case in your code?

However, this question may be a bit off-topic. Come back to the main
question, for mesh "gluing" using mortar method, the Schur matrix is S=[0
-D^T M^T] A^-1 [0 D M]^T, has the form of A_10 (A_00)^-1 A01 since A_11=0.
The magnitude of S (~E^-1) is too small compare to A_00 (which is ~E for
elasticity). I think in some case it's also rank deficient if three
Lagrange multiplier is used per node (S is very ill-conditioned although
A_00 is well). I'm skeptical here do you really solve the system of mortar
with schur complement?

Giang

On Wed, May 3, 2017 at 2:55 PM, Lukasz Kaczmarczyk <
Lukasz.Kaczmarczyk at glasgow.ac.uk> wrote:

>
> On 3 May 2017, at 13:22, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Wed, May 3, 2017 at 2:29 AM, Hoang Giang Bui <hgbk2008 at gmail.com> wrote
> :
>
>> Dear Jed
>>
>> If I understood you correctly you suggest to avoid penalty by using the
>> Lagrange multiplier for the mortar constraint? In this case it leads to the
>> use of discrete Lagrange multiplier space.
>>
>
> Sorry for being ignorant here, but why is the space "discrete"? It looks
> like you should have a continuum formulation
> of the mortar as well. Maybe I do not understand something fundamental.
> From this (https://en.wikipedia.org/wiki/Mortar_methods)
> short description, it seems that mortars begin from a continuum
> formulation, but are then reduced to the discrete level. This is no
> problem if done consistently, as for instance in the FETI method where
> efficient preconditioners exist.
>
>
> Hello,
>
> I copied the wrong link to mortar method, how we implemented it, see
> presentation http://doi.org/10.5281/zenodo.556996
>
> You right that we always start from continuum formulation, on this we
> apply some discretisation, at the end Lagrange multiplier is expressed by a
> finite vector of discrete unknowns. It is better to formulate problem first
> for the continuum; you have better control on what you are doing and
> stability of the solution.
>
> Of course, you can add some constraints at the discreet level, after you
> discretised problem, but implicitly you have some continuous space for
> Lagrange multipliers, which is associated with shape functions which you
> use to discretise problem.
>
> In our problem which we have,  we try to avoid rebuilding of the system of
> equations each time contact area is changing. We going to construct DM
> sub-problem for each body in contact, each sub-problem going to be solved
> using MG (adjacency for those matrices is fixed in time).  All will go to
> put in nested matrix with the separate block for Lagrange multipliers
> (adjacency will change in each time step).  For solving  Lagrange
> multipliers we going to use FIELDSPLIT using Schur complement. I need to
> look more detail to FETI method, at are still at development stage for
> contact problem and direct solver works, for now, small problems at that
> point.
>
> In our code, we using higher order elements with hierarchical base,  for
> this we using specialise MG solver, as you can see here, it works pretty
> well for moderate size problems, <100M
> http://mofem.eng.gla.ac.uk/mofem/html/_p_c_m_g_set_up_via_ap
> prox_orders_8cpp.html
>
> Regards,
> Lukasz
>
>
>
>  Thanks,
>
>     Matt
>
>
>> Do you or anyone already have experience using discrete Lagrange
>> multiplier space with Petsc?
>>
>> There is also similar question on stackexchange
>> https://scicomp.stackexchange.com/questions/25113/preconditi
>> oners-and-discrete-lagrange-multipliers
>>
>> Giang
>>
>> On Sat, Apr 29, 2017 at 3:34 PM, Jed Brown <jed at jedbrown.org> wrote:
>>
>>> Hoang Giang Bui <hgbk2008 at gmail.com> writes:
>>>
>>> > Hi Barry
>>> >
>>> > The first block is from a standard solid mechanics discretization
>>> based on
>>> > balance of momentum equation. There is some material involved but in
>>> > principal it's well-posed elasticity equation with positive definite
>>> > tangent operator. The "gluing business" uses the mortar method to keep
>>> the
>>> > continuity of displacement. Instead of using Lagrange multiplier to
>>> treat
>>> > the constraint I used penalty method to penalize the energy. The
>>> > discretization form of mortar is quite simple
>>> >
>>> > \int_{\Gamma_1} { rho * (\delta u_1 - \delta u_2) * (u_1 - u_2) dA }
>>> >
>>> > rho is penalty parameter. In the simulation I initially set it low
>>> (~E) to
>>> > preserve the conditioning of the system.
>>>
>>> There are two things that can go wrong here with AMG:
>>>
>>> * The penalty term can mess up the strength of connection heuristics
>>>   such that you get poor choice of C-points (classical AMG like
>>>   BoomerAMG) or poor choice of aggregates (smoothed aggregation).
>>>
>>> * The penalty term can prevent Jacobi smoothing from being effective; in
>>>   this case, it can lead to poor coarse basis functions (higher energy
>>>   than they should be) and poor smoothing in an MG cycle.  You can fix
>>>   the poor smoothing in the MG cycle by using a stronger smoother, like
>>>   ASM with some overlap.
>>>
>>> I'm generally not a fan of penalty methods due to the irritating
>>> tradeoffs and often poor solver performance.
>>>
>>> > In the figure below, the colorful blocks are u_1 and the base is u_2.
>>> Both
>>> > u_1 and u_2 use isoparametric quadratic approximation.
>>> >
>>> > ?
>>> >  Snapshot.png
>>> > <https://drive.google.com/file/d/0Bw8Hmu0-YGQXc2hKQ1BhQ1I4OE
>>> U/view?usp=drive_web>
>>> > ???
>>> >
>>> > Giang
>>> >
>>> > On Fri, Apr 28, 2017 at 6:21 PM, Barry Smith <bsmith at mcs.anl.gov>
>>> wrote:
>>> >
>>> >>
>>> >>   Ok, so boomerAMG algebraic multigrid is not good for the first
>>> block.
>>> >> You mentioned the first block has two things glued together? AMG is
>>> >> fantastic for certain problems but doesn't work for everything.
>>> >>
>>> >>    Tell us more about the first block, what PDE it comes from, what
>>> >> discretization, and what the "gluing business" is and maybe we'll have
>>> >> suggestions for how to precondition it.
>>> >>
>>> >>    Barry
>>> >>
>>> >> > On Apr 28, 2017, at 3:56 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
>>> wrote:
>>> >> >
>>> >> > It's in fact quite good
>>> >> >
>>> >> >     Residual norms for fieldsplit_u_ solve.
>>> >> >     0 KSP Residual norm 4.014715925568e+00
>>> >> >     1 KSP Residual norm 2.160497019264e-10
>>> >> >     Residual norms for fieldsplit_wp_ solve.
>>> >> >     0 KSP Residual norm 0.000000000000e+00
>>> >> >   0 KSP preconditioned resid norm 4.014715925568e+00 true resid norm
>>> >> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
>>> >> >     Residual norms for fieldsplit_u_ solve.
>>> >> >     0 KSP Residual norm 9.999999999416e-01
>>> >> >     1 KSP Residual norm 7.118380416383e-11
>>> >> >     Residual norms for fieldsplit_wp_ solve.
>>> >> >     0 KSP Residual norm 0.000000000000e+00
>>> >> >   1 KSP preconditioned resid norm 1.701150951035e-10 true resid norm
>>> >> 5.494262251846e-04 ||r(i)||/||b|| 6.100334726599e-11
>>> >> > Linear solve converged due to CONVERGED_ATOL iterations 1
>>> >> >
>>> >> > Giang
>>> >> >
>>> >> > On Thu, Apr 27, 2017 at 5:25 PM, Barry Smith <bsmith at mcs.anl.gov>
>>> wrote:
>>> >> >
>>> >> >   Run again using LU on both blocks to see what happens.
>>> >> >
>>> >> >
>>> >> > > On Apr 27, 2017, at 2:14 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
>>> >> wrote:
>>> >> > >
>>> >> > > I have changed the way to tie the nonconforming mesh. It seems the
>>> >> matrix now is better
>>> >> > >
>>> >> > > with -pc_type lu  the output is
>>> >> > >   0 KSP preconditioned resid norm 3.308678584240e-01 true resid
>>> norm
>>> >> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
>>> >> > >   1 KSP preconditioned resid norm 2.004313395301e-12 true resid
>>> norm
>>> >> 2.549872332830e-05 ||r(i)||/||b|| 2.831148938173e-12
>>> >> > > Linear solve converged due to CONVERGED_ATOL iterations 1
>>> >> > >
>>> >> > >
>>> >> > > with -pc_type fieldsplit  -fieldsplit_u_pc_type hypre
>>> >> -fieldsplit_wp_pc_type lu    the convergence is slow
>>> >> > >   0 KSP preconditioned resid norm 1.116302362553e-01 true resid
>>> norm
>>> >> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
>>> >> > >   1 KSP preconditioned resid norm 2.582134825666e-02 true resid
>>> norm
>>> >> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
>>> >> > > ...
>>> >> > > 824 KSP preconditioned resid norm 1.018542387738e-09 true resid
>>> norm
>>> >> 2.906608839310e+02 ||r(i)||/||b|| 3.227237074804e-05
>>> >> > > 825 KSP preconditioned resid norm 9.743727947637e-10 true resid
>>> norm
>>> >> 2.820369993061e+02 ||r(i)||/||b|| 3.131485215062e-05
>>> >> > > Linear solve converged due to CONVERGED_ATOL iterations 825
>>> >> > >
>>> >> > > checking with additional  -fieldsplit_u_ksp_type richardson
>>> >> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
>>> >> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
>>> >> -fieldsplit_wp_ksp_max_it 1  gives
>>> >> > >
>>> >> > >   0 KSP preconditioned resid norm 1.116302362553e-01 true resid
>>> norm
>>> >> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
>>> >> > >     Residual norms for fieldsplit_u_ solve.
>>> >> > >     0 KSP Residual norm 5.803507549280e-01
>>> >> > >     1 KSP Residual norm 2.069538175950e-01
>>> >> > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > >     0 KSP Residual norm 0.000000000000e+00
>>> >> > >   1 KSP preconditioned resid norm 2.582134825666e-02 true resid
>>> norm
>>> >> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
>>> >> > >     Residual norms for fieldsplit_u_ solve.
>>> >> > >     0 KSP Residual norm 7.831796195225e-01
>>> >> > >     1 KSP Residual norm 1.734608520110e-01
>>> >> > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > >     0 KSP Residual norm 0.000000000000e+00
>>> >> > > ....
>>> >> > > 823 KSP preconditioned resid norm 1.065070135605e-09 true resid
>>> norm
>>> >> 3.081881356833e+02 ||r(i)||/||b|| 3.421843916665e-05
>>> >> > >     Residual norms for fieldsplit_u_ solve.
>>> >> > >     0 KSP Residual norm 6.113806394327e-01
>>> >> > >     1 KSP Residual norm 1.535465290944e-01
>>> >> > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > >     0 KSP Residual norm 0.000000000000e+00
>>> >> > > 824 KSP preconditioned resid norm 1.018542387746e-09 true resid
>>> norm
>>> >> 2.906608839353e+02 ||r(i)||/||b|| 3.227237074851e-05
>>> >> > >     Residual norms for fieldsplit_u_ solve.
>>> >> > >     0 KSP Residual norm 6.123437055586e-01
>>> >> > >     1 KSP Residual norm 1.524661826133e-01
>>> >> > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > >     0 KSP Residual norm 0.000000000000e+00
>>> >> > > 825 KSP preconditioned resid norm 9.743727947718e-10 true resid
>>> norm
>>> >> 2.820369990571e+02 ||r(i)||/||b|| 3.131485212298e-05
>>> >> > > Linear solve converged due to CONVERGED_ATOL iterations 825
>>> >> > >
>>> >> > >
>>> >> > > The residual for wp block is zero since in this first step the
>>> rhs is
>>> >> zero. As can see in the output, the multigrid does not perform well to
>>> >> reduce the residual in the sub-solve. Is my observation right? what
>>> can be
>>> >> done to improve this?
>>> >> > >
>>> >> > >
>>> >> > > Giang
>>> >> > >
>>> >> > > On Tue, Apr 25, 2017 at 12:17 AM, Barry Smith <bsmith at mcs.anl.gov
>>> >
>>> >> wrote:
>>> >> > >
>>> >> > >    This can happen in the matrix is singular or nearly singular
>>> or if
>>> >> the factorization generates small pivots, which can occur for even
>>> >> nonsingular problems if the matrix is poorly scaled or just plain
>>> nasty.
>>> >> > >
>>> >> > >
>>> >> > > > On Apr 24, 2017, at 5:10 PM, Hoang Giang Bui <
>>> hgbk2008 at gmail.com>
>>> >> wrote:
>>> >> > > >
>>> >> > > > It took a while, here I send you the output
>>> >> > > >
>>> >> > > >   0 KSP preconditioned resid norm 3.129073545457e+05 true resid
>>> norm
>>> >> 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
>>> >> > > >   1 KSP preconditioned resid norm 7.442444222843e-01 true resid
>>> norm
>>> >> 1.003356247696e+02 ||r(i)||/||b|| 1.112966720375e-05
>>> >> > > >   2 KSP preconditioned resid norm 3.267453132529e-07 true resid
>>> norm
>>> >> 3.216722968300e+01 ||r(i)||/||b|| 3.568130084011e-06
>>> >> > > >   3 KSP preconditioned resid norm 1.155046883816e-11 true resid
>>> norm
>>> >> 3.234460376820e+01 ||r(i)||/||b|| 3.587805194854e-06
>>> >> > > > Linear solve converged due to CONVERGED_ATOL iterations 3
>>> >> > > > KSP Object: 4 MPI processes
>>> >> > > >   type: gmres
>>> >> > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>>> >> Orthogonalization
>>> >> > > >     GMRES: happy breakdown tolerance 1e-30
>>> >> > > >   maximum iterations=1000, initial guess is zero
>>> >> > > >   tolerances:  relative=1e-20, absolute=1e-09, divergence=10000
>>> >> > > >   left preconditioning
>>> >> > > >   using PRECONDITIONED norm type for convergence test
>>> >> > > > PC Object: 4 MPI processes
>>> >> > > >   type: lu
>>> >> > > >     LU: out-of-place factorization
>>> >> > > >     tolerance for zero pivot 2.22045e-14
>>> >> > > >     matrix ordering: natural
>>> >> > > >     factor fill ratio given 0, needed 0
>>> >> > > >       Factored matrix follows:
>>> >> > > >         Mat Object:         4 MPI processes
>>> >> > > >           type: mpiaij
>>> >> > > >           rows=973051, cols=973051
>>> >> > > >           package used to perform factorization: pastix
>>> >> > > >   Error :                        3.24786e-14
>>> >> > > >           total: nonzeros=0, allocated nonzeros=0
>>> >> > > >           total number of mallocs used during MatSetValues
>>> calls =0
>>> >> > > >             PaStiX run parameters:
>>> >> > > >               Matrix type :                      Unsymmetric
>>> >> > > >               Level of printing (0,1,2):         0
>>> >> > > >               Number of refinements iterations : 3
>>> >> > > >   Error :                        3.24786e-14
>>> >> > > >   linear system matrix = precond matrix:
>>> >> > > >   Mat Object:   4 MPI processes
>>> >> > > >     type: mpiaij
>>> >> > > >     rows=973051, cols=973051
>>> >> > > >   Error :                        3.24786e-14
>>> >> > > >     total: nonzeros=9.90037e+07, allocated nonzeros=9.90037e+07
>>> >> > > >     total number of mallocs used during MatSetValues calls =0
>>> >> > > >       using I-node (on process 0) routines: found 78749 nodes,
>>> limit
>>> >> used is 5
>>> >> > > >   Error :                        3.24786e-14
>>> >> > > >
>>> >> > > > It doesn't do as you said. Something is not right here. I will
>>> look
>>> >> in depth.
>>> >> > > >
>>> >> > > > Giang
>>> >> > > >
>>> >> > > > On Mon, Apr 24, 2017 at 8:21 PM, Barry Smith <
>>> bsmith at mcs.anl.gov>
>>> >> wrote:
>>> >> > > >
>>> >> > > > > On Apr 24, 2017, at 12:47 PM, Hoang Giang Bui <
>>> hgbk2008 at gmail.com>
>>> >> wrote:
>>> >> > > > >
>>> >> > > > > Good catch. I get this for the very first step, maybe at that
>>> time
>>> >> the rhs_w is zero.
>>> >> > > >
>>> >> > > >     With the multiplicative composition the right hand side of
>>> the
>>> >> second solve is the initial right hand side of the second solve minus
>>> >> A_10*x where x is the solution to the first sub solve and A_10 is the
>>> lower
>>> >> left block of the outer matrix. So unless both the initial right hand
>>> side
>>> >> has a zero for the second block and A_10 is identically zero the
>>> right hand
>>> >> side for the second sub solve should not be zero. Is A_10 == 0?
>>> >> > > >
>>> >> > > >
>>> >> > > > > In the later step, it shows 2 step convergence
>>> >> > > > >
>>> >> > > > > Residual norms for fieldsplit_u_ solve.
>>> >> > > > >     0 KSP Residual norm 3.165886479830e+04
>>> >> > > > >     1 KSP Residual norm 2.905922877684e-01
>>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > > > >     0 KSP Residual norm 2.397669419027e-01
>>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>>> >> > > > >   0 KSP preconditioned resid norm 3.165886479920e+04 true
>>> resid
>>> >> norm 7.963616922323e+05 ||r(i)||/||b|| 1.000000000000e+00
>>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>>> >> > > > >     0 KSP Residual norm 9.999891813771e-01
>>> >> > > > >     1 KSP Residual norm 1.512000395579e-05
>>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > > > >     0 KSP Residual norm 8.192702188243e-06
>>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>>> >> > > > >   1 KSP preconditioned resid norm 5.252183822848e-02 true
>>> resid
>>> >> norm 7.135927677844e+04 ||r(i)||/||b|| 8.960661653427e-02
>>> >> > > >
>>> >> > > >     The outer residual norms are still wonky, the preconditioned
>>> >> residual norm goes from 3.165886479920e+04 to 5.252183822848e-02
>>> which is a
>>> >> huge drop but the 7.963616922323e+05  drops very much less
>>> >> 7.135927677844e+04. This is not normal.
>>> >> > > >
>>> >> > > >    What if you just use -pc_type lu for the entire system (no
>>> >> fieldsplit), does the true residual drop to almost zero in the first
>>> >> iteration (as it should?). Send the output.
>>> >> > > >
>>> >> > > >
>>> >> > > >
>>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>>> >> > > > >     0 KSP Residual norm 6.946213936597e-01
>>> >> > > > >     1 KSP Residual norm 1.195514007343e-05
>>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > > > >     0 KSP Residual norm 1.025694497535e+00
>>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>>> >> > > > >   2 KSP preconditioned resid norm 8.785709535405e-03 true
>>> resid
>>> >> norm 1.419341799277e+04 ||r(i)||/||b|| 1.782282866091e-02
>>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>>> >> > > > >     0 KSP Residual norm 7.255149996405e-01
>>> >> > > > >     1 KSP Residual norm 6.583512434218e-06
>>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > > > >     0 KSP Residual norm 1.015229700337e+00
>>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>>> >> > > > >   3 KSP preconditioned resid norm 7.110407712709e-04 true
>>> resid
>>> >> norm 5.284940654154e+02 ||r(i)||/||b|| 6.636357205153e-04
>>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>>> >> > > > >     0 KSP Residual norm 3.512243341400e-01
>>> >> > > > >     1 KSP Residual norm 2.032490351200e-06
>>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > > > >     0 KSP Residual norm 1.282327290982e+00
>>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>>> >> > > > >   4 KSP preconditioned resid norm 3.482036620521e-05 true
>>> resid
>>> >> norm 4.291231924307e+01 ||r(i)||/||b|| 5.388546393133e-05
>>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>>> >> > > > >     0 KSP Residual norm 3.423609338053e-01
>>> >> > > > >     1 KSP Residual norm 4.213703301972e-07
>>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > > > >     0 KSP Residual norm 1.157384757538e+00
>>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>>> >> > > > >   5 KSP preconditioned resid norm 1.203470314534e-06 true
>>> resid
>>> >> norm 4.544956156267e+00 ||r(i)||/||b|| 5.707150658550e-06
>>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>>> >> > > > >     0 KSP Residual norm 3.838596289995e-01
>>> >> > > > >     1 KSP Residual norm 9.927864176103e-08
>>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > > > >     0 KSP Residual norm 1.066298905618e+00
>>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>>> >> > > > >   6 KSP preconditioned resid norm 3.331619244266e-08 true
>>> resid
>>> >> norm 2.821511729024e+00 ||r(i)||/||b|| 3.543002829675e-06
>>> >> > > > >     Residual norms for fieldsplit_u_ solve.
>>> >> > > > >     0 KSP Residual norm 4.624964188094e-01
>>> >> > > > >     1 KSP Residual norm 6.418229775372e-08
>>> >> > > > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > > > >     0 KSP Residual norm 9.800784311614e-01
>>> >> > > > >     1 KSP Residual norm 0.000000000000e+00
>>> >> > > > >   7 KSP preconditioned resid norm 8.788046233297e-10 true
>>> resid
>>> >> norm 2.849209671705e+00 ||r(i)||/||b|| 3.577783436215e-06
>>> >> > > > > Linear solve converged due to CONVERGED_ATOL iterations 7
>>> >> > > > >
>>> >> > > > > The outer operator is an explicit matrix.
>>> >> > > > >
>>> >> > > > > Giang
>>> >> > > > >
>>> >> > > > > On Mon, Apr 24, 2017 at 7:32 PM, Barry Smith <
>>> bsmith at mcs.anl.gov>
>>> >> wrote:
>>> >> > > > >
>>> >> > > > > > On Apr 24, 2017, at 3:16 AM, Hoang Giang Bui <
>>> hgbk2008 at gmail.com>
>>> >> wrote:
>>> >> > > > > >
>>> >> > > > > > Thanks Barry, trying with -fieldsplit_u_type lu gives better
>>> >> convergence. I still used 4 procs though, probably with 1 proc it
>>> should
>>> >> also be the same.
>>> >> > > > > >
>>> >> > > > > > The u block used a Nitsche-type operator to connect two
>>> >> non-matching domains. I don't think it will leave some rigid body
>>> motion
>>> >> leads to not sufficient constraints. Maybe you have other idea?
>>> >> > > > > >
>>> >> > > > > >     Residual norms for fieldsplit_u_ solve.
>>> >> > > > > >     0 KSP Residual norm 3.129067184300e+05
>>> >> > > > > >     1 KSP Residual norm 5.906261468196e-01
>>> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
>>> >> > > > >
>>> >> > > > >     ^^^^ something is wrong here. The sub solve should not be
>>> >> starting with a 0 residual (this means the right hand side for this
>>> sub
>>> >> solve is zero which it should not be).
>>> >> > > > >
>>> >> > > > > > FieldSplit with MULTIPLICATIVE composition: total splits = 2
>>> >> > > > >
>>> >> > > > >
>>> >> > > > >    How are you providing the outer operator? As an explicit
>>> matrix
>>> >> or with some shell matrix?
>>> >> > > > >
>>> >> > > > >
>>> >> > > > >
>>> >> > > > > >   0 KSP preconditioned resid norm 3.129067184300e+05 true
>>> resid
>>> >> norm 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
>>> >> > > > > >     Residual norms for fieldsplit_u_ solve.
>>> >> > > > > >     0 KSP Residual norm 9.999955993437e-01
>>> >> > > > > >     1 KSP Residual norm 4.019774691831e-06
>>> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
>>> >> > > > > >   1 KSP preconditioned resid norm 5.003913641475e-01 true
>>> resid
>>> >> norm 4.692996324114e+01 ||r(i)||/||b|| 5.205677185522e-06
>>> >> > > > > >     Residual norms for fieldsplit_u_ solve.
>>> >> > > > > >     0 KSP Residual norm 1.000012180204e+00
>>> >> > > > > >     1 KSP Residual norm 1.017367950422e-05
>>> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
>>> >> > > > > >   2 KSP preconditioned resid norm 2.330910333756e-07 true
>>> resid
>>> >> norm 3.474855463983e+01 ||r(i)||/||b|| 3.854461960453e-06
>>> >> > > > > >     Residual norms for fieldsplit_u_ solve.
>>> >> > > > > >     0 KSP Residual norm 1.000004200085e+00
>>> >> > > > > >     1 KSP Residual norm 6.231613102458e-06
>>> >> > > > > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > > > > >     0 KSP Residual norm 0.000000000000e+00
>>> >> > > > > >   3 KSP preconditioned resid norm 8.671259838389e-11 true
>>> resid
>>> >> norm 3.545103468011e+01 ||r(i)||/||b|| 3.932384125024e-06
>>> >> > > > > > Linear solve converged due to CONVERGED_ATOL iterations 3
>>> >> > > > > > KSP Object: 4 MPI processes
>>> >> > > > > >   type: gmres
>>> >> > > > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>>> >> Orthogonalization
>>> >> > > > > >     GMRES: happy breakdown tolerance 1e-30
>>> >> > > > > >   maximum iterations=1000, initial guess is zero
>>> >> > > > > >   tolerances:  relative=1e-20, absolute=1e-09,
>>> divergence=10000
>>> >> > > > > >   left preconditioning
>>> >> > > > > >   using PRECONDITIONED norm type for convergence test
>>> >> > > > > > PC Object: 4 MPI processes
>>> >> > > > > >   type: fieldsplit
>>> >> > > > > >     FieldSplit with MULTIPLICATIVE composition: total
>>> splits = 2
>>> >> > > > > >     Solver info for each split is in the following KSP
>>> objects:
>>> >> > > > > >     Split number 0 Defined by IS
>>> >> > > > > >     KSP Object:    (fieldsplit_u_)     4 MPI processes
>>> >> > > > > >       type: richardson
>>> >> > > > > >         Richardson: damping factor=1
>>> >> > > > > >       maximum iterations=1, initial guess is zero
>>> >> > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>>> >> divergence=10000
>>> >> > > > > >       left preconditioning
>>> >> > > > > >       using PRECONDITIONED norm type for convergence test
>>> >> > > > > >     PC Object:    (fieldsplit_u_)     4 MPI processes
>>> >> > > > > >       type: lu
>>> >> > > > > >         LU: out-of-place factorization
>>> >> > > > > >         tolerance for zero pivot 2.22045e-14
>>> >> > > > > >         matrix ordering: natural
>>> >> > > > > >         factor fill ratio given 0, needed 0
>>> >> > > > > >           Factored matrix follows:
>>> >> > > > > >             Mat Object:             4 MPI processes
>>> >> > > > > >               type: mpiaij
>>> >> > > > > >               rows=938910, cols=938910
>>> >> > > > > >               package used to perform factorization: pastix
>>> >> > > > > >               total: nonzeros=0, allocated nonzeros=0
>>> >> > > > > >       Error :                        3.36878e-14
>>> >> > > > > >           total number of mallocs used during MatSetValues
>>> calls
>>> >> =0
>>> >> > > > > >                 PaStiX run parameters:
>>> >> > > > > >                   Matrix type :
>>> Unsymmetric
>>> >> > > > > >                   Level of printing (0,1,2):         0
>>> >> > > > > >                   Number of refinements iterations : 3
>>> >> > > > > >   Error :                        3.36878e-14
>>> >> > > > > >       linear system matrix = precond matrix:
>>> >> > > > > >       Mat Object:      (fieldsplit_u_)       4 MPI processes
>>> >> > > > > >         type: mpiaij
>>> >> > > > > >         rows=938910, cols=938910, bs=3
>>> >> > > > > >   Error :                        3.36878e-14
>>> >> > > > > >   Error :                        3.36878e-14
>>> >> > > > > >         total: nonzeros=8.60906e+07, allocated
>>> >> nonzeros=8.60906e+07
>>> >> > > > > >         total number of mallocs used during MatSetValues
>>> calls =0
>>> >> > > > > >           using I-node (on process 0) routines: found 78749
>>> >> nodes, limit used is 5
>>> >> > > > > >     Split number 1 Defined by IS
>>> >> > > > > >     KSP Object:    (fieldsplit_wp_)     4 MPI processes
>>> >> > > > > >       type: richardson
>>> >> > > > > >         Richardson: damping factor=1
>>> >> > > > > >       maximum iterations=1, initial guess is zero
>>> >> > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>>> >> divergence=10000
>>> >> > > > > >       left preconditioning
>>> >> > > > > >       using PRECONDITIONED norm type for convergence test
>>> >> > > > > >     PC Object:    (fieldsplit_wp_)     4 MPI processes
>>> >> > > > > >       type: lu
>>> >> > > > > >         LU: out-of-place factorization
>>> >> > > > > >         tolerance for zero pivot 2.22045e-14
>>> >> > > > > >         matrix ordering: natural
>>> >> > > > > >         factor fill ratio given 0, needed 0
>>> >> > > > > >           Factored matrix follows:
>>> >> > > > > >             Mat Object:             4 MPI processes
>>> >> > > > > >               type: mpiaij
>>> >> > > > > >               rows=34141, cols=34141
>>> >> > > > > >               package used to perform factorization: pastix
>>> >> > > > > >                 Error :                        -nan
>>> >> > > > > >   Error :                        -nan
>>> >> > > > > >   Error :                        -nan
>>> >> > > > > > total: nonzeros=0, allocated nonzeros=0
>>> >> > > > > >               total number of mallocs used during
>>> MatSetValues
>>> >> calls =0
>>> >> > > > > >                 PaStiX run parameters:
>>> >> > > > > >                   Matrix type :
>>> Symmetric
>>> >> > > > > >                   Level of printing (0,1,2):         0
>>> >> > > > > >                   Number of refinements iterations : 0
>>> >> > > > > >   Error :                        -nan
>>> >> > > > > >       linear system matrix = precond matrix:
>>> >> > > > > >       Mat Object:      (fieldsplit_wp_)       4 MPI
>>> processes
>>> >> > > > > >         type: mpiaij
>>> >> > > > > >         rows=34141, cols=34141
>>> >> > > > > >         total: nonzeros=485655, allocated nonzeros=485655
>>> >> > > > > >         total number of mallocs used during MatSetValues
>>> calls =0
>>> >> > > > > >           not using I-node (on process 0) routines
>>> >> > > > > >   linear system matrix = precond matrix:
>>> >> > > > > >   Mat Object:   4 MPI processes
>>> >> > > > > >     type: mpiaij
>>> >> > > > > >     rows=973051, cols=973051
>>> >> > > > > >     total: nonzeros=9.90037e+07, allocated
>>> nonzeros=9.90037e+07
>>> >> > > > > >     total number of mallocs used during MatSetValues calls
>>> =0
>>> >> > > > > >       using I-node (on process 0) routines: found 78749
>>> nodes,
>>> >> limit used is 5
>>> >> > > > > >
>>> >> > > > > >
>>> >> > > > > >
>>> >> > > > > > Giang
>>> >> > > > > >
>>> >> > > > > > On Sun, Apr 23, 2017 at 10:19 PM, Barry Smith <
>>> >> bsmith at mcs.anl.gov> wrote:
>>> >> > > > > >
>>> >> > > > > > > On Apr 23, 2017, at 2:42 PM, Hoang Giang Bui <
>>> >> hgbk2008 at gmail.com> wrote:
>>> >> > > > > > >
>>> >> > > > > > > Dear Matt/Barry
>>> >> > > > > > >
>>> >> > > > > > > With your options, it results in
>>> >> > > > > > >
>>> >> > > > > > >   0 KSP preconditioned resid norm 1.106709687386e+31 true
>>> >> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
>>> >> > > > > > >     Residual norms for fieldsplit_u_ solve.
>>> >> > > > > > >     0 KSP Residual norm 2.407308987203e+36
>>> >> > > > > > >     1 KSP Residual norm 5.797185652683e+72
>>> >> > > > > >
>>> >> > > > > > It looks like Matt is right, hypre is seemly producing
>>> useless
>>> >> garbage.
>>> >> > > > > >
>>> >> > > > > > First how do things run on one process. If you have similar
>>> >> problems then debug on one process (debugging any kind of problem is
>>> always
>>> >> far easy on one process).
>>> >> > > > > >
>>> >> > > > > > First run with -fieldsplit_u_type lu (instead of using
>>> hypre) to
>>> >> see if that works or also produces something bad.
>>> >> > > > > >
>>> >> > > > > > What is the operator and the boundary conditions for u? It
>>> could
>>> >> be singular.
>>> >> > > > > >
>>> >> > > > > >
>>> >> > > > > >
>>> >> > > > > >
>>> >> > > > > >
>>> >> > > > > >
>>> >> > > > > > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > > > > > >     0 KSP Residual norm 0.000000000000e+00
>>> >> > > > > > > ...
>>> >> > > > > > > 999 KSP preconditioned resid norm 2.920157329174e+12 true
>>> >> resid norm 9.015683504616e+06 ||r(i)||/||b|| 1.000059124102e+00
>>> >> > > > > > >     Residual norms for fieldsplit_u_ solve.
>>> >> > > > > > >     0 KSP Residual norm 1.533726746719e+36
>>> >> > > > > > >     1 KSP Residual norm 3.692757392261e+72
>>> >> > > > > > >     Residual norms for fieldsplit_wp_ solve.
>>> >> > > > > > >     0 KSP Residual norm 0.000000000000e+00
>>> >> > > > > > >
>>> >> > > > > > > Do you suggest that the pastix solver for the "wp" block
>>> >> encounters small pivot? In addition, seem like the "u" block is also
>>> >> singular.
>>> >> > > > > > >
>>> >> > > > > > > Giang
>>> >> > > > > > >
>>> >> > > > > > > On Sun, Apr 23, 2017 at 7:39 PM, Barry Smith <
>>> >> bsmith at mcs.anl.gov> wrote:
>>> >> > > > > > >
>>> >> > > > > > >    Huge preconditioned norms but normal unpreconditioned
>>> norms
>>> >> almost always come from a very small pivot in an LU or ILU
>>> factorization.
>>> >> > > > > > >
>>> >> > > > > > >    The first thing to do is monitor the two sub solves.
>>> Run
>>> >> with the additional options -fieldsplit_u_ksp_type richardson
>>> >> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
>>> >> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
>>> >> -fieldsplit_wp_ksp_max_it 1
>>> >> > > > > > >
>>> >> > > > > > > > On Apr 23, 2017, at 12:22 PM, Hoang Giang Bui <
>>> >> hgbk2008 at gmail.com> wrote:
>>> >> > > > > > > >
>>> >> > > > > > > > Hello
>>> >> > > > > > > >
>>> >> > > > > > > > I encountered a strange convergence behavior that I have
>>> >> trouble to understand
>>> >> > > > > > > >
>>> >> > > > > > > > KSPSetFromOptions completed
>>> >> > > > > > > >   0 KSP preconditioned resid norm 1.106709687386e+31
>>> true
>>> >> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
>>> >> > > > > > > >   1 KSP preconditioned resid norm 2.933141742664e+29
>>> true
>>> >> resid norm 9.015152282123e+06 ||r(i)||/||b|| 1.000000198575e+00
>>> >> > > > > > > >   2 KSP preconditioned resid norm 9.686409637174e+16
>>> true
>>> >> resid norm 9.015354521944e+06 ||r(i)||/||b|| 1.000022631902e+00
>>> >> > > > > > > >   3 KSP preconditioned resid norm 4.219243615809e+15
>>> true
>>> >> resid norm 9.017157702420e+06 ||r(i)||/||b|| 1.000222648583e+00
>>> >> > > > > > > > .....
>>> >> > > > > > > > 999 KSP preconditioned resid norm 3.043754298076e+12
>>> true
>>> >> resid norm 9.015425041089e+06 ||r(i)||/||b|| 1.000030454195e+00
>>> >> > > > > > > > 1000 KSP preconditioned resid norm 3.043000287819e+12
>>> true
>>> >> resid norm 9.015424313455e+06 ||r(i)||/||b|| 1.000030373483e+00
>>> >> > > > > > > > Linear solve did not converge due to DIVERGED_ITS
>>> iterations
>>> >> 1000
>>> >> > > > > > > > KSP Object: 4 MPI processes
>>> >> > > > > > > >   type: gmres
>>> >> > > > > > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>>> >> Orthogonalization
>>> >> > > > > > > >     GMRES: happy breakdown tolerance 1e-30
>>> >> > > > > > > >   maximum iterations=1000, initial guess is zero
>>> >> > > > > > > >   tolerances:  relative=1e-20, absolute=1e-09,
>>> >> divergence=10000
>>> >> > > > > > > >   left preconditioning
>>> >> > > > > > > >   using PRECONDITIONED norm type for convergence test
>>> >> > > > > > > > PC Object: 4 MPI processes
>>> >> > > > > > > >   type: fieldsplit
>>> >> > > > > > > >     FieldSplit with MULTIPLICATIVE composition: total
>>> splits
>>> >> = 2
>>> >> > > > > > > >     Solver info for each split is in the following KSP
>>> >> objects:
>>> >> > > > > > > >     Split number 0 Defined by IS
>>> >> > > > > > > >     KSP Object:    (fieldsplit_u_)     4 MPI processes
>>> >> > > > > > > >       type: preonly
>>> >> > > > > > > >       maximum iterations=10000, initial guess is zero
>>> >> > > > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>>> >> divergence=10000
>>> >> > > > > > > >       left preconditioning
>>> >> > > > > > > >       using NONE norm type for convergence test
>>> >> > > > > > > >     PC Object:    (fieldsplit_u_)     4 MPI processes
>>> >> > > > > > > >       type: hypre
>>> >> > > > > > > >         HYPRE BoomerAMG preconditioning
>>> >> > > > > > > >         HYPRE BoomerAMG: Cycle type V
>>> >> > > > > > > >         HYPRE BoomerAMG: Maximum number of levels 25
>>> >> > > > > > > >         HYPRE BoomerAMG: Maximum number of iterations
>>> PER
>>> >> hypre call 1
>>> >> > > > > > > >         HYPRE BoomerAMG: Convergence tolerance PER hypre
>>> >> call 0
>>> >> > > > > > > >         HYPRE BoomerAMG: Threshold for strong coupling
>>> 0.6
>>> >> > > > > > > >         HYPRE BoomerAMG: Interpolation truncation
>>> factor 0
>>> >> > > > > > > >         HYPRE BoomerAMG: Interpolation: max elements
>>> per row
>>> >> 0
>>> >> > > > > > > >         HYPRE BoomerAMG: Number of levels of aggressive
>>> >> coarsening 0
>>> >> > > > > > > >         HYPRE BoomerAMG: Number of paths for aggressive
>>> >> coarsening 1
>>> >> > > > > > > >         HYPRE BoomerAMG: Maximum row sums 0.9
>>> >> > > > > > > >         HYPRE BoomerAMG: Sweeps down         1
>>> >> > > > > > > >         HYPRE BoomerAMG: Sweeps up           1
>>> >> > > > > > > >         HYPRE BoomerAMG: Sweeps on coarse    1
>>> >> > > > > > > >         HYPRE BoomerAMG: Relax down
>>> >> symmetric-SOR/Jacobi
>>> >> > > > > > > >         HYPRE BoomerAMG: Relax up
>>> >> symmetric-SOR/Jacobi
>>> >> > > > > > > >         HYPRE BoomerAMG: Relax on coarse
>>> >>  Gaussian-elimination
>>> >> > > > > > > >         HYPRE BoomerAMG: Relax weight  (all)      1
>>> >> > > > > > > >         HYPRE BoomerAMG: Outer relax weight (all) 1
>>> >> > > > > > > >         HYPRE BoomerAMG: Using CF-relaxation
>>> >> > > > > > > >         HYPRE BoomerAMG: Measure type        local
>>> >> > > > > > > >         HYPRE BoomerAMG: Coarsen type        PMIS
>>> >> > > > > > > >         HYPRE BoomerAMG: Interpolation type  classical
>>> >> > > > > > > >       linear system matrix = precond matrix:
>>> >> > > > > > > >       Mat Object:      (fieldsplit_u_)       4 MPI
>>> processes
>>> >> > > > > > > >         type: mpiaij
>>> >> > > > > > > >         rows=938910, cols=938910, bs=3
>>> >> > > > > > > >         total: nonzeros=8.60906e+07, allocated
>>> >> nonzeros=8.60906e+07
>>> >> > > > > > > >         total number of mallocs used during MatSetValues
>>> >> calls =0
>>> >> > > > > > > >           using I-node (on process 0) routines: found
>>> 78749
>>> >> nodes, limit used is 5
>>> >> > > > > > > >     Split number 1 Defined by IS
>>> >> > > > > > > >     KSP Object:    (fieldsplit_wp_)     4 MPI processes
>>> >> > > > > > > >       type: preonly
>>> >> > > > > > > >       maximum iterations=10000, initial guess is zero
>>> >> > > > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>>> >> divergence=10000
>>> >> > > > > > > >       left preconditioning
>>> >> > > > > > > >       using NONE norm type for convergence test
>>> >> > > > > > > >     PC Object:    (fieldsplit_wp_)     4 MPI processes
>>> >> > > > > > > >       type: lu
>>> >> > > > > > > >         LU: out-of-place factorization
>>> >> > > > > > > >         tolerance for zero pivot 2.22045e-14
>>> >> > > > > > > >         matrix ordering: natural
>>> >> > > > > > > >         factor fill ratio given 0, needed 0
>>> >> > > > > > > >           Factored matrix follows:
>>> >> > > > > > > >             Mat Object:             4 MPI processes
>>> >> > > > > > > >               type: mpiaij
>>> >> > > > > > > >               rows=34141, cols=34141
>>> >> > > > > > > >               package used to perform factorization:
>>> pastix
>>> >> > > > > > > >             Error :                        -nan
>>> >> > > > > > > >   Error :                        -nan
>>> >> > > > > > > >     total: nonzeros=0, allocated nonzeros=0
>>> >> > > > > > > >             Error :                        -nan
>>> >> > > > > > > >     total number of mallocs used during MatSetValues
>>> calls =0
>>> >> > > > > > > >                 PaStiX run parameters:
>>> >> > > > > > > >                   Matrix type :
>>> >> Symmetric
>>> >> > > > > > > >                   Level of printing (0,1,2):         0
>>> >> > > > > > > >                   Number of refinements iterations : 0
>>> >> > > > > > > >   Error :                        -nan
>>> >> > > > > > > >       linear system matrix = precond matrix:
>>> >> > > > > > > >       Mat Object:      (fieldsplit_wp_)       4 MPI
>>> processes
>>> >> > > > > > > >         type: mpiaij
>>> >> > > > > > > >         rows=34141, cols=34141
>>> >> > > > > > > >         total: nonzeros=485655, allocated
>>> nonzeros=485655
>>> >> > > > > > > >         total number of mallocs used during MatSetValues
>>> >> calls =0
>>> >> > > > > > > >           not using I-node (on process 0) routines
>>> >> > > > > > > >   linear system matrix = precond matrix:
>>> >> > > > > > > >   Mat Object:   4 MPI processes
>>> >> > > > > > > >     type: mpiaij
>>> >> > > > > > > >     rows=973051, cols=973051
>>> >> > > > > > > >     total: nonzeros=9.90037e+07, allocated
>>> >> nonzeros=9.90037e+07
>>> >> > > > > > > >     total number of mallocs used during MatSetValues
>>> calls =0
>>> >> > > > > > > >       using I-node (on process 0) routines: found 78749
>>> >> nodes, limit used is 5
>>> >> > > > > > > >
>>> >> > > > > > > > The pattern of convergence gives a hint that this
>>> system is
>>> >> somehow bad/singular. But I don't know why the preconditioned error
>>> goes up
>>> >> too high. Anyone has an idea?
>>> >> > > > > > > >
>>> >> > > > > > > > Best regards
>>> >> > > > > > > > Giang Bui
>>> >> > > > > > > >
>>> >> > > > > > >
>>> >> > > > > > >
>>> >> > > > > >
>>> >> > > > > >
>>> >> > > > >
>>> >> > > > >
>>> >> > > >
>>> >> > > >
>>> >> > >
>>> >> > >
>>> >> >
>>> >> >
>>> >>
>>> >>
>>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170503/c9a34020/attachment-0001.html>

From Lukasz.Kaczmarczyk at glasgow.ac.uk  Wed May  3 18:19:51 2017
From: Lukasz.Kaczmarczyk at glasgow.ac.uk (Lukasz Kaczmarczyk)
Date: Wed, 3 May 2017 23:19:51 +0000
Subject: [petsc-users] strange convergence
In-Reply-To: <CAJW_hKfZNcswbD3j7KCwWyXRNcxv2QfBfwNjnSo8mKsqiiRyCA@mail.gmail.com>
References: <CAJW_hKcJN520Q=0cAJyAQ=ZF1Wh0f63=5AXSJ5MSyXMFv2AnNQ@mail.gmail.com>
	<7891536D-91FE-4BFF-8DAD-CE7AB85A4E57@mcs.anl.gov>
	<CAJW_hKfFpmJRx+XHSZxbdPEK8_B1sWQTATKVaAZu6vi-1uALqg@mail.gmail.com>
	<425BBB58-9721-49F3-8C86-940F08E925F7@mcs.anl.gov>
	<CAJW_hKd+GAwf4ZS3_337n3PAJmxC6dm2B5uGDS88nrH-KjV60g@mail.gmail.com>
	<A1749310-AE79-4D42-A14F-991DA59F4919@mcs.anl.gov>
	<CAJW_hKd+m3p+Oju7XSPha=_AqYbGD_uF0ZV6qe8HzxbFWZ_2Zg@mail.gmail.com>
	<42EB791A-40C2-439F-A5F7-5F8C15CECA6F@mcs.anl.gov>
	<CAJW_hKd=d2NMxiaU1h6Ro3S3f_iaY294yRJyCG_tb1KN6Wiiug@mail.gmail.com>
	<F5D2FC76-C3A2-4FC8-8144-0D7500F26547@mcs.anl.gov>
	<CAJW_hKc7ZnhAGrq=wAgr3kyBw536GSDL+deDvfmpwkwnmVzHzw@mail.gmail.com>
	<82193784-B4C4-47D7-80EA-25F549C9091B@mcs.anl.gov>
	<CAJW_hKeiozsxQ7XkVneArg_0UYiGXPU7Zovn_PMkNWFrix9w7A@mail.gmail.com>
	<B3AE9D6B-7826-4ED2-B9A2-12250D37BB01@mcs.anl.gov>
	<CAJW_hKfzOV3dDjjgZGvexAm_Sg+8N8fb82J_-iMfxJ2oYvBm9w@mail.gmail.com>
	<87wpa3wd5j.fsf@jedbrown.org>
	<CAJW_hKdxHMKtN5P7edsZ=V2VTeWebR=D6chkZTMy5L=2fTsT+g@mail.gmail.com>
	<CAMYG4GkBEJCYKS5fn72MceU_WD1QhnitmDRZg1ivGaenzLYLDw@mail.gmail.com>
	<DCBF2435-8966-4FBE-9261-54D41460BD06@glasgow.ac.uk>
	<CAJW_hKfZNcswbD3j7KCwWyXRNcxv2QfBfwNjnSo8mKsqiiRyCA@mail.gmail.com>
Message-ID: <4E1AE917-3FB2-4EE9-88A5-AB3EA612D5D6@glasgow.ac.uk>


On 3 May 2017, at 22:19, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> wrote:

Hi Lukasz, thanks for sharing very interesting slide.

Both of you are right, the mortar method starts from continuum argument then reduce to discrete space by discretizing the Lagrange multiplier. However, the way to choose the interpolation space has some implication on the properties of the mortar matrices. For example, the dual mortar space can help to reduce the multiplier by static condensation but it creates some numerical oscillation. In my opinion I think it's not stable despite a very sound theoretical foundation is developed. Both standard and dual mortar approach impose some drawbacks for high order contact because of negative shape function can create some spurious negative nodal gap. How do you cope with that case in your code?

We exploit that we can set approx. order independently to Lagrange multiplier and displacements. Having hierarchical base, we have dofs on vertices, edges, faces and volumes (in a case of displacements) we can apply order to each entity independently.  This gives us some control over spurious modes,  we do not see them at the moment, but this is work in progress, and we do not enough testing. It is as well important where you approximate Lagrange multipliers, master or slave side.  We have as well some flexibility of choosing a base; right choice would be to use Bernstein polynomial which has only positive values. We do have them yet, but we can use now Legendre, Lobatto (integrated Legendre) or Jacobi. Many things to test, more to read, and see what will happen.

However, this question may be a bit off-topic. Come back to the main question, for mesh "gluing" using mortar method, the Schur matrix is S=[0 -D^T M^T] A^-1 [0 D M]^T, has the form of A_10 (A_00)^-1 A01 since A_11=0. The magnitude of S (~E^-1) is too small compare to A_00 (which is ~E for elasticity). I think in some case it's also rank deficient if three Lagrange multiplier is used per node (S is very ill-conditioned although A_00 is well). I'm skeptical here do you really solve the system of mortar with schur complement?

You could be right here about Schur complement. We do not try this yet, but note that you can multiply constraint by the scalar, this is exploited for example in Popp_et_al-2009-A finite deformation mortar contact formulation using a primal?dual active set strategy, where constrain equation is scaled by constant equal to the young modulus.

Lukasz

Giang

On Wed, May 3, 2017 at 2:55 PM, Lukasz Kaczmarczyk <Lukasz.Kaczmarczyk at glasgow.ac.uk<mailto:Lukasz.Kaczmarczyk at glasgow.ac.uk>> wrote:

On 3 May 2017, at 13:22, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:

On Wed, May 3, 2017 at 2:29 AM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> wrote:
Dear Jed

If I understood you correctly you suggest to avoid penalty by using the Lagrange multiplier for the mortar constraint? In this case it leads to the use of discrete Lagrange multiplier space.

Sorry for being ignorant here, but why is the space "discrete"? It looks like you should have a continuum formulation
of the mortar as well. Maybe I do not understand something fundamental. From this (https://en.wikipedia.org/wiki/Mortar_methods)
short description, it seems that mortars begin from a continuum formulation, but are then reduced to the discrete level. This is no
problem if done consistently, as for instance in the FETI method where efficient preconditioners exist.


Hello,

I copied the wrong link to mortar method, how we implemented it, see presentation http://doi.org/10.5281/zenodo.556996

You right that we always start from continuum formulation, on this we apply some discretisation, at the end Lagrange multiplier is expressed by a finite vector of discrete unknowns. It is better to formulate problem first for the continuum; you have better control on what you are doing and stability of the solution.

Of course, you can add some constraints at the discreet level, after you discretised problem, but implicitly you have some continuous space for Lagrange multipliers, which is associated with shape functions which you use to discretise problem.

In our problem which we have,  we try to avoid rebuilding of the system of equations each time contact area is changing. We going to construct DM sub-problem for each body in contact, each sub-problem going to be solved using MG (adjacency for those matrices is fixed in time).  All will go to put in nested matrix with the separate block for Lagrange multipliers (adjacency will change in each time step).  For solving  Lagrange multipliers we going to use FIELDSPLIT using Schur complement. I need to look more detail to FETI method, at are still at development stage for contact problem and direct solver works, for now, small problems at that point.

In our code, we using higher order elements with hierarchical base,  for this we using specialise MG solver, as you can see here, it works pretty well for moderate size problems, <100M
http://mofem.eng.gla.ac.uk/mofem/html/_p_c_m_g_set_up_via_approx_orders_8cpp.html

Regards,
Lukasz



 Thanks,

    Matt

Do you or anyone already have experience using discrete Lagrange multiplier space with Petsc?

There is also similar question on stackexchange
https://scicomp.stackexchange.com/questions/25113/preconditioners-and-discrete-lagrange-multipliers

Giang

On Sat, Apr 29, 2017 at 3:34 PM, Jed Brown <jed at jedbrown.org<mailto:jed at jedbrown.org>> wrote:
Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> writes:

> Hi Barry
>
> The first block is from a standard solid mechanics discretization based on
> balance of momentum equation. There is some material involved but in
> principal it's well-posed elasticity equation with positive definite
> tangent operator. The "gluing business" uses the mortar method to keep the
> continuity of displacement. Instead of using Lagrange multiplier to treat
> the constraint I used penalty method to penalize the energy. The
> discretization form of mortar is quite simple
>
> \int_{\Gamma_1} { rho * (\delta u_1 - \delta u_2) * (u_1 - u_2) dA }
>
> rho is penalty parameter. In the simulation I initially set it low (~E) to
> preserve the conditioning of the system.

There are two things that can go wrong here with AMG:

* The penalty term can mess up the strength of connection heuristics
  such that you get poor choice of C-points (classical AMG like
  BoomerAMG) or poor choice of aggregates (smoothed aggregation).

* The penalty term can prevent Jacobi smoothing from being effective; in
  this case, it can lead to poor coarse basis functions (higher energy
  than they should be) and poor smoothing in an MG cycle.  You can fix
  the poor smoothing in the MG cycle by using a stronger smoother, like
  ASM with some overlap.

I'm generally not a fan of penalty methods due to the irritating
tradeoffs and often poor solver performance.

> In the figure below, the colorful blocks are u_1 and the base is u_2. Both
> u_1 and u_2 use isoparametric quadratic approximation.
>
> ?
>  Snapshot.png
> <https://drive.google.com/file/d/0Bw8Hmu0-YGQXc2hKQ1BhQ1I4OEU/view?usp=drive_web>
> ???
>
> Giang
>
> On Fri, Apr 28, 2017 at 6:21 PM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>> wrote:
>
>>
>>   Ok, so boomerAMG algebraic multigrid is not good for the first block.
>> You mentioned the first block has two things glued together? AMG is
>> fantastic for certain problems but doesn't work for everything.
>>
>>    Tell us more about the first block, what PDE it comes from, what
>> discretization, and what the "gluing business" is and maybe we'll have
>> suggestions for how to precondition it.
>>
>>    Barry
>>
>> > On Apr 28, 2017, at 3:56 AM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> wrote:
>> >
>> > It's in fact quite good
>> >
>> >     Residual norms for fieldsplit_u_ solve.
>> >     0 KSP Residual norm 4.014715925568e+00
>> >     1 KSP Residual norm 2.160497019264e-10
>> >     Residual norms for fieldsplit_wp_ solve.
>> >     0 KSP Residual norm 0.000000000000e+00
>> >   0 KSP preconditioned resid norm 4.014715925568e+00 true resid norm
>> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
>> >     Residual norms for fieldsplit_u_ solve.
>> >     0 KSP Residual norm 9.999999999416e-01
>> >     1 KSP Residual norm 7.118380416383e-11
>> >     Residual norms for fieldsplit_wp_ solve.
>> >     0 KSP Residual norm 0.000000000000e+00
>> >   1 KSP preconditioned resid norm 1.701150951035e-10 true resid norm
>> 5.494262251846e-04 ||r(i)||/||b|| 6.100334726599e-11
>> > Linear solve converged due to CONVERGED_ATOL iterations 1
>> >
>> > Giang
>> >
>> > On Thu, Apr 27, 2017 at 5:25 PM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>> wrote:
>> >
>> >   Run again using LU on both blocks to see what happens.
>> >
>> >
>> > > On Apr 27, 2017, at 2:14 AM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>>
>> wrote:
>> > >
>> > > I have changed the way to tie the nonconforming mesh. It seems the
>> matrix now is better
>> > >
>> > > with -pc_type lu  the output is
>> > >   0 KSP preconditioned resid norm 3.308678584240e-01 true resid norm
>> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > >   1 KSP preconditioned resid norm 2.004313395301e-12 true resid norm
>> 2.549872332830e-05 ||r(i)||/||b|| 2.831148938173e-12
>> > > Linear solve converged due to CONVERGED_ATOL iterations 1
>> > >
>> > >
>> > > with -pc_type fieldsplit  -fieldsplit_u_pc_type hypre
>> -fieldsplit_wp_pc_type lu    the convergence is slow
>> > >   0 KSP preconditioned resid norm 1.116302362553e-01 true resid norm
>> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > >   1 KSP preconditioned resid norm 2.582134825666e-02 true resid norm
>> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
>> > > ...
>> > > 824 KSP preconditioned resid norm 1.018542387738e-09 true resid norm
>> 2.906608839310e+02 ||r(i)||/||b|| 3.227237074804e-05
>> > > 825 KSP preconditioned resid norm 9.743727947637e-10 true resid norm
>> 2.820369993061e+02 ||r(i)||/||b|| 3.131485215062e-05
>> > > Linear solve converged due to CONVERGED_ATOL iterations 825
>> > >
>> > > checking with additional  -fieldsplit_u_ksp_type richardson
>> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
>> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
>> -fieldsplit_wp_ksp_max_it 1  gives
>> > >
>> > >   0 KSP preconditioned resid norm 1.116302362553e-01 true resid norm
>> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > >     Residual norms for fieldsplit_u_ solve.
>> > >     0 KSP Residual norm 5.803507549280e-01
>> > >     1 KSP Residual norm 2.069538175950e-01
>> > >     Residual norms for fieldsplit_wp_ solve.
>> > >     0 KSP Residual norm 0.000000000000e+00
>> > >   1 KSP preconditioned resid norm 2.582134825666e-02 true resid norm
>> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
>> > >     Residual norms for fieldsplit_u_ solve.
>> > >     0 KSP Residual norm 7.831796195225e-01
>> > >     1 KSP Residual norm 1.734608520110e-01
>> > >     Residual norms for fieldsplit_wp_ solve.
>> > >     0 KSP Residual norm 0.000000000000e+00
>> > > ....
>> > > 823 KSP preconditioned resid norm 1.065070135605e-09 true resid norm
>> 3.081881356833e+02 ||r(i)||/||b|| 3.421843916665e-05
>> > >     Residual norms for fieldsplit_u_ solve.
>> > >     0 KSP Residual norm 6.113806394327e-01
>> > >     1 KSP Residual norm 1.535465290944e-01
>> > >     Residual norms for fieldsplit_wp_ solve.
>> > >     0 KSP Residual norm 0.000000000000e+00
>> > > 824 KSP preconditioned resid norm 1.018542387746e-09 true resid norm
>> 2.906608839353e+02 ||r(i)||/||b|| 3.227237074851e-05
>> > >     Residual norms for fieldsplit_u_ solve.
>> > >     0 KSP Residual norm 6.123437055586e-01
>> > >     1 KSP Residual norm 1.524661826133e-01
>> > >     Residual norms for fieldsplit_wp_ solve.
>> > >     0 KSP Residual norm 0.000000000000e+00
>> > > 825 KSP preconditioned resid norm 9.743727947718e-10 true resid norm
>> 2.820369990571e+02 ||r(i)||/||b|| 3.131485212298e-05
>> > > Linear solve converged due to CONVERGED_ATOL iterations 825
>> > >
>> > >
>> > > The residual for wp block is zero since in this first step the rhs is
>> zero. As can see in the output, the multigrid does not perform well to
>> reduce the residual in the sub-solve. Is my observation right? what can be
>> done to improve this?
>> > >
>> > >
>> > > Giang
>> > >
>> > > On Tue, Apr 25, 2017 at 12:17 AM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>>
>> wrote:
>> > >
>> > >    This can happen in the matrix is singular or nearly singular or if
>> the factorization generates small pivots, which can occur for even
>> nonsingular problems if the matrix is poorly scaled or just plain nasty.
>> > >
>> > >
>> > > > On Apr 24, 2017, at 5:10 PM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>>
>> wrote:
>> > > >
>> > > > It took a while, here I send you the output
>> > > >
>> > > >   0 KSP preconditioned resid norm 3.129073545457e+05 true resid norm
>> 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > > >   1 KSP preconditioned resid norm 7.442444222843e-01 true resid norm
>> 1.003356247696e+02 ||r(i)||/||b|| 1.112966720375e-05
>> > > >   2 KSP preconditioned resid norm 3.267453132529e-07 true resid norm
>> 3.216722968300e+01 ||r(i)||/||b|| 3.568130084011e-06
>> > > >   3 KSP preconditioned resid norm 1.155046883816e-11 true resid norm
>> 3.234460376820e+01 ||r(i)||/||b|| 3.587805194854e-06
>> > > > Linear solve converged due to CONVERGED_ATOL iterations 3
>> > > > KSP Object: 4 MPI processes
>> > > >   type: gmres
>> > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> Orthogonalization
>> > > >     GMRES: happy breakdown tolerance 1e-30
>> > > >   maximum iterations=1000, initial guess is zero
>> > > >   tolerances:  relative=1e-20, absolute=1e-09, divergence=10000
>> > > >   left preconditioning
>> > > >   using PRECONDITIONED norm type for convergence test
>> > > > PC Object: 4 MPI processes
>> > > >   type: lu
>> > > >     LU: out-of-place factorization
>> > > >     tolerance for zero pivot 2.22045e-14
>> > > >     matrix ordering: natural
>> > > >     factor fill ratio given 0, needed 0
>> > > >       Factored matrix follows:
>> > > >         Mat Object:         4 MPI processes
>> > > >           type: mpiaij
>> > > >           rows=973051, cols=973051
>> > > >           package used to perform factorization: pastix
>> > > >   Error :                        3.24786e-14
>> > > >           total: nonzeros=0, allocated nonzeros=0
>> > > >           total number of mallocs used during MatSetValues calls =0
>> > > >             PaStiX run parameters:
>> > > >               Matrix type :                      Unsymmetric
>> > > >               Level of printing (0,1,2):         0
>> > > >               Number of refinements iterations : 3
>> > > >   Error :                        3.24786e-14
>> > > >   linear system matrix = precond matrix:
>> > > >   Mat Object:   4 MPI processes
>> > > >     type: mpiaij
>> > > >     rows=973051, cols=973051
>> > > >   Error :                        3.24786e-14
>> > > >     total: nonzeros=9.90037e+07, allocated nonzeros=9.90037e+07
>> > > >     total number of mallocs used during MatSetValues calls =0
>> > > >       using I-node (on process 0) routines: found 78749 nodes, limit
>> used is 5
>> > > >   Error :                        3.24786e-14
>> > > >
>> > > > It doesn't do as you said. Something is not right here. I will look
>> in depth.
>> > > >
>> > > > Giang
>> > > >
>> > > > On Mon, Apr 24, 2017 at 8:21 PM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>>
>> wrote:
>> > > >
>> > > > > On Apr 24, 2017, at 12:47 PM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>>
>> wrote:
>> > > > >
>> > > > > Good catch. I get this for the very first step, maybe at that time
>> the rhs_w is zero.
>> > > >
>> > > >     With the multiplicative composition the right hand side of the
>> second solve is the initial right hand side of the second solve minus
>> A_10*x where x is the solution to the first sub solve and A_10 is the lower
>> left block of the outer matrix. So unless both the initial right hand side
>> has a zero for the second block and A_10 is identically zero the right hand
>> side for the second sub solve should not be zero. Is A_10 == 0?
>> > > >
>> > > >
>> > > > > In the later step, it shows 2 step convergence
>> > > > >
>> > > > > Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 3.165886479830e+04
>> > > > >     1 KSP Residual norm 2.905922877684e-01
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 2.397669419027e-01
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   0 KSP preconditioned resid norm 3.165886479920e+04 true resid
>> norm 7.963616922323e+05 ||r(i)||/||b|| 1.000000000000e+00
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 9.999891813771e-01
>> > > > >     1 KSP Residual norm 1.512000395579e-05
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 8.192702188243e-06
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   1 KSP preconditioned resid norm 5.252183822848e-02 true resid
>> norm 7.135927677844e+04 ||r(i)||/||b|| 8.960661653427e-02
>> > > >
>> > > >     The outer residual norms are still wonky, the preconditioned
>> residual norm goes from 3.165886479920e+04 to 5.252183822848e-02 which is a
>> huge drop but the 7.963616922323e+05  drops very much less
>> 7.135927677844e+04. This is not normal.
>> > > >
>> > > >    What if you just use -pc_type lu for the entire system (no
>> fieldsplit), does the true residual drop to almost zero in the first
>> iteration (as it should?). Send the output.
>> > > >
>> > > >
>> > > >
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 6.946213936597e-01
>> > > > >     1 KSP Residual norm 1.195514007343e-05
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.025694497535e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   2 KSP preconditioned resid norm 8.785709535405e-03 true resid
>> norm 1.419341799277e+04 ||r(i)||/||b|| 1.782282866091e-02
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 7.255149996405e-01
>> > > > >     1 KSP Residual norm 6.583512434218e-06
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.015229700337e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   3 KSP preconditioned resid norm 7.110407712709e-04 true resid
>> norm 5.284940654154e+02 ||r(i)||/||b|| 6.636357205153e-04
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 3.512243341400e-01
>> > > > >     1 KSP Residual norm 2.032490351200e-06
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.282327290982e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   4 KSP preconditioned resid norm 3.482036620521e-05 true resid
>> norm 4.291231924307e+01 ||r(i)||/||b|| 5.388546393133e-05
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 3.423609338053e-01
>> > > > >     1 KSP Residual norm 4.213703301972e-07
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.157384757538e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   5 KSP preconditioned resid norm 1.203470314534e-06 true resid
>> norm 4.544956156267e+00 ||r(i)||/||b|| 5.707150658550e-06
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 3.838596289995e-01
>> > > > >     1 KSP Residual norm 9.927864176103e-08
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 1.066298905618e+00
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   6 KSP preconditioned resid norm 3.331619244266e-08 true resid
>> norm 2.821511729024e+00 ||r(i)||/||b|| 3.543002829675e-06
>> > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > >     0 KSP Residual norm 4.624964188094e-01
>> > > > >     1 KSP Residual norm 6.418229775372e-08
>> > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > >     0 KSP Residual norm 9.800784311614e-01
>> > > > >     1 KSP Residual norm 0.000000000000e+00
>> > > > >   7 KSP preconditioned resid norm 8.788046233297e-10 true resid
>> norm 2.849209671705e+00 ||r(i)||/||b|| 3.577783436215e-06
>> > > > > Linear solve converged due to CONVERGED_ATOL iterations 7
>> > > > >
>> > > > > The outer operator is an explicit matrix.
>> > > > >
>> > > > > Giang
>> > > > >
>> > > > > On Mon, Apr 24, 2017 at 7:32 PM, Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>>
>> wrote:
>> > > > >
>> > > > > > On Apr 24, 2017, at 3:16 AM, Hoang Giang Bui <hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>>
>> wrote:
>> > > > > >
>> > > > > > Thanks Barry, trying with -fieldsplit_u_type lu gives better
>> convergence. I still used 4 procs though, probably with 1 proc it should
>> also be the same.
>> > > > > >
>> > > > > > The u block used a Nitsche-type operator to connect two
>> non-matching domains. I don't think it will leave some rigid body motion
>> leads to not sufficient constraints. Maybe you have other idea?
>> > > > > >
>> > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > >     0 KSP Residual norm 3.129067184300e+05
>> > > > > >     1 KSP Residual norm 5.906261468196e-01
>> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > >
>> > > > >     ^^^^ something is wrong here. The sub solve should not be
>> starting with a 0 residual (this means the right hand side for this sub
>> solve is zero which it should not be).
>> > > > >
>> > > > > > FieldSplit with MULTIPLICATIVE composition: total splits = 2
>> > > > >
>> > > > >
>> > > > >    How are you providing the outer operator? As an explicit matrix
>> or with some shell matrix?
>> > > > >
>> > > > >
>> > > > >
>> > > > > >   0 KSP preconditioned resid norm 3.129067184300e+05 true resid
>> norm 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > >     0 KSP Residual norm 9.999955993437e-01
>> > > > > >     1 KSP Residual norm 4.019774691831e-06
>> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > >   1 KSP preconditioned resid norm 5.003913641475e-01 true resid
>> norm 4.692996324114e+01 ||r(i)||/||b|| 5.205677185522e-06
>> > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > >     0 KSP Residual norm 1.000012180204e+00
>> > > > > >     1 KSP Residual norm 1.017367950422e-05
>> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > >   2 KSP preconditioned resid norm 2.330910333756e-07 true resid
>> norm 3.474855463983e+01 ||r(i)||/||b|| 3.854461960453e-06
>> > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > >     0 KSP Residual norm 1.000004200085e+00
>> > > > > >     1 KSP Residual norm 6.231613102458e-06
>> > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > >   3 KSP preconditioned resid norm 8.671259838389e-11 true resid
>> norm 3.545103468011e+01 ||r(i)||/||b|| 3.932384125024e-06
>> > > > > > Linear solve converged due to CONVERGED_ATOL iterations 3
>> > > > > > KSP Object: 4 MPI processes
>> > > > > >   type: gmres
>> > > > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> Orthogonalization
>> > > > > >     GMRES: happy breakdown tolerance 1e-30
>> > > > > >   maximum iterations=1000, initial guess is zero
>> > > > > >   tolerances:  relative=1e-20, absolute=1e-09, divergence=10000
>> > > > > >   left preconditioning
>> > > > > >   using PRECONDITIONED norm type for convergence test
>> > > > > > PC Object: 4 MPI processes
>> > > > > >   type: fieldsplit
>> > > > > >     FieldSplit with MULTIPLICATIVE composition: total splits = 2
>> > > > > >     Solver info for each split is in the following KSP objects:
>> > > > > >     Split number 0 Defined by IS
>> > > > > >     KSP Object:    (fieldsplit_u_)     4 MPI processes
>> > > > > >       type: richardson
>> > > > > >         Richardson: damping factor=1
>> > > > > >       maximum iterations=1, initial guess is zero
>> > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> divergence=10000
>> > > > > >       left preconditioning
>> > > > > >       using PRECONDITIONED norm type for convergence test
>> > > > > >     PC Object:    (fieldsplit_u_)     4 MPI processes
>> > > > > >       type: lu
>> > > > > >         LU: out-of-place factorization
>> > > > > >         tolerance for zero pivot 2.22045e-14
>> > > > > >         matrix ordering: natural
>> > > > > >         factor fill ratio given 0, needed 0
>> > > > > >           Factored matrix follows:
>> > > > > >             Mat Object:             4 MPI processes
>> > > > > >               type: mpiaij
>> > > > > >               rows=938910, cols=938910
>> > > > > >               package used to perform factorization: pastix
>> > > > > >               total: nonzeros=0, allocated nonzeros=0
>> > > > > >       Error :                        3.36878e-14
>> > > > > >           total number of mallocs used during MatSetValues calls
>> =0
>> > > > > >                 PaStiX run parameters:
>> > > > > >                   Matrix type :                      Unsymmetric
>> > > > > >                   Level of printing (0,1,2):         0
>> > > > > >                   Number of refinements iterations : 3
>> > > > > >   Error :                        3.36878e-14
>> > > > > >       linear system matrix = precond matrix:
>> > > > > >       Mat Object:      (fieldsplit_u_)       4 MPI processes
>> > > > > >         type: mpiaij
>> > > > > >         rows=938910, cols=938910, bs=3
>> > > > > >   Error :                        3.36878e-14
>> > > > > >   Error :                        3.36878e-14
>> > > > > >         total: nonzeros=8.60906e+07, allocated
>> nonzeros=8.60906e+07
>> > > > > >         total number of mallocs used during MatSetValues calls =0
>> > > > > >           using I-node (on process 0) routines: found 78749
>> nodes, limit used is 5
>> > > > > >     Split number 1 Defined by IS
>> > > > > >     KSP Object:    (fieldsplit_wp_)     4 MPI processes
>> > > > > >       type: richardson
>> > > > > >         Richardson: damping factor=1
>> > > > > >       maximum iterations=1, initial guess is zero
>> > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> divergence=10000
>> > > > > >       left preconditioning
>> > > > > >       using PRECONDITIONED norm type for convergence test
>> > > > > >     PC Object:    (fieldsplit_wp_)     4 MPI processes
>> > > > > >       type: lu
>> > > > > >         LU: out-of-place factorization
>> > > > > >         tolerance for zero pivot 2.22045e-14
>> > > > > >         matrix ordering: natural
>> > > > > >         factor fill ratio given 0, needed 0
>> > > > > >           Factored matrix follows:
>> > > > > >             Mat Object:             4 MPI processes
>> > > > > >               type: mpiaij
>> > > > > >               rows=34141, cols=34141
>> > > > > >               package used to perform factorization: pastix
>> > > > > >                 Error :                        -nan
>> > > > > >   Error :                        -nan
>> > > > > >   Error :                        -nan
>> > > > > > total: nonzeros=0, allocated nonzeros=0
>> > > > > >               total number of mallocs used during MatSetValues
>> calls =0
>> > > > > >                 PaStiX run parameters:
>> > > > > >                   Matrix type :                      Symmetric
>> > > > > >                   Level of printing (0,1,2):         0
>> > > > > >                   Number of refinements iterations : 0
>> > > > > >   Error :                        -nan
>> > > > > >       linear system matrix = precond matrix:
>> > > > > >       Mat Object:      (fieldsplit_wp_)       4 MPI processes
>> > > > > >         type: mpiaij
>> > > > > >         rows=34141, cols=34141
>> > > > > >         total: nonzeros=485655, allocated nonzeros=485655
>> > > > > >         total number of mallocs used during MatSetValues calls =0
>> > > > > >           not using I-node (on process 0) routines
>> > > > > >   linear system matrix = precond matrix:
>> > > > > >   Mat Object:   4 MPI processes
>> > > > > >     type: mpiaij
>> > > > > >     rows=973051, cols=973051
>> > > > > >     total: nonzeros=9.90037e+07, allocated nonzeros=9.90037e+07
>> > > > > >     total number of mallocs used during MatSetValues calls =0
>> > > > > >       using I-node (on process 0) routines: found 78749 nodes,
>> limit used is 5
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > Giang
>> > > > > >
>> > > > > > On Sun, Apr 23, 2017 at 10:19 PM, Barry Smith <
>> bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>> wrote:
>> > > > > >
>> > > > > > > On Apr 23, 2017, at 2:42 PM, Hoang Giang Bui <
>> hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> wrote:
>> > > > > > >
>> > > > > > > Dear Matt/Barry
>> > > > > > >
>> > > > > > > With your options, it results in
>> > > > > > >
>> > > > > > >   0 KSP preconditioned resid norm 1.106709687386e+31 true
>> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > > >     0 KSP Residual norm 2.407308987203e+36
>> > > > > > >     1 KSP Residual norm 5.797185652683e+72
>> > > > > >
>> > > > > > It looks like Matt is right, hypre is seemly producing useless
>> garbage.
>> > > > > >
>> > > > > > First how do things run on one process. If you have similar
>> problems then debug on one process (debugging any kind of problem is always
>> far easy on one process).
>> > > > > >
>> > > > > > First run with -fieldsplit_u_type lu (instead of using hypre) to
>> see if that works or also produces something bad.
>> > > > > >
>> > > > > > What is the operator and the boundary conditions for u? It could
>> be singular.
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > > > ...
>> > > > > > > 999 KSP preconditioned resid norm 2.920157329174e+12 true
>> resid norm 9.015683504616e+06 ||r(i)||/||b|| 1.000059124102e+00
>> > > > > > >     Residual norms for fieldsplit_u_ solve.
>> > > > > > >     0 KSP Residual norm 1.533726746719e+36
>> > > > > > >     1 KSP Residual norm 3.692757392261e+72
>> > > > > > >     Residual norms for fieldsplit_wp_ solve.
>> > > > > > >     0 KSP Residual norm 0.000000000000e+00
>> > > > > > >
>> > > > > > > Do you suggest that the pastix solver for the "wp" block
>> encounters small pivot? In addition, seem like the "u" block is also
>> singular.
>> > > > > > >
>> > > > > > > Giang
>> > > > > > >
>> > > > > > > On Sun, Apr 23, 2017 at 7:39 PM, Barry Smith <
>> bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>> wrote:
>> > > > > > >
>> > > > > > >    Huge preconditioned norms but normal unpreconditioned norms
>> almost always come from a very small pivot in an LU or ILU factorization.
>> > > > > > >
>> > > > > > >    The first thing to do is monitor the two sub solves. Run
>> with the additional options -fieldsplit_u_ksp_type richardson
>> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
>> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
>> -fieldsplit_wp_ksp_max_it 1
>> > > > > > >
>> > > > > > > > On Apr 23, 2017, at 12:22 PM, Hoang Giang Bui <
>> hgbk2008 at gmail.com<mailto:hgbk2008 at gmail.com>> wrote:
>> > > > > > > >
>> > > > > > > > Hello
>> > > > > > > >
>> > > > > > > > I encountered a strange convergence behavior that I have
>> trouble to understand
>> > > > > > > >
>> > > > > > > > KSPSetFromOptions completed
>> > > > > > > >   0 KSP preconditioned resid norm 1.106709687386e+31 true
>> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
>> > > > > > > >   1 KSP preconditioned resid norm 2.933141742664e+29 true
>> resid norm 9.015152282123e+06 ||r(i)||/||b|| 1.000000198575e+00
>> > > > > > > >   2 KSP preconditioned resid norm 9.686409637174e+16 true
>> resid norm 9.015354521944e+06 ||r(i)||/||b|| 1.000022631902e+00
>> > > > > > > >   3 KSP preconditioned resid norm 4.219243615809e+15 true
>> resid norm 9.017157702420e+06 ||r(i)||/||b|| 1.000222648583e+00
>> > > > > > > > .....
>> > > > > > > > 999 KSP preconditioned resid norm 3.043754298076e+12 true
>> resid norm 9.015425041089e+06 ||r(i)||/||b|| 1.000030454195e+00
>> > > > > > > > 1000 KSP preconditioned resid norm 3.043000287819e+12 true
>> resid norm 9.015424313455e+06 ||r(i)||/||b|| 1.000030373483e+00
>> > > > > > > > Linear solve did not converge due to DIVERGED_ITS iterations
>> 1000
>> > > > > > > > KSP Object: 4 MPI processes
>> > > > > > > >   type: gmres
>> > > > > > > >     GMRES: restart=1000, using Modified Gram-Schmidt
>> Orthogonalization
>> > > > > > > >     GMRES: happy breakdown tolerance 1e-30
>> > > > > > > >   maximum iterations=1000, initial guess is zero
>> > > > > > > >   tolerances:  relative=1e-20, absolute=1e-09,
>> divergence=10000
>> > > > > > > >   left preconditioning
>> > > > > > > >   using PRECONDITIONED norm type for convergence test
>> > > > > > > > PC Object: 4 MPI processes
>> > > > > > > >   type: fieldsplit
>> > > > > > > >     FieldSplit with MULTIPLICATIVE composition: total splits
>> = 2
>> > > > > > > >     Solver info for each split is in the following KSP
>> objects:
>> > > > > > > >     Split number 0 Defined by IS
>> > > > > > > >     KSP Object:    (fieldsplit_u_)     4 MPI processes
>> > > > > > > >       type: preonly
>> > > > > > > >       maximum iterations=10000, initial guess is zero
>> > > > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> divergence=10000
>> > > > > > > >       left preconditioning
>> > > > > > > >       using NONE norm type for convergence test
>> > > > > > > >     PC Object:    (fieldsplit_u_)     4 MPI processes
>> > > > > > > >       type: hypre
>> > > > > > > >         HYPRE BoomerAMG preconditioning
>> > > > > > > >         HYPRE BoomerAMG: Cycle type V
>> > > > > > > >         HYPRE BoomerAMG: Maximum number of levels 25
>> > > > > > > >         HYPRE BoomerAMG: Maximum number of iterations PER
>> hypre call 1
>> > > > > > > >         HYPRE BoomerAMG: Convergence tolerance PER hypre
>> call 0
>> > > > > > > >         HYPRE BoomerAMG: Threshold for strong coupling 0.6
>> > > > > > > >         HYPRE BoomerAMG: Interpolation truncation factor 0
>> > > > > > > >         HYPRE BoomerAMG: Interpolation: max elements per row
>> 0
>> > > > > > > >         HYPRE BoomerAMG: Number of levels of aggressive
>> coarsening 0
>> > > > > > > >         HYPRE BoomerAMG: Number of paths for aggressive
>> coarsening 1
>> > > > > > > >         HYPRE BoomerAMG: Maximum row sums 0.9
>> > > > > > > >         HYPRE BoomerAMG: Sweeps down         1
>> > > > > > > >         HYPRE BoomerAMG: Sweeps up           1
>> > > > > > > >         HYPRE BoomerAMG: Sweeps on coarse    1
>> > > > > > > >         HYPRE BoomerAMG: Relax down
>> symmetric-SOR/Jacobi
>> > > > > > > >         HYPRE BoomerAMG: Relax up
>> symmetric-SOR/Jacobi
>> > > > > > > >         HYPRE BoomerAMG: Relax on coarse
>>  Gaussian-elimination
>> > > > > > > >         HYPRE BoomerAMG: Relax weight  (all)      1
>> > > > > > > >         HYPRE BoomerAMG: Outer relax weight (all) 1
>> > > > > > > >         HYPRE BoomerAMG: Using CF-relaxation
>> > > > > > > >         HYPRE BoomerAMG: Measure type        local
>> > > > > > > >         HYPRE BoomerAMG: Coarsen type        PMIS
>> > > > > > > >         HYPRE BoomerAMG: Interpolation type  classical
>> > > > > > > >       linear system matrix = precond matrix:
>> > > > > > > >       Mat Object:      (fieldsplit_u_)       4 MPI processes
>> > > > > > > >         type: mpiaij
>> > > > > > > >         rows=938910, cols=938910, bs=3
>> > > > > > > >         total: nonzeros=8.60906e+07, allocated
>> nonzeros=8.60906e+07
>> > > > > > > >         total number of mallocs used during MatSetValues
>> calls =0
>> > > > > > > >           using I-node (on process 0) routines: found 78749
>> nodes, limit used is 5
>> > > > > > > >     Split number 1 Defined by IS
>> > > > > > > >     KSP Object:    (fieldsplit_wp_)     4 MPI processes
>> > > > > > > >       type: preonly
>> > > > > > > >       maximum iterations=10000, initial guess is zero
>> > > > > > > >       tolerances:  relative=1e-05, absolute=1e-50,
>> divergence=10000
>> > > > > > > >       left preconditioning
>> > > > > > > >       using NONE norm type for convergence test
>> > > > > > > >     PC Object:    (fieldsplit_wp_)     4 MPI processes
>> > > > > > > >       type: lu
>> > > > > > > >         LU: out-of-place factorization
>> > > > > > > >         tolerance for zero pivot 2.22045e-14
>> > > > > > > >         matrix ordering: natural
>> > > > > > > >         factor fill ratio given 0, needed 0
>> > > > > > > >           Factored matrix follows:
>> > > > > > > >             Mat Object:             4 MPI processes
>> > > > > > > >               type: mpiaij
>> > > > > > > >               rows=34141, cols=34141
>> > > > > > > >               package used to perform factorization: pastix
>> > > > > > > >             Error :                        -nan
>> > > > > > > >   Error :                        -nan
>> > > > > > > >     total: nonzeros=0, allocated nonzeros=0
>> > > > > > > >             Error :                        -nan
>> > > > > > > >     total number of mallocs used during MatSetValues calls =0
>> > > > > > > >                 PaStiX run parameters:
>> > > > > > > >                   Matrix type :
>> Symmetric
>> > > > > > > >                   Level of printing (0,1,2):         0
>> > > > > > > >                   Number of refinements iterations : 0
>> > > > > > > >   Error :                        -nan
>> > > > > > > >       linear system matrix = precond matrix:
>> > > > > > > >       Mat Object:      (fieldsplit_wp_)       4 MPI processes
>> > > > > > > >         type: mpiaij
>> > > > > > > >         rows=34141, cols=34141
>> > > > > > > >         total: nonzeros=485655, allocated nonzeros=485655
>> > > > > > > >         total number of mallocs used during MatSetValues
>> calls =0
>> > > > > > > >           not using I-node (on process 0) routines
>> > > > > > > >   linear system matrix = precond matrix:
>> > > > > > > >   Mat Object:   4 MPI processes
>> > > > > > > >     type: mpiaij
>> > > > > > > >     rows=973051, cols=973051
>> > > > > > > >     total: nonzeros=9.90037e+07, allocated
>> nonzeros=9.90037e+07
>> > > > > > > >     total number of mallocs used during MatSetValues calls =0
>> > > > > > > >       using I-node (on process 0) routines: found 78749
>> nodes, limit used is 5
>> > > > > > > >
>> > > > > > > > The pattern of convergence gives a hint that this system is
>> somehow bad/singular. But I don't know why the preconditioned error goes up
>> too high. Anyone has an idea?
>> > > > > > > >
>> > > > > > > > Best regards
>> > > > > > > > Giang Bui
>> > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > >
>> > >
>> >
>> >
>>
>>




--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170503/90e3e1ac/attachment-0001.html>

From hzhang at mcs.anl.gov  Wed May  3 21:05:57 2017
From: hzhang at mcs.anl.gov (Hong)
Date: Wed, 3 May 2017 21:05:57 -0500
Subject: [petsc-users] GAMG scaling
In-Reply-To: <CADOhEh4bxBVeg5SJrqfGtc5F5UqN6eo1anKoRjZn1gdS7KnFoQ@mail.gmail.com>
References: <CADOhEh7wt-k7be4Zx+hNL6J3gyYfVWh3PD5ejv0XaagsV0XFXg@mail.gmail.com>
	<CAGCphBuQ+QLg=HUQq3v1_4FDPHw1_XrsVhuYUbrVwxq2G_+VeA@mail.gmail.com>
	<CADOhEh4bxBVeg5SJrqfGtc5F5UqN6eo1anKoRjZn1gdS7KnFoQ@mail.gmail.com>
Message-ID: <CAGCphBt3Y9iwe9zuvbYzAZL7fHiru5YwLmsWhzYORcw6_wOe3g@mail.gmail.com>

I basically used 'runex56' and set '-ne' be compatible with np.
Then I used option
'-matptap_via scalable'
'-matptap_via hypre'
'-matptap_via nonscalable'

I attached a job script below.

In master branch, I set default as 'nonscalable' for small - medium size
matrices, and automatically switch to 'scalable' when matrix size gets
larger.

Petsc solver uses MatPtAP,  which does local RAP to reduce communication
and accelerate computation.
I suggest you simply use default setting. Let me know if you encounter
trouble.

Hong

job.ne174.n8.np125.sh:
runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56 -ne
174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
-pc_gamg_reuse_interpolation true -ksp_converged_reason
-use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
-mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
-mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
-mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
-gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
-mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
-pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
-pc_gamg_repartition false -pc_mg_cycle_type v
-pc_gamg_use_parallel_coarse_grid_solver -mg_coarse_pc_type jacobi
-mg_coarse_ksp_type cg -ksp_monitor -log_view -matptap_via scalable >
log.ne174.n8.np125.scalable

runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56 -ne
174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
-pc_gamg_reuse_interpolation true -ksp_converged_reason
-use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
-mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
-mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
-mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
-gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
-mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
-pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
-pc_gamg_repartition false -pc_mg_cycle_type v
-pc_gamg_use_parallel_coarse_grid_solver -mg_coarse_pc_type jacobi
-mg_coarse_ksp_type cg -ksp_monitor -log_view -matptap_via hypre >
log.ne174.n8.np125.hypre

runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56 -ne
174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
-pc_gamg_reuse_interpolation true -ksp_converged_reason
-use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
-mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
-mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
-mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
-gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
-mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
-pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
-pc_gamg_repartition false -pc_mg_cycle_type v
-pc_gamg_use_parallel_coarse_grid_solver -mg_coarse_pc_type jacobi
-mg_coarse_ksp_type cg -ksp_monitor -log_view -matptap_via nonscalable >
log.ne174.n8.np125.nonscalable

runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56 -ne
174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
-pc_gamg_reuse_interpolation true -ksp_converged_reason
-use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
-mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
-mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
-mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
-gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
-mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
-pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
-pc_gamg_repartition false -pc_mg_cycle_type v
-pc_gamg_use_parallel_coarse_grid_solver -mg_coarse_pc_type jacobi
-mg_coarse_ksp_type cg -ksp_monitor -log_view > log.ne174.n8.np125

On Wed, May 3, 2017 at 2:08 PM, Mark Adams <mfadams at lbl.gov> wrote:

> Hong,the input files do not seem to be accessible. What are the command
> line option? (I don't see a "rap" or "scale" in the source).
>
>
>
> On Wed, May 3, 2017 at 12:17 PM, Hong <hzhang at mcs.anl.gov> wrote:
>
>> Mark,
>> Below is the copy of my email sent to you on Feb 27:
>>
>> I implemented scalable MatPtAP and did comparisons of three
>> implementations using ex56.c on alcf cetus machine (this machine has
>> small memory, 1GB/core):
>> - nonscalable PtAP: use an array of length PN to do dense axpy
>> - scalable PtAP:       do sparse axpy without use of PN array
>> - hypre PtAP.
>>
>> The results are attached. Summary:
>> - nonscalable PtAP is 2x faster than scalable, 8x faster than hypre PtAP
>> - scalable PtAP is 4x faster than hypre PtAP
>> - hypre uses less memory (see job.ne399.n63.np1000.sh)
>>
>> Based on above observation, I set the default PtAP algorithm as
>> 'nonscalable'.
>> When PN > local estimated nonzero of C=PtAP, then switch default to
>> 'scalable'.
>> User can overwrite default.
>>
>> For the case of np=8000, ne=599 (see job.ne599.n500.np8000.sh), I get
>> MatPtAP                   3.6224e+01 (nonscalable for small mats,
>> scalable for larger ones)
>> scalable MatPtAP     4.6129e+01
>> hypre                        1.9389e+02
>>
>> This work in on petsc-master. Give it a try. If you encounter any
>> problem, let me know.
>>
>> Hong
>>
>> On Wed, May 3, 2017 at 10:01 AM, Mark Adams <mfadams at lbl.gov> wrote:
>>
>>> (Hong), what is the current state of optimizing RAP for scaling?
>>>
>>> Nate, is driving 3D elasticity problems at scaling with GAMG and we are
>>> working out performance problems. They are hitting problems at ~1.5B dof
>>> problems on a basic Cray (XC30 I think).
>>>
>>> Thanks,
>>> Mark
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170503/ae3dcbde/attachment.html>

From mfadams at lbl.gov  Thu May  4 07:44:04 2017
From: mfadams at lbl.gov (Mark Adams)
Date: Thu, 4 May 2017 08:44:04 -0400
Subject: [petsc-users] GAMG scaling
In-Reply-To: <CAGCphBt3Y9iwe9zuvbYzAZL7fHiru5YwLmsWhzYORcw6_wOe3g@mail.gmail.com>
References: <CADOhEh7wt-k7be4Zx+hNL6J3gyYfVWh3PD5ejv0XaagsV0XFXg@mail.gmail.com>
	<CAGCphBuQ+QLg=HUQq3v1_4FDPHw1_XrsVhuYUbrVwxq2G_+VeA@mail.gmail.com>
	<CADOhEh4bxBVeg5SJrqfGtc5F5UqN6eo1anKoRjZn1gdS7KnFoQ@mail.gmail.com>
	<CAGCphBt3Y9iwe9zuvbYzAZL7fHiru5YwLmsWhzYORcw6_wOe3g@mail.gmail.com>
Message-ID: <CADOhEh6vp3eiU-sHt86+v9ECtgCj4_9Q=y9+eTvm4FsnBBL6Cw@mail.gmail.com>

Thanks Hong,

I am not seeing these options with -help ...

On Wed, May 3, 2017 at 10:05 PM, Hong <hzhang at mcs.anl.gov> wrote:

> I basically used 'runex56' and set '-ne' be compatible with np.
> Then I used option
> '-matptap_via scalable'
> '-matptap_via hypre'
> '-matptap_via nonscalable'
>
> I attached a job script below.
>
> In master branch, I set default as 'nonscalable' for small - medium size
> matrices, and automatically switch to 'scalable' when matrix size gets
> larger.
>
> Petsc solver uses MatPtAP,  which does local RAP to reduce communication
> and accelerate computation.
> I suggest you simply use default setting. Let me know if you encounter
> trouble.
>
> Hong
>
> job.ne174.n8.np125.sh:
> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56 -ne
> 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
> -pc_gamg_reuse_interpolation true -ksp_converged_reason
> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
> -pc_gamg_repartition false -pc_mg_cycle_type v -pc_gamg_use_parallel_coarse_grid_solver
> -mg_coarse_pc_type jacobi -mg_coarse_ksp_type cg -ksp_monitor -log_view
> -matptap_via scalable > log.ne174.n8.np125.scalable
>
> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56 -ne
> 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
> -pc_gamg_reuse_interpolation true -ksp_converged_reason
> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
> -pc_gamg_repartition false -pc_mg_cycle_type v -pc_gamg_use_parallel_coarse_grid_solver
> -mg_coarse_pc_type jacobi -mg_coarse_ksp_type cg -ksp_monitor -log_view
> -matptap_via hypre > log.ne174.n8.np125.hypre
>
> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56 -ne
> 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
> -pc_gamg_reuse_interpolation true -ksp_converged_reason
> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
> -pc_gamg_repartition false -pc_mg_cycle_type v -pc_gamg_use_parallel_coarse_grid_solver
> -mg_coarse_pc_type jacobi -mg_coarse_ksp_type cg -ksp_monitor -log_view
> -matptap_via nonscalable > log.ne174.n8.np125.nonscalable
>
> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56 -ne
> 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
> -pc_gamg_reuse_interpolation true -ksp_converged_reason
> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
> -pc_gamg_repartition false -pc_mg_cycle_type v -pc_gamg_use_parallel_coarse_grid_solver
> -mg_coarse_pc_type jacobi -mg_coarse_ksp_type cg -ksp_monitor -log_view >
> log.ne174.n8.np125
>
> On Wed, May 3, 2017 at 2:08 PM, Mark Adams <mfadams at lbl.gov> wrote:
>
>> Hong,the input files do not seem to be accessible. What are the command
>> line option? (I don't see a "rap" or "scale" in the source).
>>
>>
>>
>> On Wed, May 3, 2017 at 12:17 PM, Hong <hzhang at mcs.anl.gov> wrote:
>>
>>> Mark,
>>> Below is the copy of my email sent to you on Feb 27:
>>>
>>> I implemented scalable MatPtAP and did comparisons of three
>>> implementations using ex56.c on alcf cetus machine (this machine has
>>> small memory, 1GB/core):
>>> - nonscalable PtAP: use an array of length PN to do dense axpy
>>> - scalable PtAP:       do sparse axpy without use of PN array
>>> - hypre PtAP.
>>>
>>> The results are attached. Summary:
>>> - nonscalable PtAP is 2x faster than scalable, 8x faster than hypre PtAP
>>> - scalable PtAP is 4x faster than hypre PtAP
>>> - hypre uses less memory (see job.ne399.n63.np1000.sh)
>>>
>>> Based on above observation, I set the default PtAP algorithm as
>>> 'nonscalable'.
>>> When PN > local estimated nonzero of C=PtAP, then switch default to
>>> 'scalable'.
>>> User can overwrite default.
>>>
>>> For the case of np=8000, ne=599 (see job.ne599.n500.np8000.sh), I get
>>> MatPtAP                   3.6224e+01 (nonscalable for small mats,
>>> scalable for larger ones)
>>> scalable MatPtAP     4.6129e+01
>>> hypre                        1.9389e+02
>>>
>>> This work in on petsc-master. Give it a try. If you encounter any
>>> problem, let me know.
>>>
>>> Hong
>>>
>>> On Wed, May 3, 2017 at 10:01 AM, Mark Adams <mfadams at lbl.gov> wrote:
>>>
>>>> (Hong), what is the current state of optimizing RAP for scaling?
>>>>
>>>> Nate, is driving 3D elasticity problems at scaling with GAMG and we are
>>>> working out performance problems. They are hitting problems at ~1.5B dof
>>>> problems on a basic Cray (XC30 I think).
>>>>
>>>> Thanks,
>>>> Mark
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170504/a6fd1adb/attachment.html>

From mfadams at lbl.gov  Thu May  4 08:09:10 2017
From: mfadams at lbl.gov (Mark Adams)
Date: Thu, 4 May 2017 09:09:10 -0400
Subject: [petsc-users] SNES error
In-Reply-To: <CAMYG4GkNsgU4FR8GfJvVeZ28qY2PpYGcR+Jj1L7umJr8XXVTZQ@mail.gmail.com>
References: <CADOhEh7_aBgChPnHnca1iweEssO4jP+CDzy3y06GSUEpyWr9Uw@mail.gmail.com>
	<CAMYG4Gmwh3Xzjta+C9HRE8eAtU7-5y7qtabw6Z9WR---R0AJxg@mail.gmail.com>
	<677760BF-5666-4C9D-A064-B495ACD80889@mcs.anl.gov>
	<CADOhEh66JN=yX=V8E0Y8uhiOmfS6sA6-Lup37ubyh8wW0y38vg@mail.gmail.com>
	<CAMYG4GkNsgU4FR8GfJvVeZ28qY2PpYGcR+Jj1L7umJr8XXVTZQ@mail.gmail.com>
Message-ID: <CADOhEh50DKVKsVJ2wwNakbC-ZBQAh1znoHjttNLit9_+vU4iLw@mail.gmail.com>

OK, that makes sense, it fails when my velocity grid gets not tiny.

I can use tine velocity grids for now.

On Tue, May 2, 2017 at 11:18 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, May 2, 2017 at 10:10 AM, Mark Adams <mfadams at lbl.gov> wrote:
>
>> /Users/markadams/Codes/petsc/arch-macosx-gnu-O/bin/mpiexec -n 1 ./vml
>> -v_coord_cylinder -x_dm_refine 2 -v_dm_refine 2 -snes_rtol 1.e-6 -snes_stol
>> 1.e-6 -ts_type cn -snes_fd -pc_type lu -ksp_type preonly
>> -x_petscspace_order 1 -x_petscspace_poly_tensor -v_petscspace_order 1
>> -v_petscspace_poly_tensor -ts_dt .1 -ts_max_steps 10 -ts_final_time 1e10
>> -verbose 3 -num_species 1 -snes_monitor -masses 1,2,4 -thermal_temps
>> 30,30,30  -domainv_lo -2,-2 -domainv_hi 2,2 -domainx_lo -12,-12 -domainx_hi
>> 12,12 -E 0,0 -blobx_radius 2 -x_dm_view hdf5:x.h5 -x_vec_view
>> hdf5:x.h5::append -v_dm_view hdf5:v.h5 -v_vec_view hdf5:v.h5::append
>> -x_pre_dm_view hdf5:prex.h5 -x_pre_vec_view hdf5:prex.h5::append
>> -snes_converged_reason -snes_linesearch_monitor -ts_adapt_monitor
>> main call SetupXDiscretization
>> main call SetInitialConditionDomain
>>                 VMLViewX DMGetOutputSequenceNumber=-1,
>> cmd_str=-x_pre_vec_view
>>   0) species 0: charge density= -2.3940791757186e+00, z-momentum=
>>  5.9851979392559e-01, energy=  3.2314073646197e-01, thermal-flux=
>>  2.4419137539877e-01
>>           0) Normalized: charge density= -2.3940791757186e+00, z
>> momentum=  5.9851979392559e-01, energy=  3.2314073646197e-01, thermal flux=
>>  2.4419137539877e-01, local: 64 X cells, 81 X vertices
>>                 VMLViewX DMGetOutputSequenceNumber=0, cmd_str=(null)
>>         VMLViewV DMGetOutputSequenceNumber=-1
>>     0 SNES Function norm 4.097052680599e+00
>>     1 SNES Function norm 1.213148652908e-09
>>   Nonlinear solve did not converge due to DIVERGED_FUNCTION_COUNT
>> iterations 1
>>
>
> Neat! Mark, I think this has to do with you calling SNESEvaluateFunc()
> inside another one. We limit the number of function evaluations
> to 10,000 by default, mostly to corral line searches. I think you hit
> this, and thus need to up the count.
>
>   Thanks,
>
>      Matt
>
>
>>       TSAdapt none step   0 stage rejected t=0          + 1.000e-01,
>> nonlinear solve failures 1 greater than current TS allowed
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> [0]PETSC ERROR:
>> [0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE,
>> increase -ts_max_snes_failures or make negative to attempt recovery
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.6-3659-g699918129c
>>  GIT Date: 2017-04-26 08:18:35 -0400
>> [0]PETSC ERROR: ./vml on a arch-macosx-gnu-O named MarksMac-5.local by
>> markadams Tue May  2 11:04:02 2017
>> [0]PETSC ERROR: Configure options --with-cc=clang --with-cc++=clang++
>> COPTFLAGS="-O3 -g -mavx2" CXXOPTFLAGS="-O3 -g -mavx2" FOPTFLAGS="-O3 -g
>> -mavx2" --download-mpich=1 --download-parmetis=1 --download-metis=1
>> --download-hypre=1 --download-ml=1 --download-triangle=1
>> --download-ctetgen=1 --download-p4est=1 --with-x=0 --download-superlu_dist
>> --download-superlu --download-ctetgen --with-debugging=0 --download-hdf5=1
>> PETSC_ARCH=arch-macosx-gnu-O --download-chaco --with-viewfromoptions=1
>>
>>
>> On Mon, May 1, 2017 at 10:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>>
>>>   and
>>>
>>>   -snes_linesearch_monitor
>>>   -ts_adapt_monitor
>>>
>>>
>>> > On May 1, 2017, at 7:51 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>> >
>>> > Run with -snes_converged_reason.
>>> >
>>> >    Matt
>>> >
>>> > On Mon, May 1, 2017 at 7:14 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>> > I get this SNES failure and I don't understand what the problem is.
>>> The rtol is 1.e-6 and the first iteration reduces the residual by 9 orders
>>> of magnitude. Yet, TS is not satisfied. What is going on here?
>>> >
>>> > mpiexec -n 1 ./vml -v_coord_cylinder -x_dm_refine 2 -v_dm_refine 2
>>> -snes_rtol 1.e-6 -snes_stol 1.e-6 -ts_type cn -snes_fd -pc_type lu
>>> -ksp_type preonly -x_petscspace_order 1 -x_petscspace_poly_tensor
>>> -v_petscspace_order 1 -v_petscspace_poly_tensor -ts_dt .1 -ts_max_steps 10
>>> -ts_final_time 1e10 -verbose 3 -num_species 1 -snes_monitor -masses 1,2,4
>>> -thermal_temps 30,30,30  -domainv_lo -2,-2 -domainv_hi 2,2 -domainx_lo
>>> -12,-12 -domainx_hi 12,12 -E 0,0 -blobx_radius 2 -x_dm_view hdf5:x.h5
>>> -x_vec_view hdf5:x.h5::append -v_dm_view hdf5:v.h5 -v_vec_view
>>> hdf5:v.h5::append -x_pre_dm_view hdf5:prex.h5 -x_pre_vec_view
>>> hdf5:prex.h5::append
>>> >  ....
>>> >
>>> >    0 SNES Function norm 4.097052680599e+00
>>> >     1 SNES Function norm 1.213148652908e-09
>>> > [0]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> > [0]PETSC ERROR:
>>> > [0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE,
>>> increase -ts_max_snes_failures or make negative to attempt recovery
>>> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/d
>>> ocumentation/faq.html for trouble shooting.
>>> > [0]PETSC ERROR: Petsc Development GIT revision:
>>> v3.7.6-3659-g699918129c  GIT Date: 2017-04-26 08:18:35 -0400
>>> > [0]PETSC ERROR: ./vml on a arch-macosx-gnu-O named MarksMac-5.local by
>>> markadams Mon May  1 19:21:32 2017
>>> > [0]PETSC ERROR: Configure options --with-cc=clang --with-cc++=clang++
>>> COPTFLAGS="-O3 -g -mavx2" CXXOPTFLAGS="-O3 -g -mavx2" FOPTFLAGS="-O3 -g
>>> -mavx2" --download-mpich=1 --download-parmetis=1 --download-metis=1
>>> --download-hypre=1 --download-ml=1 --download-triangle=1
>>> --download-ctetgen=1 --download-p4est=1 --with-x=0 --download-superlu_dist
>>> --download-superlu --download-ctetgen --with-debugging=0 --download-hdf5=1
>>> PETSC_ARCH=arch-macosx-gnu-O --download-chaco --with-viewfromoptions=1
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> > -- Norbert Wiener
>>>
>>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170504/032a189c/attachment-0001.html>

From natacha.bereux at gmail.com  Thu May  4 09:17:33 2017
From: natacha.bereux at gmail.com (Natacha BEREUX)
Date: Thu, 4 May 2017 16:17:33 +0200
Subject: [petsc-users] Configure nested PCFIELDSPLIT with general index
	sets
In-Reply-To: <CAMYG4GkrW36C+SHS+0==kCt1qsH=08cCizgMpDY1jCbXYMhC=w@mail.gmail.com>
References: <CAFdHYftQiw0kFJsZOmgYoWB-rkh6hYUSa3DNt=3ydA09RyR_-g@mail.gmail.com>
	<CAMYG4Gkd2WXB+27mS1mgFaSguUfqvVgxv7Ob7OLUCjp=ja-Zbw@mail.gmail.com>
	<6496846F-19F8-4494-87E1-DDC390513370@imperial.ac.uk>
	<CAFdHYfvqKiF9BnKbGk9p2HjC=DBnVpV5Opr7c6aphNNMjezZuw@mail.gmail.com>
	<CAFdHYftnpDUzsaB7qdocmHmOYRNd+hKt2xMhFc4MdSg8h5QdwA@mail.gmail.com>
	<CAMYG4Gk2dG3TUOWWUFfLEzX-xw1wAvVA3V6NSoL0AFp6k=O9Ew@mail.gmail.com>
	<CAFdHYfu8Q50xLNkGeHS+YCT9H9=atGKknqJAhhG0n4uMCV2tFg@mail.gmail.com>
	<CAMYG4Gmfc0pZyRfaHey_8XimyqAE52t_yNQMJyHZ113gjOawsg@mail.gmail.com>
	<CAFdHYfuwXX-NdMQx0OfgQswyV6Z42FwvweSyf+cCv4Lqvrnw+w@mail.gmail.com>
	<CAMYG4GnGkjHuQ1Kep00fQaZfy05PmR=78iEkW8rd+Vd6bc3QDQ@mail.gmail.com>
	<CAMYG4GkrW36C+SHS+0==kCt1qsH=08cCizgMpDY1jCbXYMhC=w@mail.gmail.com>
Message-ID: <CAFdHYfsjKvhdA-=xwJV83wzicf4SFru31KDOdiX502BsNuj3xg@mail.gmail.com>

Dear Matt,
I re-checked the master branch. To be precise, I downloaded the nightly
tarball this morning (from
http://ftp.mcs.anl.gov/pub/petsc/petsc-master.tar.gz)
I am sure that the Fortran interface of DMSellSetCreateFieldDecomposition
is missing.
And it is quite tricky to add it. I have tried to write something in
src/dm/impls/shell/ftn-custom/zdmshellf.c but I am not familiar with
callbacks.
Any help would be greatly appreciated!
Best regards
Natacha

On Fri, Apr 28, 2017 at 8:11 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Fri, Apr 28, 2017 at 1:09 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Fri, Apr 28, 2017 at 11:48 AM, Natacha BEREUX <
>> natacha.bereux at gmail.com> wrote:
>>
>>> Dear Matt,
>>> Sorry for my (very) late reply.
>>> I was not able to find the Fortran interface of
>>> DMSellSetCreateFieldDecomposition in the late petsc-3.7.6 fortran (and
>>> my code still fails to link).
>>> I have the feeling that it is missing in the master branch.
>>> And I was not able to get it on bitbucket either.
>>> Is there a branch from which I can pull your commit  ?
>>>
>>
>> I would either:
>>
>>   a) Use the 'next' branch
>>
>> or
>>
>>   b) wait until Monday for me to merge to 'master'
>>
>> This merge has been held up, but can now go forward.
>>
>
> I just checked master. It was already merged. Please recheck your master.
>
>   Thanks,
>
>      Matt
>
>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Thans a lot for your help,
>>> Natacha
>>>
>>> On Thu, Mar 30, 2017 at 9:25 PM, Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>>
>>>> On Wed, Mar 22, 2017 at 1:45 PM, Natacha BEREUX <
>>>> natacha.bereux at gmail.com> wrote:
>>>>
>>>>> Hello Matt,
>>>>> Thanks a lot for your answers.
>>>>> Since I am working on a large FEM Fortran code, I have to stick to
>>>>> Fortran.
>>>>> Do you know if  someone plans to add this Fortran interface? Or may be
>>>>> I could do it myself ? Is this particular interface very hard to add ?
>>>>> Perhaps could  I mimic some other interface ?
>>>>> What would you advise ?
>>>>>
>>>>
>>>> I have added the interface in branch knepley/feature-fortran-compose.
>>>> I also put this in the 'next' branch. It
>>>> should make it to master soon. There is a test in
>>>> sys/examples/tests/ex13f
>>>>
>>>>   Thanks,
>>>>
>>>>     Matt
>>>>
>>>>
>>>>> Best regards,
>>>>> Natacha
>>>>>
>>>>> On Wed, Mar 22, 2017 at 12:33 PM, Matthew Knepley <knepley at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Wed, Mar 22, 2017 at 10:03 AM, Natacha BEREUX <
>>>>>> natacha.bereux at gmail.com> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>> if my understanding is correct, the approach proposed by Matt and
>>>>>>> Lawrence is the following :
>>>>>>> - create a DMShell (DMShellCreate)
>>>>>>> - define my own CreateFieldDecomposition to return the index sets I
>>>>>>> need (for displacement, pressure and temperature degrees of freedom) :
>>>>>>> myCreateFieldDecomposition(... )
>>>>>>> - set it in the DMShell ( DMShellSetCreateFieldDecomposition)
>>>>>>> - then sets  the DM in KSP context  (KSPSetDM)
>>>>>>>
>>>>>>> I  have some more questions
>>>>>>> - I did not succeed in setting my own CreateFieldDecomposition in
>>>>>>> the DMShell : link  fails with " unknown reference to ?
>>>>>>> dmshellsetcreatefielddecomposition_ ?. Could it be a Fortran
>>>>>>> problem (I am using Fortran)?  Is this routine available in PETSc  Fortran
>>>>>>> interface ? \
>>>>>>>
>>>>>>
>>>>>> Yes, exactly. The Fortran interface for passing function pointers is
>>>>>> complex, and no one has added this function yet.
>>>>>>
>>>>>>
>>>>>>> - CreateFieldDecomposition is supposed to return an array of dms (to
>>>>>>> define the fields). I am not able to return such datas.  Do I return a
>>>>>>> PETSC_NULL_OBJECT instead ?
>>>>>>>
>>>>>>
>>>>>> Yes.
>>>>>>
>>>>>>
>>>>>>> - do I have to provide something else to define the DMShell ?
>>>>>>>
>>>>>>
>>>>>> I think you will have to return local and global vectors, but this
>>>>>> just means creating a vector of the correct size and distribution.
>>>>>>
>>>>>>   Thanks,
>>>>>>
>>>>>>      Matt
>>>>>>
>>>>>>
>>>>>>> Thanks a lot for your help
>>>>>>> Natacha
>>>>>>>
>>>>>>> On Tue, Mar 21, 2017 at 2:44 PM, Natacha BEREUX <
>>>>>>> natacha.bereux at gmail.com> wrote:
>>>>>>>
>>>>>>>> Thanks for your quick answers. To be honest, I am not familiar at
>>>>>>>> all with DMShells and DMPlexes. But since it is what I need, I am going to
>>>>>>>> try it.
>>>>>>>> Thanks again  for your advices,
>>>>>>>> Natacha
>>>>>>>>
>>>>>>>> On Tue, Mar 21, 2017 at 2:27 PM, Lawrence Mitchell <
>>>>>>>> lawrence.mitchell at imperial.ac.uk> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> > On 21 Mar 2017, at 13:24, Matthew Knepley <knepley at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> >
>>>>>>>>> > I think the remedy is as easy as specifying a DMShell that has a
>>>>>>>>> PetscSection (DMSetDefaultSection) with your ordering, and
>>>>>>>>> > I think this is how Firedrake (http://www.firedrakeproject.org/)
>>>>>>>>> does it.
>>>>>>>>>
>>>>>>>>> We actually don't use a section, but we do provide
>>>>>>>>> DMCreateFieldDecomposition_Shell.
>>>>>>>>>
>>>>>>>>> If you have a section that describes all the fields, then I think
>>>>>>>>> if the DMShell knows about it, you effectively get the same behaviour as
>>>>>>>>> DMPlex (which does the decomposition in the same manner?).
>>>>>>>>>
>>>>>>>>> > However, I usually use a DMPlex which knows about my
>>>>>>>>> > mesh, so I am not sure if this strategy has any holes.
>>>>>>>>>
>>>>>>>>> I haven't noticed anything yet.
>>>>>>>>>
>>>>>>>>> Lawrence
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> What most experimenters take for granted before they begin their
>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>> experiments lead.
>>>>>> -- Norbert Wiener
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170504/565956fd/attachment.html>

From knepley at gmail.com  Thu May  4 10:07:59 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 4 May 2017 10:07:59 -0500
Subject: [petsc-users] Configure nested PCFIELDSPLIT with general index
	sets
In-Reply-To: <CAFdHYfsjKvhdA-=xwJV83wzicf4SFru31KDOdiX502BsNuj3xg@mail.gmail.com>
References: <CAFdHYftQiw0kFJsZOmgYoWB-rkh6hYUSa3DNt=3ydA09RyR_-g@mail.gmail.com>
	<CAMYG4Gkd2WXB+27mS1mgFaSguUfqvVgxv7Ob7OLUCjp=ja-Zbw@mail.gmail.com>
	<6496846F-19F8-4494-87E1-DDC390513370@imperial.ac.uk>
	<CAFdHYfvqKiF9BnKbGk9p2HjC=DBnVpV5Opr7c6aphNNMjezZuw@mail.gmail.com>
	<CAFdHYftnpDUzsaB7qdocmHmOYRNd+hKt2xMhFc4MdSg8h5QdwA@mail.gmail.com>
	<CAMYG4Gk2dG3TUOWWUFfLEzX-xw1wAvVA3V6NSoL0AFp6k=O9Ew@mail.gmail.com>
	<CAFdHYfu8Q50xLNkGeHS+YCT9H9=atGKknqJAhhG0n4uMCV2tFg@mail.gmail.com>
	<CAMYG4Gmfc0pZyRfaHey_8XimyqAE52t_yNQMJyHZ113gjOawsg@mail.gmail.com>
	<CAFdHYfuwXX-NdMQx0OfgQswyV6Z42FwvweSyf+cCv4Lqvrnw+w@mail.gmail.com>
	<CAMYG4GnGkjHuQ1Kep00fQaZfy05PmR=78iEkW8rd+Vd6bc3QDQ@mail.gmail.com>
	<CAMYG4GkrW36C+SHS+0==kCt1qsH=08cCizgMpDY1jCbXYMhC=w@mail.gmail.com>
	<CAFdHYfsjKvhdA-=xwJV83wzicf4SFru31KDOdiX502BsNuj3xg@mail.gmail.com>
Message-ID: <CAMYG4GmOhT_1qUBz9bmHJMHsQbY98Do2sm=ufOa3AryEVKpPHg@mail.gmail.com>

On Thu, May 4, 2017 at 9:17 AM, Natacha BEREUX <natacha.bereux at gmail.com>
wrote:

> Dear Matt,
> I re-checked the master branch. To be precise, I downloaded the nightly
> tarball this morning (from http://ftp.mcs.anl.gov/pub/
> petsc/petsc-master.tar.gz)
> I am sure that the Fortran interface of DMSellSetCreateFieldDecomposition
> is missing.
> And it is quite tricky to add it. I have tried to write something in
> src/dm/impls/shell/ftn-custom/zdmshellf.c but I am not familiar with
> callbacks.
> Any help would be greatly appreciated!
>

I added PetscObjectCompose() to Fortran, so you could compose IS objects
when needed. Setting function pointers from Fortran is indeed
complicated and I do not yet know how to do it. Could you submit and Issue (
https://bitbucket.org/petsc/petsc/issues?status=new&status=open)
and someone will add this as soon as we have time?

In the meantime, it would not be hard to create the DMShell in C and have a
small C wrapper for your Fortran function to create the decomposition.

  Thanks,

    Matt


> Best regards
> Natacha
>
> On Fri, Apr 28, 2017 at 8:11 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Fri, Apr 28, 2017 at 1:09 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Fri, Apr 28, 2017 at 11:48 AM, Natacha BEREUX <
>>> natacha.bereux at gmail.com> wrote:
>>>
>>>> Dear Matt,
>>>> Sorry for my (very) late reply.
>>>> I was not able to find the Fortran interface of
>>>> DMSellSetCreateFieldDecomposition in the late petsc-3.7.6 fortran (and
>>>> my code still fails to link).
>>>> I have the feeling that it is missing in the master branch.
>>>> And I was not able to get it on bitbucket either.
>>>> Is there a branch from which I can pull your commit  ?
>>>>
>>>
>>> I would either:
>>>
>>>   a) Use the 'next' branch
>>>
>>> or
>>>
>>>   b) wait until Monday for me to merge to 'master'
>>>
>>> This merge has been held up, but can now go forward.
>>>
>>
>> I just checked master. It was already merged. Please recheck your master.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>
>>>> Thans a lot for your help,
>>>> Natacha
>>>>
>>>> On Thu, Mar 30, 2017 at 9:25 PM, Matthew Knepley <knepley at gmail.com>
>>>> wrote:
>>>>
>>>>> On Wed, Mar 22, 2017 at 1:45 PM, Natacha BEREUX <
>>>>> natacha.bereux at gmail.com> wrote:
>>>>>
>>>>>> Hello Matt,
>>>>>> Thanks a lot for your answers.
>>>>>> Since I am working on a large FEM Fortran code, I have to stick to
>>>>>> Fortran.
>>>>>> Do you know if  someone plans to add this Fortran interface? Or may
>>>>>> be I could do it myself ? Is this particular interface very hard to add ?
>>>>>> Perhaps could  I mimic some other interface ?
>>>>>> What would you advise ?
>>>>>>
>>>>>
>>>>> I have added the interface in branch knepley/feature-fortran-compose.
>>>>> I also put this in the 'next' branch. It
>>>>> should make it to master soon. There is a test in
>>>>> sys/examples/tests/ex13f
>>>>>
>>>>>   Thanks,
>>>>>
>>>>>     Matt
>>>>>
>>>>>
>>>>>> Best regards,
>>>>>> Natacha
>>>>>>
>>>>>> On Wed, Mar 22, 2017 at 12:33 PM, Matthew Knepley <knepley at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Wed, Mar 22, 2017 at 10:03 AM, Natacha BEREUX <
>>>>>>> natacha.bereux at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>> if my understanding is correct, the approach proposed by Matt and
>>>>>>>> Lawrence is the following :
>>>>>>>> - create a DMShell (DMShellCreate)
>>>>>>>> - define my own CreateFieldDecomposition to return the index sets I
>>>>>>>> need (for displacement, pressure and temperature degrees of freedom) :
>>>>>>>> myCreateFieldDecomposition(... )
>>>>>>>> - set it in the DMShell ( DMShellSetCreateFieldDecomposition)
>>>>>>>> - then sets  the DM in KSP context  (KSPSetDM)
>>>>>>>>
>>>>>>>> I  have some more questions
>>>>>>>> - I did not succeed in setting my own CreateFieldDecomposition in
>>>>>>>> the DMShell : link  fails with " unknown reference to ?
>>>>>>>> dmshellsetcreatefielddecomposition_ ?. Could it be a Fortran
>>>>>>>> problem (I am using Fortran)?  Is this routine available in PETSc  Fortran
>>>>>>>> interface ? \
>>>>>>>>
>>>>>>>
>>>>>>> Yes, exactly. The Fortran interface for passing function pointers is
>>>>>>> complex, and no one has added this function yet.
>>>>>>>
>>>>>>>
>>>>>>>> - CreateFieldDecomposition is supposed to return an array of dms
>>>>>>>> (to define the fields). I am not able to return such datas.  Do I return a
>>>>>>>> PETSC_NULL_OBJECT instead ?
>>>>>>>>
>>>>>>>
>>>>>>> Yes.
>>>>>>>
>>>>>>>
>>>>>>>> - do I have to provide something else to define the DMShell ?
>>>>>>>>
>>>>>>>
>>>>>>> I think you will have to return local and global vectors, but this
>>>>>>> just means creating a vector of the correct size and distribution.
>>>>>>>
>>>>>>>   Thanks,
>>>>>>>
>>>>>>>      Matt
>>>>>>>
>>>>>>>
>>>>>>>> Thanks a lot for your help
>>>>>>>> Natacha
>>>>>>>>
>>>>>>>> On Tue, Mar 21, 2017 at 2:44 PM, Natacha BEREUX <
>>>>>>>> natacha.bereux at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Thanks for your quick answers. To be honest, I am not familiar at
>>>>>>>>> all with DMShells and DMPlexes. But since it is what I need, I am going to
>>>>>>>>> try it.
>>>>>>>>> Thanks again  for your advices,
>>>>>>>>> Natacha
>>>>>>>>>
>>>>>>>>> On Tue, Mar 21, 2017 at 2:27 PM, Lawrence Mitchell <
>>>>>>>>> lawrence.mitchell at imperial.ac.uk> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> > On 21 Mar 2017, at 13:24, Matthew Knepley <knepley at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>> >
>>>>>>>>>> > I think the remedy is as easy as specifying a DMShell that has
>>>>>>>>>> a PetscSection (DMSetDefaultSection) with your ordering, and
>>>>>>>>>> > I think this is how Firedrake (http://www.firedrakeproject.org/)
>>>>>>>>>> does it.
>>>>>>>>>>
>>>>>>>>>> We actually don't use a section, but we do provide
>>>>>>>>>> DMCreateFieldDecomposition_Shell.
>>>>>>>>>>
>>>>>>>>>> If you have a section that describes all the fields, then I think
>>>>>>>>>> if the DMShell knows about it, you effectively get the same behaviour as
>>>>>>>>>> DMPlex (which does the decomposition in the same manner?).
>>>>>>>>>>
>>>>>>>>>> > However, I usually use a DMPlex which knows about my
>>>>>>>>>> > mesh, so I am not sure if this strategy has any holes.
>>>>>>>>>>
>>>>>>>>>> I haven't noticed anything yet.
>>>>>>>>>>
>>>>>>>>>> Lawrence
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> What most experimenters take for granted before they begin their
>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>> experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170504/ef5d42ff/attachment-0001.html>

From hzhang at mcs.anl.gov  Thu May  4 10:33:51 2017
From: hzhang at mcs.anl.gov (Hong)
Date: Thu, 4 May 2017 10:33:51 -0500
Subject: [petsc-users] GAMG scaling
In-Reply-To: <CADOhEh6vp3eiU-sHt86+v9ECtgCj4_9Q=y9+eTvm4FsnBBL6Cw@mail.gmail.com>
References: <CADOhEh7wt-k7be4Zx+hNL6J3gyYfVWh3PD5ejv0XaagsV0XFXg@mail.gmail.com>
	<CAGCphBuQ+QLg=HUQq3v1_4FDPHw1_XrsVhuYUbrVwxq2G_+VeA@mail.gmail.com>
	<CADOhEh4bxBVeg5SJrqfGtc5F5UqN6eo1anKoRjZn1gdS7KnFoQ@mail.gmail.com>
	<CAGCphBt3Y9iwe9zuvbYzAZL7fHiru5YwLmsWhzYORcw6_wOe3g@mail.gmail.com>
	<CADOhEh6vp3eiU-sHt86+v9ECtgCj4_9Q=y9+eTvm4FsnBBL6Cw@mail.gmail.com>
Message-ID: <CAGCphBuSKeibw1cdDaZuW3a4jy8YhVybja9bj=wDFXRnGtM_aA@mail.gmail.com>

Mark:
>
> I am not seeing these options with -help ...
>
Hmm, this might be a bug - I'll check it.
Hong


>
> On Wed, May 3, 2017 at 10:05 PM, Hong <hzhang at mcs.anl.gov> wrote:
>
>> I basically used 'runex56' and set '-ne' be compatible with np.
>> Then I used option
>> '-matptap_via scalable'
>> '-matptap_via hypre'
>> '-matptap_via nonscalable'
>>
>> I attached a job script below.
>>
>> In master branch, I set default as 'nonscalable' for small - medium size
>> matrices, and automatically switch to 'scalable' when matrix size gets
>> larger.
>>
>> Petsc solver uses MatPtAP,  which does local RAP to reduce communication
>> and accelerate computation.
>> I suggest you simply use default setting. Let me know if you encounter
>> trouble.
>>
>> Hong
>>
>> job.ne174.n8.np125.sh:
>> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56
>> -ne 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
>> -pc_gamg_reuse_interpolation true -ksp_converged_reason
>> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
>> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
>> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
>> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
>> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
>> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
>> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
>> -pc_gamg_repartition false -pc_mg_cycle_type v
>> -pc_gamg_use_parallel_coarse_grid_solver -mg_coarse_pc_type jacobi
>> -mg_coarse_ksp_type cg -ksp_monitor -log_view -matptap_via scalable >
>> log.ne174.n8.np125.scalable
>>
>> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56
>> -ne 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
>> -pc_gamg_reuse_interpolation true -ksp_converged_reason
>> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
>> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
>> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
>> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
>> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
>> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
>> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
>> -pc_gamg_repartition false -pc_mg_cycle_type v
>> -pc_gamg_use_parallel_coarse_grid_solver -mg_coarse_pc_type jacobi
>> -mg_coarse_ksp_type cg -ksp_monitor -log_view -matptap_via hypre >
>> log.ne174.n8.np125.hypre
>>
>> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56
>> -ne 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
>> -pc_gamg_reuse_interpolation true -ksp_converged_reason
>> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
>> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
>> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
>> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
>> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
>> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
>> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
>> -pc_gamg_repartition false -pc_mg_cycle_type v
>> -pc_gamg_use_parallel_coarse_grid_solver -mg_coarse_pc_type jacobi
>> -mg_coarse_ksp_type cg -ksp_monitor -log_view -matptap_via nonscalable >
>> log.ne174.n8.np125.nonscalable
>>
>> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56
>> -ne 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
>> -pc_gamg_reuse_interpolation true -ksp_converged_reason
>> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
>> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
>> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
>> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
>> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
>> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
>> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
>> -pc_gamg_repartition false -pc_mg_cycle_type v
>> -pc_gamg_use_parallel_coarse_grid_solver -mg_coarse_pc_type jacobi
>> -mg_coarse_ksp_type cg -ksp_monitor -log_view > log.ne174.n8.np125
>>
>> On Wed, May 3, 2017 at 2:08 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>
>>> Hong,the input files do not seem to be accessible. What are the command
>>> line option? (I don't see a "rap" or "scale" in the source).
>>>
>>>
>>>
>>> On Wed, May 3, 2017 at 12:17 PM, Hong <hzhang at mcs.anl.gov> wrote:
>>>
>>>> Mark,
>>>> Below is the copy of my email sent to you on Feb 27:
>>>>
>>>> I implemented scalable MatPtAP and did comparisons of three
>>>> implementations using ex56.c on alcf cetus machine (this machine has
>>>> small memory, 1GB/core):
>>>> - nonscalable PtAP: use an array of length PN to do dense axpy
>>>> - scalable PtAP:       do sparse axpy without use of PN array
>>>> - hypre PtAP.
>>>>
>>>> The results are attached. Summary:
>>>> - nonscalable PtAP is 2x faster than scalable, 8x faster than hypre PtAP
>>>> - scalable PtAP is 4x faster than hypre PtAP
>>>> - hypre uses less memory (see job.ne399.n63.np1000.sh)
>>>>
>>>> Based on above observation, I set the default PtAP algorithm as
>>>> 'nonscalable'.
>>>> When PN > local estimated nonzero of C=PtAP, then switch default to
>>>> 'scalable'.
>>>> User can overwrite default.
>>>>
>>>> For the case of np=8000, ne=599 (see job.ne599.n500.np8000.sh), I get
>>>> MatPtAP                   3.6224e+01 (nonscalable for small mats,
>>>> scalable for larger ones)
>>>> scalable MatPtAP     4.6129e+01
>>>> hypre                        1.9389e+02
>>>>
>>>> This work in on petsc-master. Give it a try. If you encounter any
>>>> problem, let me know.
>>>>
>>>> Hong
>>>>
>>>> On Wed, May 3, 2017 at 10:01 AM, Mark Adams <mfadams at lbl.gov> wrote:
>>>>
>>>>> (Hong), what is the current state of optimizing RAP for scaling?
>>>>>
>>>>> Nate, is driving 3D elasticity problems at scaling with GAMG and we
>>>>> are working out performance problems. They are hitting problems at ~1.5B
>>>>> dof problems on a basic Cray (XC30 I think).
>>>>>
>>>>> Thanks,
>>>>> Mark
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170504/08483941/attachment.html>

From hng.email at gmail.com  Thu May  4 12:10:12 2017
From: hng.email at gmail.com (Hom Nath Gharti)
Date: Thu, 4 May 2017 13:10:12 -0400
Subject: [petsc-users] Suggestion for large scale Poisson's solver
Message-ID: <CAL+XNdXn=bB4DBRThq5NSXXfZkS+z27NGNVDJ+C2yQ2Qkqqu8Q@mail.gmail.com>

Dear all,

I am trying to solve a Poisson's equation on the Earth models with the
following information

- Degrees of freedom ~300,000,000
- I use MPIAIJ matrix
- Coefficient matrix is symmetric and doesn't change with time steps
- Need to compute for a large number of time steps

Which solver/preconditioner is the most efficient for this problem? I
would be grateful for your suggestion.

Thanks,
Hom Nath

From bsmith at mcs.anl.gov  Thu May  4 12:47:04 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 4 May 2017 12:47:04 -0500
Subject: [petsc-users] Suggestion for large scale Poisson's solver
In-Reply-To: <CAL+XNdXn=bB4DBRThq5NSXXfZkS+z27NGNVDJ+C2yQ2Qkqqu8Q@mail.gmail.com>
References: <CAL+XNdXn=bB4DBRThq5NSXXfZkS+z27NGNVDJ+C2yQ2Qkqqu8Q@mail.gmail.com>
Message-ID: <B24F14A1-996A-4B44-BD80-C5A70E949AD5@mcs.anl.gov>


> On May 4, 2017, at 12:10 PM, Hom Nath Gharti <hng.email at gmail.com> wrote:
> 
> Dear all,
> 
> I am trying to solve a Poisson's equation on the Earth models with the
> following information
> 
> - Degrees of freedom ~300,000,000
> - I use MPIAIJ matrix
> - Coefficient matrix is symmetric and doesn't change with time steps
> - Need to compute for a large number of time steps
> 
> Which solver/preconditioner is the most efficient for this problem? I
> would be grateful for your suggestion.

   Geometric multigrid is always best if you can use it. If not I would use hypre BoomerAMG, it should have very good convergence.


> 
> Thanks,
> Hom Nath


From hng.email at gmail.com  Thu May  4 12:52:50 2017
From: hng.email at gmail.com (Hom Nath Gharti)
Date: Thu, 4 May 2017 13:52:50 -0400
Subject: [petsc-users] Suggestion for large scale Poisson's solver
In-Reply-To: <B24F14A1-996A-4B44-BD80-C5A70E949AD5@mcs.anl.gov>
References: <CAL+XNdXn=bB4DBRThq5NSXXfZkS+z27NGNVDJ+C2yQ2Qkqqu8Q@mail.gmail.com>
	<B24F14A1-996A-4B44-BD80-C5A70E949AD5@mcs.anl.gov>
Message-ID: <CAL+XNdWUUyAtktFEZpNxae8XRPyA8pebisq_-SE+cFS6UtMtog@mail.gmail.com>

Thanks, Barry. Is there a way to take advantage of the fact that the
matrix remains same during time steps?

On Thu, May 4, 2017 at 1:47 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>> On May 4, 2017, at 12:10 PM, Hom Nath Gharti <hng.email at gmail.com> wrote:
>>
>> Dear all,
>>
>> I am trying to solve a Poisson's equation on the Earth models with the
>> following information
>>
>> - Degrees of freedom ~300,000,000
>> - I use MPIAIJ matrix
>> - Coefficient matrix is symmetric and doesn't change with time steps
>> - Need to compute for a large number of time steps
>>
>> Which solver/preconditioner is the most efficient for this problem? I
>> would be grateful for your suggestion.
>
>    Geometric multigrid is always best if you can use it. If not I would use hypre BoomerAMG, it should have very good convergence.
>
>
>>
>> Thanks,
>> Hom Nath
>

From bsmith at mcs.anl.gov  Thu May  4 13:00:13 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 4 May 2017 13:00:13 -0500
Subject: [petsc-users] Suggestion for large scale Poisson's solver
In-Reply-To: <CAL+XNdWUUyAtktFEZpNxae8XRPyA8pebisq_-SE+cFS6UtMtog@mail.gmail.com>
References: <CAL+XNdXn=bB4DBRThq5NSXXfZkS+z27NGNVDJ+C2yQ2Qkqqu8Q@mail.gmail.com>
	<B24F14A1-996A-4B44-BD80-C5A70E949AD5@mcs.anl.gov>
	<CAL+XNdWUUyAtktFEZpNxae8XRPyA8pebisq_-SE+cFS6UtMtog@mail.gmail.com>
Message-ID: <0B8CA754-9781-48F1-B5F5-CF67E52E5AC0@mcs.anl.gov>


> On May 4, 2017, at 12:52 PM, Hom Nath Gharti <hng.email at gmail.com> wrote:
> 
> Thanks, Barry. Is there a way to take advantage of the fact that the
> matrix remains same during time steps?

   Yes since you are not changing the matrix it will construct the preconditioner once and just use it forever.


> 
> On Thu, May 4, 2017 at 1:47 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>>> On May 4, 2017, at 12:10 PM, Hom Nath Gharti <hng.email at gmail.com> wrote:
>>> 
>>> Dear all,
>>> 
>>> I am trying to solve a Poisson's equation on the Earth models with the
>>> following information
>>> 
>>> - Degrees of freedom ~300,000,000
>>> - I use MPIAIJ matrix
>>> - Coefficient matrix is symmetric and doesn't change with time steps
>>> - Need to compute for a large number of time steps
>>> 
>>> Which solver/preconditioner is the most efficient for this problem? I
>>> would be grateful for your suggestion.
>> 
>>   Geometric multigrid is always best if you can use it. If not I would use hypre BoomerAMG, it should have very good convergence.
>> 
>> 
>>> 
>>> Thanks,
>>> Hom Nath
>> 


From hng.email at gmail.com  Thu May  4 13:02:42 2017
From: hng.email at gmail.com (Hom Nath Gharti)
Date: Thu, 4 May 2017 14:02:42 -0400
Subject: [petsc-users] Suggestion for large scale Poisson's solver
In-Reply-To: <0B8CA754-9781-48F1-B5F5-CF67E52E5AC0@mcs.anl.gov>
References: <CAL+XNdXn=bB4DBRThq5NSXXfZkS+z27NGNVDJ+C2yQ2Qkqqu8Q@mail.gmail.com>
	<B24F14A1-996A-4B44-BD80-C5A70E949AD5@mcs.anl.gov>
	<CAL+XNdWUUyAtktFEZpNxae8XRPyA8pebisq_-SE+cFS6UtMtog@mail.gmail.com>
	<0B8CA754-9781-48F1-B5F5-CF67E52E5AC0@mcs.anl.gov>
Message-ID: <CAL+XNdVffiGfTsgK+PXx0xsbUgiZvOrQ1qstUbnTaYd=zvbSvQ@mail.gmail.com>

Thanks a lot!

On Thu, May 4, 2017 at 2:00 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>> On May 4, 2017, at 12:52 PM, Hom Nath Gharti <hng.email at gmail.com> wrote:
>>
>> Thanks, Barry. Is there a way to take advantage of the fact that the
>> matrix remains same during time steps?
>
>    Yes since you are not changing the matrix it will construct the preconditioner once and just use it forever.
>
>
>>
>> On Thu, May 4, 2017 at 1:47 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>
>>>> On May 4, 2017, at 12:10 PM, Hom Nath Gharti <hng.email at gmail.com> wrote:
>>>>
>>>> Dear all,
>>>>
>>>> I am trying to solve a Poisson's equation on the Earth models with the
>>>> following information
>>>>
>>>> - Degrees of freedom ~300,000,000
>>>> - I use MPIAIJ matrix
>>>> - Coefficient matrix is symmetric and doesn't change with time steps
>>>> - Need to compute for a large number of time steps
>>>>
>>>> Which solver/preconditioner is the most efficient for this problem? I
>>>> would be grateful for your suggestion.
>>>
>>>   Geometric multigrid is always best if you can use it. If not I would use hypre BoomerAMG, it should have very good convergence.
>>>
>>>
>>>>
>>>> Thanks,
>>>> Hom Nath
>>>
>

From hzhang at mcs.anl.gov  Thu May  4 14:33:11 2017
From: hzhang at mcs.anl.gov (Hong)
Date: Thu, 4 May 2017 14:33:11 -0500
Subject: [petsc-users] GAMG scaling
In-Reply-To: <CAGCphBuSKeibw1cdDaZuW3a4jy8YhVybja9bj=wDFXRnGtM_aA@mail.gmail.com>
References: <CADOhEh7wt-k7be4Zx+hNL6J3gyYfVWh3PD5ejv0XaagsV0XFXg@mail.gmail.com>
	<CAGCphBuQ+QLg=HUQq3v1_4FDPHw1_XrsVhuYUbrVwxq2G_+VeA@mail.gmail.com>
	<CADOhEh4bxBVeg5SJrqfGtc5F5UqN6eo1anKoRjZn1gdS7KnFoQ@mail.gmail.com>
	<CAGCphBt3Y9iwe9zuvbYzAZL7fHiru5YwLmsWhzYORcw6_wOe3g@mail.gmail.com>
	<CADOhEh6vp3eiU-sHt86+v9ECtgCj4_9Q=y9+eTvm4FsnBBL6Cw@mail.gmail.com>
	<CAGCphBuSKeibw1cdDaZuW3a4jy8YhVybja9bj=wDFXRnGtM_aA@mail.gmail.com>
Message-ID: <CAGCphBsqKvkg=MaS4tgRyfG9WChrn9bNeBEa6SiGnHqKhLd2dg@mail.gmail.com>

Mark,
Fixed
https://bitbucket.org/petsc/petsc/commits/68eacb73b84ae7f3fd7363217d47f23a8f967155

Run ex56 gives
mpiexec -n 8 ./ex56 -ne 13 ... -h |grep via
  -mattransposematmult_via <scalable> Algorithmic approach (choose one of)
scalable nonscalable matmatmult (MatTransposeMatMult)
  -matmatmult_via <nonscalable> Algorithmic approach (choose one of)
scalable nonscalable hypre (MatMatMult)
  -matptap_via <nonscalable> Algorithmic approach (choose one of) scalable
nonscalable hypre (MatPtAP)
...

I'll merge it to master after regression tests.

Hong

On Thu, May 4, 2017 at 10:33 AM, Hong <hzhang at mcs.anl.gov> wrote:

> Mark:
>>
>> I am not seeing these options with -help ...
>>
> Hmm, this might be a bug - I'll check it.
> Hong
>
>
>>
>> On Wed, May 3, 2017 at 10:05 PM, Hong <hzhang at mcs.anl.gov> wrote:
>>
>>> I basically used 'runex56' and set '-ne' be compatible with np.
>>> Then I used option
>>> '-matptap_via scalable'
>>> '-matptap_via hypre'
>>> '-matptap_via nonscalable'
>>>
>>> I attached a job script below.
>>>
>>> In master branch, I set default as 'nonscalable' for small - medium size
>>> matrices, and automatically switch to 'scalable' when matrix size gets
>>> larger.
>>>
>>> Petsc solver uses MatPtAP,  which does local RAP to reduce communication
>>> and accelerate computation.
>>> I suggest you simply use default setting. Let me know if you encounter
>>> trouble.
>>>
>>> Hong
>>>
>>> job.ne174.n8.np125.sh:
>>> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56
>>> -ne 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
>>> -pc_gamg_reuse_interpolation true -ksp_converged_reason
>>> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
>>> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
>>> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
>>> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
>>> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
>>> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
>>> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
>>> -pc_gamg_repartition false -pc_mg_cycle_type v
>>> -pc_gamg_use_parallel_coarse_grid_solver -mg_coarse_pc_type jacobi
>>> -mg_coarse_ksp_type cg -ksp_monitor -log_view -matptap_via scalable >
>>> log.ne174.n8.np125.scalable
>>>
>>> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56
>>> -ne 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
>>> -pc_gamg_reuse_interpolation true -ksp_converged_reason
>>> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
>>> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
>>> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
>>> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
>>> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
>>> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
>>> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
>>> -pc_gamg_repartition false -pc_mg_cycle_type v
>>> -pc_gamg_use_parallel_coarse_grid_solver -mg_coarse_pc_type jacobi
>>> -mg_coarse_ksp_type cg -ksp_monitor -log_view -matptap_via hypre >
>>> log.ne174.n8.np125.hypre
>>>
>>> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56
>>> -ne 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
>>> -pc_gamg_reuse_interpolation true -ksp_converged_reason
>>> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
>>> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
>>> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
>>> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
>>> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
>>> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
>>> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
>>> -pc_gamg_repartition false -pc_mg_cycle_type v
>>> -pc_gamg_use_parallel_coarse_grid_solver -mg_coarse_pc_type jacobi
>>> -mg_coarse_ksp_type cg -ksp_monitor -log_view -matptap_via nonscalable >
>>> log.ne174.n8.np125.nonscalable
>>>
>>> runjob --np 125 -p 16 --block $COBALT_PARTNAME --verbose=INFO : ./ex56
>>> -ne 174 -alpha 1.e-3 -ksp_type cg -pc_type gamg -pc_gamg_agg_nsmooths 1
>>> -pc_gamg_reuse_interpolation true -ksp_converged_reason
>>> -use_mat_nearnullspace -mg_levels_esteig_ksp_type cg
>>> -mg_levels_esteig_ksp_max_it 10 -pc_gamg_square_graph 1
>>> -mg_levels_ksp_max_it 1 -mg_levels_ksp_type chebyshev
>>> -mg_levels_ksp_chebyshev_esteig 0,0.2,0,1.05 -gamg_est_ksp_type cg
>>> -gamg_est_ksp_max_it 10 -pc_gamg_asm_use_agg true -mg_levels_sub_pc_type lu
>>> -mg_levels_pc_asm_overlap 0 -pc_gamg_threshold -0.01
>>> -pc_gamg_coarse_eq_limit 200 -pc_gamg_process_eq_limit 30
>>> -pc_gamg_repartition false -pc_mg_cycle_type v
>>> -pc_gamg_use_parallel_coarse_grid_solver -mg_coarse_pc_type jacobi
>>> -mg_coarse_ksp_type cg -ksp_monitor -log_view > log.ne174.n8.np125
>>>
>>> On Wed, May 3, 2017 at 2:08 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>>
>>>> Hong,the input files do not seem to be accessible. What are the command
>>>> line option? (I don't see a "rap" or "scale" in the source).
>>>>
>>>>
>>>>
>>>> On Wed, May 3, 2017 at 12:17 PM, Hong <hzhang at mcs.anl.gov> wrote:
>>>>
>>>>> Mark,
>>>>> Below is the copy of my email sent to you on Feb 27:
>>>>>
>>>>> I implemented scalable MatPtAP and did comparisons of three
>>>>> implementations using ex56.c on alcf cetus machine (this machine has
>>>>> small memory, 1GB/core):
>>>>> - nonscalable PtAP: use an array of length PN to do dense axpy
>>>>> - scalable PtAP:       do sparse axpy without use of PN array
>>>>> - hypre PtAP.
>>>>>
>>>>> The results are attached. Summary:
>>>>> - nonscalable PtAP is 2x faster than scalable, 8x faster than hypre
>>>>> PtAP
>>>>> - scalable PtAP is 4x faster than hypre PtAP
>>>>> - hypre uses less memory (see job.ne399.n63.np1000.sh)
>>>>>
>>>>> Based on above observation, I set the default PtAP algorithm as
>>>>> 'nonscalable'.
>>>>> When PN > local estimated nonzero of C=PtAP, then switch default to
>>>>> 'scalable'.
>>>>> User can overwrite default.
>>>>>
>>>>> For the case of np=8000, ne=599 (see job.ne599.n500.np8000.sh), I get
>>>>> MatPtAP                   3.6224e+01 (nonscalable for small mats,
>>>>> scalable for larger ones)
>>>>> scalable MatPtAP     4.6129e+01
>>>>> hypre                        1.9389e+02
>>>>>
>>>>> This work in on petsc-master. Give it a try. If you encounter any
>>>>> problem, let me know.
>>>>>
>>>>> Hong
>>>>>
>>>>> On Wed, May 3, 2017 at 10:01 AM, Mark Adams <mfadams at lbl.gov> wrote:
>>>>>
>>>>>> (Hong), what is the current state of optimizing RAP for scaling?
>>>>>>
>>>>>> Nate, is driving 3D elasticity problems at scaling with GAMG and we
>>>>>> are working out performance problems. They are hitting problems at ~1.5B
>>>>>> dof problems on a basic Cray (XC30 I think).
>>>>>>
>>>>>> Thanks,
>>>>>> Mark
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170504/d2bf2f33/attachment-0001.html>

From natacha.bereux at gmail.com  Thu May  4 15:39:15 2017
From: natacha.bereux at gmail.com (Natacha BEREUX)
Date: Thu, 4 May 2017 22:39:15 +0200
Subject: [petsc-users] Configure nested PCFIELDSPLIT with general index
	sets
In-Reply-To: <CAMYG4GmOhT_1qUBz9bmHJMHsQbY98Do2sm=ufOa3AryEVKpPHg@mail.gmail.com>
References: <CAFdHYftQiw0kFJsZOmgYoWB-rkh6hYUSa3DNt=3ydA09RyR_-g@mail.gmail.com>
	<CAMYG4Gkd2WXB+27mS1mgFaSguUfqvVgxv7Ob7OLUCjp=ja-Zbw@mail.gmail.com>
	<6496846F-19F8-4494-87E1-DDC390513370@imperial.ac.uk>
	<CAFdHYfvqKiF9BnKbGk9p2HjC=DBnVpV5Opr7c6aphNNMjezZuw@mail.gmail.com>
	<CAFdHYftnpDUzsaB7qdocmHmOYRNd+hKt2xMhFc4MdSg8h5QdwA@mail.gmail.com>
	<CAMYG4Gk2dG3TUOWWUFfLEzX-xw1wAvVA3V6NSoL0AFp6k=O9Ew@mail.gmail.com>
	<CAFdHYfu8Q50xLNkGeHS+YCT9H9=atGKknqJAhhG0n4uMCV2tFg@mail.gmail.com>
	<CAMYG4Gmfc0pZyRfaHey_8XimyqAE52t_yNQMJyHZ113gjOawsg@mail.gmail.com>
	<CAFdHYfuwXX-NdMQx0OfgQswyV6Z42FwvweSyf+cCv4Lqvrnw+w@mail.gmail.com>
	<CAMYG4GnGkjHuQ1Kep00fQaZfy05PmR=78iEkW8rd+Vd6bc3QDQ@mail.gmail.com>
	<CAMYG4GkrW36C+SHS+0==kCt1qsH=08cCizgMpDY1jCbXYMhC=w@mail.gmail.com>
	<CAFdHYfsjKvhdA-=xwJV83wzicf4SFru31KDOdiX502BsNuj3xg@mail.gmail.com>
	<CAMYG4GmOhT_1qUBz9bmHJMHsQbY98Do2sm=ufOa3AryEVKpPHg@mail.gmail.com>
Message-ID: <CAFdHYfvKtob3xg6jZOkKJN1vDgR9O5cJ_HjCkYFEhtqQGZjFgQ@mail.gmail.com>

Thanks for your explanation. It is much clearer now.
I have just submitted an issue on the bugtracker (for
DMShellSetCreateFieldDecomposition Fortran interface)
I am going to work on your other proposals (using PetscObjectCompose or
wrap my decomposition).
I'll let you know what it gives !
Thanks a lot
Natacha

On Thu, May 4, 2017 at 5:07 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, May 4, 2017 at 9:17 AM, Natacha BEREUX <natacha.bereux at gmail.com>
> wrote:
>
>> Dear Matt,
>> I re-checked the master branch. To be precise, I downloaded the nightly
>> tarball this morning (from http://ftp.mcs.anl.gov/pub/pet
>> sc/petsc-master.tar.gz)
>> I am sure that the Fortran interface of DMSellSetCreateFieldDecomposition
>> is missing.
>> And it is quite tricky to add it. I have tried to write something in
>> src/dm/impls/shell/ftn-custom/zdmshellf.c but I am not familiar with
>> callbacks.
>> Any help would be greatly appreciated!
>>
>
> I added PetscObjectCompose() to Fortran, so you could compose IS objects
> when needed. Setting function pointers from Fortran is indeed
> complicated and I do not yet know how to do it. Could you submit and Issue
> (https://bitbucket.org/petsc/petsc/issues?status=new&status=open)
> and someone will add this as soon as we have time?
>
> In the meantime, it would not be hard to create the DMShell in C and have
> a small C wrapper for your Fortran function to create the decomposition.
>
>   Thanks,
>
>     Matt
>
>
>> Best regards
>> Natacha
>>
>> On Fri, Apr 28, 2017 at 8:11 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Fri, Apr 28, 2017 at 1:09 PM, Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>>
>>>> On Fri, Apr 28, 2017 at 11:48 AM, Natacha BEREUX <
>>>> natacha.bereux at gmail.com> wrote:
>>>>
>>>>> Dear Matt,
>>>>> Sorry for my (very) late reply.
>>>>> I was not able to find the Fortran interface of
>>>>> DMSellSetCreateFieldDecomposition in the late petsc-3.7.6 fortran
>>>>> (and my code still fails to link).
>>>>> I have the feeling that it is missing in the master branch.
>>>>> And I was not able to get it on bitbucket either.
>>>>> Is there a branch from which I can pull your commit  ?
>>>>>
>>>>
>>>> I would either:
>>>>
>>>>   a) Use the 'next' branch
>>>>
>>>> or
>>>>
>>>>   b) wait until Monday for me to merge to 'master'
>>>>
>>>> This merge has been held up, but can now go forward.
>>>>
>>>
>>> I just checked master. It was already merged. Please recheck your master.
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>
>>>>   Thanks,
>>>>
>>>>      Matt
>>>>
>>>>
>>>>> Thans a lot for your help,
>>>>> Natacha
>>>>>
>>>>> On Thu, Mar 30, 2017 at 9:25 PM, Matthew Knepley <knepley at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Wed, Mar 22, 2017 at 1:45 PM, Natacha BEREUX <
>>>>>> natacha.bereux at gmail.com> wrote:
>>>>>>
>>>>>>> Hello Matt,
>>>>>>> Thanks a lot for your answers.
>>>>>>> Since I am working on a large FEM Fortran code, I have to stick to
>>>>>>> Fortran.
>>>>>>> Do you know if  someone plans to add this Fortran interface? Or may
>>>>>>> be I could do it myself ? Is this particular interface very hard to add ?
>>>>>>> Perhaps could  I mimic some other interface ?
>>>>>>> What would you advise ?
>>>>>>>
>>>>>>
>>>>>> I have added the interface in branch knepley/feature-fortran-compose.
>>>>>> I also put this in the 'next' branch. It
>>>>>> should make it to master soon. There is a test in
>>>>>> sys/examples/tests/ex13f
>>>>>>
>>>>>>   Thanks,
>>>>>>
>>>>>>     Matt
>>>>>>
>>>>>>
>>>>>>> Best regards,
>>>>>>> Natacha
>>>>>>>
>>>>>>> On Wed, Mar 22, 2017 at 12:33 PM, Matthew Knepley <knepley at gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> On Wed, Mar 22, 2017 at 10:03 AM, Natacha BEREUX <
>>>>>>>> natacha.bereux at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>> if my understanding is correct, the approach proposed by Matt and
>>>>>>>>> Lawrence is the following :
>>>>>>>>> - create a DMShell (DMShellCreate)
>>>>>>>>> - define my own CreateFieldDecomposition to return the index sets
>>>>>>>>> I need (for displacement, pressure and temperature degrees of freedom) :
>>>>>>>>> myCreateFieldDecomposition(... )
>>>>>>>>> - set it in the DMShell ( DMShellSetCreateFieldDecomposition)
>>>>>>>>> - then sets  the DM in KSP context  (KSPSetDM)
>>>>>>>>>
>>>>>>>>> I  have some more questions
>>>>>>>>> - I did not succeed in setting my own CreateFieldDecomposition in
>>>>>>>>> the DMShell : link  fails with " unknown reference to ?
>>>>>>>>> dmshellsetcreatefielddecomposition_ ?. Could it be a Fortran
>>>>>>>>> problem (I am using Fortran)?  Is this routine available in PETSc  Fortran
>>>>>>>>> interface ? \
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, exactly. The Fortran interface for passing function pointers
>>>>>>>> is complex, and no one has added this function yet.
>>>>>>>>
>>>>>>>>
>>>>>>>>> - CreateFieldDecomposition is supposed to return an array of dms
>>>>>>>>> (to define the fields). I am not able to return such datas.  Do I return a
>>>>>>>>> PETSC_NULL_OBJECT instead ?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes.
>>>>>>>>
>>>>>>>>
>>>>>>>>> - do I have to provide something else to define the DMShell ?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I think you will have to return local and global vectors, but this
>>>>>>>> just means creating a vector of the correct size and distribution.
>>>>>>>>
>>>>>>>>   Thanks,
>>>>>>>>
>>>>>>>>      Matt
>>>>>>>>
>>>>>>>>
>>>>>>>>> Thanks a lot for your help
>>>>>>>>> Natacha
>>>>>>>>>
>>>>>>>>> On Tue, Mar 21, 2017 at 2:44 PM, Natacha BEREUX <
>>>>>>>>> natacha.bereux at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks for your quick answers. To be honest, I am not familiar at
>>>>>>>>>> all with DMShells and DMPlexes. But since it is what I need, I am going to
>>>>>>>>>> try it.
>>>>>>>>>> Thanks again  for your advices,
>>>>>>>>>> Natacha
>>>>>>>>>>
>>>>>>>>>> On Tue, Mar 21, 2017 at 2:27 PM, Lawrence Mitchell <
>>>>>>>>>> lawrence.mitchell at imperial.ac.uk> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> > On 21 Mar 2017, at 13:24, Matthew Knepley <knepley at gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > I think the remedy is as easy as specifying a DMShell that has
>>>>>>>>>>> a PetscSection (DMSetDefaultSection) with your ordering, and
>>>>>>>>>>> > I think this is how Firedrake (http://www.firedrakeproject.o
>>>>>>>>>>> rg/) does it.
>>>>>>>>>>>
>>>>>>>>>>> We actually don't use a section, but we do provide
>>>>>>>>>>> DMCreateFieldDecomposition_Shell.
>>>>>>>>>>>
>>>>>>>>>>> If you have a section that describes all the fields, then I
>>>>>>>>>>> think if the DMShell knows about it, you effectively get the same behaviour
>>>>>>>>>>> as DMPlex (which does the decomposition in the same manner?).
>>>>>>>>>>>
>>>>>>>>>>> > However, I usually use a DMPlex which knows about my
>>>>>>>>>>> > mesh, so I am not sure if this strategy has any holes.
>>>>>>>>>>>
>>>>>>>>>>> I haven't noticed anything yet.
>>>>>>>>>>>
>>>>>>>>>>> Lawrence
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>>> experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> What most experimenters take for granted before they begin their
>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>> experiments lead.
>>>>>> -- Norbert Wiener
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170504/31efc585/attachment.html>

From pvsang002 at gmail.com  Fri May  5 05:45:30 2017
From: pvsang002 at gmail.com (Pham Pham)
Date: Fri, 5 May 2017 18:45:30 +0800
Subject: [petsc-users] Installation question
In-Reply-To: <alpine.LFD.2.20.1704191301160.30353@asterix>
References: <CAOx7dQm4dhoirqEA5S7=boMEp-u4n0g0sKtvb30geq9DzwJ+tQ@mail.gmail.com>
	<alpine.LFD.2.20.1704191042400.26973@asterix>
	<CAOx7dQ=_xTm-7WEsLLenHZp5ehnyH5mQAbK83w0oOWu5sqZAqg@mail.gmail.com>
	<alpine.LFD.2.20.1704191301160.30353@asterix>
Message-ID: <CAOx7dQm=ozD3VQX3-8hzth_WfEXC9e5d28Yadh=F+diGPDk0Ww@mail.gmail.com>

*Hi,*
*I can configure now, but fail when testing:*

[mpepvs at atlas7-c10 petsc-3.7.5]$ make
PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt
test   Running test examples to verify correct installation
Using PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 and
PETSC_ARCH=arch-linux-cxx-opt
Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI
process
See http://www.mcs.anl.gov/petsc/documentation/faq.html
mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);
possible causes:
  1. no mpd is running on this host
  2. an mpd is running but was started without a "console" (-n option)
Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI
processes
See http://www.mcs.anl.gov/petsc/documentation/faq.html
mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);
possible causes:
  1. no mpd is running on this host
  2. an mpd is running but was started without a "console" (-n option)
Possible error running Fortran example src/snes/examples/tutorials/ex5f
with 1 MPI process
See http://www.mcs.anl.gov/petsc/documentation/faq.html
mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);
possible causes:
  1. no mpd is running on this host
  2. an mpd is running but was started without a "console" (-n option)
Completed test examples
=========================================
Now to evaluate the computer systems you plan use - do:
make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
PETSC_ARCH=arch-linux-cxx-opt streams




*Please help on this.*
*Many thanks!*


On Thu, Apr 20, 2017 at 2:02 AM, Satish Balay <balay at mcs.anl.gov> wrote:

> Sorry - should have mentioned:
>
> do 'rm -rf arch-linux-cxx-opt' and rerun configure again.
>
> The mpich install from previous build [that is currently in
> arch-linux-cxx-opt/]
> is conflicting with --with-mpi-dir=/app1/centos6.3/gnu/mvapich2-1.9/
>
> Satish
>
>
> On Wed, 19 Apr 2017, Pham Pham wrote:
>
> > I reconfigured PETSs with installed MPI, however, I got serous error:
> >
> > **************************ERROR*************************************
> >   Error during compile, check arch-linux-cxx-opt/lib/petsc/conf/make.log
> >   Send it and arch-linux-cxx-opt/lib/petsc/conf/configure.log to
> > petsc-maint at mcs.anl.gov
> > ********************************************************************
> >
> > Please explain what is happening?
> >
> > Thank you very much.
> >
> >
> >
> >
> > On Wed, Apr 19, 2017 at 11:43 PM, Satish Balay <balay at mcs.anl.gov>
> wrote:
> >
> > > Presumably your cluster already has a recommended MPI to use [which is
> > > already installed. So you should use that - instead of
> > > --download-mpich=1
> > >
> > > Satish
> > >
> > > On Wed, 19 Apr 2017, Pham Pham wrote:
> > >
> > > > Hi,
> > > >
> > > > I just installed petsc-3.7.5 into my university cluster. When
> evaluating
> > > > the computer system, PETSc reports "It appears you have 1 node(s)", I
> > > donot
> > > > understand this, since the system is a multinodes system. Could you
> > > please
> > > > explain this to me?
> > > >
> > > > Thank you very much.
> > > >
> > > > S.
> > > >
> > > > Output:
> > > > =========================================
> > > > Now to evaluate the computer systems you plan use - do:
> > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > PETSC_ARCH=arch-linux-cxx-opt streams
> > > > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
> > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > PETSC_ARCH=arch-linux-cxx-opt
> > > > streams
> > > > cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory
> > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > PETSC_ARCH=arch-linux-cxx-opt
> > > > streams
> > > > /home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpicxx -o
> > > > MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
> > > > -Wno-unknown-pragmas -fvisibility=hidden -g -O
> > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
> > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include
> > > > `pwd`/MPIVersion.c
> > > > Running streams with
> > > > '/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpiexec '
> > > using
> > > > 'NPMAX=12'
> > > > Number of MPI processes 1 Processor names  atlas7-c10
> > > > Triad:         9137.5025   Rate (MB/s)
> > > > Number of MPI processes 2 Processor names  atlas7-c10 atlas7-c10
> > > > Triad:         9707.2815   Rate (MB/s)
> > > > Number of MPI processes 3 Processor names  atlas7-c10 atlas7-c10
> > > atlas7-c10
> > > > Triad:        13559.5275   Rate (MB/s)
> > > > Number of MPI processes 4 Processor names  atlas7-c10 atlas7-c10
> > > atlas7-c10
> > > > atlas7-c10
> > > > Triad:        14193.0597   Rate (MB/s)
> > > > Number of MPI processes 5 Processor names  atlas7-c10 atlas7-c10
> > > atlas7-c10
> > > > atlas7-c10 atlas7-c10
> > > > Triad:        14492.9234   Rate (MB/s)
> > > > Number of MPI processes 6 Processor names  atlas7-c10 atlas7-c10
> > > atlas7-c10
> > > > atlas7-c10 atlas7-c10 atlas7-c10
> > > > Triad:        15476.5912   Rate (MB/s)
> > > > Number of MPI processes 7 Processor names  atlas7-c10 atlas7-c10
> > > atlas7-c10
> > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > Triad:        15148.7388   Rate (MB/s)
> > > > Number of MPI processes 8 Processor names  atlas7-c10 atlas7-c10
> > > atlas7-c10
> > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > Triad:        15799.1290   Rate (MB/s)
> > > > Number of MPI processes 9 Processor names  atlas7-c10 atlas7-c10
> > > atlas7-c10
> > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > Triad:        15671.3104   Rate (MB/s)
> > > > Number of MPI processes 10 Processor names  atlas7-c10 atlas7-c10
> > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > atlas7-c10 atlas7-c10
> > > > Triad:        15601.4754   Rate (MB/s)
> > > > Number of MPI processes 11 Processor names  atlas7-c10 atlas7-c10
> > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > atlas7-c10 atlas7-c10 atlas7-c10
> > > > Triad:        15434.5790   Rate (MB/s)
> > > > Number of MPI processes 12 Processor names  atlas7-c10 atlas7-c10
> > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > Triad:        15134.1263   Rate (MB/s)
> > > > ------------------------------------------------
> > > > np  speedup
> > > > 1 1.0
> > > > 2 1.06
> > > > 3 1.48
> > > > 4 1.55
> > > > 5 1.59
> > > > 6 1.69
> > > > 7 1.66
> > > > 8 1.73
> > > > 9 1.72
> > > > 10 1.71
> > > > 11 1.69
> > > > 12 1.66
> > > > Estimation of possible speedup of MPI programs based on Streams
> > > benchmark.
> > > > It appears you have 1 node(s)
> > > > Unable to plot speedup to a file
> > > > Unable to open matplotlib to plot speedup
> > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
> > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
> > > >
> > >
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170505/d63e6441/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make.log
Type: text/x-log
Size: 102067 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170505/d63e6441/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: text/x-log
Size: 6026195 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170505/d63e6441/attachment-0003.bin>

From balay at mcs.anl.gov  Fri May  5 09:02:53 2017
From: balay at mcs.anl.gov (Satish Balay)
Date: Fri, 5 May 2017 09:02:53 -0500
Subject: [petsc-users] Installation question
In-Reply-To: <CAOx7dQm=ozD3VQX3-8hzth_WfEXC9e5d28Yadh=F+diGPDk0Ww@mail.gmail.com>
References: <CAOx7dQm4dhoirqEA5S7=boMEp-u4n0g0sKtvb30geq9DzwJ+tQ@mail.gmail.com>
	<alpine.LFD.2.20.1704191042400.26973@asterix>
	<CAOx7dQ=_xTm-7WEsLLenHZp5ehnyH5mQAbK83w0oOWu5sqZAqg@mail.gmail.com>
	<alpine.LFD.2.20.1704191301160.30353@asterix>
	<CAOx7dQm=ozD3VQX3-8hzth_WfEXC9e5d28Yadh=F+diGPDk0Ww@mail.gmail.com>
Message-ID: <alpine.LFD.2.20.1705050859500.10073@asterix>

With Intel MPI - its best to use mpiexec.hydra [and not mpiexec]

So you can do:

make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt MPIEXEC=mpiexec.hydra test


[you can also specify --with-mpiexec=mpiexec.hydra at configure time]

Satish


On Fri, 5 May 2017, Pham Pham wrote:

> *Hi,*
> *I can configure now, but fail when testing:*
> 
> [mpepvs at atlas7-c10 petsc-3.7.5]$ make
> PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt
> test   Running test examples to verify correct installation
> Using PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 and
> PETSC_ARCH=arch-linux-cxx-opt
> Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI
> process
> See http://www.mcs.anl.gov/petsc/documentation/faq.html
> mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);
> possible causes:
>   1. no mpd is running on this host
>   2. an mpd is running but was started without a "console" (-n option)
> Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI
> processes
> See http://www.mcs.anl.gov/petsc/documentation/faq.html
> mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);
> possible causes:
>   1. no mpd is running on this host
>   2. an mpd is running but was started without a "console" (-n option)
> Possible error running Fortran example src/snes/examples/tutorials/ex5f
> with 1 MPI process
> See http://www.mcs.anl.gov/petsc/documentation/faq.html
> mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);
> possible causes:
>   1. no mpd is running on this host
>   2. an mpd is running but was started without a "console" (-n option)
> Completed test examples
> =========================================
> Now to evaluate the computer systems you plan use - do:
> make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> PETSC_ARCH=arch-linux-cxx-opt streams
> 
> 
> 
> 
> *Please help on this.*
> *Many thanks!*
> 
> 
> On Thu, Apr 20, 2017 at 2:02 AM, Satish Balay <balay at mcs.anl.gov> wrote:
> 
> > Sorry - should have mentioned:
> >
> > do 'rm -rf arch-linux-cxx-opt' and rerun configure again.
> >
> > The mpich install from previous build [that is currently in
> > arch-linux-cxx-opt/]
> > is conflicting with --with-mpi-dir=/app1/centos6.3/gnu/mvapich2-1.9/
> >
> > Satish
> >
> >
> > On Wed, 19 Apr 2017, Pham Pham wrote:
> >
> > > I reconfigured PETSs with installed MPI, however, I got serous error:
> > >
> > > **************************ERROR*************************************
> > >   Error during compile, check arch-linux-cxx-opt/lib/petsc/conf/make.log
> > >   Send it and arch-linux-cxx-opt/lib/petsc/conf/configure.log to
> > > petsc-maint at mcs.anl.gov
> > > ********************************************************************
> > >
> > > Please explain what is happening?
> > >
> > > Thank you very much.
> > >
> > >
> > >
> > >
> > > On Wed, Apr 19, 2017 at 11:43 PM, Satish Balay <balay at mcs.anl.gov>
> > wrote:
> > >
> > > > Presumably your cluster already has a recommended MPI to use [which is
> > > > already installed. So you should use that - instead of
> > > > --download-mpich=1
> > > >
> > > > Satish
> > > >
> > > > On Wed, 19 Apr 2017, Pham Pham wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I just installed petsc-3.7.5 into my university cluster. When
> > evaluating
> > > > > the computer system, PETSc reports "It appears you have 1 node(s)", I
> > > > donot
> > > > > understand this, since the system is a multinodes system. Could you
> > > > please
> > > > > explain this to me?
> > > > >
> > > > > Thank you very much.
> > > > >
> > > > > S.
> > > > >
> > > > > Output:
> > > > > =========================================
> > > > > Now to evaluate the computer systems you plan use - do:
> > > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > > PETSC_ARCH=arch-linux-cxx-opt streams
> > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
> > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > PETSC_ARCH=arch-linux-cxx-opt
> > > > > streams
> > > > > cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory
> > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > PETSC_ARCH=arch-linux-cxx-opt
> > > > > streams
> > > > > /home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpicxx -o
> > > > > MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
> > > > > -Wno-unknown-pragmas -fvisibility=hidden -g -O
> > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
> > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include
> > > > > `pwd`/MPIVersion.c
> > > > > Running streams with
> > > > > '/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpiexec '
> > > > using
> > > > > 'NPMAX=12'
> > > > > Number of MPI processes 1 Processor names  atlas7-c10
> > > > > Triad:         9137.5025   Rate (MB/s)
> > > > > Number of MPI processes 2 Processor names  atlas7-c10 atlas7-c10
> > > > > Triad:         9707.2815   Rate (MB/s)
> > > > > Number of MPI processes 3 Processor names  atlas7-c10 atlas7-c10
> > > > atlas7-c10
> > > > > Triad:        13559.5275   Rate (MB/s)
> > > > > Number of MPI processes 4 Processor names  atlas7-c10 atlas7-c10
> > > > atlas7-c10
> > > > > atlas7-c10
> > > > > Triad:        14193.0597   Rate (MB/s)
> > > > > Number of MPI processes 5 Processor names  atlas7-c10 atlas7-c10
> > > > atlas7-c10
> > > > > atlas7-c10 atlas7-c10
> > > > > Triad:        14492.9234   Rate (MB/s)
> > > > > Number of MPI processes 6 Processor names  atlas7-c10 atlas7-c10
> > > > atlas7-c10
> > > > > atlas7-c10 atlas7-c10 atlas7-c10
> > > > > Triad:        15476.5912   Rate (MB/s)
> > > > > Number of MPI processes 7 Processor names  atlas7-c10 atlas7-c10
> > > > atlas7-c10
> > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > Triad:        15148.7388   Rate (MB/s)
> > > > > Number of MPI processes 8 Processor names  atlas7-c10 atlas7-c10
> > > > atlas7-c10
> > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > Triad:        15799.1290   Rate (MB/s)
> > > > > Number of MPI processes 9 Processor names  atlas7-c10 atlas7-c10
> > > > atlas7-c10
> > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > Triad:        15671.3104   Rate (MB/s)
> > > > > Number of MPI processes 10 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > atlas7-c10 atlas7-c10
> > > > > Triad:        15601.4754   Rate (MB/s)
> > > > > Number of MPI processes 11 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > atlas7-c10 atlas7-c10 atlas7-c10
> > > > > Triad:        15434.5790   Rate (MB/s)
> > > > > Number of MPI processes 12 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > Triad:        15134.1263   Rate (MB/s)
> > > > > ------------------------------------------------
> > > > > np  speedup
> > > > > 1 1.0
> > > > > 2 1.06
> > > > > 3 1.48
> > > > > 4 1.55
> > > > > 5 1.59
> > > > > 6 1.69
> > > > > 7 1.66
> > > > > 8 1.73
> > > > > 9 1.72
> > > > > 10 1.71
> > > > > 11 1.69
> > > > > 12 1.66
> > > > > Estimation of possible speedup of MPI programs based on Streams
> > > > benchmark.
> > > > > It appears you have 1 node(s)
> > > > > Unable to plot speedup to a file
> > > > > Unable to open matplotlib to plot speedup
> > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
> > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
> > > > >
> > > >
> > > >
> > >
> >
> >
> 


From pvsang002 at gmail.com  Fri May  5 10:18:29 2017
From: pvsang002 at gmail.com (Pham Pham)
Date: Fri, 5 May 2017 23:18:29 +0800
Subject: [petsc-users] Installation question
In-Reply-To: <alpine.LFD.2.20.1705050859500.10073@asterix>
References: <CAOx7dQm4dhoirqEA5S7=boMEp-u4n0g0sKtvb30geq9DzwJ+tQ@mail.gmail.com>
	<alpine.LFD.2.20.1704191042400.26973@asterix>
	<CAOx7dQ=_xTm-7WEsLLenHZp5ehnyH5mQAbK83w0oOWu5sqZAqg@mail.gmail.com>
	<alpine.LFD.2.20.1704191301160.30353@asterix>
	<CAOx7dQm=ozD3VQX3-8hzth_WfEXC9e5d28Yadh=F+diGPDk0Ww@mail.gmail.com>
	<alpine.LFD.2.20.1705050859500.10073@asterix>
Message-ID: <CAOx7dQmoY3eCbkks1fzX0HT7FS0bkWOPwqNLE1sbyG3V6nOZqQ@mail.gmail.com>

Hi Satish,

It runs now, and shows a bad speed up:
Please help to improve this.

Thank you.


?

On Fri, May 5, 2017 at 10:02 PM, Satish Balay <balay at mcs.anl.gov> wrote:

> With Intel MPI - its best to use mpiexec.hydra [and not mpiexec]
>
> So you can do:
>
> make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> PETSC_ARCH=arch-linux-cxx-opt MPIEXEC=mpiexec.hydra test
>
>
> [you can also specify --with-mpiexec=mpiexec.hydra at configure time]
>
> Satish
>
>
> On Fri, 5 May 2017, Pham Pham wrote:
>
> > *Hi,*
> > *I can configure now, but fail when testing:*
> >
> > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
> > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> PETSC_ARCH=arch-linux-cxx-opt
> > test   Running test examples to verify correct installation
> > Using PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 and
> > PETSC_ARCH=arch-linux-cxx-opt
> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI
> > process
> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > mpiexec_atlas7-c10: cannot connect to local mpd
> (/tmp/mpd2.console_mpepvs);
> > possible causes:
> >   1. no mpd is running on this host
> >   2. an mpd is running but was started without a "console" (-n option)
> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI
> > processes
> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > mpiexec_atlas7-c10: cannot connect to local mpd
> (/tmp/mpd2.console_mpepvs);
> > possible causes:
> >   1. no mpd is running on this host
> >   2. an mpd is running but was started without a "console" (-n option)
> > Possible error running Fortran example src/snes/examples/tutorials/ex5f
> > with 1 MPI process
> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > mpiexec_atlas7-c10: cannot connect to local mpd
> (/tmp/mpd2.console_mpepvs);
> > possible causes:
> >   1. no mpd is running on this host
> >   2. an mpd is running but was started without a "console" (-n option)
> > Completed test examples
> > =========================================
> > Now to evaluate the computer systems you plan use - do:
> > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > PETSC_ARCH=arch-linux-cxx-opt streams
> >
> >
> >
> >
> > *Please help on this.*
> > *Many thanks!*
> >
> >
> > On Thu, Apr 20, 2017 at 2:02 AM, Satish Balay <balay at mcs.anl.gov> wrote:
> >
> > > Sorry - should have mentioned:
> > >
> > > do 'rm -rf arch-linux-cxx-opt' and rerun configure again.
> > >
> > > The mpich install from previous build [that is currently in
> > > arch-linux-cxx-opt/]
> > > is conflicting with --with-mpi-dir=/app1/centos6.3/gnu/mvapich2-1.9/
> > >
> > > Satish
> > >
> > >
> > > On Wed, 19 Apr 2017, Pham Pham wrote:
> > >
> > > > I reconfigured PETSs with installed MPI, however, I got serous error:
> > > >
> > > > **************************ERROR*************************************
> > > >   Error during compile, check arch-linux-cxx-opt/lib/petsc/
> conf/make.log
> > > >   Send it and arch-linux-cxx-opt/lib/petsc/conf/configure.log to
> > > > petsc-maint at mcs.anl.gov
> > > > ********************************************************************
> > > >
> > > > Please explain what is happening?
> > > >
> > > > Thank you very much.
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Apr 19, 2017 at 11:43 PM, Satish Balay <balay at mcs.anl.gov>
> > > wrote:
> > > >
> > > > > Presumably your cluster already has a recommended MPI to use
> [which is
> > > > > already installed. So you should use that - instead of
> > > > > --download-mpich=1
> > > > >
> > > > > Satish
> > > > >
> > > > > On Wed, 19 Apr 2017, Pham Pham wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I just installed petsc-3.7.5 into my university cluster. When
> > > evaluating
> > > > > > the computer system, PETSc reports "It appears you have 1
> node(s)", I
> > > > > donot
> > > > > > understand this, since the system is a multinodes system. Could
> you
> > > > > please
> > > > > > explain this to me?
> > > > > >
> > > > > > Thank you very much.
> > > > > >
> > > > > > S.
> > > > > >
> > > > > > Output:
> > > > > > =========================================
> > > > > > Now to evaluate the computer systems you plan use - do:
> > > > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > > > PETSC_ARCH=arch-linux-cxx-opt streams
> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
> > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > > PETSC_ARCH=arch-linux-cxx-opt
> > > > > > streams
> > > > > > cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory
> > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > > PETSC_ARCH=arch-linux-cxx-opt
> > > > > > streams
> > > > > > /home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpicxx
> -o
> > > > > > MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
> > > > > > -Wno-unknown-pragmas -fvisibility=hidden -g -O
> > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
> > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include
> > > > > > `pwd`/MPIVersion.c
> > > > > > Running streams with
> > > > > > '/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpiexec
> '
> > > > > using
> > > > > > 'NPMAX=12'
> > > > > > Number of MPI processes 1 Processor names  atlas7-c10
> > > > > > Triad:         9137.5025   Rate (MB/s)
> > > > > > Number of MPI processes 2 Processor names  atlas7-c10 atlas7-c10
> > > > > > Triad:         9707.2815   Rate (MB/s)
> > > > > > Number of MPI processes 3 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10
> > > > > > Triad:        13559.5275   Rate (MB/s)
> > > > > > Number of MPI processes 4 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10
> > > > > > atlas7-c10
> > > > > > Triad:        14193.0597   Rate (MB/s)
> > > > > > Number of MPI processes 5 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10
> > > > > > atlas7-c10 atlas7-c10
> > > > > > Triad:        14492.9234   Rate (MB/s)
> > > > > > Number of MPI processes 6 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > Triad:        15476.5912   Rate (MB/s)
> > > > > > Number of MPI processes 7 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > Triad:        15148.7388   Rate (MB/s)
> > > > > > Number of MPI processes 8 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > Triad:        15799.1290   Rate (MB/s)
> > > > > > Number of MPI processes 9 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > Triad:        15671.3104   Rate (MB/s)
> > > > > > Number of MPI processes 10 Processor names  atlas7-c10 atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > atlas7-c10 atlas7-c10
> > > > > > Triad:        15601.4754   Rate (MB/s)
> > > > > > Number of MPI processes 11 Processor names  atlas7-c10 atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > Triad:        15434.5790   Rate (MB/s)
> > > > > > Number of MPI processes 12 Processor names  atlas7-c10 atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > Triad:        15134.1263   Rate (MB/s)
> > > > > > ------------------------------------------------
> > > > > > np  speedup
> > > > > > 1 1.0
> > > > > > 2 1.06
> > > > > > 3 1.48
> > > > > > 4 1.55
> > > > > > 5 1.59
> > > > > > 6 1.69
> > > > > > 7 1.66
> > > > > > 8 1.73
> > > > > > 9 1.72
> > > > > > 10 1.71
> > > > > > 11 1.69
> > > > > > 12 1.66
> > > > > > Estimation of possible speedup of MPI programs based on Streams
> > > > > benchmark.
> > > > > > It appears you have 1 node(s)
> > > > > > Unable to plot speedup to a file
> > > > > > Unable to open matplotlib to plot speedup
> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170505/3cbd26c1/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scaling.png
Type: image/png
Size: 46047 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170505/3cbd26c1/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.log
Type: text/x-log
Size: 636 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170505/3cbd26c1/attachment-0004.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.log
Type: text/x-log
Size: 636 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170505/3cbd26c1/attachment-0005.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make.log
Type: text/x-log
Size: 102045 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170505/3cbd26c1/attachment-0006.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: text/x-log
Size: 4616950 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170505/3cbd26c1/attachment-0007.bin>

From niko.karin at gmail.com  Fri May  5 11:14:03 2017
From: niko.karin at gmail.com (Karin&NiKo)
Date: Fri, 5 May 2017 18:14:03 +0200
Subject: [petsc-users] Using SNES in a legacy code
Message-ID: <CA+gX-L9j8yrYhHDpCaTbbtsx5SA7YEZYr6WuLM-M5B-rHwDnvg@mail.gmail.com>

Dear PETSc team,

I am part of the development team of legacy fortran code with a tailored
Newton's method. The software is already using PETSc's linear solvers and
we enjoy it. Now I would like to evaluate the SNES solver.
I have already extracted a function in order to compute the Jacobian and
another one to compute the residual.
But there is something I cannot figure out : at each Newton's iteration,
our solver needs to know the unknowns value in order to compute the
Jacobian. But the increment vector is computed within the SNES.
How can I synchronize PETSc's vector of unknowns  and mine? Is there some
kind of SNESSetPostSolveShell ?

Thanks for developping PETSc,
Nicolas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170505/5eb24468/attachment.html>

From knepley at gmail.com  Fri May  5 11:26:25 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 5 May 2017 11:26:25 -0500
Subject: [petsc-users] Installation question
In-Reply-To: <CAOx7dQmoY3eCbkks1fzX0HT7FS0bkWOPwqNLE1sbyG3V6nOZqQ@mail.gmail.com>
References: <CAOx7dQm4dhoirqEA5S7=boMEp-u4n0g0sKtvb30geq9DzwJ+tQ@mail.gmail.com>
	<alpine.LFD.2.20.1704191042400.26973@asterix>
	<CAOx7dQ=_xTm-7WEsLLenHZp5ehnyH5mQAbK83w0oOWu5sqZAqg@mail.gmail.com>
	<alpine.LFD.2.20.1704191301160.30353@asterix>
	<CAOx7dQm=ozD3VQX3-8hzth_WfEXC9e5d28Yadh=F+diGPDk0Ww@mail.gmail.com>
	<alpine.LFD.2.20.1705050859500.10073@asterix>
	<CAOx7dQmoY3eCbkks1fzX0HT7FS0bkWOPwqNLE1sbyG3V6nOZqQ@mail.gmail.com>
Message-ID: <CAMYG4Gkg2ap25Z5+=Ev2M7GspFubtYvYLSFFJMDZkseFhVjXoQ@mail.gmail.com>

On Fri, May 5, 2017 at 10:18 AM, Pham Pham <pvsang002 at gmail.com> wrote:

> Hi Satish,
>
> It runs now, and shows a bad speed up:
> Please help to improve this.
>

http://www.mcs.anl.gov/petsc/documentation/faq.html#computers

The short answer is: You cannot improve this without buying a different
machine. This is
a fundamental algorithmic limitation that cannot be helped by threads, or
vectorization, or
anything else.

   Matt


> Thank you.
>
>
> ?
>
> On Fri, May 5, 2017 at 10:02 PM, Satish Balay <balay at mcs.anl.gov> wrote:
>
>> With Intel MPI - its best to use mpiexec.hydra [and not mpiexec]
>>
>> So you can do:
>>
>> make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> PETSC_ARCH=arch-linux-cxx-opt MPIEXEC=mpiexec.hydra test
>>
>>
>> [you can also specify --with-mpiexec=mpiexec.hydra at configure time]
>>
>> Satish
>>
>>
>> On Fri, 5 May 2017, Pham Pham wrote:
>>
>> > *Hi,*
>> > *I can configure now, but fail when testing:*
>> >
>> > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
>> > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> PETSC_ARCH=arch-linux-cxx-opt
>> > test   Running test examples to verify correct installation
>> > Using PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 and
>> > PETSC_ARCH=arch-linux-cxx-opt
>> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1
>> MPI
>> > process
>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> > mpiexec_atlas7-c10: cannot connect to local mpd
>> (/tmp/mpd2.console_mpepvs);
>> > possible causes:
>> >   1. no mpd is running on this host
>> >   2. an mpd is running but was started without a "console" (-n option)
>> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2
>> MPI
>> > processes
>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> > mpiexec_atlas7-c10: cannot connect to local mpd
>> (/tmp/mpd2.console_mpepvs);
>> > possible causes:
>> >   1. no mpd is running on this host
>> >   2. an mpd is running but was started without a "console" (-n option)
>> > Possible error running Fortran example src/snes/examples/tutorials/ex5f
>> > with 1 MPI process
>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> > mpiexec_atlas7-c10: cannot connect to local mpd
>> (/tmp/mpd2.console_mpepvs);
>> > possible causes:
>> >   1. no mpd is running on this host
>> >   2. an mpd is running but was started without a "console" (-n option)
>> > Completed test examples
>> > =========================================
>> > Now to evaluate the computer systems you plan use - do:
>> > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> > PETSC_ARCH=arch-linux-cxx-opt streams
>> >
>> >
>> >
>> >
>> > *Please help on this.*
>> > *Many thanks!*
>> >
>> >
>> > On Thu, Apr 20, 2017 at 2:02 AM, Satish Balay <balay at mcs.anl.gov>
>> wrote:
>> >
>> > > Sorry - should have mentioned:
>> > >
>> > > do 'rm -rf arch-linux-cxx-opt' and rerun configure again.
>> > >
>> > > The mpich install from previous build [that is currently in
>> > > arch-linux-cxx-opt/]
>> > > is conflicting with --with-mpi-dir=/app1/centos6.3/gnu/mvapich2-1.9/
>> > >
>> > > Satish
>> > >
>> > >
>> > > On Wed, 19 Apr 2017, Pham Pham wrote:
>> > >
>> > > > I reconfigured PETSs with installed MPI, however, I got serous
>> error:
>> > > >
>> > > > **************************ERROR*****************************
>> ********
>> > > >   Error during compile, check arch-linux-cxx-opt/lib/petsc/c
>> onf/make.log
>> > > >   Send it and arch-linux-cxx-opt/lib/petsc/conf/configure.log to
>> > > > petsc-maint at mcs.anl.gov
>> > > > ************************************************************
>> ********
>> > > >
>> > > > Please explain what is happening?
>> > > >
>> > > > Thank you very much.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Wed, Apr 19, 2017 at 11:43 PM, Satish Balay <balay at mcs.anl.gov>
>> > > wrote:
>> > > >
>> > > > > Presumably your cluster already has a recommended MPI to use
>> [which is
>> > > > > already installed. So you should use that - instead of
>> > > > > --download-mpich=1
>> > > > >
>> > > > > Satish
>> > > > >
>> > > > > On Wed, 19 Apr 2017, Pham Pham wrote:
>> > > > >
>> > > > > > Hi,
>> > > > > >
>> > > > > > I just installed petsc-3.7.5 into my university cluster. When
>> > > evaluating
>> > > > > > the computer system, PETSc reports "It appears you have 1
>> node(s)", I
>> > > > > donot
>> > > > > > understand this, since the system is a multinodes system. Could
>> you
>> > > > > please
>> > > > > > explain this to me?
>> > > > > >
>> > > > > > Thank you very much.
>> > > > > >
>> > > > > > S.
>> > > > > >
>> > > > > > Output:
>> > > > > > =========================================
>> > > > > > Now to evaluate the computer systems you plan use - do:
>> > > > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> > > > > > PETSC_ARCH=arch-linux-cxx-opt streams
>> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
>> > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> > > > > PETSC_ARCH=arch-linux-cxx-opt
>> > > > > > streams
>> > > > > > cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory
>> > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> > > > > PETSC_ARCH=arch-linux-cxx-opt
>> > > > > > streams
>> > > > > > /home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpicxx
>> -o
>> > > > > > MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
>> > > > > > -Wno-unknown-pragmas -fvisibility=hidden -g -O
>> > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
>> > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include
>> > > > > > `pwd`/MPIVersion.c
>> > > > > > Running streams with
>> > > > > > '/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpiexec
>> '
>> > > > > using
>> > > > > > 'NPMAX=12'
>> > > > > > Number of MPI processes 1 Processor names  atlas7-c10
>> > > > > > Triad:         9137.5025   Rate (MB/s)
>> > > > > > Number of MPI processes 2 Processor names  atlas7-c10 atlas7-c10
>> > > > > > Triad:         9707.2815   Rate (MB/s)
>> > > > > > Number of MPI processes 3 Processor names  atlas7-c10 atlas7-c10
>> > > > > atlas7-c10
>> > > > > > Triad:        13559.5275   Rate (MB/s)
>> > > > > > Number of MPI processes 4 Processor names  atlas7-c10 atlas7-c10
>> > > > > atlas7-c10
>> > > > > > atlas7-c10
>> > > > > > Triad:        14193.0597   Rate (MB/s)
>> > > > > > Number of MPI processes 5 Processor names  atlas7-c10 atlas7-c10
>> > > > > atlas7-c10
>> > > > > > atlas7-c10 atlas7-c10
>> > > > > > Triad:        14492.9234   Rate (MB/s)
>> > > > > > Number of MPI processes 6 Processor names  atlas7-c10 atlas7-c10
>> > > > > atlas7-c10
>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10
>> > > > > > Triad:        15476.5912   Rate (MB/s)
>> > > > > > Number of MPI processes 7 Processor names  atlas7-c10 atlas7-c10
>> > > > > atlas7-c10
>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>> > > > > > Triad:        15148.7388   Rate (MB/s)
>> > > > > > Number of MPI processes 8 Processor names  atlas7-c10 atlas7-c10
>> > > > > atlas7-c10
>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>> > > > > > Triad:        15799.1290   Rate (MB/s)
>> > > > > > Number of MPI processes 9 Processor names  atlas7-c10 atlas7-c10
>> > > > > atlas7-c10
>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>> atlas7-c10
>> > > > > > Triad:        15671.3104   Rate (MB/s)
>> > > > > > Number of MPI processes 10 Processor names  atlas7-c10
>> atlas7-c10
>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>> atlas7-c10
>> > > > > > atlas7-c10 atlas7-c10
>> > > > > > Triad:        15601.4754   Rate (MB/s)
>> > > > > > Number of MPI processes 11 Processor names  atlas7-c10
>> atlas7-c10
>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>> atlas7-c10
>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10
>> > > > > > Triad:        15434.5790   Rate (MB/s)
>> > > > > > Number of MPI processes 12 Processor names  atlas7-c10
>> atlas7-c10
>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>> atlas7-c10
>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>> > > > > > Triad:        15134.1263   Rate (MB/s)
>> > > > > > ------------------------------------------------
>> > > > > > np  speedup
>> > > > > > 1 1.0
>> > > > > > 2 1.06
>> > > > > > 3 1.48
>> > > > > > 4 1.55
>> > > > > > 5 1.59
>> > > > > > 6 1.69
>> > > > > > 7 1.66
>> > > > > > 8 1.73
>> > > > > > 9 1.72
>> > > > > > 10 1.71
>> > > > > > 11 1.69
>> > > > > > 12 1.66
>> > > > > > Estimation of possible speedup of MPI programs based on Streams
>> > > > > benchmark.
>> > > > > > It appears you have 1 node(s)
>> > > > > > Unable to plot speedup to a file
>> > > > > > Unable to open matplotlib to plot speedup
>> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
>> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
>> > > > > >
>> > > > >
>> > > > >
>> > > >
>> > >
>> > >
>> >
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170505/c4e44066/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scaling.png
Type: image/png
Size: 46047 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170505/c4e44066/attachment-0001.png>

From knepley at gmail.com  Fri May  5 11:28:38 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 5 May 2017 11:28:38 -0500
Subject: [petsc-users] Using SNES in a legacy code
In-Reply-To: <CA+gX-L9j8yrYhHDpCaTbbtsx5SA7YEZYr6WuLM-M5B-rHwDnvg@mail.gmail.com>
References: <CA+gX-L9j8yrYhHDpCaTbbtsx5SA7YEZYr6WuLM-M5B-rHwDnvg@mail.gmail.com>
Message-ID: <CAMYG4G=Fj1rPPMvwBOXG397gEmsruTfyxe=vPAuRzYG=7mqsPg@mail.gmail.com>

On Fri, May 5, 2017 at 11:14 AM, Karin&NiKo <niko.karin at gmail.com> wrote:

> Dear PETSc team,
>
> I am part of the development team of legacy fortran code with a tailored
> Newton's method. The software is already using PETSc's linear solvers and
> we enjoy it. Now I would like to evaluate the SNES solver.
> I have already extracted a function in order to compute the Jacobian and
> another one to compute the residual.
> But there is something I cannot figure out : at each Newton's iteration,
> our solver needs to know the unknowns value in order to compute the
> Jacobian. But the increment vector is computed within the SNES.
>

The FormJacobian() function that you pass in gets the current guess as an
argument.

  Thanks,

    Matt


> How can I synchronize PETSc's vector of unknowns  and mine? Is there some
> kind of SNESSetPostSolveShell ?
>
> Thanks for developping PETSc,
> Nicolas
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170505/852d2b22/attachment.html>

From tinap89 at yahoo.com  Mon May  8 01:13:51 2017
From: tinap89 at yahoo.com (Tina Patel)
Date: Mon, 8 May 2017 06:13:51 +0000 (UTC)
Subject: [petsc-users] PETSc module using modules~ both using PETSc sys, DMDA,
	vec, etc
References: <992950763.4463130.1494224031189.ref@mail.yahoo.com>
Message-ID: <992950763.4463130.1494224031189@mail.yahoo.com>

Hello,
I created a few standalone programs that use a DMDA structure, calculate and create matrices. However, now that I am trying to combine them using a main and using the files as modules, the header files seem to consistently conflict. I am currently only trying to compile 3 files out of the several that i have completed.?
when trying to compile 2 modules that i have, where the 1st module uses the 2nd module.?Common errors that i am getting is the?
"symbol 'xxx' ... conflicts with symbol from module 'utils', use-associated at ..."and"Cannot change attributes of USE-associated symbol xxxx at"
Is there a method to go about this? thank you.
-Tina
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170508/eaaee076/attachment.html>

From jed at jedbrown.org  Mon May  8 08:37:17 2017
From: jed at jedbrown.org (Jed Brown)
Date: Mon, 08 May 2017 07:37:17 -0600
Subject: [petsc-users] PETSc module using modules~ both using PETSc sys,
	DMDA, vec, etc
In-Reply-To: <992950763.4463130.1494224031189@mail.yahoo.com>
References: <992950763.4463130.1494224031189.ref@mail.yahoo.com>
	<992950763.4463130.1494224031189@mail.yahoo.com>
Message-ID: <87r2zzbhbm.fsf@jedbrown.org>

Tina Patel <tinap89 at yahoo.com> writes:

> Hello,
> I created a few standalone programs that use a DMDA structure, calculate and create matrices. However, now that I am trying to combine them using a main and using the files as modules, the header files seem to consistently conflict. I am currently only trying to compile 3 files out of the several that i have completed.?
> when trying to compile 2 modules that i have, where the 1st module uses the 2nd module.?Common errors that i am getting is the?
> "symbol 'xxx' ... conflicts with symbol from module 'utils', use-associated at ..."and"Cannot change attributes of USE-associated symbol xxxx at"
> Is there a method to go about this? 

Are these errors related to PETSc?  Note that if you reuse names in
different modules, you may need to do a selective import.

  use modulename, only: foo, bar=>baz
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170508/d77d8698/attachment.pgp>

From bsmith at mcs.anl.gov  Mon May  8 11:25:58 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 8 May 2017 11:25:58 -0500
Subject: [petsc-users] PETSc module using modules~ both using PETSc sys,
 DMDA, vec, etc
In-Reply-To: <992950763.4463130.1494224031189@mail.yahoo.com>
References: <992950763.4463130.1494224031189.ref@mail.yahoo.com>
	<992950763.4463130.1494224031189@mail.yahoo.com>
Message-ID: <FA1F8471-520C-4F46-9EC7-14CFB9C0D9F4@mcs.anl.gov>


   Which version of PETSc are you using. The way we handle Fortran in the git master branch of the repository makes this easier than in the current release so you might consider upgrading. 

   Send the entire error messages, likely you are creating two modules that contain the same PETSc variables which is not allowed with modules. 

    Barry


> On May 8, 2017, at 1:13 AM, Tina Patel <tinap89 at yahoo.com> wrote:
> 
> Hello,
> 
> I created a few standalone programs that use a DMDA structure, calculate and create matrices. However, now that I am trying to combine them using a main and using the files as modules, the header files seem to consistently conflict. I am currently only trying to compile 3 files out of the several that i have completed. 
> 
> when trying to compile 2 modules that i have, where the 1st module uses the 2nd module. 
> Common errors that i am getting is the 
> 
> "symbol 'xxx' ... conflicts with symbol from module 'utils', use-associated at ..."
> and
> "Cannot change attributes of USE-associated symbol xxxx at"
> 
> Is there a method to go about this? thank you.
> 
> -Tina


From leidy-catherine.ramirez-villalba at ec-nantes.fr  Tue May  9 11:04:06 2017
From: leidy-catherine.ramirez-villalba at ec-nantes.fr (Leidy Catherine Ramirez Villalba)
Date: Tue, 9 May 2017 18:04:06 +0200 (CEST)
Subject: [petsc-users] Optimization time parallel version of structural
	solver
Message-ID: <432790006.651849.1494345846204.JavaMail.zimbra@ec-nantes.fr>

Dear PETSc team, 

I'm currently working on the parallelization of the assembling of a system, previously assembled in a serial way (manual), but solved using PETSc in parallel. 
The problem I have is that when comparing computational time with the previous implementation, it seem that the parallel version is slower than the serial one. 

The type of matrices we deal with are sparse and might change their size in a significant order (kind of contact problems, where relations between elements change). 
For the example I'm using, for giving an example, the initial size of the matrix is : 139905, after several iteratinos it changes to: 141501 and finally to: 254172. 

The system is assembled and solved at each iteration and the matrix can not be re-used, therefore for each new iteration the matrix is set to zero keeping the previous non-zero pattern, and the option 'MAT_NEW_NONZERO_LOCATIONS' is set to 'TRUE'. 
In order to do the assembling I use the function 'MatSetValues' , inserting 3 lines and 3 rows, which might not be next to each other, and thus might no constitute a block. 

I believe that, what makes an important difference in time is the fact of adding almost the double of elements (from 139905 to 254172), but i don't know how what could I implement to retain a larger preallocation or to solve in any other way. 
I don't know, neither, in advance the position of new elements so that I can think in placing zeros to, maybe, generate a pre-pattern. 

Do you have any idea of how could I improve the time of the parallel version? 

Thanks in advance! 

Regards, 
Catherine 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170509/ecb121ca/attachment.html>

From knepley at gmail.com  Tue May  9 11:11:24 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 9 May 2017 11:11:24 -0500
Subject: [petsc-users] Optimization time parallel version of structural
	solver
In-Reply-To: <432790006.651849.1494345846204.JavaMail.zimbra@ec-nantes.fr>
References: <432790006.651849.1494345846204.JavaMail.zimbra@ec-nantes.fr>
Message-ID: <CAMYG4Gmfz_884xs5-7=GhbA+-fAc12zZpU08K0D6rz9w82Yqtg@mail.gmail.com>

On Tue, May 9, 2017 at 11:04 AM, "Leidy Catherine Ramirez Villalba" <
leidy-catherine.ramirez-villalba at ec-nantes.fr> wrote:

> Dear PETSc team,
>
> I'm currently working on the parallelization of the assembling of a
> system, previously assembled in a serial way (manual), but  solved using
> PETSc in parallel.
> The problem I have is that when comparing computational time with the
> previous implementation, it seem that the parallel version is slower than
> the serial one.
>
> The type of matrices we deal with are sparse and might change their size
> in a significant order (kind of contact problems, where relations between
> elements change).
> For the example I'm using, for giving an example, the initial size of the
> matrix is : 139905, after several iteratinos it changes to: 141501 and
> finally to: 254172.
>
> The system is assembled and solved at each iteration and the matrix can
> not be re-used, therefore for each new iteration the matrix is set to zero
> keeping the previous non-zero pattern, and the option
> 'MAT_NEW_NONZERO_LOCATIONS' is set to 'TRUE'.
> In order to do the assembling I use the function 'MatSetValues' ,
> inserting 3 lines and 3 rows, which might not be next to each other, and
> thus might no constitute a block.
>
> I believe that, what makes an important difference in time is the fact of
> adding almost the double of elements (from 139905 to  254172), but i don't
> know how what could I implement to retain a larger preallocation or to
> solve in any other way.
> I don't know, neither, in advance the position of new elements so that I
> can think in placing zeros to, maybe, generate a pre-pattern.
>

1) Conceivably, the difference in parallel might be that you are setting
elements owned by other processes. However for all performance questions,
we need to
    see the output of -log_view

2) Certainly inserting new elements is slow because reallocating the matrix
is slow. It would be faster to

  a) Throw away the first matrix
  b) Count all nonzeros in the second matrix
  c) Set preallocation for the second matrix
  d) Fill in the second matrix

  Thanks,

     Matt


> Do you have any idea of how could I improve the time of the parallel
> version?
>
> Thanks in advance!
>
> Regards,
> Catherine
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170509/f522f18a/attachment.html>

From pvsang002 at gmail.com  Thu May 11 07:08:24 2017
From: pvsang002 at gmail.com (Pham Pham)
Date: Thu, 11 May 2017 19:08:24 +0700
Subject: [petsc-users] Installation question
In-Reply-To: <CAMYG4Gkg2ap25Z5+=Ev2M7GspFubtYvYLSFFJMDZkseFhVjXoQ@mail.gmail.com>
References: <CAOx7dQm4dhoirqEA5S7=boMEp-u4n0g0sKtvb30geq9DzwJ+tQ@mail.gmail.com>
	<alpine.LFD.2.20.1704191042400.26973@asterix>
	<CAOx7dQ=_xTm-7WEsLLenHZp5ehnyH5mQAbK83w0oOWu5sqZAqg@mail.gmail.com>
	<alpine.LFD.2.20.1704191301160.30353@asterix>
	<CAOx7dQm=ozD3VQX3-8hzth_WfEXC9e5d28Yadh=F+diGPDk0Ww@mail.gmail.com>
	<alpine.LFD.2.20.1705050859500.10073@asterix>
	<CAOx7dQmoY3eCbkks1fzX0HT7FS0bkWOPwqNLE1sbyG3V6nOZqQ@mail.gmail.com>
	<CAMYG4Gkg2ap25Z5+=Ev2M7GspFubtYvYLSFFJMDZkseFhVjXoQ@mail.gmail.com>
Message-ID: <CAOx7dQ=zh4Murs5AqCPFzwNQ7GasZWYrQwhqhi3dL55NZS=apg@mail.gmail.com>

Hi Matt,

Thank you for the reply.

I am using University HPC which has multiple nodes, and should be good for
parallel computing. The bad performance might be due to the way I install
and run PETSc...

Looking at the output when running streams, I can see that the Processor
names were the same.
Does that mean only one processor involved in computing, did it cause the
bad performance?

Thank you very much.

Ph.

Below is testing output:

[mpepvs at atlas5-c01 petsc-3.7.5]$ make
PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt
streams

cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory
PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt
streams
/app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/bin/mpicxx -o
MPIVersion.o c -wd1572 -g -O3   -fPIC
-I/home/svu/mpepvs/petsc/petsc-3.7.5/include
-I/hom
e/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include
-I/app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/include
`pwd`/MPIVersion.c
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
The version of PETSc you are using is out-of-date, we recommend updating to
the new release
 Available Version: 3.7.6   Installed Version: 3.7.5
http://www.mcs.anl.gov/petsc/download/index.html
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Running streams with 'mpiexec.hydra ' using 'NPMAX=12'
Number of MPI processes 1 Processor names  atlas5-c01
Triad:        11026.7604   Rate (MB/s)
Number of MPI processes 2 Processor names  atlas5-c01 atlas5-c01
Triad:        14669.6730   Rate (MB/s)
Number of MPI processes 3 Processor names  atlas5-c01 atlas5-c01 atlas5-c01
Triad:        12848.2644   Rate (MB/s)
Number of MPI processes 4 Processor names  atlas5-c01 atlas5-c01 atlas5-c01
atlas5-c01
Triad:        15033.7687   Rate (MB/s)
Number of MPI processes 5 Processor names  atlas5-c01 atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01
Triad:        13299.3830   Rate (MB/s)
Number of MPI processes 6 Processor names  atlas5-c01 atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01
Triad:        14382.2116   Rate (MB/s)
Number of MPI processes 7 Processor names  atlas5-c01 atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
Triad:        13194.2573   Rate (MB/s)
Number of MPI processes 8 Processor names  atlas5-c01 atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
Triad:        14199.7255   Rate (MB/s)
Number of MPI processes 9 Processor names  atlas5-c01 atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
Triad:        13045.8946   Rate (MB/s)
Number of MPI processes 10 Processor names  atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01
Triad:        13058.3283   Rate (MB/s)
Number of MPI processes 11 Processor names  atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01
Triad:        13037.3334   Rate (MB/s)
Number of MPI processes 12 Processor names  atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
Triad:        12526.6096   Rate (MB/s)
------------------------------------------------
np  speedup
1 1.0
2 1.33
3 1.17
4 1.36
5 1.21
6 1.3
7 1.2
8 1.29
9 1.18
10 1.18
11 1.18
12 1.14
Estimation of possible speedup of MPI programs based on Streams benchmark.
It appears you have 1 node(s)
See graph in the file src/benchmarks/streams/scaling.png

On Fri, May 5, 2017 at 11:26 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Fri, May 5, 2017 at 10:18 AM, Pham Pham <pvsang002 at gmail.com> wrote:
>
>> Hi Satish,
>>
>> It runs now, and shows a bad speed up:
>> Please help to improve this.
>>
>
> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers
>
> The short answer is: You cannot improve this without buying a different
> machine. This is
> a fundamental algorithmic limitation that cannot be helped by threads, or
> vectorization, or
> anything else.
>
>    Matt
>
>
>> Thank you.
>>
>>
>> ?
>>
>> On Fri, May 5, 2017 at 10:02 PM, Satish Balay <balay at mcs.anl.gov> wrote:
>>
>>> With Intel MPI - its best to use mpiexec.hydra [and not mpiexec]
>>>
>>> So you can do:
>>>
>>> make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>> PETSC_ARCH=arch-linux-cxx-opt MPIEXEC=mpiexec.hydra test
>>>
>>>
>>> [you can also specify --with-mpiexec=mpiexec.hydra at configure time]
>>>
>>> Satish
>>>
>>>
>>> On Fri, 5 May 2017, Pham Pham wrote:
>>>
>>> > *Hi,*
>>> > *I can configure now, but fail when testing:*
>>> >
>>> > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
>>> > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>> PETSC_ARCH=arch-linux-cxx-opt
>>> > test   Running test examples to verify correct installation
>>> > Using PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 and
>>> > PETSC_ARCH=arch-linux-cxx-opt
>>> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1
>>> MPI
>>> > process
>>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>> > mpiexec_atlas7-c10: cannot connect to local mpd
>>> (/tmp/mpd2.console_mpepvs);
>>> > possible causes:
>>> >   1. no mpd is running on this host
>>> >   2. an mpd is running but was started without a "console" (-n option)
>>> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2
>>> MPI
>>> > processes
>>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>> > mpiexec_atlas7-c10: cannot connect to local mpd
>>> (/tmp/mpd2.console_mpepvs);
>>> > possible causes:
>>> >   1. no mpd is running on this host
>>> >   2. an mpd is running but was started without a "console" (-n option)
>>> > Possible error running Fortran example src/snes/examples/tutorials/ex
>>> 5f
>>> > with 1 MPI process
>>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>> > mpiexec_atlas7-c10: cannot connect to local mpd
>>> (/tmp/mpd2.console_mpepvs);
>>> > possible causes:
>>> >   1. no mpd is running on this host
>>> >   2. an mpd is running but was started without a "console" (-n option)
>>> > Completed test examples
>>> > =========================================
>>> > Now to evaluate the computer systems you plan use - do:
>>> > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>> > PETSC_ARCH=arch-linux-cxx-opt streams
>>> >
>>> >
>>> >
>>> >
>>> > *Please help on this.*
>>> > *Many thanks!*
>>> >
>>> >
>>> > On Thu, Apr 20, 2017 at 2:02 AM, Satish Balay <balay at mcs.anl.gov>
>>> wrote:
>>> >
>>> > > Sorry - should have mentioned:
>>> > >
>>> > > do 'rm -rf arch-linux-cxx-opt' and rerun configure again.
>>> > >
>>> > > The mpich install from previous build [that is currently in
>>> > > arch-linux-cxx-opt/]
>>> > > is conflicting with --with-mpi-dir=/app1/centos6.3/gnu/mvapich2-1.9/
>>> > >
>>> > > Satish
>>> > >
>>> > >
>>> > > On Wed, 19 Apr 2017, Pham Pham wrote:
>>> > >
>>> > > > I reconfigured PETSs with installed MPI, however, I got serous
>>> error:
>>> > > >
>>> > > > **************************ERROR*****************************
>>> ********
>>> > > >   Error during compile, check arch-linux-cxx-opt/lib/petsc/c
>>> onf/make.log
>>> > > >   Send it and arch-linux-cxx-opt/lib/petsc/conf/configure.log to
>>> > > > petsc-maint at mcs.anl.gov
>>> > > > ************************************************************
>>> ********
>>> > > >
>>> > > > Please explain what is happening?
>>> > > >
>>> > > > Thank you very much.
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > > On Wed, Apr 19, 2017 at 11:43 PM, Satish Balay <balay at mcs.anl.gov>
>>> > > wrote:
>>> > > >
>>> > > > > Presumably your cluster already has a recommended MPI to use
>>> [which is
>>> > > > > already installed. So you should use that - instead of
>>> > > > > --download-mpich=1
>>> > > > >
>>> > > > > Satish
>>> > > > >
>>> > > > > On Wed, 19 Apr 2017, Pham Pham wrote:
>>> > > > >
>>> > > > > > Hi,
>>> > > > > >
>>> > > > > > I just installed petsc-3.7.5 into my university cluster. When
>>> > > evaluating
>>> > > > > > the computer system, PETSc reports "It appears you have 1
>>> node(s)", I
>>> > > > > donot
>>> > > > > > understand this, since the system is a multinodes system.
>>> Could you
>>> > > > > please
>>> > > > > > explain this to me?
>>> > > > > >
>>> > > > > > Thank you very much.
>>> > > > > >
>>> > > > > > S.
>>> > > > > >
>>> > > > > > Output:
>>> > > > > > =========================================
>>> > > > > > Now to evaluate the computer systems you plan use - do:
>>> > > > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>> > > > > > PETSC_ARCH=arch-linux-cxx-opt streams
>>> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
>>> > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>> > > > > PETSC_ARCH=arch-linux-cxx-opt
>>> > > > > > streams
>>> > > > > > cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory
>>> > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>> > > > > PETSC_ARCH=arch-linux-cxx-opt
>>> > > > > > streams
>>> > > > > > /home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpicxx
>>> -o
>>> > > > > > MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
>>> > > > > > -Wno-unknown-pragmas -fvisibility=hidden -g -O
>>> > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
>>> > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/incl
>>> ude
>>> > > > > > `pwd`/MPIVersion.c
>>> > > > > > Running streams with
>>> > > > > > '/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpiexec
>>> '
>>> > > > > using
>>> > > > > > 'NPMAX=12'
>>> > > > > > Number of MPI processes 1 Processor names  atlas7-c10
>>> > > > > > Triad:         9137.5025   Rate (MB/s)
>>> > > > > > Number of MPI processes 2 Processor names  atlas7-c10
>>> atlas7-c10
>>> > > > > > Triad:         9707.2815   Rate (MB/s)
>>> > > > > > Number of MPI processes 3 Processor names  atlas7-c10
>>> atlas7-c10
>>> > > > > atlas7-c10
>>> > > > > > Triad:        13559.5275   Rate (MB/s)
>>> > > > > > Number of MPI processes 4 Processor names  atlas7-c10
>>> atlas7-c10
>>> > > > > atlas7-c10
>>> > > > > > atlas7-c10
>>> > > > > > Triad:        14193.0597   Rate (MB/s)
>>> > > > > > Number of MPI processes 5 Processor names  atlas7-c10
>>> atlas7-c10
>>> > > > > atlas7-c10
>>> > > > > > atlas7-c10 atlas7-c10
>>> > > > > > Triad:        14492.9234   Rate (MB/s)
>>> > > > > > Number of MPI processes 6 Processor names  atlas7-c10
>>> atlas7-c10
>>> > > > > atlas7-c10
>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10
>>> > > > > > Triad:        15476.5912   Rate (MB/s)
>>> > > > > > Number of MPI processes 7 Processor names  atlas7-c10
>>> atlas7-c10
>>> > > > > atlas7-c10
>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>> > > > > > Triad:        15148.7388   Rate (MB/s)
>>> > > > > > Number of MPI processes 8 Processor names  atlas7-c10
>>> atlas7-c10
>>> > > > > atlas7-c10
>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>> > > > > > Triad:        15799.1290   Rate (MB/s)
>>> > > > > > Number of MPI processes 9 Processor names  atlas7-c10
>>> atlas7-c10
>>> > > > > atlas7-c10
>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>> atlas7-c10
>>> > > > > > Triad:        15671.3104   Rate (MB/s)
>>> > > > > > Number of MPI processes 10 Processor names  atlas7-c10
>>> atlas7-c10
>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>> atlas7-c10
>>> > > > > > atlas7-c10 atlas7-c10
>>> > > > > > Triad:        15601.4754   Rate (MB/s)
>>> > > > > > Number of MPI processes 11 Processor names  atlas7-c10
>>> atlas7-c10
>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>> atlas7-c10
>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10
>>> > > > > > Triad:        15434.5790   Rate (MB/s)
>>> > > > > > Number of MPI processes 12 Processor names  atlas7-c10
>>> atlas7-c10
>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>> atlas7-c10
>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>> > > > > > Triad:        15134.1263   Rate (MB/s)
>>> > > > > > ------------------------------------------------
>>> > > > > > np  speedup
>>> > > > > > 1 1.0
>>> > > > > > 2 1.06
>>> > > > > > 3 1.48
>>> > > > > > 4 1.55
>>> > > > > > 5 1.59
>>> > > > > > 6 1.69
>>> > > > > > 7 1.66
>>> > > > > > 8 1.73
>>> > > > > > 9 1.72
>>> > > > > > 10 1.71
>>> > > > > > 11 1.69
>>> > > > > > 12 1.66
>>> > > > > > Estimation of possible speedup of MPI programs based on Streams
>>> > > > > benchmark.
>>> > > > > > It appears you have 1 node(s)
>>> > > > > > Unable to plot speedup to a file
>>> > > > > > Unable to open matplotlib to plot speedup
>>> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
>>> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
>>> > > > > >
>>> > > > >
>>> > > > >
>>> > > >
>>> > >
>>> > >
>>> >
>>>
>>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170511/419747d8/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scaling.png
Type: image/png
Size: 46047 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170511/419747d8/attachment-0001.png>

From knepley at gmail.com  Thu May 11 07:27:19 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 11 May 2017 07:27:19 -0500
Subject: [petsc-users] Installation question
In-Reply-To: <CAOx7dQ=zh4Murs5AqCPFzwNQ7GasZWYrQwhqhi3dL55NZS=apg@mail.gmail.com>
References: <CAOx7dQm4dhoirqEA5S7=boMEp-u4n0g0sKtvb30geq9DzwJ+tQ@mail.gmail.com>
	<alpine.LFD.2.20.1704191042400.26973@asterix>
	<CAOx7dQ=_xTm-7WEsLLenHZp5ehnyH5mQAbK83w0oOWu5sqZAqg@mail.gmail.com>
	<alpine.LFD.2.20.1704191301160.30353@asterix>
	<CAOx7dQm=ozD3VQX3-8hzth_WfEXC9e5d28Yadh=F+diGPDk0Ww@mail.gmail.com>
	<alpine.LFD.2.20.1705050859500.10073@asterix>
	<CAOx7dQmoY3eCbkks1fzX0HT7FS0bkWOPwqNLE1sbyG3V6nOZqQ@mail.gmail.com>
	<CAMYG4Gkg2ap25Z5+=Ev2M7GspFubtYvYLSFFJMDZkseFhVjXoQ@mail.gmail.com>
	<CAOx7dQ=zh4Murs5AqCPFzwNQ7GasZWYrQwhqhi3dL55NZS=apg@mail.gmail.com>
Message-ID: <CAMYG4Gm23YFFLqG32S2FcvKTRmn+UmbmA22cKov1D9C+1cuSFw@mail.gmail.com>

On Thu, May 11, 2017 at 7:08 AM, Pham Pham <pvsang002 at gmail.com> wrote:

> Hi Matt,
>
> Thank you for the reply.
>
> I am using University HPC which has multiple nodes, and should be good for
> parallel computing. The bad performance might be due to the way I install
> and run PETSc...
>
> Looking at the output when running streams, I can see that the Processor
> names were the same.
> Does that mean only one processor involved in computing, did it cause the
> bad performance?
>

Yes. From the data, it appears that the kind of processor you have has 12
cores, but only enough memory bandwidth to support 1.5 cores.
Try running the STREAMS with only 1 process per node. This is a setting in
your submission script, but it is different for every cluster. Thus
I would ask the local sysdamin for this machine to help you do that. You
should see almost perfect scaling with that configuration. You might
also try 2 processes per node to compare.

  Thanks,

     Matt


> Thank you very much.
>
> Ph.
>
> Below is testing output:
>
> [mpepvs at atlas5-c01 petsc-3.7.5]$ make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> PETSC_ARCH=arch-linux-cxx-opt streams
>
>
>
>
> cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory
> PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> PETSC_ARCH=arch-linux-cxx-opt streams
> /app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/bin/mpicxx -o
> MPIVersion.o c -wd1572 -g -O3   -fPIC    -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
> -I/hom
>
>
> e/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include
> -I/app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/include
> `pwd`/MPIVersion.c
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> +++++++++++++++++++++++++++++++
> The version of PETSc you are using is out-of-date, we recommend updating
> to the new release
>  Available Version: 3.7.6   Installed Version: 3.7.5
> http://www.mcs.anl.gov/petsc/download/index.html
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> +++++++++++++++++++++++++++++++
> Running streams with 'mpiexec.hydra ' using 'NPMAX=12'
> Number of MPI processes 1 Processor names  atlas5-c01
> Triad:        11026.7604   Rate (MB/s)
> Number of MPI processes 2 Processor names  atlas5-c01 atlas5-c01
> Triad:        14669.6730   Rate (MB/s)
> Number of MPI processes 3 Processor names  atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        12848.2644   Rate (MB/s)
> Number of MPI processes 4 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01
> Triad:        15033.7687   Rate (MB/s)
> Number of MPI processes 5 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        13299.3830   Rate (MB/s)
> Number of MPI processes 6 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        14382.2116   Rate (MB/s)
> Number of MPI processes 7 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        13194.2573   Rate (MB/s)
> Number of MPI processes 8 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        14199.7255   Rate (MB/s)
> Number of MPI processes 9 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        13045.8946   Rate (MB/s)
> Number of MPI processes 10 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01
> Triad:        13058.3283   Rate (MB/s)
> Number of MPI processes 11 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        13037.3334   Rate (MB/s)
> Number of MPI processes 12 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        12526.6096   Rate (MB/s)
> ------------------------------------------------
> np  speedup
> 1 1.0
> 2 1.33
> 3 1.17
> 4 1.36
> 5 1.21
> 6 1.3
> 7 1.2
> 8 1.29
> 9 1.18
> 10 1.18
> 11 1.18
> 12 1.14
> Estimation of possible speedup of MPI programs based on Streams benchmark.
> It appears you have 1 node(s)
> See graph in the file src/benchmarks/streams/scaling.png
>
> On Fri, May 5, 2017 at 11:26 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Fri, May 5, 2017 at 10:18 AM, Pham Pham <pvsang002 at gmail.com> wrote:
>>
>>> Hi Satish,
>>>
>>> It runs now, and shows a bad speed up:
>>> Please help to improve this.
>>>
>>
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers
>>
>> The short answer is: You cannot improve this without buying a different
>> machine. This is
>> a fundamental algorithmic limitation that cannot be helped by threads, or
>> vectorization, or
>> anything else.
>>
>>    Matt
>>
>>
>>> Thank you.
>>>
>>>
>>> ?
>>>
>>> On Fri, May 5, 2017 at 10:02 PM, Satish Balay <balay at mcs.anl.gov> wrote:
>>>
>>>> With Intel MPI - its best to use mpiexec.hydra [and not mpiexec]
>>>>
>>>> So you can do:
>>>>
>>>> make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>>> PETSC_ARCH=arch-linux-cxx-opt MPIEXEC=mpiexec.hydra test
>>>>
>>>>
>>>> [you can also specify --with-mpiexec=mpiexec.hydra at configure time]
>>>>
>>>> Satish
>>>>
>>>>
>>>> On Fri, 5 May 2017, Pham Pham wrote:
>>>>
>>>> > *Hi,*
>>>> > *I can configure now, but fail when testing:*
>>>> >
>>>> > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
>>>> > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>>> PETSC_ARCH=arch-linux-cxx-opt
>>>> > test   Running test examples to verify correct installation
>>>> > Using PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 and
>>>> > PETSC_ARCH=arch-linux-cxx-opt
>>>> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1
>>>> MPI
>>>> > process
>>>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>> > mpiexec_atlas7-c10: cannot connect to local mpd
>>>> (/tmp/mpd2.console_mpepvs);
>>>> > possible causes:
>>>> >   1. no mpd is running on this host
>>>> >   2. an mpd is running but was started without a "console" (-n option)
>>>> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2
>>>> MPI
>>>> > processes
>>>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>> > mpiexec_atlas7-c10: cannot connect to local mpd
>>>> (/tmp/mpd2.console_mpepvs);
>>>> > possible causes:
>>>> >   1. no mpd is running on this host
>>>> >   2. an mpd is running but was started without a "console" (-n option)
>>>> > Possible error running Fortran example src/snes/examples/tutorials/ex
>>>> 5f
>>>> > with 1 MPI process
>>>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>> > mpiexec_atlas7-c10: cannot connect to local mpd
>>>> (/tmp/mpd2.console_mpepvs);
>>>> > possible causes:
>>>> >   1. no mpd is running on this host
>>>> >   2. an mpd is running but was started without a "console" (-n option)
>>>> > Completed test examples
>>>> > =========================================
>>>> > Now to evaluate the computer systems you plan use - do:
>>>> > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>>> > PETSC_ARCH=arch-linux-cxx-opt streams
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > *Please help on this.*
>>>> > *Many thanks!*
>>>> >
>>>> >
>>>> > On Thu, Apr 20, 2017 at 2:02 AM, Satish Balay <balay at mcs.anl.gov>
>>>> wrote:
>>>> >
>>>> > > Sorry - should have mentioned:
>>>> > >
>>>> > > do 'rm -rf arch-linux-cxx-opt' and rerun configure again.
>>>> > >
>>>> > > The mpich install from previous build [that is currently in
>>>> > > arch-linux-cxx-opt/]
>>>> > > is conflicting with --with-mpi-dir=/app1/centos6.3
>>>> /gnu/mvapich2-1.9/
>>>> > >
>>>> > > Satish
>>>> > >
>>>> > >
>>>> > > On Wed, 19 Apr 2017, Pham Pham wrote:
>>>> > >
>>>> > > > I reconfigured PETSs with installed MPI, however, I got serous
>>>> error:
>>>> > > >
>>>> > > > **************************ERROR*****************************
>>>> ********
>>>> > > >   Error during compile, check arch-linux-cxx-opt/lib/petsc/c
>>>> onf/make.log
>>>> > > >   Send it and arch-linux-cxx-opt/lib/petsc/conf/configure.log to
>>>> > > > petsc-maint at mcs.anl.gov
>>>> > > > ************************************************************
>>>> ********
>>>> > > >
>>>> > > > Please explain what is happening?
>>>> > > >
>>>> > > > Thank you very much.
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > > On Wed, Apr 19, 2017 at 11:43 PM, Satish Balay <balay at mcs.anl.gov
>>>> >
>>>> > > wrote:
>>>> > > >
>>>> > > > > Presumably your cluster already has a recommended MPI to use
>>>> [which is
>>>> > > > > already installed. So you should use that - instead of
>>>> > > > > --download-mpich=1
>>>> > > > >
>>>> > > > > Satish
>>>> > > > >
>>>> > > > > On Wed, 19 Apr 2017, Pham Pham wrote:
>>>> > > > >
>>>> > > > > > Hi,
>>>> > > > > >
>>>> > > > > > I just installed petsc-3.7.5 into my university cluster. When
>>>> > > evaluating
>>>> > > > > > the computer system, PETSc reports "It appears you have 1
>>>> node(s)", I
>>>> > > > > donot
>>>> > > > > > understand this, since the system is a multinodes system.
>>>> Could you
>>>> > > > > please
>>>> > > > > > explain this to me?
>>>> > > > > >
>>>> > > > > > Thank you very much.
>>>> > > > > >
>>>> > > > > > S.
>>>> > > > > >
>>>> > > > > > Output:
>>>> > > > > > =========================================
>>>> > > > > > Now to evaluate the computer systems you plan use - do:
>>>> > > > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>>> > > > > > PETSC_ARCH=arch-linux-cxx-opt streams
>>>> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
>>>> > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>>> > > > > PETSC_ARCH=arch-linux-cxx-opt
>>>> > > > > > streams
>>>> > > > > > cd src/benchmarks/streams; /usr/bin/gmake
>>>> --no-print-directory
>>>> > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>>> > > > > PETSC_ARCH=arch-linux-cxx-opt
>>>> > > > > > streams
>>>> > > > > > /home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpicxx
>>>> -o
>>>> > > > > > MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
>>>> > > > > > -Wno-unknown-pragmas -fvisibility=hidden -g -O
>>>> > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
>>>> > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/incl
>>>> ude
>>>> > > > > > `pwd`/MPIVersion.c
>>>> > > > > > Running streams with
>>>> > > > > > '/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpiexec
>>>> '
>>>> > > > > using
>>>> > > > > > 'NPMAX=12'
>>>> > > > > > Number of MPI processes 1 Processor names  atlas7-c10
>>>> > > > > > Triad:         9137.5025   Rate (MB/s)
>>>> > > > > > Number of MPI processes 2 Processor names  atlas7-c10
>>>> atlas7-c10
>>>> > > > > > Triad:         9707.2815   Rate (MB/s)
>>>> > > > > > Number of MPI processes 3 Processor names  atlas7-c10
>>>> atlas7-c10
>>>> > > > > atlas7-c10
>>>> > > > > > Triad:        13559.5275   Rate (MB/s)
>>>> > > > > > Number of MPI processes 4 Processor names  atlas7-c10
>>>> atlas7-c10
>>>> > > > > atlas7-c10
>>>> > > > > > atlas7-c10
>>>> > > > > > Triad:        14193.0597   Rate (MB/s)
>>>> > > > > > Number of MPI processes 5 Processor names  atlas7-c10
>>>> atlas7-c10
>>>> > > > > atlas7-c10
>>>> > > > > > atlas7-c10 atlas7-c10
>>>> > > > > > Triad:        14492.9234   Rate (MB/s)
>>>> > > > > > Number of MPI processes 6 Processor names  atlas7-c10
>>>> atlas7-c10
>>>> > > > > atlas7-c10
>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10
>>>> > > > > > Triad:        15476.5912   Rate (MB/s)
>>>> > > > > > Number of MPI processes 7 Processor names  atlas7-c10
>>>> atlas7-c10
>>>> > > > > atlas7-c10
>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>>> > > > > > Triad:        15148.7388   Rate (MB/s)
>>>> > > > > > Number of MPI processes 8 Processor names  atlas7-c10
>>>> atlas7-c10
>>>> > > > > atlas7-c10
>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>>> > > > > > Triad:        15799.1290   Rate (MB/s)
>>>> > > > > > Number of MPI processes 9 Processor names  atlas7-c10
>>>> atlas7-c10
>>>> > > > > atlas7-c10
>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>>> atlas7-c10
>>>> > > > > > Triad:        15671.3104   Rate (MB/s)
>>>> > > > > > Number of MPI processes 10 Processor names  atlas7-c10
>>>> atlas7-c10
>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>>> atlas7-c10
>>>> > > > > > atlas7-c10 atlas7-c10
>>>> > > > > > Triad:        15601.4754   Rate (MB/s)
>>>> > > > > > Number of MPI processes 11 Processor names  atlas7-c10
>>>> atlas7-c10
>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>>> atlas7-c10
>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10
>>>> > > > > > Triad:        15434.5790   Rate (MB/s)
>>>> > > > > > Number of MPI processes 12 Processor names  atlas7-c10
>>>> atlas7-c10
>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>>> atlas7-c10
>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>>> > > > > > Triad:        15134.1263   Rate (MB/s)
>>>> > > > > > ------------------------------------------------
>>>> > > > > > np  speedup
>>>> > > > > > 1 1.0
>>>> > > > > > 2 1.06
>>>> > > > > > 3 1.48
>>>> > > > > > 4 1.55
>>>> > > > > > 5 1.59
>>>> > > > > > 6 1.69
>>>> > > > > > 7 1.66
>>>> > > > > > 8 1.73
>>>> > > > > > 9 1.72
>>>> > > > > > 10 1.71
>>>> > > > > > 11 1.69
>>>> > > > > > 12 1.66
>>>> > > > > > Estimation of possible speedup of MPI programs based on
>>>> Streams
>>>> > > > > benchmark.
>>>> > > > > > It appears you have 1 node(s)
>>>> > > > > > Unable to plot speedup to a file
>>>> > > > > > Unable to open matplotlib to plot speedup
>>>> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
>>>> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
>>>> > > > > >
>>>> > > > >
>>>> > > > >
>>>> > > >
>>>> > >
>>>> > >
>>>> >
>>>>
>>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170511/b880df5a/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scaling.png
Type: image/png
Size: 46047 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170511/b880df5a/attachment-0001.png>

From gbisht at lbl.gov  Thu May 11 11:24:39 2017
From: gbisht at lbl.gov (Gautam Bisht)
Date: Thu, 11 May 2017 09:24:39 -0700
Subject: [petsc-users] [petsc-dev] For Fortran users of PETSc
	development version
In-Reply-To: <4726FF04-1D07-455F-88B4-FF2488DCAE07@mcs.anl.gov>
References: <DFD91CBB-599B-4061-BE29-DD66CAB22610@mcs.anl.gov>
	<4726FF04-1D07-455F-88B4-FF2488DCAE07@mcs.anl.gov>
Message-ID: <CAPz1TncvoZ6_bX7YZ-+Vd+x=ExHk98OLVFo2M7rg867dw4gssw@mail.gmail.com>

Hi Barry,

I'm wondering if these changes will be a part of future 3.8.0 release or
4.0.0 release. And, do you have a tentative timeline when such a release
tag would be made?

-Gautam.

On Sun, Dec 4, 2016 at 11:13 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   Jed noticed a small mistake in my description. It is type(tXXX) not
> type(iXXX) if you chose to declare your variables that way. Note that
> declaring them via type(tXXX) or XXX is identical (XXX is just a macro for
> type(tXXX)).
>
>    Barry
>
>
> > On Dec 4, 2016, at 11:57 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >
> >    For Fortran users of the PETSc development (git master branch) version
> >
> >
> >    I have updated and simplified the Fortran usage of PETSc in the past
> few weeks. I will put the branch barry/fortran-update into the master
> branch on Monday. The usage changes are
> >
> > A)  for each Fortran function (and main) use the following
> >
> >       subroutine mysubroutine(.....)
> > #include <petsc/finclude/petscxxx.h>
> >       use petscxxx
> >       implicit none
> >
> >     For example if you are using SNES in your code you would have
> >
> > #include <petsc/finclude/petscsnes.h>
> >       use petscsnes
> >       implicit none
> >
> > B) Instead of PETSC_NULL_OBJECT you must pass PETSC_NULL_XXX (for
> example PETSC_NULL_VEC) using the specific object type XXX that the
> function call is expecting.
> >
> > C) Objects can be declared either as XXX  a  or  type(iXXX) a, for
> example Mat a or type(iMat) a.  (Note that previously for those who used
> types it was type(Mat) but that can no longer be used.
> >
> > Notes:
> >
> > 1)   There are no longer any .h90 files that may be included
> >
> > 2)    Like C the include files are now nested so you no longer need to
> include for example
> >
> > #include <petsc/finclude/petscsys.h>
> > #include <petsc/finclude/petscvec.h>
> > #include <petsc/finclude/petscmat.h>
> > #include <petsc/finclude/petscpc.h>
> > #include <petsc/finclude/petscksp.h>
> >
> > you can just include
> >
> > #include <petsc/finclude/petscksp.h>
> >
> > 3) there is now type checking of most function calls. This will help
> eliminate bugs due to incorrect calling sequences. Note that Fortran
> distinguishes between a argument that is a scalar (zero dimensional array),
> a one dimensional array and a two dimensional array (etc). So you may get
> compile warnings because you are passing in an array when PETSc expects a
> scalar or vis-versa. If you get these simply fix your declaration of the
> variable to match what is expected. In some routines like MatSetValues()
> and friends you can pass either scalars, one dimensional arrays or two
> dimensional arrays, if you get errors here please send mail to
> petsc-maint at mcs.anl.gov and include enough of your code so we can see the
> dimensions of all your variables so we can fix the problems.
> >
> > 4) You can continue to use either fixed (.F extension) or free format
> (.F90 extension) for your source
> >
> > 5) All the examples in PETSc have been updated so consult them for
> clarifications.
> >
> >
> >  Please report any problems to petsc-maint at mcs.anl.gov
> >
> >   Thanks
> >
> >   Barry
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170511/0b1dd4b9/attachment.html>

From leejearl at 126.com  Thu May 11 22:57:03 2017
From: leejearl at 126.com (=?GBK?B?wO68vg==?=)
Date: Fri, 12 May 2017 11:57:03 +0800 (CST)
Subject: [petsc-users] how to get the vertices belongs to a control volume
	in 2D?
Message-ID: <298d38df.50bb.15bfacd8032.Coremail.leejearl@126.com>

Hi developers:

    I have such a question that I want to get the vertices of a cell. I know I can get the points by


    1. Getting the faces of a cell such as "DMPlexGetCone(dm, c, &faces";


    2. Getting the vertices of every face of the cell such as "DMPlexGetCone(dm, f, &vertices)".

    Then I can obtain the vertices belongs to a cell. Is there any concise routine which I can choose to get the


vertices of a cell directly?

    Thanks





leejearl 




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170512/baf40436/attachment.html>

From mbaker112 at outlook.de  Fri May 12 01:50:41 2017
From: mbaker112 at outlook.de (Matt Baker)
Date: Fri, 12 May 2017 06:50:41 +0000
Subject: [petsc-users] Some general questions
Message-ID: <DB6PR0801MB2102D262E89A92B98C9BED2B9CE20@DB6PR0801MB2102.eurprd08.prod.outlook.com>

Hello,


I have a few questions on how to improve performance of my program. I'm solving Poisson's equation on a (large) 3D FD grid with Dirichlet boundary conditions and multiple right hand sides. I set up the matrix and everything's working fine so far, but I'm sure the solving process could go faster. I know multigrid is generally the best preconditioner in such a case and algebraic multigrid currently works best.


So generally speaking:


Should I make the effort of symmetrizising the system matrix? I know how to do it, but it would probably take some time. CG does currently work, but is not competitive against other methods, so I guess the matrix might not be "symmetric enough"?


For the various multigrid preconditioners: I always read that the problem should be solved exactly on the coarsest grid, but wouldn't an iterative solver do the same job if its provided accuracy is high enough, since the coarse discretization and the subsequent interpolation process introduce errors themselves?


I submit my program to a batch system, but PETSc was compiled on the login node with different hardware. Is this affecting performance? What parts of the configuration process should I perform on a compute node then?


Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170512/ec4f3235/attachment.html>

From dave.mayhem23 at gmail.com  Fri May 12 02:25:43 2017
From: dave.mayhem23 at gmail.com (Dave May)
Date: Fri, 12 May 2017 08:25:43 +0100
Subject: [petsc-users] Some general questions
In-Reply-To: <DB6PR0801MB2102D262E89A92B98C9BED2B9CE20@DB6PR0801MB2102.eurprd08.prod.outlook.com>
References: <DB6PR0801MB2102D262E89A92B98C9BED2B9CE20@DB6PR0801MB2102.eurprd08.prod.outlook.com>
Message-ID: <CAJ98EDp5FbVQnS1r+C_WRT_X506UfKKk4TJER-XGr2C_1WajOg@mail.gmail.com>

On 12 May 2017 at 07:50, Matt Baker <mbaker112 at outlook.de> wrote:

> Hello,
>
>
> I have a few questions on how to improve performance of my program. I'm
> solving Poisson's equation on a (large) 3D FD grid with Dirichlet boundary
> conditions and multiple right hand sides. I set up the matrix and
> everything's working fine so far, but I'm sure the solving process could go
> faster. I know multigrid is generally the best preconditioner in such a
> case and algebraic multigrid currently works best.
>

If you use a DMDA for your FD problem, consider using PCMG with Galerkin.
It will set up a geometric multigrid hierarchy. Depending on the specifics
of your Poisson problem (constant coefficient versus highly hetegoneous),
geometric MG is likely superior (faster time to solution) than AMG.


>
> So generally speaking:
>
>
> Should I make the effort of symmetrizising the system matrix? I know how
> to do it, but it would probably take some time. CG does currently work, but
> is not competitive against other methods, so I guess the matrix might not
> be "symmetric enough"?
>

In its basic form, CG is only guaranteed to converge with with an SPD
operator.
If you want to use CG, definitely do the work and make the operator
symmetric.

>
> For the various multigrid preconditioners: I always read that the problem
> should be solved exactly on the coarsest grid, but wouldn't an iterative
> solver do the same job if its provided accuracy is high enough, since the
> coarse discretization and the subsequent interpolation process introduce
> errors themselves?
>

Yes iterative can work well. If your Poisson problem has a constant
coefficient, rtol 1.0e-1 is likely a sufficient tolerance to use for an the
coarse grid solve (e.g. overall convergence of solve won't be affected). If
the Poisson problem has a highly variable coefficient (jumps of O(1e3) or
more), or it has very large gradients say 1e3 variation over a few cells,
then you will have to perform a more accurate iterative coarse level solve
(say rtol 1e-4 to 1e-6). Note that the numbers for rtol I quote are purely
empirical.


>
> I submit my program to a batch system, but PETSc was compiled on the login
> node with different hardware. Is this affecting performance? What parts of
> the configuration process should I perform on a compute node then?
>
If the login and compute nodes are fundamentally different, you should
configure petsc with the option
 --with-batch
and following the instructions.

Thanks,
  Dave

>
> Thanks.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170512/fb3dc18b/attachment.html>

From lawrence.mitchell at imperial.ac.uk  Fri May 12 03:48:37 2017
From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell)
Date: Fri, 12 May 2017 09:48:37 +0100
Subject: [petsc-users] how to get the vertices belongs to a control
 volume in 2D?
In-Reply-To: <298d38df.50bb.15bfacd8032.Coremail.leejearl@126.com>
References: <298d38df.50bb.15bfacd8032.Coremail.leejearl@126.com>
Message-ID: <14503BBE-816D-4287-B724-AFC857AE63F1@imperial.ac.uk>


> On 12 May 2017, at 04:57, ?? <leejearl at 126.com> wrote:
> 
> Hi developers:
>     I have such a question that I want to get the vertices of a cell. I know I can get the points by 
>     1. Getting the faces of a cell such as "DMPlexGetCone(dm, c, &faces";
>     2. Getting the vertices of every face of the cell such as "DMPlexGetCone(dm, f, &vertices)".
> 
>     Then I can obtain the vertices belongs to a cell. Is there any concise routine which I can choose to get the 
> vertices of a cell directly?

You should use the interface for the transitive closure.

Find bounds of points that are vertices:

DMPlexGetDepthStratum(dm, &vStart, &vEnd);

...
DMPlexGetTransitiveClosure(dm, c, PETSC_TRUE, &nclosure, &closure);
for (PetscInt i = 0; i < nclosure; i++) {
    const PetscInt p = closure[2*i];
    if (p >= vStart && p < vEnd) {
        p is a vertex
    }
}

This works regardless of the topological dimension of the "cell" point you are using (the same code is good to find the vertices in the closure of a facet, say).

Matt's course notes (http://www.caam.rice.edu/~caam519/CSBook.pdf) have nice pictures that help understand this language in section 7.1.

Cheers,

Lawrence

From knepley at gmail.com  Fri May 12 04:08:23 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 12 May 2017 04:08:23 -0500
Subject: [petsc-users] Some general questions
In-Reply-To: <CAJ98EDp5FbVQnS1r+C_WRT_X506UfKKk4TJER-XGr2C_1WajOg@mail.gmail.com>
References: <DB6PR0801MB2102D262E89A92B98C9BED2B9CE20@DB6PR0801MB2102.eurprd08.prod.outlook.com>
	<CAJ98EDp5FbVQnS1r+C_WRT_X506UfKKk4TJER-XGr2C_1WajOg@mail.gmail.com>
Message-ID: <CAMYG4GkyMsCBB7==vM=3_FwA-pNbmhQiTRhnJhKTXfmgH7w1EA@mail.gmail.com>

On Fri, May 12, 2017 at 2:25 AM, Dave May <dave.mayhem23 at gmail.com> wrote:

>
>
> On 12 May 2017 at 07:50, Matt Baker <mbaker112 at outlook.de> wrote:
>
>> Hello,
>>
>>
>> I have a few questions on how to improve performance of my program. I'm
>> solving Poisson's equation on a (large) 3D FD grid with Dirichlet boundary
>> conditions and multiple right hand sides. I set up the matrix and
>> everything's working fine so far, but I'm sure the solving process could go
>> faster. I know multigrid is generally the best preconditioner in such a
>> case and algebraic multigrid currently works best.
>>
>
> If you use a DMDA for your FD problem, consider using PCMG with Galerkin.
> It will set up a geometric multigrid hierarchy. Depending on the specifics
> of your Poisson problem (constant coefficient versus highly hetegoneous),
> geometric MG is likely superior (faster time to solution) than AMG.
>
>
>>
>> So generally speaking:
>>
>>
>> Should I make the effort of symmetrizising the system matrix? I know how
>> to do it, but it would probably take some time. CG does currently work, but
>> is not competitive against other methods, so I guess the matrix might not
>> be "symmetric enough"?
>>
>
> In its basic form, CG is only guaranteed to converge with with an SPD
> operator.
> If you want to use CG, definitely do the work and make the operator
> symmetric.
>

For Poisson, CG is never, ever ever, ever ever faster than Full Multigrid
(FMG). Don't use it. All the people publishing that are idiots :)
but of course try it out for yourself with

  -pc_mg_type full

It should converge to discretization error in 1 iterate if the smoother is
strong enough (you might need to use 2 iterates on the downsmooth).

>
>> For the various multigrid preconditioners: I always read that the problem
>> should be solved exactly on the coarsest grid, but wouldn't an iterative
>> solver do the same job if its provided accuracy is high enough, since the
>> coarse discretization and the subsequent interpolation process introduce
>> errors themselves?
>>
>
> Yes iterative can work well. If your Poisson problem has a constant
> coefficient, rtol 1.0e-1 is likely a sufficient tolerance to use for an the
> coarse grid solve (e.g. overall convergence of solve won't be affected). If
> the Poisson problem has a highly variable coefficient (jumps of O(1e3) or
> more), or it has very large gradients say 1e3 variation over a few cells,
> then you will have to perform a more accurate iterative coarse level solve
> (say rtol 1e-4 to 1e-6). Note that the numbers for rtol I quote are purely
> empirical.
>

Always compare to direct. I don't think anything beats direct on problems
the size of your coarse problem. If iterative is winning, likely
your coarse problem is too big. However, again you can try it yourself
easily with options.

   Matt

>
>> I submit my program to a batch system, but PETSc was compiled on the
>> login node with different hardware. Is this affecting performance? What
>> parts of the configuration process should I perform on a compute node then?
>>
> If the login and compute nodes are fundamentally different, you should
> configure petsc with the option
>  --with-batch
> and following the instructions.
>
> Thanks,
>   Dave
>
>>
>> Thanks.
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170512/83b98fc0/attachment-0001.html>

From knepley at gmail.com  Fri May 12 04:10:12 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 12 May 2017 04:10:12 -0500
Subject: [petsc-users] how to get the vertices belongs to a control
 volume in 2D?
In-Reply-To: <14503BBE-816D-4287-B724-AFC857AE63F1@imperial.ac.uk>
References: <298d38df.50bb.15bfacd8032.Coremail.leejearl@126.com>
	<14503BBE-816D-4287-B724-AFC857AE63F1@imperial.ac.uk>
Message-ID: <CAMYG4G=FJwcFoH4UZeP5nHyXhUhU5wWjyFwDk8w3G-TEzmYtCA@mail.gmail.com>

On Fri, May 12, 2017 at 3:48 AM, Lawrence Mitchell <
lawrence.mitchell at imperial.ac.uk> wrote:

>
> > On 12 May 2017, at 04:57, ?? <leejearl at 126.com> wrote:
> >
> > Hi developers:
> >     I have such a question that I want to get the vertices of a cell. I
> know I can get the points by
> >     1. Getting the faces of a cell such as "DMPlexGetCone(dm, c, &faces";
> >     2. Getting the vertices of every face of the cell such as
> "DMPlexGetCone(dm, f, &vertices)".
> >
> >     Then I can obtain the vertices belongs to a cell. Is there any
> concise routine which I can choose to get the
> > vertices of a cell directly?
>
> You should use the interface for the transitive closure.
>
> Find bounds of points that are vertices:
>
> DMPlexGetDepthStratum(dm, &vStart, &vEnd);
>
> ...
> DMPlexGetTransitiveClosure(dm, c, PETSC_TRUE, &nclosure, &closure);
> for (PetscInt i = 0; i < nclosure; i++) {
>     const PetscInt p = closure[2*i];
>     if (p >= vStart && p < vEnd) {
>         p is a vertex
>     }
> }
>
> This works regardless of the topological dimension of the "cell" point you
> are using (the same code is good to find the vertices in the closure of a
> facet, say).
>
> Matt's course notes (http://www.caam.rice.edu/~caam519/CSBook.pdf) have
> nice pictures that help understand this language in section 7.1.
>

Also note that this is fine for getting vertices if you want to do
topological things. However, if what you really want is
some function over the vertices (like coordinates), you should use


http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMPlexVecGetClosure.html

  Thanks,

     Matt


> Cheers,
>
> Lawrence




-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170512/6239884b/attachment.html>

From leejearl at 126.com  Mon May 15 03:20:44 2017
From: leejearl at 126.com (leejearl)
Date: Mon, 15 May 2017 16:20:44 +0800
Subject: [petsc-users] how to get the vertices belongs to a control
 volume in 2D?
In-Reply-To: <CAMYG4G=FJwcFoH4UZeP5nHyXhUhU5wWjyFwDk8w3G-TEzmYtCA@mail.gmail.com>
References: <298d38df.50bb.15bfacd8032.Coremail.leejearl@126.com>
	<14503BBE-816D-4287-B724-AFC857AE63F1@imperial.ac.uk>
	<CAMYG4G=FJwcFoH4UZeP5nHyXhUhU5wWjyFwDk8w3G-TEzmYtCA@mail.gmail.com>
Message-ID: <0db70c6b-1c1e-6b28-c4c9-66ae05a08ca8@126.com>

Hi, all:
     Thanks for your kind reply. Matt's course notes looks very nice.

     Thanks,
leejearl

    


On 2017?05?12? 17:10, Matthew Knepley wrote:
> On Fri, May 12, 2017 at 3:48 AM, Lawrence Mitchell 
> <lawrence.mitchell at imperial.ac.uk 
> <mailto:lawrence.mitchell at imperial.ac.uk>> wrote:
>
>
>     > On 12 May 2017, at 04:57, ?? <leejearl at 126.com
>     <mailto:leejearl at 126.com>> wrote:
>     >
>     > Hi developers:
>     >     I have such a question that I want to get the vertices of a
>     cell. I know I can get the points by
>     >     1. Getting the faces of a cell such as "DMPlexGetCone(dm, c,
>     &faces";
>     >     2. Getting the vertices of every face of the cell such as
>     "DMPlexGetCone(dm, f, &vertices)".
>     >
>     >     Then I can obtain the vertices belongs to a cell. Is there
>     any concise routine which I can choose to get the
>     > vertices of a cell directly?
>
>     You should use the interface for the transitive closure.
>
>     Find bounds of points that are vertices:
>
>     DMPlexGetDepthStratum(dm, &vStart, &vEnd);
>
>     ...
>     DMPlexGetTransitiveClosure(dm, c, PETSC_TRUE, &nclosure, &closure);
>     for (PetscInt i = 0; i < nclosure; i++) {
>         const PetscInt p = closure[2*i];
>         if (p >= vStart && p < vEnd) {
>             p is a vertex
>         }
>     }
>
>     This works regardless of the topological dimension of the "cell"
>     point you are using (the same code is good to find the vertices in
>     the closure of a facet, say).
>
>     Matt's course notes (http://www.caam.rice.edu/~caam519/CSBook.pdf
>     <http://www.caam.rice.edu/%7Ecaam519/CSBook.pdf>) have nice
>     pictures that help understand this language in section 7.1.
>
>
> Also note that this is fine for getting vertices if you want to do 
> topological things. However, if what you really want is
> some function over the vertices (like coordinates), you should use
>
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMPlexVecGetClosure.html
>
>   Thanks,
>
>      Matt
>
>     Cheers,
>
>     Lawrence
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170515/b81fd265/attachment.html>

From ibarletta at inogs.it  Mon May 15 10:37:48 2017
From: ibarletta at inogs.it (Barletta, Ivano)
Date: Mon, 15 May 2017 17:37:48 +0200
Subject: [petsc-users] Problem with IS and VecScatter
Message-ID: <CAEz28OgJUCeHe-QamMLMovaNXPHc8pQ5Hvtj5ooUuy6ZZnu4tg@mail.gmail.com>

Hello users/developers

I'm trying to build a vecscatter object to
migrate data from a vector x to a vector x2
having same global size but different parallel layout.

Prior to this, I build an Index Set using the method

ISCreateStride

The IS is created correctly, since the program returns
ierr=0 when I call the subroutine (I'm using Fortran 90).

but when I run the program in parallel I get this error

   0:[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
   0:[0]PETSC ERROR: Unknown type. Check for miss-spelling or missing
package:
http://www.mcs.anl.gov/petsc/documentation/installation.html#external
   0:[0]PETSC ERROR: Unknown IS type: general
   0:[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
for trouble shooting.
   0:[0]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017
   0:[0]PETSC ERROR: ./opa on a arch-linux2-c-debug named n419.cluster.net
by ib04116 Mon May 15 17:19:25 2017
   0:[0]PETSC ERROR: Configure options --with-cc=mpiicc --with-fc=mpiifort
--with-cxx=mpiicpc --with-mpiexec=mpirun
--with-blas-lapack-dir=/users/home/opt/intel/composer_xe_2013/mkl
--with-scalapack-lib="-L/users/home/opt/intel/composer_xe_2013/mkl//lib/intel64
-lmkl_scalapack_ilp64 -lmkl_blacs_intelmpi_ilp64"
--with-scalapack-include=/users/home/opt/intel/composer_xe_2013/mkl/include
--download-metis --download-parmetis --download-mumps --download-superlu
--with-debugging=yes CFLAGS=-I/users/home/opt/netcdf/netcdf-4.3/include
-I/users/home/opt/szip/szip-2.1/include
-I/users/home/opt/hdf5/hdf5-1.8.11/include -I/usr/include FFLAGS=-xHost
-no-prec-div -O3 -I/users/home/opt/netcdf/netcdf-4.3/include
LDFLAGS=-L/users/home/opt/netcdf/netcdf-4.3/lib -lnetcdff
-L/users/home/opt/hdf5/hdf5-1.8.11/lib
-L/users/home/opt/netcdf/netcdf-4.3/lib -L/usr/lib64/ -lz -lgpfs -lnetcdf
-lcurl -lnetcdf

The program cannot complete the scatter process
and remains hanging.

One odd thing is that, though I create a stride, the IS
object is marked as general. An even more odd thing is
that I've used the same code in a simple test case and
everything worked fine...

Have you got any hint about this?

Thanks
Ivano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170515/b94ec310/attachment.html>

From bsmith at mcs.anl.gov  Mon May 15 12:26:04 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 15 May 2017 12:26:04 -0500
Subject: [petsc-users] Problem with IS and VecScatter
In-Reply-To: <CAEz28OgJUCeHe-QamMLMovaNXPHc8pQ5Hvtj5ooUuy6ZZnu4tg@mail.gmail.com>
References: <CAEz28OgJUCeHe-QamMLMovaNXPHc8pQ5Hvtj5ooUuy6ZZnu4tg@mail.gmail.com>
Message-ID: <92FF3A9E-FEAD-442B-BDED-423FF47F4DFC@mcs.anl.gov>


   First run with valgrind:   http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind

then  run with -start_in_debugger and look at the variables where it crashes to see why it might crash.




> On May 15, 2017, at 10:37 AM, Barletta, Ivano <ibarletta at inogs.it> wrote:
> 
> Hello users/developers
> 
> I'm trying to build a vecscatter object to 
> migrate data from a vector x to a vector x2
> having same global size but different parallel layout.
> 
> Prior to this, I build an Index Set using the method
> 
> ISCreateStride
> 
> The IS is created correctly, since the program returns 
> ierr=0 when I call the subroutine (I'm using Fortran 90).
> 
> but when I run the program in parallel I get this error
> 
>    0:[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>    0:[0]PETSC ERROR: Unknown type. Check for miss-spelling or missing package: http://www.mcs.anl.gov/petsc/documentation/installation.html#external
>    0:[0]PETSC ERROR: Unknown IS type: general
>    0:[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>    0:[0]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017
>    0:[0]PETSC ERROR: ./opa on a arch-linux2-c-debug named n419.cluster.net by ib04116 Mon May 15 17:19:25 2017
>    0:[0]PETSC ERROR: Configure options --with-cc=mpiicc --with-fc=mpiifort --with-cxx=mpiicpc --with-mpiexec=mpirun --with-blas-lapack-dir=/users/home/opt/intel/composer_xe_2013/mkl --with-scalapack-lib="-L/users/home/opt/intel/composer_xe_2013/mkl//lib/intel64 -lmkl_scalapack_ilp64 -lmkl_blacs_intelmpi_ilp64" --with-scalapack-include=/users/home/opt/intel/composer_xe_2013/mkl/include --download-metis --download-parmetis --download-mumps --download-superlu --with-debugging=yes CFLAGS=-I/users/home/opt/netcdf/netcdf-4.3/include -I/users/home/opt/szip/szip-2.1/include -I/users/home/opt/hdf5/hdf5-1.8.11/include -I/usr/include FFLAGS=-xHost -no-prec-div -O3 -I/users/home/opt/netcdf/netcdf-4.3/include LDFLAGS=-L/users/home/opt/netcdf/netcdf-4.3/lib -lnetcdff -L/users/home/opt/hdf5/hdf5-1.8.11/lib -L/users/home/opt/netcdf/netcdf-4.3/lib -L/usr/lib64/ -lz -lgpfs -lnetcdf -lcurl -lnetcdf
> 
> The program cannot complete the scatter process 
> and remains hanging. 
> 
> One odd thing is that, though I create a stride, the IS
> object is marked as general. An even more odd thing is 
> that I've used the same code in a simple test case and
> everything worked fine...
> 
> Have you got any hint about this?
> 
> Thanks
> Ivano
> 
> 
> 


From mfadams at lbl.gov  Mon May 15 17:03:09 2017
From: mfadams at lbl.gov (Mark Adams)
Date: Mon, 15 May 2017 18:03:09 -0400
Subject: [petsc-users] SNES error
In-Reply-To: <CADOhEh50DKVKsVJ2wwNakbC-ZBQAh1znoHjttNLit9_+vU4iLw@mail.gmail.com>
References: <CADOhEh7_aBgChPnHnca1iweEssO4jP+CDzy3y06GSUEpyWr9Uw@mail.gmail.com>
	<CAMYG4Gmwh3Xzjta+C9HRE8eAtU7-5y7qtabw6Z9WR---R0AJxg@mail.gmail.com>
	<677760BF-5666-4C9D-A064-B495ACD80889@mcs.anl.gov>
	<CADOhEh66JN=yX=V8E0Y8uhiOmfS6sA6-Lup37ubyh8wW0y38vg@mail.gmail.com>
	<CAMYG4GkNsgU4FR8GfJvVeZ28qY2PpYGcR+Jj1L7umJr8XXVTZQ@mail.gmail.com>
	<CADOhEh50DKVKsVJ2wwNakbC-ZBQAh1znoHjttNLit9_+vU4iLw@mail.gmail.com>
Message-ID: <CADOhEh5+J=qAPVtzkPqCQVrrL0NnyumdMQ0CUe8meY0mE4XnsA@mail.gmail.com>

I could use this fix for this global op counter that is getting trigger
because I have an operator inside of another operator.
Thanks,

On Thu, May 4, 2017 at 9:09 AM, Mark Adams <mfadams at lbl.gov> wrote:

> OK, that makes sense, it fails when my velocity grid gets not tiny.
>
> I can use tine velocity grids for now.
>
> On Tue, May 2, 2017 at 11:18 AM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Tue, May 2, 2017 at 10:10 AM, Mark Adams <mfadams at lbl.gov> wrote:
>>
>>> /Users/markadams/Codes/petsc/arch-macosx-gnu-O/bin/mpiexec -n 1 ./vml
>>> -v_coord_cylinder -x_dm_refine 2 -v_dm_refine 2 -snes_rtol 1.e-6 -snes_stol
>>> 1.e-6 -ts_type cn -snes_fd -pc_type lu -ksp_type preonly
>>> -x_petscspace_order 1 -x_petscspace_poly_tensor -v_petscspace_order 1
>>> -v_petscspace_poly_tensor -ts_dt .1 -ts_max_steps 10 -ts_final_time 1e10
>>> -verbose 3 -num_species 1 -snes_monitor -masses 1,2,4 -thermal_temps
>>> 30,30,30  -domainv_lo -2,-2 -domainv_hi 2,2 -domainx_lo -12,-12 -domainx_hi
>>> 12,12 -E 0,0 -blobx_radius 2 -x_dm_view hdf5:x.h5 -x_vec_view
>>> hdf5:x.h5::append -v_dm_view hdf5:v.h5 -v_vec_view hdf5:v.h5::append
>>> -x_pre_dm_view hdf5:prex.h5 -x_pre_vec_view hdf5:prex.h5::append
>>> -snes_converged_reason -snes_linesearch_monitor -ts_adapt_monitor
>>> main call SetupXDiscretization
>>> main call SetInitialConditionDomain
>>>                 VMLViewX DMGetOutputSequenceNumber=-1,
>>> cmd_str=-x_pre_vec_view
>>>   0) species 0: charge density= -2.3940791757186e+00, z-momentum=
>>>  5.9851979392559e-01, energy=  3.2314073646197e-01, thermal-flux=
>>>  2.4419137539877e-01
>>>           0) Normalized: charge density= -2.3940791757186e+00, z
>>> momentum=  5.9851979392559e-01, energy=  3.2314073646197e-01, thermal flux=
>>>  2.4419137539877e-01, local: 64 X cells, 81 X vertices
>>>                 VMLViewX DMGetOutputSequenceNumber=0, cmd_str=(null)
>>>         VMLViewV DMGetOutputSequenceNumber=-1
>>>     0 SNES Function norm 4.097052680599e+00
>>>     1 SNES Function norm 1.213148652908e-09
>>>   Nonlinear solve did not converge due to DIVERGED_FUNCTION_COUNT
>>> iterations 1
>>>
>>
>> Neat! Mark, I think this has to do with you calling SNESEvaluateFunc()
>> inside another one. We limit the number of function evaluations
>> to 10,000 by default, mostly to corral line searches. I think you hit
>> this, and thus need to up the count.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>>       TSAdapt none step   0 stage rejected t=0          + 1.000e-01,
>>> nonlinear solve failures 1 greater than current TS allowed
>>> [0]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> [0]PETSC ERROR:
>>> [0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE,
>>> increase -ts_max_snes_failures or make negative to attempt recovery
>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>> for trouble shooting.
>>> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.6-3659-g699918129c
>>>  GIT Date: 2017-04-26 08:18:35 -0400
>>> [0]PETSC ERROR: ./vml on a arch-macosx-gnu-O named MarksMac-5.local by
>>> markadams Tue May  2 11:04:02 2017
>>> [0]PETSC ERROR: Configure options --with-cc=clang --with-cc++=clang++
>>> COPTFLAGS="-O3 -g -mavx2" CXXOPTFLAGS="-O3 -g -mavx2" FOPTFLAGS="-O3 -g
>>> -mavx2" --download-mpich=1 --download-parmetis=1 --download-metis=1
>>> --download-hypre=1 --download-ml=1 --download-triangle=1
>>> --download-ctetgen=1 --download-p4est=1 --with-x=0 --download-superlu_dist
>>> --download-superlu --download-ctetgen --with-debugging=0 --download-hdf5=1
>>> PETSC_ARCH=arch-macosx-gnu-O --download-chaco --with-viewfromoptions=1
>>>
>>>
>>> On Mon, May 1, 2017 at 10:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>
>>>>
>>>>   and
>>>>
>>>>   -snes_linesearch_monitor
>>>>   -ts_adapt_monitor
>>>>
>>>>
>>>> > On May 1, 2017, at 7:51 PM, Matthew Knepley <knepley at gmail.com>
>>>> wrote:
>>>> >
>>>> > Run with -snes_converged_reason.
>>>> >
>>>> >    Matt
>>>> >
>>>> > On Mon, May 1, 2017 at 7:14 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>>> > I get this SNES failure and I don't understand what the problem is.
>>>> The rtol is 1.e-6 and the first iteration reduces the residual by 9 orders
>>>> of magnitude. Yet, TS is not satisfied. What is going on here?
>>>> >
>>>> > mpiexec -n 1 ./vml -v_coord_cylinder -x_dm_refine 2 -v_dm_refine 2
>>>> -snes_rtol 1.e-6 -snes_stol 1.e-6 -ts_type cn -snes_fd -pc_type lu
>>>> -ksp_type preonly -x_petscspace_order 1 -x_petscspace_poly_tensor
>>>> -v_petscspace_order 1 -v_petscspace_poly_tensor -ts_dt .1 -ts_max_steps 10
>>>> -ts_final_time 1e10 -verbose 3 -num_species 1 -snes_monitor -masses 1,2,4
>>>> -thermal_temps 30,30,30  -domainv_lo -2,-2 -domainv_hi 2,2 -domainx_lo
>>>> -12,-12 -domainx_hi 12,12 -E 0,0 -blobx_radius 2 -x_dm_view hdf5:x.h5
>>>> -x_vec_view hdf5:x.h5::append -v_dm_view hdf5:v.h5 -v_vec_view
>>>> hdf5:v.h5::append -x_pre_dm_view hdf5:prex.h5 -x_pre_vec_view
>>>> hdf5:prex.h5::append
>>>> >  ....
>>>> >
>>>> >    0 SNES Function norm 4.097052680599e+00
>>>> >     1 SNES Function norm 1.213148652908e-09
>>>> > [0]PETSC ERROR: --------------------- Error Message
>>>> --------------------------------------------------------------
>>>> > [0]PETSC ERROR:
>>>> > [0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE,
>>>> increase -ts_max_snes_failures or make negative to attempt recovery
>>>> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/d
>>>> ocumentation/faq.html for trouble shooting.
>>>> > [0]PETSC ERROR: Petsc Development GIT revision:
>>>> v3.7.6-3659-g699918129c  GIT Date: 2017-04-26 08:18:35 -0400
>>>> > [0]PETSC ERROR: ./vml on a arch-macosx-gnu-O named MarksMac-5.local
>>>> by markadams Mon May  1 19:21:32 2017
>>>> > [0]PETSC ERROR: Configure options --with-cc=clang --with-cc++=clang++
>>>> COPTFLAGS="-O3 -g -mavx2" CXXOPTFLAGS="-O3 -g -mavx2" FOPTFLAGS="-O3 -g
>>>> -mavx2" --download-mpich=1 --download-parmetis=1 --download-metis=1
>>>> --download-hypre=1 --download-ml=1 --download-triangle=1
>>>> --download-ctetgen=1 --download-p4est=1 --with-x=0 --download-superlu_dist
>>>> --download-superlu --download-ctetgen --with-debugging=0 --download-hdf5=1
>>>> PETSC_ARCH=arch-macosx-gnu-O --download-chaco --with-viewfromoptions=1
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> > -- Norbert Wiener
>>>>
>>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170515/596d4ef3/attachment.html>

From knepley at gmail.com  Mon May 15 17:12:20 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 15 May 2017 17:12:20 -0500
Subject: [petsc-users] SNES error
In-Reply-To: <CADOhEh5+J=qAPVtzkPqCQVrrL0NnyumdMQ0CUe8meY0mE4XnsA@mail.gmail.com>
References: <CADOhEh7_aBgChPnHnca1iweEssO4jP+CDzy3y06GSUEpyWr9Uw@mail.gmail.com>
	<CAMYG4Gmwh3Xzjta+C9HRE8eAtU7-5y7qtabw6Z9WR---R0AJxg@mail.gmail.com>
	<677760BF-5666-4C9D-A064-B495ACD80889@mcs.anl.gov>
	<CADOhEh66JN=yX=V8E0Y8uhiOmfS6sA6-Lup37ubyh8wW0y38vg@mail.gmail.com>
	<CAMYG4GkNsgU4FR8GfJvVeZ28qY2PpYGcR+Jj1L7umJr8XXVTZQ@mail.gmail.com>
	<CADOhEh50DKVKsVJ2wwNakbC-ZBQAh1znoHjttNLit9_+vU4iLw@mail.gmail.com>
	<CADOhEh5+J=qAPVtzkPqCQVrrL0NnyumdMQ0CUe8meY0mE4XnsA@mail.gmail.com>
Message-ID: <CAMYG4G==soMhBvziq4EamM4Lsnw-CP=HLkeZ8Kh+aDPzQDaevQ@mail.gmail.com>

On Mon, May 15, 2017 at 5:03 PM, Mark Adams <mfadams at lbl.gov> wrote:

> I could use this fix for this global op counter that is getting trigger
> because I have an operator inside of another operator.
>

Set the last arugment to PETSC_MAX_INT


http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetTolerances.html#SNESSetTolerances

    Matt


> Thanks,
>
> On Thu, May 4, 2017 at 9:09 AM, Mark Adams <mfadams at lbl.gov> wrote:
>
>> OK, that makes sense, it fails when my velocity grid gets not tiny.
>>
>> I can use tine velocity grids for now.
>>
>> On Tue, May 2, 2017 at 11:18 AM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Tue, May 2, 2017 at 10:10 AM, Mark Adams <mfadams at lbl.gov> wrote:
>>>
>>>> /Users/markadams/Codes/petsc/arch-macosx-gnu-O/bin/mpiexec -n 1 ./vml
>>>> -v_coord_cylinder -x_dm_refine 2 -v_dm_refine 2 -snes_rtol 1.e-6 -snes_stol
>>>> 1.e-6 -ts_type cn -snes_fd -pc_type lu -ksp_type preonly
>>>> -x_petscspace_order 1 -x_petscspace_poly_tensor -v_petscspace_order 1
>>>> -v_petscspace_poly_tensor -ts_dt .1 -ts_max_steps 10 -ts_final_time 1e10
>>>> -verbose 3 -num_species 1 -snes_monitor -masses 1,2,4 -thermal_temps
>>>> 30,30,30  -domainv_lo -2,-2 -domainv_hi 2,2 -domainx_lo -12,-12 -domainx_hi
>>>> 12,12 -E 0,0 -blobx_radius 2 -x_dm_view hdf5:x.h5 -x_vec_view
>>>> hdf5:x.h5::append -v_dm_view hdf5:v.h5 -v_vec_view hdf5:v.h5::append
>>>> -x_pre_dm_view hdf5:prex.h5 -x_pre_vec_view hdf5:prex.h5::append
>>>> -snes_converged_reason -snes_linesearch_monitor -ts_adapt_monitor
>>>> main call SetupXDiscretization
>>>> main call SetInitialConditionDomain
>>>>                 VMLViewX DMGetOutputSequenceNumber=-1,
>>>> cmd_str=-x_pre_vec_view
>>>>   0) species 0: charge density= -2.3940791757186e+00, z-momentum=
>>>>  5.9851979392559e-01, energy=  3.2314073646197e-01, thermal-flux=
>>>>  2.4419137539877e-01
>>>>           0) Normalized: charge density= -2.3940791757186e+00, z
>>>> momentum=  5.9851979392559e-01, energy=  3.2314073646197e-01, thermal flux=
>>>>  2.4419137539877e-01, local: 64 X cells, 81 X vertices
>>>>                 VMLViewX DMGetOutputSequenceNumber=0, cmd_str=(null)
>>>>         VMLViewV DMGetOutputSequenceNumber=-1
>>>>     0 SNES Function norm 4.097052680599e+00
>>>>     1 SNES Function norm 1.213148652908e-09
>>>>   Nonlinear solve did not converge due to DIVERGED_FUNCTION_COUNT
>>>> iterations 1
>>>>
>>>
>>> Neat! Mark, I think this has to do with you calling SNESEvaluateFunc()
>>> inside another one. We limit the number of function evaluations
>>> to 10,000 by default, mostly to corral line searches. I think you hit
>>> this, and thus need to up the count.
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>
>>>>       TSAdapt none step   0 stage rejected t=0          + 1.000e-01,
>>>> nonlinear solve failures 1 greater than current TS allowed
>>>> [0]PETSC ERROR: --------------------- Error Message
>>>> --------------------------------------------------------------
>>>> [0]PETSC ERROR:
>>>> [0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE,
>>>> increase -ts_max_snes_failures or make negative to attempt recovery
>>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>> for trouble shooting.
>>>> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.6-3659-g699918129c
>>>>  GIT Date: 2017-04-26 08:18:35 -0400
>>>> [0]PETSC ERROR: ./vml on a arch-macosx-gnu-O named MarksMac-5.local by
>>>> markadams Tue May  2 11:04:02 2017
>>>> [0]PETSC ERROR: Configure options --with-cc=clang --with-cc++=clang++
>>>> COPTFLAGS="-O3 -g -mavx2" CXXOPTFLAGS="-O3 -g -mavx2" FOPTFLAGS="-O3 -g
>>>> -mavx2" --download-mpich=1 --download-parmetis=1 --download-metis=1
>>>> --download-hypre=1 --download-ml=1 --download-triangle=1
>>>> --download-ctetgen=1 --download-p4est=1 --with-x=0 --download-superlu_dist
>>>> --download-superlu --download-ctetgen --with-debugging=0 --download-hdf5=1
>>>> PETSC_ARCH=arch-macosx-gnu-O --download-chaco --with-viewfromoptions=1
>>>>
>>>>
>>>> On Mon, May 1, 2017 at 10:25 PM, Barry Smith <bsmith at mcs.anl.gov>
>>>> wrote:
>>>>
>>>>>
>>>>>   and
>>>>>
>>>>>   -snes_linesearch_monitor
>>>>>   -ts_adapt_monitor
>>>>>
>>>>>
>>>>> > On May 1, 2017, at 7:51 PM, Matthew Knepley <knepley at gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Run with -snes_converged_reason.
>>>>> >
>>>>> >    Matt
>>>>> >
>>>>> > On Mon, May 1, 2017 at 7:14 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>>>> > I get this SNES failure and I don't understand what the problem is.
>>>>> The rtol is 1.e-6 and the first iteration reduces the residual by 9 orders
>>>>> of magnitude. Yet, TS is not satisfied. What is going on here?
>>>>> >
>>>>> > mpiexec -n 1 ./vml -v_coord_cylinder -x_dm_refine 2 -v_dm_refine 2
>>>>> -snes_rtol 1.e-6 -snes_stol 1.e-6 -ts_type cn -snes_fd -pc_type lu
>>>>> -ksp_type preonly -x_petscspace_order 1 -x_petscspace_poly_tensor
>>>>> -v_petscspace_order 1 -v_petscspace_poly_tensor -ts_dt .1 -ts_max_steps 10
>>>>> -ts_final_time 1e10 -verbose 3 -num_species 1 -snes_monitor -masses 1,2,4
>>>>> -thermal_temps 30,30,30  -domainv_lo -2,-2 -domainv_hi 2,2 -domainx_lo
>>>>> -12,-12 -domainx_hi 12,12 -E 0,0 -blobx_radius 2 -x_dm_view hdf5:x.h5
>>>>> -x_vec_view hdf5:x.h5::append -v_dm_view hdf5:v.h5 -v_vec_view
>>>>> hdf5:v.h5::append -x_pre_dm_view hdf5:prex.h5 -x_pre_vec_view
>>>>> hdf5:prex.h5::append
>>>>> >  ....
>>>>> >
>>>>> >    0 SNES Function norm 4.097052680599e+00
>>>>> >     1 SNES Function norm 1.213148652908e-09
>>>>> > [0]PETSC ERROR: --------------------- Error Message
>>>>> --------------------------------------------------------------
>>>>> > [0]PETSC ERROR:
>>>>> > [0]PETSC ERROR: TSStep has failed due to DIVERGED_NONLINEAR_SOLVE,
>>>>> increase -ts_max_snes_failures or make negative to attempt recovery
>>>>> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/d
>>>>> ocumentation/faq.html for trouble shooting.
>>>>> > [0]PETSC ERROR: Petsc Development GIT revision:
>>>>> v3.7.6-3659-g699918129c  GIT Date: 2017-04-26 08:18:35 -0400
>>>>> > [0]PETSC ERROR: ./vml on a arch-macosx-gnu-O named MarksMac-5.local
>>>>> by markadams Mon May  1 19:21:32 2017
>>>>> > [0]PETSC ERROR: Configure options --with-cc=clang
>>>>> --with-cc++=clang++ COPTFLAGS="-O3 -g -mavx2" CXXOPTFLAGS="-O3 -g -mavx2"
>>>>> FOPTFLAGS="-O3 -g -mavx2" --download-mpich=1 --download-parmetis=1
>>>>> --download-metis=1 --download-hypre=1 --download-ml=1 --download-triangle=1
>>>>> --download-ctetgen=1 --download-p4est=1 --with-x=0 --download-superlu_dist
>>>>> --download-superlu --download-ctetgen --with-debugging=0 --download-hdf5=1
>>>>> PETSC_ARCH=arch-macosx-gnu-O --download-chaco --with-viewfromoptions=1
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> > -- Norbert Wiener
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170515/103e814c/attachment-0001.html>

From bsmith at mcs.anl.gov  Wed May 17 17:22:24 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 17 May 2017 17:22:24 -0500
Subject: [petsc-users] [petsc-dev] For Fortran users of PETSc
	development version
In-Reply-To: <CAPz1TncvoZ6_bX7YZ-+Vd+x=ExHk98OLVFo2M7rg867dw4gssw@mail.gmail.com>
References: <DFD91CBB-599B-4061-BE29-DD66CAB22610@mcs.anl.gov>
	<4726FF04-1D07-455F-88B4-FF2488DCAE07@mcs.anl.gov>
	<CAPz1TncvoZ6_bX7YZ-+Vd+x=ExHk98OLVFo2M7rg867dw4gssw@mail.gmail.com>
Message-ID: <57A4F0B7-AFDC-4FE8-920A-A9700F827360@mcs.anl.gov>


> On May 11, 2017, at 11:24 AM, Gautam Bisht <gbisht at lbl.gov> wrote:
> 
> Hi Barry,
> 
> I'm wondering if these changes will be a part of future 3.8.0 release or 4.0.0 release. And, do you have a tentative timeline when such a release tag would be made?

   This will be in 3.8.0 and we hope to make the release in early June.

   Barry

> 
> -Gautam.
> 
> On Sun, Dec 4, 2016 at 11:13 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   Jed noticed a small mistake in my description. It is type(tXXX) not type(iXXX) if you chose to declare your variables that way. Note that declaring them via type(tXXX) or XXX is identical (XXX is just a macro for type(tXXX)).
> 
>    Barry
> 
> 
> > On Dec 4, 2016, at 11:57 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >
> >    For Fortran users of the PETSc development (git master branch) version
> >
> >
> >    I have updated and simplified the Fortran usage of PETSc in the past few weeks. I will put the branch barry/fortran-update into the master branch on Monday. The usage changes are
> >
> > A)  for each Fortran function (and main) use the following
> >
> >       subroutine mysubroutine(.....)
> > #include <petsc/finclude/petscxxx.h>
> >       use petscxxx
> >       implicit none
> >
> >     For example if you are using SNES in your code you would have
> >
> > #include <petsc/finclude/petscsnes.h>
> >       use petscsnes
> >       implicit none
> >
> > B) Instead of PETSC_NULL_OBJECT you must pass PETSC_NULL_XXX (for example PETSC_NULL_VEC) using the specific object type XXX that the function call is expecting.
> >
> > C) Objects can be declared either as XXX  a  or  type(iXXX) a, for example Mat a or type(iMat) a.  (Note that previously for those who used types it was type(Mat) but that can no longer be used.
> >
> > Notes:
> >
> > 1)   There are no longer any .h90 files that may be included
> >
> > 2)    Like C the include files are now nested so you no longer need to include for example
> >
> > #include <petsc/finclude/petscsys.h>
> > #include <petsc/finclude/petscvec.h>
> > #include <petsc/finclude/petscmat.h>
> > #include <petsc/finclude/petscpc.h>
> > #include <petsc/finclude/petscksp.h>
> >
> > you can just include
> >
> > #include <petsc/finclude/petscksp.h>
> >
> > 3) there is now type checking of most function calls. This will help eliminate bugs due to incorrect calling sequences. Note that Fortran distinguishes between a argument that is a scalar (zero dimensional array), a one dimensional array and a two dimensional array (etc). So you may get compile warnings because you are passing in an array when PETSc expects a scalar or vis-versa. If you get these simply fix your declaration of the variable to match what is expected. In some routines like MatSetValues() and friends you can pass either scalars, one dimensional arrays or two dimensional arrays, if you get errors here please send mail to petsc-maint at mcs.anl.gov and include enough of your code so we can see the dimensions of all your variables so we can fix the problems.
> >
> > 4) You can continue to use either fixed (.F extension) or free format (.F90 extension) for your source
> >
> > 5) All the examples in PETSc have been updated so consult them for clarifications.
> >
> >
> >  Please report any problems to petsc-maint at mcs.anl.gov
> >
> >   Thanks
> >
> >   Barry
> >
> >
> 
> 


From Fabian.Jakub at physik.uni-muenchen.de  Thu May 18 05:11:40 2017
From: Fabian.Jakub at physik.uni-muenchen.de (Fabian.Jakub)
Date: Thu, 18 May 2017 12:11:40 +0200
Subject: [petsc-users] Problems with PetscObjectViewFromOptions in Fortran
Message-ID: <8603540a-38de-ead0-8690-3a9e5d063e7f@physik.uni-muenchen.de>

Dear Petsc Team,

I have a problem with object viewing through PetscObjectViewFromOptions

The C Version works fine, e.g.

static char help[] = "Testing multiple PetscObjectViewFromOptions";
#include <petsc.h>

int main(int argc,char **argv) {
  DM dmA, dmB;
  PetscInitialize(&argc,&argv,(char*)0,help);

  PetscErrorCode ierr;

  ierr = DMPlexCreate(PETSC_COMM_WORLD, &dmA); CHKERRQ(ierr);
  ierr = PetscObjectSetName((PetscObject) dmA, "DMPlex_A"); CHKERRQ(ierr);

  ierr = DMPlexCreate(PETSC_COMM_WORLD, &dmB); CHKERRQ(ierr);
  ierr = PetscObjectSetName((PetscObject) dmB, "DMPlex_B"); CHKERRQ(ierr);

  PetscObjectViewFromOptions((PetscObject) dmA, NULL, "-dmA");
  PetscObjectViewFromOptions((PetscObject) dmB, NULL, "-dmB");

  ierr = DMDestroy(&dmA); CHKERRQ(ierr);
  ierr = DMDestroy(&dmB); CHKERRQ(ierr);

  PetscFinalize();
}

and running it with -help, correctly produces the options and views as:

-dmA

-dmB


but the equivalent in Fortran, e.g.:

program main
#include "petsc/finclude/petsc.h"
      use petsc
      implicit none

      PetscErrorCode :: ierr

      DM :: dmA, dmB

      call PetscInitialize(PETSC_NULL_CHARACTER, ierr); CHKERRQ(ierr)

      call DMPlexCreate(PETSC_COMM_WORLD, dmA, ierr);CHKERRQ(ierr)
      call PetscObjectSetName(dmA, 'DMPlex_A', ierr);CHKERRQ(ierr)

      call DMPlexCreate(PETSC_COMM_WORLD, dmB, ierr);CHKERRQ(ierr)
      call PetscObjectSetName(dmB, 'DMPlex_B', ierr);CHKERRQ(ierr)

      call PetscObjectViewFromOptions(dmA, PETSC_NULL_CHARACTER, "-dmA",
ierr); CHKERRQ(ierr)
      call PetscObjectViewFromOptions(dmB, PETSC_NULL_CHARACTER, "-dmB",
ierr); CHKERRQ(ierr)

      call DMDestroy(dmA, ierr);CHKERRQ(ierr)
      call DMDestroy(dmB, ierr);CHKERRQ(ierr)

      call PetscFinalize(ierr)

end program

produces the options to be:

-dmA-dmB

-dmB



While this works as expected when running with:
    ./example -dmA-dmB -dmB

This is not intuitive.

Is the hickup on my side or is it somewhere in the Fortran stubs?

Please, let me know if you need more details on the build or if you
cannot reproduce this.

Many thanks,

Fabian



Petsc Development GIT revision: v3.7.6-3910-gd04c6f6  GIT Date:
2017-05-15 17:09:20 -0500
./configure                           \
  --with-cc=$(which mpicc)            \
  --with-fc=$(which mpif90)           \
  --with-cxx=$(which mpicxx)          \
  --with-fortran                      \
  --with-fortran-interfaces           \
  --with-shared-libraries=1           \
  --download-hdf5                     \
  --download-triangle                 \
  --download-ctetgen                  \
  --with-cmake=$(which cmake)         \
  --with-debugging=1                  \
  COPTFLAGS='-O2 ' \
  FOPTFLAGS='-O2 ' \
  \
  && make all test

GNU Fortran (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
(Open MPI) 1.10.2


Complete output of -help -info (Fortran Version):
[0] petscinitialize_internal(): (Fortran):PETSc successfully started:
procs 1
[0] PetscGetHostName(): Rejecting domainname, likely is NIS
met-ws-740m19.(none)
[0] petscinitialize_internal(): Running on machine: met-ws-740m19
------Additional PETSc component options--------
 -log_exclude: <vec,mat,pc.ksp,snes>
 -info_exclude: <null,vec,mat,pc,ksp,snes,ts>
-----------------------------------------------
[0] PetscCommDuplicate(): Duplicating a communicator 47693199447680
11260976 max tags = 2147483647
[0] PetscCommDuplicate(): Using internal PETSc communicator
47693199447680 11260976
[0] PetscCommDuplicate(): Using internal PETSc communicator
47693199447680 11260976
[0] PetscCommDuplicate(): Using internal PETSc communicator
47693199447680 11260976
[0] PetscCommDuplicate(): Using internal PETSc communicator
47693199447680 11260976
[0] PetscCommDuplicate(): Using internal PETSc communicator
47693199447680 11260976
[0] PetscCommDuplicate(): Using internal PETSc communicator
47693199447680 11260976
[0] PetscCommDuplicate(): Using internal PETSc communicator
47693199447680 11260976

  -dmA-dmB ascii[:[filename][:[format][:append]]]: Prints object to
stdout or ASCII file (PetscOptionsGetViewer)
    -dmA-dmB binary[:[filename][:[format][:append]]]: Saves object to a
binary file (PetscOptionsGetViewer)
    -dmA-dmB draw[:drawtype[:filename]] Draws object (PetscOptionsGetViewer)
    -dmA-dmB socket[:port]: Pushes object to a Unix socket
(PetscOptionsGetViewer)
    -dmA-dmB saws[:communicatorname]: Publishes object to SAWs
(PetscOptionsGetViewer)

DM Object: DMPlex_A 1 MPI processes
  type: plex
[0] PetscCommDuplicate(): Duplicating a communicator 47693199449728
13799472 max tags = 2147483647
[0] PetscCommDuplicate(): Using internal PETSc communicator
47693199449728 13799472
DMPlex_A in 0 dimensions:
  0-cells: 0

  -dmB ascii[:[filename][:[format][:append]]]: Prints object to stdout
or ASCII file (PetscOptionsGetViewer)
    -dmB binary[:[filename][:[format][:append]]]: Saves object to a
binary file (PetscOptionsGetViewer)
    -dmB draw[:drawtype[:filename]] Draws object (PetscOptionsGetViewer)
    -dmB socket[:port]: Pushes object to a Unix socket
(PetscOptionsGetViewer)
    -dmB saws[:communicatorname]: Publishes object to SAWs
(PetscOptionsGetViewer)

DM Object: DMPlex_B 1 MPI processes
  type: plex
[0] PetscCommDuplicate(): Using internal PETSc communicator
47693199449728 13799472
[0] PetscCommDuplicate(): Using internal PETSc communicator
47693199449728 13799472
DMPlex_B in 0 dimensions:
  0-cells: 0
[0] Petsc_DelComm_Inner(): Removing reference to PETSc communicator
embedded in a user MPI_Comm 13799472
[0] Petsc_DelComm_Outer(): User MPI_Comm 47693199449728 is being freed
after removing reference from inner PETSc comm to this outer comm
[0] PetscCommDestroy(): Deleting PETSc MPI_Comm 13799472
[0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm 13799472
[0] PetscFinalize(): PetscFinalize() called
[0] PetscGetHostName(): Rejecting domainname, likely is NIS
met-ws-740m19.(none)
[0] PetscFOpen(): Opening file Log.0
[0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm
11260976
[0] Petsc_DelComm_Inner(): Removing reference to PETSc communicator
embedded in a user MPI_Comm 11260976
[0] Petsc_DelComm_Outer(): User MPI_Comm 47693199447680 is being freed
after removing reference from inner PETSc comm to this outer comm
[0] PetscCommDestroy(): Deleting PETSc MPI_Comm 11260976
[0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm 11260976

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170518/ea8de804/attachment.pgp>

From mbaker112 at outlook.de  Thu May 18 08:38:41 2017
From: mbaker112 at outlook.de (Matt Baker)
Date: Thu, 18 May 2017 13:38:41 +0000
Subject: [petsc-users] Regarding the conjugate gradient method
In-Reply-To: <DB6PR0801MB2102D262E89A92B98C9BED2B9CE20@DB6PR0801MB2102.eurprd08.prod.outlook.com>
References: <DB6PR0801MB2102D262E89A92B98C9BED2B9CE20@DB6PR0801MB2102.eurprd08.prod.outlook.com>
Message-ID: <DB6PR0801MB2102F2ED7D2A9C382DFB11429CE40@DB6PR0801MB2102.eurprd08.prod.outlook.com>

Hello,


just a quick question:

The CG method is generally derived for spd matrices. However, the PETSc man page states


Notes: The PCG method requires both the matrix and preconditioner to be symmetric positive (or negative) (semi) definite Only left preconditioning is supported.


Does this mean CG works for semi-definite problems as well? Is that guaranteed then?


Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170518/5bf9c084/attachment.html>

From knepley at gmail.com  Thu May 18 08:46:29 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 18 May 2017 08:46:29 -0500
Subject: [petsc-users] Regarding the conjugate gradient method
In-Reply-To: <DB6PR0801MB2102F2ED7D2A9C382DFB11429CE40@DB6PR0801MB2102.eurprd08.prod.outlook.com>
References: <DB6PR0801MB2102D262E89A92B98C9BED2B9CE20@DB6PR0801MB2102.eurprd08.prod.outlook.com>
	<DB6PR0801MB2102F2ED7D2A9C382DFB11429CE40@DB6PR0801MB2102.eurprd08.prod.outlook.com>
Message-ID: <CAMYG4GnKum1VHJb_5uD8k+Z1_ikQ4wD7iNqhzPw4OpZ3uUHaEw@mail.gmail.com>

On Thu, May 18, 2017 at 8:38 AM, Matt Baker <mbaker112 at outlook.de> wrote:

> Hello,
>
>
> just a quick question:
>
> The CG method is generally derived for spd matrices. However, the PETSc
> man page states
>
>
> Notes: The PCG method requires both the matrix and preconditioner to be
> symmetric positive (or negative) (semi) definite Only left preconditioning
> is supported.
>
>
> Does this mean CG works for semi-definite problems as well? Is that
> guaranteed then?
>
I believe that CG converges to A^+ b if b is in the range space of A, but I
would have to look it up. Its probably in Hestenes, "Optimization theory:
the finite dimensional case" 1975

  Thanks,

    Matt

> Thanks.
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170518/ec7fed14/attachment.html>

From jed at jedbrown.org  Thu May 18 16:01:18 2017
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 18 May 2017 16:01:18 -0500
Subject: [petsc-users] [Yousef Saad] Preconditioning-17 Travel awards
Message-ID: <8737c1yj4x.fsf@jedbrown.org>

Travel awards for early career researchers are available.

-------------- next part --------------
An embedded message was scrubbed...
From: saad at cs.umn.edu (Yousef Saad)
Subject: Preconditioning-17 Travel awards
Date: Thu, 18 May 2017 14:26:33 -0500 (CDT)
Size: 3408
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170518/f46c81f2/attachment.mht>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170518/f46c81f2/attachment.pgp>

From ling.zou at inl.gov  Fri May 19 09:25:39 2017
From: ling.zou at inl.gov (Zou, Ling)
Date: Fri, 19 May 2017 08:25:39 -0600
Subject: [petsc-users] Understanding log summary
Message-ID: <CAJWxZDQvCc+p9=vgp01+fhHwLYQVNXnucfNBP9zG9QX1Xc3F=w@mail.gmail.com>

Hi All,

In terms of code performance, sometimes people would ask for info about
total non-linear iteration numbers, total linear iteration numbers, etc. I
suppose all these could be found in the log summary. For the attached log
summary, can I say?
  total non-linear iteration number = 573
  total linear iteration number = 2321

Thank you.

Ling



------------------------------------------------------------------------------------------------------------------------

Event                Count      Time (sec)     Flops
      --- Global ---  --- Stage ---   Total

                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s

------------------------------------------------------------------------------------------------------------------------


--- Event Stage 0: Main Stage


VecDot               607 1.0 1.8729e-04 1.0 1.95e+05 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0  1040

VecMDot             2321 1.0 1.1075e-03 1.0 2.87e+06 1.0 0.0e+00 0.0e+00
0.0e+00  0  5  0  0  0   0  5  0  0  0  2590

VecNorm             5422 1.0 1.2229e-03 1.0 1.74e+06 1.0 0.0e+00 0.0e+00
0.0e+00  0  3  0  0  0   0  3  0  0  0  1423

VecScale            5822 1.0 1.2764e-03 1.0 9.37e+05 1.0 0.0e+00 0.0e+00
0.0e+00  0  2  0  0  0   0  2  0  0  0   734

VecCopy            14334 1.0 1.8302e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

VecSet              1231 1.0 3.0700e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

VecAXPY            14961 1.0 3.3679e-03 1.0 4.82e+06 1.0 0.0e+00 0.0e+00
0.0e+00  0  9  0  0  0   0  9  0  0  0  1430

VecWAXPY           20842 1.0 5.5537e-03 1.0 4.00e+06 1.0 0.0e+00 0.0e+00
0.0e+00  0  8  0  0  0   0  8  0  0  0   721

VecMAXPY            2894 1.0 1.4292e-03 1.0 3.62e+06 1.0 0.0e+00 0.0e+00
0.0e+00  0  7  0  0  0   0  7  0  0  0  2536

VecSetRandom          34 1.0 1.0322e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

VecReduceArith      1146 1.0 2.9907e-04 1.0 3.68e+05 1.0 0.0e+00 0.0e+00
0.0e+00  0  1  0  0  0   0  1  0  0  0  1230

VecReduceComm        573 1.0 9.2384e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

VecNormalize        2894 1.0 1.9604e-03 1.0 1.39e+06 1.0 0.0e+00 0.0e+00
0.0e+00  0  3  0  0  0   0  3  0  0  0   712

SNESJacobianEval     573 1.0 2.2410e+00 1.0 5.60e+06 1.0 0.0e+00 0.0e+00
0.0e+00 62 11  0  0  0  62 11  0  0  0     2

MatMult MF          2928 1.0 5.7922e-01 1.0 2.83e+06 1.0 0.0e+00 0.0e+00
0.0e+00 16  5  0  0  0  16  5  0  0  0     5

MatMult             2928 1.0 5.7963e-01 1.0 2.83e+06 1.0 0.0e+00 0.0e+00
0.0e+00 16  5  0  0  0  16  5  0  0  0     5

MatSolve            2894 1.0 7.3171e-03 1.0 1.76e+07 1.0 0.0e+00 0.0e+00
0.0e+00  0 33  0  0  0   0 33  0  0  0  2405

MatLUFactorNum       573 1.0 1.4733e-02 1.0 1.71e+07 1.0 0.0e+00 0.0e+00
0.0e+00  0 32  0  0  0   0 32  0  0  0  1158

MatILUFactorSym        1 1.0 2.4543e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatAssemblyBegin    1147 1.0 7.9204e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatAssemblyEnd      1147 1.0 2.5825e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatGetRowIJ            2 1.0 6.3280e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatGetOrdering         1 1.0 1.5754e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatZeroEntries       573 1.0 5.7120e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatFDColorCreate       1 1.0 2.0541e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatFDColorSetUp        1 1.0 1.6103e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatFDColorApply      573 1.0 2.2386e+00 1.0 5.60e+06 1.0 0.0e+00 0.0e+00
0.0e+00 62 11  0  0  0  62 11  0  0  0     2

MatFDColorFunc     11460 1.0 2.2264e+00 1.0 1.85e+06 1.0 0.0e+00 0.0e+00
0.0e+00 62  3  0  0  0  62  3  0  0  0     1

MatColoringApply       1 1.0 3.9990e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

KSPGMRESOrthog      2321 1.0 2.9685e-03 1.0 5.75e+06 1.0 0.0e+00 0.0e+00
0.0e+00  0 11  0  0  0   0 11  0  0  0  1935

KSPSetUp             573 1.0 2.3291e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

KSPSolve             573 1.0 5.0164e-01 1.0 4.50e+07 1.0 0.0e+00 0.0e+00
0.0e+00 14 85  0  0  0  14 85  0  0  0    90

PCSetUp              573 1.0 1.5172e-02 1.0 1.71e+07 1.0 0.0e+00 0.0e+00
0.0e+00  0 32  0  0  0   0 32  0  0  0  1124

PCApply             2894 1.0 7.8614e-03 1.0 1.76e+07 1.0 0.0e+00 0.0e+00
0.0e+00  0 33  0  0  0   0 33  0  0  0  2239

------------------------------------------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170519/b4f8e527/attachment.html>

From bsmith at mcs.anl.gov  Fri May 19 13:02:05 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 19 May 2017 13:02:05 -0500
Subject: [petsc-users] Understanding log summary
In-Reply-To: <CAJWxZDQvCc+p9=vgp01+fhHwLYQVNXnucfNBP9zG9QX1Xc3F=w@mail.gmail.com>
References: <CAJWxZDQvCc+p9=vgp01+fhHwLYQVNXnucfNBP9zG9QX1Xc3F=w@mail.gmail.com>
Message-ID: <8D633024-16A3-41C6-8C08-5F8AA82BBD70@mcs.anl.gov>


> On May 19, 2017, at 9:25 AM, Zou, Ling <ling.zou at inl.gov> wrote:
> 
> Hi All,
> 
> In terms of code performance, sometimes people would ask for info about total non-linear iteration numbers, total linear iteration numbers, etc. I suppose all these could be found in the log summary. For the attached log summary, can I say?
>   total non-linear iteration number = 573
>   total linear iteration number = 2321

   Yes,

    The log file is kind of funny. It spends 62% of the time in MatFDColorApply() which is computing the Jacobian via differencing and coloring, this is a lot of time. You might consider lagging the Jacobian; that is not recompute the Jacobian for each new linear solve. You can use -snes_lag_jacobian 2 or -snes_lag_jacobian 3 etc and see how this affects the run time.

  Barry




> MatMult MF
> 
> Thank you.
> 
> Ling
> 
> 
> 
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
> 
> --- Event Stage 0: Main Stage
> 
> VecDot               607 1.0 1.8729e-04 1.0 1.95e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1040
> VecMDot             2321 1.0 1.1075e-03 1.0 2.87e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  5  0  0  0   0  5  0  0  0  2590
> VecNorm             5422 1.0 1.2229e-03 1.0 1.74e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  3  0  0  0  1423
> VecScale            5822 1.0 1.2764e-03 1.0 9.37e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0   734
> VecCopy            14334 1.0 1.8302e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet              1231 1.0 3.0700e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY            14961 1.0 3.3679e-03 1.0 4.82e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  9  0  0  0   0  9  0  0  0  1430
> VecWAXPY           20842 1.0 5.5537e-03 1.0 4.00e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  8  0  0  0   0  8  0  0  0   721
> VecMAXPY            2894 1.0 1.4292e-03 1.0 3.62e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  7  0  0  0   0  7  0  0  0  2536
> VecSetRandom          34 1.0 1.0322e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecReduceArith      1146 1.0 2.9907e-04 1.0 3.68e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  1230
> VecReduceComm        573 1.0 9.2384e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecNormalize        2894 1.0 1.9604e-03 1.0 1.39e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  3  0  0  0   712
> SNESJacobianEval     573 1.0 2.2410e+00 1.0 5.60e+06 1.0 0.0e+00 0.0e+00 0.0e+00 62 11  0  0  0  62 11  0  0  0     2
> MatMult MF          2928 1.0 5.7922e-01 1.0 2.83e+06 1.0 0.0e+00 0.0e+00 0.0e+00 16  5  0  0  0  16  5  0  0  0     5
> MatMult             2928 1.0 5.7963e-01 1.0 2.83e+06 1.0 0.0e+00 0.0e+00 0.0e+00 16  5  0  0  0  16  5  0  0  0     5
> MatSolve            2894 1.0 7.3171e-03 1.0 1.76e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0 33  0  0  0   0 33  0  0  0  2405
> MatLUFactorNum       573 1.0 1.4733e-02 1.0 1.71e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0 32  0  0  0   0 32  0  0  0  1158
> MatILUFactorSym        1 1.0 2.4543e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyBegin    1147 1.0 7.9204e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd      1147 1.0 2.5825e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetRowIJ            2 1.0 6.3280e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         1 1.0 1.5754e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatZeroEntries       573 1.0 5.7120e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatFDColorCreate       1 1.0 2.0541e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatFDColorSetUp        1 1.0 1.6103e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatFDColorApply      573 1.0 2.2386e+00 1.0 5.60e+06 1.0 0.0e+00 0.0e+00 0.0e+00 62 11  0  0  0  62 11  0  0  0     2
> MatFDColorFunc     11460 1.0 2.2264e+00 1.0 1.85e+06 1.0 0.0e+00 0.0e+00 0.0e+00 62  3  0  0  0  62  3  0  0  0     1
> MatColoringApply       1 1.0 3.9990e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPGMRESOrthog      2321 1.0 2.9685e-03 1.0 5.75e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0 11  0  0  0   0 11  0  0  0  1935
> KSPSetUp             573 1.0 2.3291e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve             573 1.0 5.0164e-01 1.0 4.50e+07 1.0 0.0e+00 0.0e+00 0.0e+00 14 85  0  0  0  14 85  0  0  0    90
> PCSetUp              573 1.0 1.5172e-02 1.0 1.71e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0 32  0  0  0   0 32  0  0  0  1124
> PCApply             2894 1.0 7.8614e-03 1.0 1.76e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0 33  0  0  0   0 33  0  0  0  2239
> ------------------------------------------------------------------------------------------------------------------------
> 


From ling.zou at inl.gov  Fri May 19 13:12:25 2017
From: ling.zou at inl.gov (Zou, Ling)
Date: Fri, 19 May 2017 12:12:25 -0600
Subject: [petsc-users] Understanding log summary
In-Reply-To: <8D633024-16A3-41C6-8C08-5F8AA82BBD70@mcs.anl.gov>
References: <CAJWxZDQvCc+p9=vgp01+fhHwLYQVNXnucfNBP9zG9QX1Xc3F=w@mail.gmail.com>
	<8D633024-16A3-41C6-8C08-5F8AA82BBD70@mcs.anl.gov>
Message-ID: <CAJWxZDRNu+qWb5S=jTVHzoUC7k0_bF8dyuzdvRUpRwP6okAzhA@mail.gmail.com>

Barry, thanks for your comments and advise.
Lagging Jacobian evaluation certainly helped (just tested it).
However, eventually the finite differencing for Jacobian evaluation should
be replaced with some sort of approximated Jacobian evaluation subroutine.
So at this moment I don't worry too much on its cost.

Thanks again,

Ling

On Fri, May 19, 2017 at 12:02 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On May 19, 2017, at 9:25 AM, Zou, Ling <ling.zou at inl.gov> wrote:
> >
> > Hi All,
> >
> > In terms of code performance, sometimes people would ask for info about
> total non-linear iteration numbers, total linear iteration numbers, etc. I
> suppose all these could be found in the log summary. For the attached log
> summary, can I say?
> >   total non-linear iteration number = 573
> >   total linear iteration number = 2321
>
>    Yes,
>
>     The log file is kind of funny. It spends 62% of the time in
> MatFDColorApply() which is computing the Jacobian via differencing and
> coloring, this is a lot of time. You might consider lagging the Jacobian;
> that is not recompute the Jacobian for each new linear solve. You can use
> -snes_lag_jacobian 2 or -snes_lag_jacobian 3 etc and see how this affects
> the run time.
>
>   Barry
>
>
>
>
> > MatMult MF
> >
> > Thank you.
> >
> > Ling
> >
> >
> >
> > ------------------------------------------------------------
> ------------------------------------------------------------
> > Event                Count      Time (sec)     Flops
>          --- Global ---  --- Stage ---   Total
> >                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> > ------------------------------------------------------------
> ------------------------------------------------------------
> >
> > --- Event Stage 0: Main Stage
> >
> > VecDot               607 1.0 1.8729e-04 1.0 1.95e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0  1040
> > VecMDot             2321 1.0 1.1075e-03 1.0 2.87e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  5  0  0  0   0  5  0  0  0  2590
> > VecNorm             5422 1.0 1.2229e-03 1.0 1.74e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  3  0  0  0   0  3  0  0  0  1423
> > VecScale            5822 1.0 1.2764e-03 1.0 9.37e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  2  0  0  0   0  2  0  0  0   734
> > VecCopy            14334 1.0 1.8302e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecSet              1231 1.0 3.0700e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecAXPY            14961 1.0 3.3679e-03 1.0 4.82e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  9  0  0  0   0  9  0  0  0  1430
> > VecWAXPY           20842 1.0 5.5537e-03 1.0 4.00e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  8  0  0  0   0  8  0  0  0   721
> > VecMAXPY            2894 1.0 1.4292e-03 1.0 3.62e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  7  0  0  0   0  7  0  0  0  2536
> > VecSetRandom          34 1.0 1.0322e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecReduceArith      1146 1.0 2.9907e-04 1.0 3.68e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0  1230
> > VecReduceComm        573 1.0 9.2384e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecNormalize        2894 1.0 1.9604e-03 1.0 1.39e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  3  0  0  0   0  3  0  0  0   712
> > SNESJacobianEval     573 1.0 2.2410e+00 1.0 5.60e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 62 11  0  0  0  62 11  0  0  0     2
> > MatMult MF          2928 1.0 5.7922e-01 1.0 2.83e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 16  5  0  0  0  16  5  0  0  0     5
> > MatMult             2928 1.0 5.7963e-01 1.0 2.83e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 16  5  0  0  0  16  5  0  0  0     5
> > MatSolve            2894 1.0 7.3171e-03 1.0 1.76e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 33  0  0  0   0 33  0  0  0  2405
> > MatLUFactorNum       573 1.0 1.4733e-02 1.0 1.71e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 32  0  0  0   0 32  0  0  0  1158
> > MatILUFactorSym        1 1.0 2.4543e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatAssemblyBegin    1147 1.0 7.9204e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatAssemblyEnd      1147 1.0 2.5825e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatGetRowIJ            2 1.0 6.3280e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatGetOrdering         1 1.0 1.5754e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatZeroEntries       573 1.0 5.7120e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatFDColorCreate       1 1.0 2.0541e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatFDColorSetUp        1 1.0 1.6103e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatFDColorApply      573 1.0 2.2386e+00 1.0 5.60e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 62 11  0  0  0  62 11  0  0  0     2
> > MatFDColorFunc     11460 1.0 2.2264e+00 1.0 1.85e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 62  3  0  0  0  62  3  0  0  0     1
> > MatColoringApply       1 1.0 3.9990e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > KSPGMRESOrthog      2321 1.0 2.9685e-03 1.0 5.75e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 11  0  0  0   0 11  0  0  0  1935
> > KSPSetUp             573 1.0 2.3291e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > KSPSolve             573 1.0 5.0164e-01 1.0 4.50e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00 14 85  0  0  0  14 85  0  0  0    90
> > PCSetUp              573 1.0 1.5172e-02 1.0 1.71e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 32  0  0  0   0 32  0  0  0  1124
> > PCApply             2894 1.0 7.8614e-03 1.0 1.76e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 33  0  0  0   0 33  0  0  0  2239
> > ------------------------------------------------------------
> ------------------------------------------------------------
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170519/3a07144a/attachment.html>

From hongzhang at anl.gov  Fri May 19 13:53:47 2017
From: hongzhang at anl.gov (Zhang, Hong)
Date: Fri, 19 May 2017 18:53:47 +0000
Subject: [petsc-users] Understanding log summary
In-Reply-To: <CAJWxZDQvCc+p9=vgp01+fhHwLYQVNXnucfNBP9zG9QX1Xc3F=w@mail.gmail.com>
References: <CAJWxZDQvCc+p9=vgp01+fhHwLYQVNXnucfNBP9zG9QX1Xc3F=w@mail.gmail.com>
Message-ID: <633AA7C7-6B11-4DAE-B855-90873FD8B330@anl.gov>


On May 19, 2017, at 9:25 AM, Zou, Ling <ling.zou at inl.gov<mailto:ling.zou at inl.gov>> wrote:

Hi All,

In terms of code performance, sometimes people would ask for info about total non-linear iteration numbers, total linear iteration numbers, etc. I suppose all these could be found in the log summary. For the attached log summary, can I say?
  total non-linear iteration number = 573
  total linear iteration number = 2321

Usually SNESSolve corresponds to the number of nonlinear iterations in the summary. It seems that you are solving some linear systems with GMRES. And there are 573 linear solves with 2321 GMRES iterations in total.

Hong (Mr.)

Thank you.

Ling



------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecDot               607 1.0 1.8729e-04 1.0 1.95e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1040
VecMDot             2321 1.0 1.1075e-03 1.0 2.87e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  5  0  0  0   0  5  0  0  0  2590
VecNorm             5422 1.0 1.2229e-03 1.0 1.74e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  3  0  0  0  1423
VecScale            5822 1.0 1.2764e-03 1.0 9.37e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0   734
VecCopy            14334 1.0 1.8302e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet              1231 1.0 3.0700e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY            14961 1.0 3.3679e-03 1.0 4.82e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  9  0  0  0   0  9  0  0  0  1430
VecWAXPY           20842 1.0 5.5537e-03 1.0 4.00e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  8  0  0  0   0  8  0  0  0   721
VecMAXPY            2894 1.0 1.4292e-03 1.0 3.62e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  7  0  0  0   0  7  0  0  0  2536
VecSetRandom          34 1.0 1.0322e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecReduceArith      1146 1.0 2.9907e-04 1.0 3.68e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  1230
VecReduceComm        573 1.0 9.2384e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize        2894 1.0 1.9604e-03 1.0 1.39e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  3  0  0  0   712
SNESJacobianEval     573 1.0 2.2410e+00 1.0 5.60e+06 1.0 0.0e+00 0.0e+00 0.0e+00 62 11  0  0  0  62 11  0  0  0     2
MatMult MF          2928 1.0 5.7922e-01 1.0 2.83e+06 1.0 0.0e+00 0.0e+00 0.0e+00 16  5  0  0  0  16  5  0  0  0     5
MatMult             2928 1.0 5.7963e-01 1.0 2.83e+06 1.0 0.0e+00 0.0e+00 0.0e+00 16  5  0  0  0  16  5  0  0  0     5
MatSolve            2894 1.0 7.3171e-03 1.0 1.76e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0 33  0  0  0   0 33  0  0  0  2405
MatLUFactorNum       573 1.0 1.4733e-02 1.0 1.71e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0 32  0  0  0   0 32  0  0  0  1158
MatILUFactorSym        1 1.0 2.4543e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin    1147 1.0 7.9204e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd      1147 1.0 2.5825e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            2 1.0 6.3280e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 1.5754e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries       573 1.0 5.7120e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatFDColorCreate       1 1.0 2.0541e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatFDColorSetUp        1 1.0 1.6103e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatFDColorApply      573 1.0 2.2386e+00 1.0 5.60e+06 1.0 0.0e+00 0.0e+00 0.0e+00 62 11  0  0  0  62 11  0  0  0     2
MatFDColorFunc     11460 1.0 2.2264e+00 1.0 1.85e+06 1.0 0.0e+00 0.0e+00 0.0e+00 62  3  0  0  0  62  3  0  0  0     1
MatColoringApply       1 1.0 3.9990e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog      2321 1.0 2.9685e-03 1.0 5.75e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0 11  0  0  0   0 11  0  0  0  1935
KSPSetUp             573 1.0 2.3291e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve             573 1.0 5.0164e-01 1.0 4.50e+07 1.0 0.0e+00 0.0e+00 0.0e+00 14 85  0  0  0  14 85  0  0  0    90
PCSetUp              573 1.0 1.5172e-02 1.0 1.71e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0 32  0  0  0   0 32  0  0  0  1124
PCApply             2894 1.0 7.8614e-03 1.0 1.76e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0 33  0  0  0   0 33  0  0  0  2239
------------------------------------------------------------------------------------------------------------------------


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170519/42f31d45/attachment-0001.html>

From hongzhang at anl.gov  Fri May 19 14:03:08 2017
From: hongzhang at anl.gov (Zhang, Hong)
Date: Fri, 19 May 2017 19:03:08 +0000
Subject: [petsc-users] Understanding log summary
In-Reply-To: <633AA7C7-6B11-4DAE-B855-90873FD8B330@anl.gov>
References: <CAJWxZDQvCc+p9=vgp01+fhHwLYQVNXnucfNBP9zG9QX1Xc3F=w@mail.gmail.com>
	<633AA7C7-6B11-4DAE-B855-90873FD8B330@anl.gov>
Message-ID: <C9B49EC6-497B-4076-9766-D2D462B53F29@anl.gov>


On May 19, 2017, at 1:53 PM, Zhang, Hong <hongzhang at anl.gov<mailto:hongzhang at anl.gov>> wrote:


On May 19, 2017, at 9:25 AM, Zou, Ling <ling.zou at inl.gov<mailto:ling.zou at inl.gov>> wrote:

Hi All,

In terms of code performance, sometimes people would ask for info about total non-linear iteration numbers, total linear iteration numbers, etc. I suppose all these could be found in the log summary. For the attached log summary, can I say?
  total non-linear iteration number = 573
  total linear iteration number = 2321

Usually SNESSolve corresponds to the number of nonlinear iterations in the summary.
It seems that you are solving some linear systems with GMRES. And there are 573 linear solves with 2321 GMRES iterations in total.

Correction: SNESSolve gives the number of nonlinear solves.

My mistake. I just realize you are probably using your own nonlinear solver (not PETSc SNES). Then you can say 573 non-linear iterations and 2321 linear iterations.

Hong (Mr.)

Hong (Mr.)

Thank you.

Ling



------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecDot               607 1.0 1.8729e-04 1.0 1.95e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1040
VecMDot             2321 1.0 1.1075e-03 1.0 2.87e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  5  0  0  0   0  5  0  0  0  2590
VecNorm             5422 1.0 1.2229e-03 1.0 1.74e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  3  0  0  0  1423
VecScale            5822 1.0 1.2764e-03 1.0 9.37e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0   734
VecCopy            14334 1.0 1.8302e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet              1231 1.0 3.0700e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY            14961 1.0 3.3679e-03 1.0 4.82e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  9  0  0  0   0  9  0  0  0  1430
VecWAXPY           20842 1.0 5.5537e-03 1.0 4.00e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  8  0  0  0   0  8  0  0  0   721
VecMAXPY            2894 1.0 1.4292e-03 1.0 3.62e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  7  0  0  0   0  7  0  0  0  2536
VecSetRandom          34 1.0 1.0322e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecReduceArith      1146 1.0 2.9907e-04 1.0 3.68e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  1230
VecReduceComm        573 1.0 9.2384e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize        2894 1.0 1.9604e-03 1.0 1.39e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  3  0  0  0   712
SNESJacobianEval     573 1.0 2.2410e+00 1.0 5.60e+06 1.0 0.0e+00 0.0e+00 0.0e+00 62 11  0  0  0  62 11  0  0  0     2
MatMult MF          2928 1.0 5.7922e-01 1.0 2.83e+06 1.0 0.0e+00 0.0e+00 0.0e+00 16  5  0  0  0  16  5  0  0  0     5
MatMult             2928 1.0 5.7963e-01 1.0 2.83e+06 1.0 0.0e+00 0.0e+00 0.0e+00 16  5  0  0  0  16  5  0  0  0     5
MatSolve            2894 1.0 7.3171e-03 1.0 1.76e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0 33  0  0  0   0 33  0  0  0  2405
MatLUFactorNum       573 1.0 1.4733e-02 1.0 1.71e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0 32  0  0  0   0 32  0  0  0  1158
MatILUFactorSym        1 1.0 2.4543e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin    1147 1.0 7.9204e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd      1147 1.0 2.5825e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            2 1.0 6.3280e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 1.5754e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries       573 1.0 5.7120e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatFDColorCreate       1 1.0 2.0541e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatFDColorSetUp        1 1.0 1.6103e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatFDColorApply      573 1.0 2.2386e+00 1.0 5.60e+06 1.0 0.0e+00 0.0e+00 0.0e+00 62 11  0  0  0  62 11  0  0  0     2
MatFDColorFunc     11460 1.0 2.2264e+00 1.0 1.85e+06 1.0 0.0e+00 0.0e+00 0.0e+00 62  3  0  0  0  62  3  0  0  0     1
MatColoringApply       1 1.0 3.9990e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog      2321 1.0 2.9685e-03 1.0 5.75e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0 11  0  0  0   0 11  0  0  0  1935
KSPSetUp             573 1.0 2.3291e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve             573 1.0 5.0164e-01 1.0 4.50e+07 1.0 0.0e+00 0.0e+00 0.0e+00 14 85  0  0  0  14 85  0  0  0    90
PCSetUp              573 1.0 1.5172e-02 1.0 1.71e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0 32  0  0  0   0 32  0  0  0  1124
PCApply             2894 1.0 7.8614e-03 1.0 1.76e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0 33  0  0  0   0 33  0  0  0  2239
------------------------------------------------------------------------------------------------------------------------



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170519/763550cb/attachment.html>

From ling.zou at inl.gov  Fri May 19 14:28:28 2017
From: ling.zou at inl.gov (Zou, Ling)
Date: Fri, 19 May 2017 13:28:28 -0600
Subject: [petsc-users] Understanding log summary
In-Reply-To: <C9B49EC6-497B-4076-9766-D2D462B53F29@anl.gov>
References: <CAJWxZDQvCc+p9=vgp01+fhHwLYQVNXnucfNBP9zG9QX1Xc3F=w@mail.gmail.com>
	<633AA7C7-6B11-4DAE-B855-90873FD8B330@anl.gov>
	<C9B49EC6-497B-4076-9766-D2D462B53F29@anl.gov>
Message-ID: <CAJWxZDRyXyGHfx8V5XAADto+yzRa0vALoLbRy+x5FgNTWPjnpg@mail.gmail.com>

No problem :)
Thanks as well.

Ling

On Fri, May 19, 2017 at 1:03 PM, Zhang, Hong <hongzhang at anl.gov> wrote:

>
> On May 19, 2017, at 1:53 PM, Zhang, Hong <hongzhang at anl.gov> wrote:
>
>
> On May 19, 2017, at 9:25 AM, Zou, Ling <ling.zou at inl.gov> wrote:
>
> Hi All,
>
> In terms of code performance, sometimes people would ask for info about
> total non-linear iteration numbers, total linear iteration numbers, etc. I
> suppose all these could be found in the log summary. For the attached log
> summary, can I say?
>   total non-linear iteration number = 573
>   total linear iteration number = 2321
>
>
> Usually SNESSolve corresponds to the number of nonlinear iterations in the
> summary.
>
> It seems that you are solving some linear systems with GMRES. And there
> are 573 linear solves with 2321 GMRES iterations in total.
>
>
> Correction: SNESSolve gives the number of nonlinear solves.
>
> My mistake. I just realize you are probably using your own nonlinear
> solver (not PETSc SNES). Then you can say 573 non-linear iterations and
> 2321 linear iterations.
>
> Hong (Mr.)
>
> Hong (Mr.)
>
> Thank you.
>
> Ling
>
>
>
> ------------------------------------------------------------
> ------------------------------------------------------------
> Event                Count      Time (sec)     Flops
>       --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ------------------------------------------------------------
> ------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> VecDot               607 1.0 1.8729e-04 1.0 1.95e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0  1040
> VecMDot             2321 1.0 1.1075e-03 1.0 2.87e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  5  0  0  0   0  5  0  0  0  2590
> VecNorm             5422 1.0 1.2229e-03 1.0 1.74e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  3  0  0  0   0  3  0  0  0  1423
> VecScale            5822 1.0 1.2764e-03 1.0 9.37e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  2  0  0  0   0  2  0  0  0   734
> VecCopy            14334 1.0 1.8302e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet              1231 1.0 3.0700e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY            14961 1.0 3.3679e-03 1.0 4.82e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  9  0  0  0   0  9  0  0  0  1430
> VecWAXPY           20842 1.0 5.5537e-03 1.0 4.00e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  8  0  0  0   0  8  0  0  0   721
> VecMAXPY            2894 1.0 1.4292e-03 1.0 3.62e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  7  0  0  0   0  7  0  0  0  2536
> VecSetRandom          34 1.0 1.0322e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecReduceArith      1146 1.0 2.9907e-04 1.0 3.68e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0  1230
> VecReduceComm        573 1.0 9.2384e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecNormalize        2894 1.0 1.9604e-03 1.0 1.39e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  3  0  0  0   0  3  0  0  0   712
> SNESJacobianEval     573 1.0 2.2410e+00 1.0 5.60e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 62 11  0  0  0  62 11  0  0  0     2
> MatMult MF          2928 1.0 5.7922e-01 1.0 2.83e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 16  5  0  0  0  16  5  0  0  0     5
> MatMult             2928 1.0 5.7963e-01 1.0 2.83e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 16  5  0  0  0  16  5  0  0  0     5
> MatSolve            2894 1.0 7.3171e-03 1.0 1.76e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 33  0  0  0   0 33  0  0  0  2405
> MatLUFactorNum       573 1.0 1.4733e-02 1.0 1.71e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 32  0  0  0   0 32  0  0  0  1158
> MatILUFactorSym        1 1.0 2.4543e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyBegin    1147 1.0 7.9204e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd      1147 1.0 2.5825e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetRowIJ            2 1.0 6.3280e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         1 1.0 1.5754e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatZeroEntries       573 1.0 5.7120e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatFDColorCreate       1 1.0 2.0541e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatFDColorSetUp        1 1.0 1.6103e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatFDColorApply      573 1.0 2.2386e+00 1.0 5.60e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 62 11  0  0  0  62 11  0  0  0     2
> MatFDColorFunc     11460 1.0 2.2264e+00 1.0 1.85e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 62  3  0  0  0  62  3  0  0  0     1
> MatColoringApply       1 1.0 3.9990e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPGMRESOrthog      2321 1.0 2.9685e-03 1.0 5.75e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 11  0  0  0   0 11  0  0  0  1935
> KSPSetUp             573 1.0 2.3291e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve             573 1.0 5.0164e-01 1.0 4.50e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00 14 85  0  0  0  14 85  0  0  0    90
> PCSetUp              573 1.0 1.5172e-02 1.0 1.71e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 32  0  0  0   0 32  0  0  0  1124
> PCApply             2894 1.0 7.8614e-03 1.0 1.76e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 33  0  0  0   0 33  0  0  0  2239
> ------------------------------------------------------------
> ------------------------------------------------------------
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170519/b3337fce/attachment-0001.html>

From bsmith at mcs.anl.gov  Sat May 20 15:09:55 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 20 May 2017 15:09:55 -0500
Subject: [petsc-users] Problems with PetscObjectViewFromOptions in
	Fortran
In-Reply-To: <8603540a-38de-ead0-8690-3a9e5d063e7f@physik.uni-muenchen.de>
References: <8603540a-38de-ead0-8690-3a9e5d063e7f@physik.uni-muenchen.de>
Message-ID: <13EF3C0A-9E70-4CC3-9F84-1B6BA62104C2@mcs.anl.gov>


> On May 18, 2017, at 5:11 AM, Fabian.Jakub <Fabian.Jakub at physik.uni-muenchen.de> wrote:
> 
> Dear Petsc Team,
> 
> I have a problem with object viewing through PetscObjectViewFromOptions
> 
> The C Version works fine, e.g.
> 
> static char help[] = "Testing multiple PetscObjectViewFromOptions";
> #include <petsc.h>
> 
> int main(int argc,char **argv) {
>  DM dmA, dmB;
>  PetscInitialize(&argc,&argv,(char*)0,help);
> 
>  PetscErrorCode ierr;
> 
>  ierr = DMPlexCreate(PETSC_COMM_WORLD, &dmA); CHKERRQ(ierr);
>  ierr = PetscObjectSetName((PetscObject) dmA, "DMPlex_A"); CHKERRQ(ierr);
> 
>  ierr = DMPlexCreate(PETSC_COMM_WORLD, &dmB); CHKERRQ(ierr);
>  ierr = PetscObjectSetName((PetscObject) dmB, "DMPlex_B"); CHKERRQ(ierr);
> 
>  PetscObjectViewFromOptions((PetscObject) dmA, NULL, "-dmA");
>  PetscObjectViewFromOptions((PetscObject) dmB, NULL, "-dmB");
> 
>  ierr = DMDestroy(&dmA); CHKERRQ(ierr);
>  ierr = DMDestroy(&dmB); CHKERRQ(ierr);
> 
>  PetscFinalize();
> }
> 
> and running it with -help, correctly produces the options and views as:
> 
> -dmA
> 
> -dmB
> 
> 
> but the equivalent in Fortran, e.g.:
> 
> program main
> #include "petsc/finclude/petsc.h"
>      use petsc
>      implicit none
> 
>      PetscErrorCode :: ierr
> 
>      DM :: dmA, dmB
> 
>      call PetscInitialize(PETSC_NULL_CHARACTER, ierr); CHKERRQ(ierr)
> 
>      call DMPlexCreate(PETSC_COMM_WORLD, dmA, ierr);CHKERRQ(ierr)
>      call PetscObjectSetName(dmA, 'DMPlex_A', ierr);CHKERRQ(ierr)
> 
>      call DMPlexCreate(PETSC_COMM_WORLD, dmB, ierr);CHKERRQ(ierr)
>      call PetscObjectSetName(dmB, 'DMPlex_B', ierr);CHKERRQ(ierr)
> 
>      call PetscObjectViewFromOptions(dmA, PETSC_NULL_CHARACTER, "-dmA",
> ierr); CHKERRQ(ierr)
>      call PetscObjectViewFromOptions(dmB, PETSC_NULL_CHARACTER, "-dmB",

  The second argument is a PETScObject, not a character string.  This is what is causing the error. You should replace the PETSC_NULL_CHARACTER with PETSC_NULL_OBJECT in PETSc 3.7.x or earlier or with PETSC_NULL_VEC with the master branch development version of PETSc.

   I have added more error checking to the branch barry/errorcheck-fortran-petscobjectviewfromoptions that will detect this error in the future.

   Thanks for reporting the problem,


   Barry



> ierr); CHKERRQ(ierr)
> 
>      call DMDestroy(dmA, ierr);CHKERRQ(ierr)
>      call DMDestroy(dmB, ierr);CHKERRQ(ierr)
> 
>      call PetscFinalize(ierr)
> 
> end program
> 
> produces the options to be:
> 
> -dmA-dmB
> 
> -dmB
> 
> 
> 
> While this works as expected when running with:
>    ./example -dmA-dmB -dmB
> 
> This is not intuitive.
> 
> Is the hickup on my side or is it somewhere in the Fortran stubs?
> 
> Please, let me know if you need more details on the build or if you
> cannot reproduce this.
> 
> Many thanks,
> 
> Fabian
> 
> 
> 
> Petsc Development GIT revision: v3.7.6-3910-gd04c6f6  GIT Date:
> 2017-05-15 17:09:20 -0500
> ./configure                           \
>  --with-cc=$(which mpicc)            \
>  --with-fc=$(which mpif90)           \
>  --with-cxx=$(which mpicxx)          \
>  --with-fortran                      \
>  --with-fortran-interfaces           \
>  --with-shared-libraries=1           \
>  --download-hdf5                     \
>  --download-triangle                 \
>  --download-ctetgen                  \
>  --with-cmake=$(which cmake)         \
>  --with-debugging=1                  \
>  COPTFLAGS='-O2 ' \
>  FOPTFLAGS='-O2 ' \
>  \
>  && make all test
> 
> GNU Fortran (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
> (Open MPI) 1.10.2
> 
> 
> Complete output of -help -info (Fortran Version):
> [0] petscinitialize_internal(): (Fortran):PETSc successfully started:
> procs 1
> [0] PetscGetHostName(): Rejecting domainname, likely is NIS
> met-ws-740m19.(none)
> [0] petscinitialize_internal(): Running on machine: met-ws-740m19
> ------Additional PETSc component options--------
> -log_exclude: <vec,mat,pc.ksp,snes>
> -info_exclude: <null,vec,mat,pc,ksp,snes,ts>
> -----------------------------------------------
> [0] PetscCommDuplicate(): Duplicating a communicator 47693199447680
> 11260976 max tags = 2147483647
> [0] PetscCommDuplicate(): Using internal PETSc communicator
> 47693199447680 11260976
> [0] PetscCommDuplicate(): Using internal PETSc communicator
> 47693199447680 11260976
> [0] PetscCommDuplicate(): Using internal PETSc communicator
> 47693199447680 11260976
> [0] PetscCommDuplicate(): Using internal PETSc communicator
> 47693199447680 11260976
> [0] PetscCommDuplicate(): Using internal PETSc communicator
> 47693199447680 11260976
> [0] PetscCommDuplicate(): Using internal PETSc communicator
> 47693199447680 11260976
> [0] PetscCommDuplicate(): Using internal PETSc communicator
> 47693199447680 11260976
> 
>  -dmA-dmB ascii[:[filename][:[format][:append]]]: Prints object to
> stdout or ASCII file (PetscOptionsGetViewer)
>    -dmA-dmB binary[:[filename][:[format][:append]]]: Saves object to a
> binary file (PetscOptionsGetViewer)
>    -dmA-dmB draw[:drawtype[:filename]] Draws object (PetscOptionsGetViewer)
>    -dmA-dmB socket[:port]: Pushes object to a Unix socket
> (PetscOptionsGetViewer)
>    -dmA-dmB saws[:communicatorname]: Publishes object to SAWs
> (PetscOptionsGetViewer)
> 
> DM Object: DMPlex_A 1 MPI processes
>  type: plex
> [0] PetscCommDuplicate(): Duplicating a communicator 47693199449728
> 13799472 max tags = 2147483647
> [0] PetscCommDuplicate(): Using internal PETSc communicator
> 47693199449728 13799472
> DMPlex_A in 0 dimensions:
>  0-cells: 0
> 
>  -dmB ascii[:[filename][:[format][:append]]]: Prints object to stdout
> or ASCII file (PetscOptionsGetViewer)
>    -dmB binary[:[filename][:[format][:append]]]: Saves object to a
> binary file (PetscOptionsGetViewer)
>    -dmB draw[:drawtype[:filename]] Draws object (PetscOptionsGetViewer)
>    -dmB socket[:port]: Pushes object to a Unix socket
> (PetscOptionsGetViewer)
>    -dmB saws[:communicatorname]: Publishes object to SAWs
> (PetscOptionsGetViewer)
> 
> DM Object: DMPlex_B 1 MPI processes
>  type: plex
> [0] PetscCommDuplicate(): Using internal PETSc communicator
> 47693199449728 13799472
> [0] PetscCommDuplicate(): Using internal PETSc communicator
> 47693199449728 13799472
> DMPlex_B in 0 dimensions:
>  0-cells: 0
> [0] Petsc_DelComm_Inner(): Removing reference to PETSc communicator
> embedded in a user MPI_Comm 13799472
> [0] Petsc_DelComm_Outer(): User MPI_Comm 47693199449728 is being freed
> after removing reference from inner PETSc comm to this outer comm
> [0] PetscCommDestroy(): Deleting PETSc MPI_Comm 13799472
> [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm 13799472
> [0] PetscFinalize(): PetscFinalize() called
> [0] PetscGetHostName(): Rejecting domainname, likely is NIS
> met-ws-740m19.(none)
> [0] PetscFOpen(): Opening file Log.0
> [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm
> 11260976
> [0] Petsc_DelComm_Inner(): Removing reference to PETSc communicator
> embedded in a user MPI_Comm 11260976
> [0] Petsc_DelComm_Outer(): User MPI_Comm 47693199447680 is being freed
> after removing reference from inner PETSc comm to this outer comm
> [0] PetscCommDestroy(): Deleting PETSc MPI_Comm 11260976
> [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm 11260976
> 


From franck.houssen at inria.fr  Sun May 21 11:11:57 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Sun, 21 May 2017 18:11:57 +0200 (CEST)
Subject: [petsc-users] Is xout always already zero'ed when being called back
 on PCShellSetApply ?
In-Reply-To: <2135243060.6755933.1495382556901.JavaMail.zimbra@inria.fr>
Message-ID: <453436828.6756843.1495383117808.JavaMail.zimbra@inria.fr>

When using http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCShellSetApply.html, do I have the guaranty that xout from " PetscErrorCode apply ( PC pc, Vec xin, Vec xout)" has always been previously filled with zeros ? 
I may have to fill xout by blocks (I would += several possibly overlapping blocks: I need to make sure xout is first filled with zero to get the correct result). 

Franck 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170521/c9dffde8/attachment.html>

From franck.houssen at inria.fr  Sun May 21 11:23:00 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Sun, 21 May 2017 18:23:00 +0200 (CEST)
Subject: [petsc-users] Using PETSc MatIS,
 how to get local matrix (= one domain) before and after assembly ?
In-Reply-To: <867421313.6757137.1495383596545.JavaMail.zimbra@inria.fr>
Message-ID: <1253777447.6757298.1495383780337.JavaMail.zimbra@inria.fr>

I have a 3x3 global matrix is built (diag: 1, 2, 1): it's made of 2 overlapping 2x2 local matrix (diag: 1, 1). 
Getting non assembled local matrix is OK with MatISGetLocalMat. 
How to get assembled local matrix (initial local matrix + neigbhor contributions on the borders) ? (expected result is diag: 2, 1) 

Franck 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170521/c714768f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matISLocalMat.cpp
Type: text/x-c++src
Size: 2354 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170521/c714768f/attachment.cpp>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matISLocalMat.log
Type: text/x-log
Size: 2285 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170521/c714768f/attachment.bin>

From franck.houssen at inria.fr  Sun May 21 11:26:14 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Sun, 21 May 2017 18:26:14 +0200 (CEST)
Subject: [petsc-users] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
Message-ID: <1564257107.6757440.1495383974167.JavaMail.zimbra@inria.fr>

Using PETSc MatIS, how to matmult a global IS matrix and a global vector ? Example is attached : I don't get what I expect that is a vector such that proc0 = [1, 2] and proc1 = [2, 1] 

Franck 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170521/b028c394/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matISProdMatVec.cpp
Type: text/x-c++src
Size: 2104 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170521/b028c394/attachment.cpp>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matISProdMatVec.log
Type: text/x-log
Size: 441 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170521/b028c394/attachment.bin>

From knepley at gmail.com  Sun May 21 11:41:10 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 21 May 2017 11:41:10 -0500
Subject: [petsc-users] Is xout always already zero'ed when being called
 back on PCShellSetApply ?
In-Reply-To: <453436828.6756843.1495383117808.JavaMail.zimbra@inria.fr>
References: <2135243060.6755933.1495382556901.JavaMail.zimbra@inria.fr>
	<453436828.6756843.1495383117808.JavaMail.zimbra@inria.fr>
Message-ID: <CAMYG4GmDkMbsP=e48zjvboTDeAH--X8+3b53jY4=J++w1WjfhA@mail.gmail.com>

On Sun, May 21, 2017 at 11:11 AM, Franck Houssen <franck.houssen at inria.fr>
wrote:

> When using http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/
> PCShellSetApply.html, do I have the guaranty that xout from "
> PetscErrorCode
> <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscErrorCode.html#PetscErrorCode>
> apply (PC
> <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PC.html#PC>
> pc,Vec
> <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/Vec.html#Vec>
> xin,Vec
> <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/Vec.html#Vec>
> xout)" has always been previously filled with zeros ?
> I may have to fill xout by blocks (I would += several possibly overlapping
> blocks: I need to make sure xout is first filled with zero to get the
> correct result).
>

No, we do not initialize the output vector. Yo ucan call VecSet(xout, 0.0);

  Thanks,

    Matt


> Franck
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170521/bf822ced/attachment-0001.html>

From knepley at gmail.com  Sun May 21 11:42:59 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 21 May 2017 11:42:59 -0500
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to get local matrix (= one domain) before and after assembly ?
In-Reply-To: <1253777447.6757298.1495383780337.JavaMail.zimbra@inria.fr>
References: <867421313.6757137.1495383596545.JavaMail.zimbra@inria.fr>
	<1253777447.6757298.1495383780337.JavaMail.zimbra@inria.fr>
Message-ID: <CAMYG4Gknt_kBW8LXHma=2fFmBO1X70B0Hg6JiXi+nuDwy5ygKA@mail.gmail.com>

On Sun, May 21, 2017 at 11:23 AM, Franck Houssen <franck.houssen at inria.fr>
wrote:

> I have a 3x3 global matrix is built (diag: 1, 2, 1): it's made of 2
> overlapping 2x2 local matrix (diag: 1, 1).
> Getting non assembled local matrix is OK with MatISGetLocalMat.
> How to get assembled local matrix (initial local matrix + neigbhor
> contributions on the borders) ? (expected result is diag: 2, 1)
>

You can always use

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrix.html
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrices.html

to get copies, but if you just want to build things, you can use

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetLocalSubMatrix.html

  Thanks,

     Matt


> Franck
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170521/2fbf9267/attachment.html>

From knepley at gmail.com  Sun May 21 11:47:10 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 21 May 2017 11:47:10 -0500
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <1564257107.6757440.1495383974167.JavaMail.zimbra@inria.fr>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<1564257107.6757440.1495383974167.JavaMail.zimbra@inria.fr>
Message-ID: <CAMYG4GmzqskrmHAoD6h01r+SUQX_wL=8UPGO71gevy2w_qUvUQ@mail.gmail.com>

On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <franck.houssen at inria.fr>
wrote:

> Using PETSc MatIS, how to matmult a global IS matrix and a global vector ?
> Example is attached : I don't get what I expect that is a vector such that
> proc0 = [1, 2] and proc1 = [2, 1]
>

1) I think the global size of your matrix is wrong. You seem to want 3, not
4

2) Global vectors have a non-overlapping row partition. You might be
thinking of local vectors

  Thanks,

    Matt


> Franck
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170521/8dd5b41f/attachment.html>

From stefano.zampini at gmail.com  Sun May 21 15:51:34 2017
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Sun, 21 May 2017 23:51:34 +0300
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to get local matrix (= one domain) before and after assembly ?
In-Reply-To: <CAMYG4Gknt_kBW8LXHma=2fFmBO1X70B0Hg6JiXi+nuDwy5ygKA@mail.gmail.com>
References: <867421313.6757137.1495383596545.JavaMail.zimbra@inria.fr>
	<1253777447.6757298.1495383780337.JavaMail.zimbra@inria.fr>
	<CAMYG4Gknt_kBW8LXHma=2fFmBO1X70B0Hg6JiXi+nuDwy5ygKA@mail.gmail.com>
Message-ID: <CAGPUisi5V7DkFLAt3kBRVTyVrZnYjPV21uTdUwiDku8_MVnHPA@mail.gmail.com>

To assemble the operator in aij format, use
MatISGetMPIXAIJ
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatISGetMPIXAIJ.html

Il 21 Mag 2017 18:43, "Matthew Knepley" <knepley at gmail.com> ha scritto:

> On Sun, May 21, 2017 at 11:23 AM, Franck Houssen <franck.houssen at inria.fr>
> wrote:
>
>> I have a 3x3 global matrix is built (diag: 1, 2, 1): it's made of 2
>> overlapping 2x2 local matrix (diag: 1, 1).
>> Getting non assembled local matrix is OK with MatISGetLocalMat.
>> How to get assembled local matrix (initial local matrix + neigbhor
>> contributions on the borders) ? (expected result is diag: 2, 1)
>>
>
> You can always use
>
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/
> MatGetSubMatrix.html
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/
> MatGetSubMatrices.html
>
> to get copies, but if you just want to build things, you can use
>
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/
> MatGetLocalSubMatrix.html
>
>   Thanks,
>
>      Matt
>
>
>> Franck
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170521/4d478215/attachment.html>

From stefano.zampini at gmail.com  Sun May 21 16:02:37 2017
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Sun, 21 May 2017 23:02:37 +0200
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <CAMYG4GmzqskrmHAoD6h01r+SUQX_wL=8UPGO71gevy2w_qUvUQ@mail.gmail.com>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<1564257107.6757440.1495383974167.JavaMail.zimbra@inria.fr>
	<CAMYG4GmzqskrmHAoD6h01r+SUQX_wL=8UPGO71gevy2w_qUvUQ@mail.gmail.com>
Message-ID: <264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>

Franck,

PETSc takes care of doing the matrix-vector multiplication properly using MatIS.  As Matt said, the layout of the vectors is the usual parallel layout.
The local sizes of the MatIS matrix (i.e. the local size of the left and right vectors used in MatMult) are not the sizes of the local subdomain  matrices in MatIS.


> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <franck.houssen at inria.fr <mailto:franck.houssen at inria.fr>> wrote:
> Using PETSc MatIS, how to matmult a global IS matrix and a global vector ? Example is attached : I don't get what I expect that is a vector such that proc0 = [1, 2] and proc1 = [2, 1]
> 
> 1) I think the global size of your matrix is wrong. You seem to want 3, not 4
> 
> 2) Global vectors have a non-overlapping row partition. You might be thinking of local vectors
> 
>   Thanks,
> 
>     Matt
>  
> Franck
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/~mk51/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170521/1dfe48c8/attachment.html>

From franck.houssen at inria.fr  Mon May 22 02:23:52 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Mon, 22 May 2017 09:23:52 +0200 (CEST)
Subject: [petsc-users] Is xout always already zero'ed when being called
 back on PCShellSetApply ?
In-Reply-To: <CAMYG4GmDkMbsP=e48zjvboTDeAH--X8+3b53jY4=J++w1WjfhA@mail.gmail.com>
References: <2135243060.6755933.1495382556901.JavaMail.zimbra@inria.fr>
	<453436828.6756843.1495383117808.JavaMail.zimbra@inria.fr>
	<CAMYG4GmDkMbsP=e48zjvboTDeAH--X8+3b53jY4=J++w1WjfhA@mail.gmail.com>
Message-ID: <492707441.6837624.1495437832703.JavaMail.zimbra@inria.fr>

OK, thanks. 

Franck 

----- Mail original -----

> De: "Matthew Knepley" <knepley at gmail.com>
> ?: "Franck Houssen" <franck.houssen at inria.fr>
> Cc: "PETSc" <petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> Envoy?: Dimanche 21 Mai 2017 18:41:10
> Objet: Re: [petsc-users] Is xout always already zero'ed when being called
> back on PCShellSetApply ?

> On Sun, May 21, 2017 at 11:11 AM, Franck Houssen < franck.houssen at inria.fr >
> wrote:

> > When using
> > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCShellSetApply.html,
> > do I have the guaranty that xout from " PetscErrorCode apply ( PC pc, Vec
> > xin, Vec xout)" has always been previously filled with zeros ?
> 
> > I may have to fill xout by blocks (I would += several possibly overlapping
> > blocks: I need to make sure xout is first filled with zero to get the
> > correct result).
> 

> No, we do not initialize the output vector. Yo ucan call VecSet(xout, 0.0);

> Thanks,

> Matt

> > Franck
> 

> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener

> http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170522/16e6b4fe/attachment.html>

From pvsang002 at gmail.com  Mon May 22 11:25:09 2017
From: pvsang002 at gmail.com (Pham Pham)
Date: Tue, 23 May 2017 00:25:09 +0800
Subject: [petsc-users] Installation question
In-Reply-To: <CAMYG4Gm23YFFLqG32S2FcvKTRmn+UmbmA22cKov1D9C+1cuSFw@mail.gmail.com>
References: <CAOx7dQm4dhoirqEA5S7=boMEp-u4n0g0sKtvb30geq9DzwJ+tQ@mail.gmail.com>
	<alpine.LFD.2.20.1704191042400.26973@asterix>
	<CAOx7dQ=_xTm-7WEsLLenHZp5ehnyH5mQAbK83w0oOWu5sqZAqg@mail.gmail.com>
	<alpine.LFD.2.20.1704191301160.30353@asterix>
	<CAOx7dQm=ozD3VQX3-8hzth_WfEXC9e5d28Yadh=F+diGPDk0Ww@mail.gmail.com>
	<alpine.LFD.2.20.1705050859500.10073@asterix>
	<CAOx7dQmoY3eCbkks1fzX0HT7FS0bkWOPwqNLE1sbyG3V6nOZqQ@mail.gmail.com>
	<CAMYG4Gkg2ap25Z5+=Ev2M7GspFubtYvYLSFFJMDZkseFhVjXoQ@mail.gmail.com>
	<CAOx7dQ=zh4Murs5AqCPFzwNQ7GasZWYrQwhqhi3dL55NZS=apg@mail.gmail.com>
	<CAMYG4Gm23YFFLqG32S2FcvKTRmn+UmbmA22cKov1D9C+1cuSFw@mail.gmail.com>
Message-ID: <CAOx7dQ=OcKmsZqtqb7ebQJZRpgK7dVYjewpnHX=JACG5vd4dxQ@mail.gmail.com>

Hi Matt,

For the machine I have, Is it a good idea if I mix MPI and OpenMP: MPI for
cores with Rank%12==0 and OpenMP for the others ?

Thank you,

PVS.

On Thu, May 11, 2017 at 8:27 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, May 11, 2017 at 7:08 AM, Pham Pham <pvsang002 at gmail.com> wrote:
>
>> Hi Matt,
>>
>> Thank you for the reply.
>>
>> I am using University HPC which has multiple nodes, and should be good
>> for parallel computing. The bad performance might be due to the way I
>> install and run PETSc...
>>
>> Looking at the output when running streams, I can see that the Processor
>> names were the same.
>> Does that mean only one processor involved in computing, did it cause the
>> bad performance?
>>
>
> Yes. From the data, it appears that the kind of processor you have has 12
> cores, but only enough memory bandwidth to support 1.5 cores.
> Try running the STREAMS with only 1 process per node. This is a setting in
> your submission script, but it is different for every cluster. Thus
> I would ask the local sysdamin for this machine to help you do that. You
> should see almost perfect scaling with that configuration. You might
> also try 2 processes per node to compare.
>
>   Thanks,
>
>      Matt
>
>
>> Thank you very much.
>>
>> Ph.
>>
>> Below is testing output:
>>
>> [mpepvs at atlas5-c01 petsc-3.7.5]$ make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> PETSC_ARCH=arch-linux-cxx-opt streams
>>
>>
>>
>>
>> cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory
>> PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> PETSC_ARCH=arch-linux-cxx-opt streams
>> /app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/bin/mpicxx -o
>> MPIVersion.o c -wd1572 -g -O3   -fPIC    -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
>> -I/hom
>>
>>
>> e/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include
>> -I/app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/include
>> `pwd`/MPIVersion.c
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> +++++++++++++++++++++++++++++++
>> The version of PETSc you are using is out-of-date, we recommend updating
>> to the new release
>>  Available Version: 3.7.6   Installed Version: 3.7.5
>> http://www.mcs.anl.gov/petsc/download/index.html
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> +++++++++++++++++++++++++++++++
>> Running streams with 'mpiexec.hydra ' using 'NPMAX=12'
>> Number of MPI processes 1 Processor names  atlas5-c01
>> Triad:        11026.7604   Rate (MB/s)
>> Number of MPI processes 2 Processor names  atlas5-c01 atlas5-c01
>> Triad:        14669.6730   Rate (MB/s)
>> Number of MPI processes 3 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01
>> Triad:        12848.2644   Rate (MB/s)
>> Number of MPI processes 4 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01
>> Triad:        15033.7687   Rate (MB/s)
>> Number of MPI processes 5 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01
>> Triad:        13299.3830   Rate (MB/s)
>> Number of MPI processes 6 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> Triad:        14382.2116   Rate (MB/s)
>> Number of MPI processes 7 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> Triad:        13194.2573   Rate (MB/s)
>> Number of MPI processes 8 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> Triad:        14199.7255   Rate (MB/s)
>> Number of MPI processes 9 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> Triad:        13045.8946   Rate (MB/s)
>> Number of MPI processes 10 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01
>> Triad:        13058.3283   Rate (MB/s)
>> Number of MPI processes 11 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01
>> Triad:        13037.3334   Rate (MB/s)
>> Number of MPI processes 12 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> Triad:        12526.6096   Rate (MB/s)
>> ------------------------------------------------
>> np  speedup
>> 1 1.0
>> 2 1.33
>> 3 1.17
>> 4 1.36
>> 5 1.21
>> 6 1.3
>> 7 1.2
>> 8 1.29
>> 9 1.18
>> 10 1.18
>> 11 1.18
>> 12 1.14
>> Estimation of possible speedup of MPI programs based on Streams benchmark.
>> It appears you have 1 node(s)
>> See graph in the file src/benchmarks/streams/scaling.png
>>
>> On Fri, May 5, 2017 at 11:26 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Fri, May 5, 2017 at 10:18 AM, Pham Pham <pvsang002 at gmail.com> wrote:
>>>
>>>> Hi Satish,
>>>>
>>>> It runs now, and shows a bad speed up:
>>>> Please help to improve this.
>>>>
>>>
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers
>>>
>>> The short answer is: You cannot improve this without buying a different
>>> machine. This is
>>> a fundamental algorithmic limitation that cannot be helped by threads,
>>> or vectorization, or
>>> anything else.
>>>
>>>    Matt
>>>
>>>
>>>> Thank you.
>>>>
>>>>
>>>> ?
>>>>
>>>> On Fri, May 5, 2017 at 10:02 PM, Satish Balay <balay at mcs.anl.gov>
>>>> wrote:
>>>>
>>>>> With Intel MPI - its best to use mpiexec.hydra [and not mpiexec]
>>>>>
>>>>> So you can do:
>>>>>
>>>>> make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>>>> PETSC_ARCH=arch-linux-cxx-opt MPIEXEC=mpiexec.hydra test
>>>>>
>>>>>
>>>>> [you can also specify --with-mpiexec=mpiexec.hydra at configure time]
>>>>>
>>>>> Satish
>>>>>
>>>>>
>>>>> On Fri, 5 May 2017, Pham Pham wrote:
>>>>>
>>>>> > *Hi,*
>>>>> > *I can configure now, but fail when testing:*
>>>>> >
>>>>> > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
>>>>> > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>>>> PETSC_ARCH=arch-linux-cxx-opt
>>>>> > test   Running test examples to verify correct installation
>>>>> > Using PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 and
>>>>> > PETSC_ARCH=arch-linux-cxx-opt
>>>>> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with
>>>>> 1 MPI
>>>>> > process
>>>>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>> > mpiexec_atlas7-c10: cannot connect to local mpd
>>>>> (/tmp/mpd2.console_mpepvs);
>>>>> > possible causes:
>>>>> >   1. no mpd is running on this host
>>>>> >   2. an mpd is running but was started without a "console" (-n
>>>>> option)
>>>>> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with
>>>>> 2 MPI
>>>>> > processes
>>>>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>> > mpiexec_atlas7-c10: cannot connect to local mpd
>>>>> (/tmp/mpd2.console_mpepvs);
>>>>> > possible causes:
>>>>> >   1. no mpd is running on this host
>>>>> >   2. an mpd is running but was started without a "console" (-n
>>>>> option)
>>>>> > Possible error running Fortran example src/snes/examples/tutorials/ex
>>>>> 5f
>>>>> > with 1 MPI process
>>>>> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>> > mpiexec_atlas7-c10: cannot connect to local mpd
>>>>> (/tmp/mpd2.console_mpepvs);
>>>>> > possible causes:
>>>>> >   1. no mpd is running on this host
>>>>> >   2. an mpd is running but was started without a "console" (-n
>>>>> option)
>>>>> > Completed test examples
>>>>> > =========================================
>>>>> > Now to evaluate the computer systems you plan use - do:
>>>>> > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>>>> > PETSC_ARCH=arch-linux-cxx-opt streams
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > *Please help on this.*
>>>>> > *Many thanks!*
>>>>> >
>>>>> >
>>>>> > On Thu, Apr 20, 2017 at 2:02 AM, Satish Balay <balay at mcs.anl.gov>
>>>>> wrote:
>>>>> >
>>>>> > > Sorry - should have mentioned:
>>>>> > >
>>>>> > > do 'rm -rf arch-linux-cxx-opt' and rerun configure again.
>>>>> > >
>>>>> > > The mpich install from previous build [that is currently in
>>>>> > > arch-linux-cxx-opt/]
>>>>> > > is conflicting with --with-mpi-dir=/app1/centos6.3
>>>>> /gnu/mvapich2-1.9/
>>>>> > >
>>>>> > > Satish
>>>>> > >
>>>>> > >
>>>>> > > On Wed, 19 Apr 2017, Pham Pham wrote:
>>>>> > >
>>>>> > > > I reconfigured PETSs with installed MPI, however, I got serous
>>>>> error:
>>>>> > > >
>>>>> > > > **************************ERROR*****************************
>>>>> ********
>>>>> > > >   Error during compile, check arch-linux-cxx-opt/lib/petsc/c
>>>>> onf/make.log
>>>>> > > >   Send it and arch-linux-cxx-opt/lib/petsc/conf/configure.log to
>>>>> > > > petsc-maint at mcs.anl.gov
>>>>> > > > ************************************************************
>>>>> ********
>>>>> > > >
>>>>> > > > Please explain what is happening?
>>>>> > > >
>>>>> > > > Thank you very much.
>>>>> > > >
>>>>> > > >
>>>>> > > >
>>>>> > > >
>>>>> > > > On Wed, Apr 19, 2017 at 11:43 PM, Satish Balay <
>>>>> balay at mcs.anl.gov>
>>>>> > > wrote:
>>>>> > > >
>>>>> > > > > Presumably your cluster already has a recommended MPI to use
>>>>> [which is
>>>>> > > > > already installed. So you should use that - instead of
>>>>> > > > > --download-mpich=1
>>>>> > > > >
>>>>> > > > > Satish
>>>>> > > > >
>>>>> > > > > On Wed, 19 Apr 2017, Pham Pham wrote:
>>>>> > > > >
>>>>> > > > > > Hi,
>>>>> > > > > >
>>>>> > > > > > I just installed petsc-3.7.5 into my university cluster. When
>>>>> > > evaluating
>>>>> > > > > > the computer system, PETSc reports "It appears you have 1
>>>>> node(s)", I
>>>>> > > > > donot
>>>>> > > > > > understand this, since the system is a multinodes system.
>>>>> Could you
>>>>> > > > > please
>>>>> > > > > > explain this to me?
>>>>> > > > > >
>>>>> > > > > > Thank you very much.
>>>>> > > > > >
>>>>> > > > > > S.
>>>>> > > > > >
>>>>> > > > > > Output:
>>>>> > > > > > =========================================
>>>>> > > > > > Now to evaluate the computer systems you plan use - do:
>>>>> > > > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>>>> > > > > > PETSC_ARCH=arch-linux-cxx-opt streams
>>>>> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
>>>>> > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>>>> > > > > PETSC_ARCH=arch-linux-cxx-opt
>>>>> > > > > > streams
>>>>> > > > > > cd src/benchmarks/streams; /usr/bin/gmake
>>>>> --no-print-directory
>>>>> > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>>>>> > > > > PETSC_ARCH=arch-linux-cxx-opt
>>>>> > > > > > streams
>>>>> > > > > > /home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpicxx
>>>>> -o
>>>>> > > > > > MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
>>>>> > > > > > -Wno-unknown-pragmas -fvisibility=hidden -g -O
>>>>> > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
>>>>> > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/incl
>>>>> ude
>>>>> > > > > > `pwd`/MPIVersion.c
>>>>> > > > > > Running streams with
>>>>> > > > > > '/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpiexec
>>>>> '
>>>>> > > > > using
>>>>> > > > > > 'NPMAX=12'
>>>>> > > > > > Number of MPI processes 1 Processor names  atlas7-c10
>>>>> > > > > > Triad:         9137.5025   Rate (MB/s)
>>>>> > > > > > Number of MPI processes 2 Processor names  atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > > Triad:         9707.2815   Rate (MB/s)
>>>>> > > > > > Number of MPI processes 3 Processor names  atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > atlas7-c10
>>>>> > > > > > Triad:        13559.5275   Rate (MB/s)
>>>>> > > > > > Number of MPI processes 4 Processor names  atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > atlas7-c10
>>>>> > > > > > atlas7-c10
>>>>> > > > > > Triad:        14193.0597   Rate (MB/s)
>>>>> > > > > > Number of MPI processes 5 Processor names  atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > atlas7-c10
>>>>> > > > > > atlas7-c10 atlas7-c10
>>>>> > > > > > Triad:        14492.9234   Rate (MB/s)
>>>>> > > > > > Number of MPI processes 6 Processor names  atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > atlas7-c10
>>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10
>>>>> > > > > > Triad:        15476.5912   Rate (MB/s)
>>>>> > > > > > Number of MPI processes 7 Processor names  atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > atlas7-c10
>>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>>>> > > > > > Triad:        15148.7388   Rate (MB/s)
>>>>> > > > > > Number of MPI processes 8 Processor names  atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > atlas7-c10
>>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>>>> > > > > > Triad:        15799.1290   Rate (MB/s)
>>>>> > > > > > Number of MPI processes 9 Processor names  atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > atlas7-c10
>>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > > Triad:        15671.3104   Rate (MB/s)
>>>>> > > > > > Number of MPI processes 10 Processor names  atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > > atlas7-c10 atlas7-c10
>>>>> > > > > > Triad:        15601.4754   Rate (MB/s)
>>>>> > > > > > Number of MPI processes 11 Processor names  atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10
>>>>> > > > > > Triad:        15434.5790   Rate (MB/s)
>>>>> > > > > > Number of MPI processes 12 Processor names  atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>>>> atlas7-c10
>>>>> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>>>>> > > > > > Triad:        15134.1263   Rate (MB/s)
>>>>> > > > > > ------------------------------------------------
>>>>> > > > > > np  speedup
>>>>> > > > > > 1 1.0
>>>>> > > > > > 2 1.06
>>>>> > > > > > 3 1.48
>>>>> > > > > > 4 1.55
>>>>> > > > > > 5 1.59
>>>>> > > > > > 6 1.69
>>>>> > > > > > 7 1.66
>>>>> > > > > > 8 1.73
>>>>> > > > > > 9 1.72
>>>>> > > > > > 10 1.71
>>>>> > > > > > 11 1.69
>>>>> > > > > > 12 1.66
>>>>> > > > > > Estimation of possible speedup of MPI programs based on
>>>>> Streams
>>>>> > > > > benchmark.
>>>>> > > > > > It appears you have 1 node(s)
>>>>> > > > > > Unable to plot speedup to a file
>>>>> > > > > > Unable to open matplotlib to plot speedup
>>>>> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
>>>>> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
>>>>> > > > > >
>>>>> > > > >
>>>>> > > > >
>>>>> > > >
>>>>> > >
>>>>> > >
>>>>> >
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/b753448c/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scaling.png
Type: image/png
Size: 46047 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/b753448c/attachment-0001.png>

From bsmith at mcs.anl.gov  Mon May 22 12:58:49 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 22 May 2017 12:58:49 -0500
Subject: [petsc-users] Installation question
In-Reply-To: <CAOx7dQ=OcKmsZqtqb7ebQJZRpgK7dVYjewpnHX=JACG5vd4dxQ@mail.gmail.com>
References: <CAOx7dQm4dhoirqEA5S7=boMEp-u4n0g0sKtvb30geq9DzwJ+tQ@mail.gmail.com>
	<alpine.LFD.2.20.1704191042400.26973@asterix>
	<CAOx7dQ=_xTm-7WEsLLenHZp5ehnyH5mQAbK83w0oOWu5sqZAqg@mail.gmail.com>
	<alpine.LFD.2.20.1704191301160.30353@asterix>
	<CAOx7dQm=ozD3VQX3-8hzth_WfEXC9e5d28Yadh=F+diGPDk0Ww@mail.gmail.com>
	<alpine.LFD.2.20.1705050859500.10073@asterix>
	<CAOx7dQmoY3eCbkks1fzX0HT7FS0bkWOPwqNLE1sbyG3V6nOZqQ@mail.gmail.com>
	<CAMYG4Gkg2ap25Z5+=Ev2M7GspFubtYvYLSFFJMDZkseFhVjXoQ@mail.gmail.com>
	<CAOx7dQ=zh4Murs5AqCPFzwNQ7GasZWYrQwhqhi3dL55NZS=apg@mail.gmail.com>
	<CAMYG4Gm23YFFLqG32S2FcvKTRmn+UmbmA22cKov1D9C+1cuSFw@mail.gmail.com>
	<CAOx7dQ=OcKmsZqtqb7ebQJZRpgK7dVYjewpnHX=JACG5vd4dxQ@mail.gmail.com>
Message-ID: <1B05ACB6-6BDF-42C8-89FB-C5ECC5657934@mcs.anl.gov>


> On May 22, 2017, at 11:25 AM, Pham Pham <pvsang002 at gmail.com> wrote:
> 
> Hi Matt,
> 
> For the machine I have, Is it a good idea if I mix MPI and OpenMP: MPI for cores with Rank%12==0 and OpenMP for the others ?
> 

   MPI+OpenMP doesn't work this way. Each "rank" is an MPI process, you cannot say some ranks are MPI and some are OpenMP. If you want to use one MPI process per node and have each MPI process have 12 OpenMP threads you need to find out for YOUR systems MPI how you tell it to put one MPI process per node; 

   Barry

> Thank you,
> 
> PVS. 
> 
> On Thu, May 11, 2017 at 8:27 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Thu, May 11, 2017 at 7:08 AM, Pham Pham <pvsang002 at gmail.com> wrote:
> Hi Matt,
> 
> Thank you for the reply. 
> 
> I am using University HPC which has multiple nodes, and should be good for parallel computing. The bad performance might be due to the way I install and run PETSc...
> 
> Looking at the output when running streams, I can see that the Processor names were the same. 
> Does that mean only one processor involved in computing, did it cause the bad performance?
> 
> Yes. From the data, it appears that the kind of processor you have has 12 cores, but only enough memory bandwidth to support 1.5 cores.
> Try running the STREAMS with only 1 process per node. This is a setting in your submission script, but it is different for every cluster. Thus
> I would ask the local sysdamin for this machine to help you do that. You should see almost perfect scaling with that configuration. You might
> also try 2 processes per node to compare.
> 
>   Thanks,
> 
>      Matt
>  
> Thank you very much.
> 
> Ph. 
> 
> Below is testing output:
> 
> [mpepvs at atlas5-c01 petsc-3.7.5]$ make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt streams                                                                                                                                                                                                                                 
> cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt streams
> /app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/bin/mpicxx -o MPIVersion.o c -wd1572 -g -O3   -fPIC    -I/home/svu/mpepvs/petsc/petsc-3.7.5/include -I/hom                                                                                                                                                                                         e/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include -I/app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/include    `pwd`/MPIVersion.c
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> The version of PETSc you are using is out-of-date, we recommend updating to the new release
>  Available Version: 3.7.6   Installed Version: 3.7.5
> http://www.mcs.anl.gov/petsc/download/index.html
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Running streams with 'mpiexec.hydra ' using 'NPMAX=12'
> Number of MPI processes 1 Processor names  atlas5-c01
> Triad:        11026.7604   Rate (MB/s)
> Number of MPI processes 2 Processor names  atlas5-c01 atlas5-c01
> Triad:        14669.6730   Rate (MB/s)
> Number of MPI processes 3 Processor names  atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        12848.2644   Rate (MB/s)
> Number of MPI processes 4 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        15033.7687   Rate (MB/s)
> Number of MPI processes 5 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        13299.3830   Rate (MB/s)
> Number of MPI processes 6 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        14382.2116   Rate (MB/s)
> Number of MPI processes 7 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        13194.2573   Rate (MB/s)
> Number of MPI processes 8 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        14199.7255   Rate (MB/s)
> Number of MPI processes 9 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        13045.8946   Rate (MB/s)
> Number of MPI processes 10 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        13058.3283   Rate (MB/s)
> Number of MPI processes 11 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        13037.3334   Rate (MB/s)
> Number of MPI processes 12 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> Triad:        12526.6096   Rate (MB/s)
> ------------------------------------------------
> np  speedup
> 1 1.0
> 2 1.33
> 3 1.17
> 4 1.36
> 5 1.21
> 6 1.3
> 7 1.2
> 8 1.29
> 9 1.18
> 10 1.18
> 11 1.18
> 12 1.14
> Estimation of possible speedup of MPI programs based on Streams benchmark.
> It appears you have 1 node(s)
> See graph in the file src/benchmarks/streams/scaling.png  
> 
> On Fri, May 5, 2017 at 11:26 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Fri, May 5, 2017 at 10:18 AM, Pham Pham <pvsang002 at gmail.com> wrote:
> Hi Satish,
> 
> It runs now, and shows a bad speed up:
> Please help to improve this.
> 
> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers
> 
> The short answer is: You cannot improve this without buying a different machine. This is
> a fundamental algorithmic limitation that cannot be helped by threads, or vectorization, or
> anything else.
> 
>    Matt
>  
> Thank you.
> 
> <scaling.png>
> ?
> 
> On Fri, May 5, 2017 at 10:02 PM, Satish Balay <balay at mcs.anl.gov> wrote:
> With Intel MPI - its best to use mpiexec.hydra [and not mpiexec]
> 
> So you can do:
> 
> make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt MPIEXEC=mpiexec.hydra test
> 
> 
> [you can also specify --with-mpiexec=mpiexec.hydra at configure time]
> 
> Satish
> 
> 
> On Fri, 5 May 2017, Pham Pham wrote:
> 
> > *Hi,*
> > *I can configure now, but fail when testing:*
> >
> > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
> > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt
> > test   Running test examples to verify correct installation
> > Using PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 and
> > PETSC_ARCH=arch-linux-cxx-opt
> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI
> > process
> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);
> > possible causes:
> >   1. no mpd is running on this host
> >   2. an mpd is running but was started without a "console" (-n option)
> > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI
> > processes
> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);
> > possible causes:
> >   1. no mpd is running on this host
> >   2. an mpd is running but was started without a "console" (-n option)
> > Possible error running Fortran example src/snes/examples/tutorials/ex5f
> > with 1 MPI process
> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);
> > possible causes:
> >   1. no mpd is running on this host
> >   2. an mpd is running but was started without a "console" (-n option)
> > Completed test examples
> > =========================================
> > Now to evaluate the computer systems you plan use - do:
> > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > PETSC_ARCH=arch-linux-cxx-opt streams
> >
> >
> >
> >
> > *Please help on this.*
> > *Many thanks!*
> >
> >
> > On Thu, Apr 20, 2017 at 2:02 AM, Satish Balay <balay at mcs.anl.gov> wrote:
> >
> > > Sorry - should have mentioned:
> > >
> > > do 'rm -rf arch-linux-cxx-opt' and rerun configure again.
> > >
> > > The mpich install from previous build [that is currently in
> > > arch-linux-cxx-opt/]
> > > is conflicting with --with-mpi-dir=/app1/centos6.3/gnu/mvapich2-1.9/
> > >
> > > Satish
> > >
> > >
> > > On Wed, 19 Apr 2017, Pham Pham wrote:
> > >
> > > > I reconfigured PETSs with installed MPI, however, I got serous error:
> > > >
> > > > **************************ERROR*************************************
> > > >   Error during compile, check arch-linux-cxx-opt/lib/petsc/conf/make.log
> > > >   Send it and arch-linux-cxx-opt/lib/petsc/conf/configure.log to
> > > > petsc-maint at mcs.anl.gov
> > > > ********************************************************************
> > > >
> > > > Please explain what is happening?
> > > >
> > > > Thank you very much.
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Apr 19, 2017 at 11:43 PM, Satish Balay <balay at mcs.anl.gov>
> > > wrote:
> > > >
> > > > > Presumably your cluster already has a recommended MPI to use [which is
> > > > > already installed. So you should use that - instead of
> > > > > --download-mpich=1
> > > > >
> > > > > Satish
> > > > >
> > > > > On Wed, 19 Apr 2017, Pham Pham wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I just installed petsc-3.7.5 into my university cluster. When
> > > evaluating
> > > > > > the computer system, PETSc reports "It appears you have 1 node(s)", I
> > > > > donot
> > > > > > understand this, since the system is a multinodes system. Could you
> > > > > please
> > > > > > explain this to me?
> > > > > >
> > > > > > Thank you very much.
> > > > > >
> > > > > > S.
> > > > > >
> > > > > > Output:
> > > > > > =========================================
> > > > > > Now to evaluate the computer systems you plan use - do:
> > > > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > > > PETSC_ARCH=arch-linux-cxx-opt streams
> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
> > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > > PETSC_ARCH=arch-linux-cxx-opt
> > > > > > streams
> > > > > > cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory
> > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > > PETSC_ARCH=arch-linux-cxx-opt
> > > > > > streams
> > > > > > /home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpicxx -o
> > > > > > MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
> > > > > > -Wno-unknown-pragmas -fvisibility=hidden -g -O
> > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
> > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include
> > > > > > `pwd`/MPIVersion.c
> > > > > > Running streams with
> > > > > > '/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpiexec '
> > > > > using
> > > > > > 'NPMAX=12'
> > > > > > Number of MPI processes 1 Processor names  atlas7-c10
> > > > > > Triad:         9137.5025   Rate (MB/s)
> > > > > > Number of MPI processes 2 Processor names  atlas7-c10 atlas7-c10
> > > > > > Triad:         9707.2815   Rate (MB/s)
> > > > > > Number of MPI processes 3 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10
> > > > > > Triad:        13559.5275   Rate (MB/s)
> > > > > > Number of MPI processes 4 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10
> > > > > > atlas7-c10
> > > > > > Triad:        14193.0597   Rate (MB/s)
> > > > > > Number of MPI processes 5 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10
> > > > > > atlas7-c10 atlas7-c10
> > > > > > Triad:        14492.9234   Rate (MB/s)
> > > > > > Number of MPI processes 6 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > Triad:        15476.5912   Rate (MB/s)
> > > > > > Number of MPI processes 7 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > Triad:        15148.7388   Rate (MB/s)
> > > > > > Number of MPI processes 8 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > Triad:        15799.1290   Rate (MB/s)
> > > > > > Number of MPI processes 9 Processor names  atlas7-c10 atlas7-c10
> > > > > atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > Triad:        15671.3104   Rate (MB/s)
> > > > > > Number of MPI processes 10 Processor names  atlas7-c10 atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > atlas7-c10 atlas7-c10
> > > > > > Triad:        15601.4754   Rate (MB/s)
> > > > > > Number of MPI processes 11 Processor names  atlas7-c10 atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > Triad:        15434.5790   Rate (MB/s)
> > > > > > Number of MPI processes 12 Processor names  atlas7-c10 atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > Triad:        15134.1263   Rate (MB/s)
> > > > > > ------------------------------------------------
> > > > > > np  speedup
> > > > > > 1 1.0
> > > > > > 2 1.06
> > > > > > 3 1.48
> > > > > > 4 1.55
> > > > > > 5 1.59
> > > > > > 6 1.69
> > > > > > 7 1.66
> > > > > > 8 1.73
> > > > > > 9 1.72
> > > > > > 10 1.71
> > > > > > 11 1.69
> > > > > > 12 1.66
> > > > > > Estimation of possible speedup of MPI programs based on Streams
> > > > > benchmark.
> > > > > > It appears you have 1 node(s)
> > > > > > Unable to plot speedup to a file
> > > > > > Unable to open matplotlib to plot speedup
> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
> > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> > >
> >
> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 


From pvsang002 at gmail.com  Mon May 22 18:00:09 2017
From: pvsang002 at gmail.com (Pham Pham)
Date: Tue, 23 May 2017 07:00:09 +0800
Subject: [petsc-users] Installation question
In-Reply-To: <1B05ACB6-6BDF-42C8-89FB-C5ECC5657934@mcs.anl.gov>
References: <CAOx7dQm4dhoirqEA5S7=boMEp-u4n0g0sKtvb30geq9DzwJ+tQ@mail.gmail.com>
	<alpine.LFD.2.20.1704191042400.26973@asterix>
	<CAOx7dQ=_xTm-7WEsLLenHZp5ehnyH5mQAbK83w0oOWu5sqZAqg@mail.gmail.com>
	<alpine.LFD.2.20.1704191301160.30353@asterix>
	<CAOx7dQm=ozD3VQX3-8hzth_WfEXC9e5d28Yadh=F+diGPDk0Ww@mail.gmail.com>
	<alpine.LFD.2.20.1705050859500.10073@asterix>
	<CAOx7dQmoY3eCbkks1fzX0HT7FS0bkWOPwqNLE1sbyG3V6nOZqQ@mail.gmail.com>
	<CAMYG4Gkg2ap25Z5+=Ev2M7GspFubtYvYLSFFJMDZkseFhVjXoQ@mail.gmail.com>
	<CAOx7dQ=zh4Murs5AqCPFzwNQ7GasZWYrQwhqhi3dL55NZS=apg@mail.gmail.com>
	<CAMYG4Gm23YFFLqG32S2FcvKTRmn+UmbmA22cKov1D9C+1cuSFw@mail.gmail.com>
	<CAOx7dQ=OcKmsZqtqb7ebQJZRpgK7dVYjewpnHX=JACG5vd4dxQ@mail.gmail.com>
	<1B05ACB6-6BDF-42C8-89FB-C5ECC5657934@mcs.anl.gov>
Message-ID: <CAOx7dQ=F06xs=qZKU0hi7Emdu9O3gi19aLASMpvNYbL_mo4OoA@mail.gmail.com>

Hi Barry,

My code using DMDA, the mesh is partitioned in x-direction only. Can I have
MPI+OpenMP works in the following way:

I want to create a new communicator which includes  processes with
Rank%12==0, PETSc objects will be created with this new sub-set of
processes. In each node (which has 12 cores), the first core (Rank%12==0)
does MPI communicating job (with Rank%12==0 process of other nodes), then
commanded other 11 processes do computation works using openMP?

Thank you.

On Tue, May 23, 2017 at 1:58 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On May 22, 2017, at 11:25 AM, Pham Pham <pvsang002 at gmail.com> wrote:
> >
> > Hi Matt,
> >
> > For the machine I have, Is it a good idea if I mix MPI and OpenMP: MPI
> for cores with Rank%12==0 and OpenMP for the others ?
> >
>
>    MPI+OpenMP doesn't work this way. Each "rank" is an MPI process, you
> cannot say some ranks are MPI and some are OpenMP. If you want to use one
> MPI process per node and have each MPI process have 12 OpenMP threads you
> need to find out for YOUR systems MPI how you tell it to put one MPI
> process per node;
>
>    Barry
>
> > Thank you,
> >
> > PVS.
> >
> > On Thu, May 11, 2017 at 8:27 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
> > On Thu, May 11, 2017 at 7:08 AM, Pham Pham <pvsang002 at gmail.com> wrote:
> > Hi Matt,
> >
> > Thank you for the reply.
> >
> > I am using University HPC which has multiple nodes, and should be good
> for parallel computing. The bad performance might be due to the way I
> install and run PETSc...
> >
> > Looking at the output when running streams, I can see that the Processor
> names were the same.
> > Does that mean only one processor involved in computing, did it cause
> the bad performance?
> >
> > Yes. From the data, it appears that the kind of processor you have has
> 12 cores, but only enough memory bandwidth to support 1.5 cores.
> > Try running the STREAMS with only 1 process per node. This is a setting
> in your submission script, but it is different for every cluster. Thus
> > I would ask the local sysdamin for this machine to help you do that. You
> should see almost perfect scaling with that configuration. You might
> > also try 2 processes per node to compare.
> >
> >   Thanks,
> >
> >      Matt
> >
> > Thank you very much.
> >
> > Ph.
> >
> > Below is testing output:
> >
> > [mpepvs at atlas5-c01 petsc-3.7.5]$ make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> PETSC_ARCH=arch-linux-cxx-opt streams
> > cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory
> PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> PETSC_ARCH=arch-linux-cxx-opt streams
> > /app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/bin/mpicxx -o
> MPIVersion.o c -wd1572 -g -O3   -fPIC    -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
> -I/hom
>
>                                        e/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include
> -I/app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/include
> `pwd`/MPIVersion.c
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> +++++++++++++++++++++++++++++++
> > The version of PETSc you are using is out-of-date, we recommend updating
> to the new release
> >  Available Version: 3.7.6   Installed Version: 3.7.5
> > http://www.mcs.anl.gov/petsc/download/index.html
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> +++++++++++++++++++++++++++++++
> > Running streams with 'mpiexec.hydra ' using 'NPMAX=12'
> > Number of MPI processes 1 Processor names  atlas5-c01
> > Triad:        11026.7604   Rate (MB/s)
> > Number of MPI processes 2 Processor names  atlas5-c01 atlas5-c01
> > Triad:        14669.6730   Rate (MB/s)
> > Number of MPI processes 3 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01
> > Triad:        12848.2644   Rate (MB/s)
> > Number of MPI processes 4 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01
> > Triad:        15033.7687   Rate (MB/s)
> > Number of MPI processes 5 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        13299.3830   Rate (MB/s)
> > Number of MPI processes 6 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        14382.2116   Rate (MB/s)
> > Number of MPI processes 7 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        13194.2573   Rate (MB/s)
> > Number of MPI processes 8 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        14199.7255   Rate (MB/s)
> > Number of MPI processes 9 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        13045.8946   Rate (MB/s)
> > Number of MPI processes 10 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01
> > Triad:        13058.3283   Rate (MB/s)
> > Number of MPI processes 11 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        13037.3334   Rate (MB/s)
> > Number of MPI processes 12 Processor names  atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        12526.6096   Rate (MB/s)
> > ------------------------------------------------
> > np  speedup
> > 1 1.0
> > 2 1.33
> > 3 1.17
> > 4 1.36
> > 5 1.21
> > 6 1.3
> > 7 1.2
> > 8 1.29
> > 9 1.18
> > 10 1.18
> > 11 1.18
> > 12 1.14
> > Estimation of possible speedup of MPI programs based on Streams
> benchmark.
> > It appears you have 1 node(s)
> > See graph in the file src/benchmarks/streams/scaling.png
> >
> > On Fri, May 5, 2017 at 11:26 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
> > On Fri, May 5, 2017 at 10:18 AM, Pham Pham <pvsang002 at gmail.com> wrote:
> > Hi Satish,
> >
> > It runs now, and shows a bad speed up:
> > Please help to improve this.
> >
> > http://www.mcs.anl.gov/petsc/documentation/faq.html#computers
> >
> > The short answer is: You cannot improve this without buying a different
> machine. This is
> > a fundamental algorithmic limitation that cannot be helped by threads,
> or vectorization, or
> > anything else.
> >
> >    Matt
> >
> > Thank you.
> >
> > <scaling.png>
> > ?
> >
> > On Fri, May 5, 2017 at 10:02 PM, Satish Balay <balay at mcs.anl.gov> wrote:
> > With Intel MPI - its best to use mpiexec.hydra [and not mpiexec]
> >
> > So you can do:
> >
> > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> PETSC_ARCH=arch-linux-cxx-opt MPIEXEC=mpiexec.hydra test
> >
> >
> > [you can also specify --with-mpiexec=mpiexec.hydra at configure time]
> >
> > Satish
> >
> >
> > On Fri, 5 May 2017, Pham Pham wrote:
> >
> > > *Hi,*
> > > *I can configure now, but fail when testing:*
> > >
> > > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
> > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> PETSC_ARCH=arch-linux-cxx-opt
> > > test   Running test examples to verify correct installation
> > > Using PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 and
> > > PETSC_ARCH=arch-linux-cxx-opt
> > > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1
> MPI
> > > process
> > > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > > mpiexec_atlas7-c10: cannot connect to local mpd
> (/tmp/mpd2.console_mpepvs);
> > > possible causes:
> > >   1. no mpd is running on this host
> > >   2. an mpd is running but was started without a "console" (-n option)
> > > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2
> MPI
> > > processes
> > > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > > mpiexec_atlas7-c10: cannot connect to local mpd
> (/tmp/mpd2.console_mpepvs);
> > > possible causes:
> > >   1. no mpd is running on this host
> > >   2. an mpd is running but was started without a "console" (-n option)
> > > Possible error running Fortran example src/snes/examples/tutorials/
> ex5f
> > > with 1 MPI process
> > > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > > mpiexec_atlas7-c10: cannot connect to local mpd
> (/tmp/mpd2.console_mpepvs);
> > > possible causes:
> > >   1. no mpd is running on this host
> > >   2. an mpd is running but was started without a "console" (-n option)
> > > Completed test examples
> > > =========================================
> > > Now to evaluate the computer systems you plan use - do:
> > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > PETSC_ARCH=arch-linux-cxx-opt streams
> > >
> > >
> > >
> > >
> > > *Please help on this.*
> > > *Many thanks!*
> > >
> > >
> > > On Thu, Apr 20, 2017 at 2:02 AM, Satish Balay <balay at mcs.anl.gov>
> wrote:
> > >
> > > > Sorry - should have mentioned:
> > > >
> > > > do 'rm -rf arch-linux-cxx-opt' and rerun configure again.
> > > >
> > > > The mpich install from previous build [that is currently in
> > > > arch-linux-cxx-opt/]
> > > > is conflicting with --with-mpi-dir=/app1/centos6.3/gnu/mvapich2-1.9/
> > > >
> > > > Satish
> > > >
> > > >
> > > > On Wed, 19 Apr 2017, Pham Pham wrote:
> > > >
> > > > > I reconfigured PETSs with installed MPI, however, I got serous
> error:
> > > > >
> > > > > **************************ERROR*************************
> ************
> > > > >   Error during compile, check arch-linux-cxx-opt/lib/petsc/
> conf/make.log
> > > > >   Send it and arch-linux-cxx-opt/lib/petsc/conf/configure.log to
> > > > > petsc-maint at mcs.anl.gov
> > > > > ************************************************************
> ********
> > > > >
> > > > > Please explain what is happening?
> > > > >
> > > > > Thank you very much.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Apr 19, 2017 at 11:43 PM, Satish Balay <balay at mcs.anl.gov>
> > > > wrote:
> > > > >
> > > > > > Presumably your cluster already has a recommended MPI to use
> [which is
> > > > > > already installed. So you should use that - instead of
> > > > > > --download-mpich=1
> > > > > >
> > > > > > Satish
> > > > > >
> > > > > > On Wed, 19 Apr 2017, Pham Pham wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I just installed petsc-3.7.5 into my university cluster. When
> > > > evaluating
> > > > > > > the computer system, PETSc reports "It appears you have 1
> node(s)", I
> > > > > > donot
> > > > > > > understand this, since the system is a multinodes system.
> Could you
> > > > > > please
> > > > > > > explain this to me?
> > > > > > >
> > > > > > > Thank you very much.
> > > > > > >
> > > > > > > S.
> > > > > > >
> > > > > > > Output:
> > > > > > > =========================================
> > > > > > > Now to evaluate the computer systems you plan use - do:
> > > > > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > > > > PETSC_ARCH=arch-linux-cxx-opt streams
> > > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
> > > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > > > PETSC_ARCH=arch-linux-cxx-opt
> > > > > > > streams
> > > > > > > cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory
> > > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > > > PETSC_ARCH=arch-linux-cxx-opt
> > > > > > > streams
> > > > > > > /home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpicxx
> -o
> > > > > > > MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
> > > > > > > -Wno-unknown-pragmas -fvisibility=hidden -g -O
> > > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
> > > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-
> opt/include
> > > > > > > `pwd`/MPIVersion.c
> > > > > > > Running streams with
> > > > > > > '/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpiexec
> '
> > > > > > using
> > > > > > > 'NPMAX=12'
> > > > > > > Number of MPI processes 1 Processor names  atlas7-c10
> > > > > > > Triad:         9137.5025   Rate (MB/s)
> > > > > > > Number of MPI processes 2 Processor names  atlas7-c10
> atlas7-c10
> > > > > > > Triad:         9707.2815   Rate (MB/s)
> > > > > > > Number of MPI processes 3 Processor names  atlas7-c10
> atlas7-c10
> > > > > > atlas7-c10
> > > > > > > Triad:        13559.5275   Rate (MB/s)
> > > > > > > Number of MPI processes 4 Processor names  atlas7-c10
> atlas7-c10
> > > > > > atlas7-c10
> > > > > > > atlas7-c10
> > > > > > > Triad:        14193.0597   Rate (MB/s)
> > > > > > > Number of MPI processes 5 Processor names  atlas7-c10
> atlas7-c10
> > > > > > atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10
> > > > > > > Triad:        14492.9234   Rate (MB/s)
> > > > > > > Number of MPI processes 6 Processor names  atlas7-c10
> atlas7-c10
> > > > > > atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > > Triad:        15476.5912   Rate (MB/s)
> > > > > > > Number of MPI processes 7 Processor names  atlas7-c10
> atlas7-c10
> > > > > > atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > > Triad:        15148.7388   Rate (MB/s)
> > > > > > > Number of MPI processes 8 Processor names  atlas7-c10
> atlas7-c10
> > > > > > atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > > Triad:        15799.1290   Rate (MB/s)
> > > > > > > Number of MPI processes 9 Processor names  atlas7-c10
> atlas7-c10
> > > > > > atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> atlas7-c10
> > > > > > > Triad:        15671.3104   Rate (MB/s)
> > > > > > > Number of MPI processes 10 Processor names  atlas7-c10
> atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10
> > > > > > > Triad:        15601.4754   Rate (MB/s)
> > > > > > > Number of MPI processes 11 Processor names  atlas7-c10
> atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > > Triad:        15434.5790   Rate (MB/s)
> > > > > > > Number of MPI processes 12 Processor names  atlas7-c10
> atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > > Triad:        15134.1263   Rate (MB/s)
> > > > > > > ------------------------------------------------
> > > > > > > np  speedup
> > > > > > > 1 1.0
> > > > > > > 2 1.06
> > > > > > > 3 1.48
> > > > > > > 4 1.55
> > > > > > > 5 1.59
> > > > > > > 6 1.69
> > > > > > > 7 1.66
> > > > > > > 8 1.73
> > > > > > > 9 1.72
> > > > > > > 10 1.71
> > > > > > > 11 1.69
> > > > > > > 12 1.66
> > > > > > > Estimation of possible speedup of MPI programs based on Streams
> > > > > > benchmark.
> > > > > > > It appears you have 1 node(s)
> > > > > > > Unable to plot speedup to a file
> > > > > > > Unable to open matplotlib to plot speedup
> > > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
> > > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> >
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/c8050512/attachment-0001.html>

From bsmith at mcs.anl.gov  Mon May 22 18:37:15 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 22 May 2017 18:37:15 -0500
Subject: [petsc-users] Installation question
In-Reply-To: <CAOx7dQ=F06xs=qZKU0hi7Emdu9O3gi19aLASMpvNYbL_mo4OoA@mail.gmail.com>
References: <CAOx7dQm4dhoirqEA5S7=boMEp-u4n0g0sKtvb30geq9DzwJ+tQ@mail.gmail.com>
	<alpine.LFD.2.20.1704191042400.26973@asterix>
	<CAOx7dQ=_xTm-7WEsLLenHZp5ehnyH5mQAbK83w0oOWu5sqZAqg@mail.gmail.com>
	<alpine.LFD.2.20.1704191301160.30353@asterix>
	<CAOx7dQm=ozD3VQX3-8hzth_WfEXC9e5d28Yadh=F+diGPDk0Ww@mail.gmail.com>
	<alpine.LFD.2.20.1705050859500.10073@asterix>
	<CAOx7dQmoY3eCbkks1fzX0HT7FS0bkWOPwqNLE1sbyG3V6nOZqQ@mail.gmail.com>
	<CAMYG4Gkg2ap25Z5+=Ev2M7GspFubtYvYLSFFJMDZkseFhVjXoQ@mail.gmail.com>
	<CAOx7dQ=zh4Murs5AqCPFzwNQ7GasZWYrQwhqhi3dL55NZS=apg@mail.gmail.com>
	<CAMYG4Gm23YFFLqG32S2FcvKTRmn+UmbmA22cKov1D9C+1cuSFw@mail.gmail.com>
	<CAOx7dQ=OcKmsZqtqb7ebQJZRpgK7dVYjewpnHX=JACG5vd4dxQ@mail.gmail.com>
	<1B05ACB6-6BDF-42C8-89FB-C5ECC5657934@mcs.anl.gov>
	<CAOx7dQ=F06xs=qZKU0hi7Emdu9O3gi19aLASMpvNYbL_mo4OoA@mail.gmail.com>
Message-ID: <9E32967B-800D-4C73-93D6-9B0A3E78FBCB@mcs.anl.gov>


> On May 22, 2017, at 6:00 PM, Pham Pham <pvsang002 at gmail.com> wrote:
> 
> Hi Barry,
> 
> My code using DMDA, the mesh is partitioned in x-direction only. Can I have MPI+OpenMP works in the following way: 
> 
> I want to create a new communicator which includes  processes with Rank%12==0, PETSc objects will be created with this new sub-set of processes. In each node (which has 12 cores), the first core (Rank%12==0) does MPI communicating job (with Rank%12==0 process of other nodes), then commanded other 11 processes do computation works using openMP?

   You cannot convert an MPI rank process into an OpenMP thread. You would just assign one MPI rank per node and have that one rank do 12 OpenMP threads. 

> 
> Thank you.
> 
> On Tue, May 23, 2017 at 1:58 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> > On May 22, 2017, at 11:25 AM, Pham Pham <pvsang002 at gmail.com> wrote:
> >
> > Hi Matt,
> >
> > For the machine I have, Is it a good idea if I mix MPI and OpenMP: MPI for cores with Rank%12==0 and OpenMP for the others ?
> >
> 
>    MPI+OpenMP doesn't work this way. Each "rank" is an MPI process, you cannot say some ranks are MPI and some are OpenMP. If you want to use one MPI process per node and have each MPI process have 12 OpenMP threads you need to find out for YOUR systems MPI how you tell it to put one MPI process per node;
> 
>    Barry
> 
> > Thank you,
> >
> > PVS.
> >
> > On Thu, May 11, 2017 at 8:27 PM, Matthew Knepley <knepley at gmail.com> wrote:
> > On Thu, May 11, 2017 at 7:08 AM, Pham Pham <pvsang002 at gmail.com> wrote:
> > Hi Matt,
> >
> > Thank you for the reply.
> >
> > I am using University HPC which has multiple nodes, and should be good for parallel computing. The bad performance might be due to the way I install and run PETSc...
> >
> > Looking at the output when running streams, I can see that the Processor names were the same.
> > Does that mean only one processor involved in computing, did it cause the bad performance?
> >
> > Yes. From the data, it appears that the kind of processor you have has 12 cores, but only enough memory bandwidth to support 1.5 cores.
> > Try running the STREAMS with only 1 process per node. This is a setting in your submission script, but it is different for every cluster. Thus
> > I would ask the local sysdamin for this machine to help you do that. You should see almost perfect scaling with that configuration. You might
> > also try 2 processes per node to compare.
> >
> >   Thanks,
> >
> >      Matt
> >
> > Thank you very much.
> >
> > Ph.
> >
> > Below is testing output:
> >
> > [mpepvs at atlas5-c01 petsc-3.7.5]$ make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt streams
> > cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt streams
> > /app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/bin/mpicxx -o MPIVersion.o c -wd1572 -g -O3   -fPIC    -I/home/svu/mpepvs/petsc/petsc-3.7.5/include -I/hom                                                                                                                                                                                         e/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include -I/app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/include    `pwd`/MPIVersion.c
> > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > The version of PETSc you are using is out-of-date, we recommend updating to the new release
> >  Available Version: 3.7.6   Installed Version: 3.7.5
> > http://www.mcs.anl.gov/petsc/download/index.html
> > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > Running streams with 'mpiexec.hydra ' using 'NPMAX=12'
> > Number of MPI processes 1 Processor names  atlas5-c01
> > Triad:        11026.7604   Rate (MB/s)
> > Number of MPI processes 2 Processor names  atlas5-c01 atlas5-c01
> > Triad:        14669.6730   Rate (MB/s)
> > Number of MPI processes 3 Processor names  atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        12848.2644   Rate (MB/s)
> > Number of MPI processes 4 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        15033.7687   Rate (MB/s)
> > Number of MPI processes 5 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        13299.3830   Rate (MB/s)
> > Number of MPI processes 6 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        14382.2116   Rate (MB/s)
> > Number of MPI processes 7 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        13194.2573   Rate (MB/s)
> > Number of MPI processes 8 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        14199.7255   Rate (MB/s)
> > Number of MPI processes 9 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        13045.8946   Rate (MB/s)
> > Number of MPI processes 10 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        13058.3283   Rate (MB/s)
> > Number of MPI processes 11 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        13037.3334   Rate (MB/s)
> > Number of MPI processes 12 Processor names  atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
> > Triad:        12526.6096   Rate (MB/s)
> > ------------------------------------------------
> > np  speedup
> > 1 1.0
> > 2 1.33
> > 3 1.17
> > 4 1.36
> > 5 1.21
> > 6 1.3
> > 7 1.2
> > 8 1.29
> > 9 1.18
> > 10 1.18
> > 11 1.18
> > 12 1.14
> > Estimation of possible speedup of MPI programs based on Streams benchmark.
> > It appears you have 1 node(s)
> > See graph in the file src/benchmarks/streams/scaling.png
> >
> > On Fri, May 5, 2017 at 11:26 PM, Matthew Knepley <knepley at gmail.com> wrote:
> > On Fri, May 5, 2017 at 10:18 AM, Pham Pham <pvsang002 at gmail.com> wrote:
> > Hi Satish,
> >
> > It runs now, and shows a bad speed up:
> > Please help to improve this.
> >
> > http://www.mcs.anl.gov/petsc/documentation/faq.html#computers
> >
> > The short answer is: You cannot improve this without buying a different machine. This is
> > a fundamental algorithmic limitation that cannot be helped by threads, or vectorization, or
> > anything else.
> >
> >    Matt
> >
> > Thank you.
> >
> > <scaling.png>
> > ?
> >
> > On Fri, May 5, 2017 at 10:02 PM, Satish Balay <balay at mcs.anl.gov> wrote:
> > With Intel MPI - its best to use mpiexec.hydra [and not mpiexec]
> >
> > So you can do:
> >
> > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt MPIEXEC=mpiexec.hydra test
> >
> >
> > [you can also specify --with-mpiexec=mpiexec.hydra at configure time]
> >
> > Satish
> >
> >
> > On Fri, 5 May 2017, Pham Pham wrote:
> >
> > > *Hi,*
> > > *I can configure now, but fail when testing:*
> > >
> > > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
> > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 PETSC_ARCH=arch-linux-cxx-opt
> > > test   Running test examples to verify correct installation
> > > Using PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 and
> > > PETSC_ARCH=arch-linux-cxx-opt
> > > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI
> > > process
> > > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > > mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);
> > > possible causes:
> > >   1. no mpd is running on this host
> > >   2. an mpd is running but was started without a "console" (-n option)
> > > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI
> > > processes
> > > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > > mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);
> > > possible causes:
> > >   1. no mpd is running on this host
> > >   2. an mpd is running but was started without a "console" (-n option)
> > > Possible error running Fortran example src/snes/examples/tutorials/ex5f
> > > with 1 MPI process
> > > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > > mpiexec_atlas7-c10: cannot connect to local mpd (/tmp/mpd2.console_mpepvs);
> > > possible causes:
> > >   1. no mpd is running on this host
> > >   2. an mpd is running but was started without a "console" (-n option)
> > > Completed test examples
> > > =========================================
> > > Now to evaluate the computer systems you plan use - do:
> > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > PETSC_ARCH=arch-linux-cxx-opt streams
> > >
> > >
> > >
> > >
> > > *Please help on this.*
> > > *Many thanks!*
> > >
> > >
> > > On Thu, Apr 20, 2017 at 2:02 AM, Satish Balay <balay at mcs.anl.gov> wrote:
> > >
> > > > Sorry - should have mentioned:
> > > >
> > > > do 'rm -rf arch-linux-cxx-opt' and rerun configure again.
> > > >
> > > > The mpich install from previous build [that is currently in
> > > > arch-linux-cxx-opt/]
> > > > is conflicting with --with-mpi-dir=/app1/centos6.3/gnu/mvapich2-1.9/
> > > >
> > > > Satish
> > > >
> > > >
> > > > On Wed, 19 Apr 2017, Pham Pham wrote:
> > > >
> > > > > I reconfigured PETSs with installed MPI, however, I got serous error:
> > > > >
> > > > > **************************ERROR*************************************
> > > > >   Error during compile, check arch-linux-cxx-opt/lib/petsc/conf/make.log
> > > > >   Send it and arch-linux-cxx-opt/lib/petsc/conf/configure.log to
> > > > > petsc-maint at mcs.anl.gov
> > > > > ********************************************************************
> > > > >
> > > > > Please explain what is happening?
> > > > >
> > > > > Thank you very much.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Apr 19, 2017 at 11:43 PM, Satish Balay <balay at mcs.anl.gov>
> > > > wrote:
> > > > >
> > > > > > Presumably your cluster already has a recommended MPI to use [which is
> > > > > > already installed. So you should use that - instead of
> > > > > > --download-mpich=1
> > > > > >
> > > > > > Satish
> > > > > >
> > > > > > On Wed, 19 Apr 2017, Pham Pham wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I just installed petsc-3.7.5 into my university cluster. When
> > > > evaluating
> > > > > > > the computer system, PETSc reports "It appears you have 1 node(s)", I
> > > > > > donot
> > > > > > > understand this, since the system is a multinodes system. Could you
> > > > > > please
> > > > > > > explain this to me?
> > > > > > >
> > > > > > > Thank you very much.
> > > > > > >
> > > > > > > S.
> > > > > > >
> > > > > > > Output:
> > > > > > > =========================================
> > > > > > > Now to evaluate the computer systems you plan use - do:
> > > > > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > > > > PETSC_ARCH=arch-linux-cxx-opt streams
> > > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
> > > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > > > PETSC_ARCH=arch-linux-cxx-opt
> > > > > > > streams
> > > > > > > cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory
> > > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
> > > > > > PETSC_ARCH=arch-linux-cxx-opt
> > > > > > > streams
> > > > > > > /home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpicxx -o
> > > > > > > MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
> > > > > > > -Wno-unknown-pragmas -fvisibility=hidden -g -O
> > > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
> > > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include
> > > > > > > `pwd`/MPIVersion.c
> > > > > > > Running streams with
> > > > > > > '/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpiexec '
> > > > > > using
> > > > > > > 'NPMAX=12'
> > > > > > > Number of MPI processes 1 Processor names  atlas7-c10
> > > > > > > Triad:         9137.5025   Rate (MB/s)
> > > > > > > Number of MPI processes 2 Processor names  atlas7-c10 atlas7-c10
> > > > > > > Triad:         9707.2815   Rate (MB/s)
> > > > > > > Number of MPI processes 3 Processor names  atlas7-c10 atlas7-c10
> > > > > > atlas7-c10
> > > > > > > Triad:        13559.5275   Rate (MB/s)
> > > > > > > Number of MPI processes 4 Processor names  atlas7-c10 atlas7-c10
> > > > > > atlas7-c10
> > > > > > > atlas7-c10
> > > > > > > Triad:        14193.0597   Rate (MB/s)
> > > > > > > Number of MPI processes 5 Processor names  atlas7-c10 atlas7-c10
> > > > > > atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10
> > > > > > > Triad:        14492.9234   Rate (MB/s)
> > > > > > > Number of MPI processes 6 Processor names  atlas7-c10 atlas7-c10
> > > > > > atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > > Triad:        15476.5912   Rate (MB/s)
> > > > > > > Number of MPI processes 7 Processor names  atlas7-c10 atlas7-c10
> > > > > > atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > > Triad:        15148.7388   Rate (MB/s)
> > > > > > > Number of MPI processes 8 Processor names  atlas7-c10 atlas7-c10
> > > > > > atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > > Triad:        15799.1290   Rate (MB/s)
> > > > > > > Number of MPI processes 9 Processor names  atlas7-c10 atlas7-c10
> > > > > > atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > > Triad:        15671.3104   Rate (MB/s)
> > > > > > > Number of MPI processes 10 Processor names  atlas7-c10 atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10
> > > > > > > Triad:        15601.4754   Rate (MB/s)
> > > > > > > Number of MPI processes 11 Processor names  atlas7-c10 atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > > Triad:        15434.5790   Rate (MB/s)
> > > > > > > Number of MPI processes 12 Processor names  atlas7-c10 atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
> > > > > > > Triad:        15134.1263   Rate (MB/s)
> > > > > > > ------------------------------------------------
> > > > > > > np  speedup
> > > > > > > 1 1.0
> > > > > > > 2 1.06
> > > > > > > 3 1.48
> > > > > > > 4 1.55
> > > > > > > 5 1.59
> > > > > > > 6 1.69
> > > > > > > 7 1.66
> > > > > > > 8 1.73
> > > > > > > 9 1.72
> > > > > > > 10 1.71
> > > > > > > 11 1.69
> > > > > > > 12 1.66
> > > > > > > Estimation of possible speedup of MPI programs based on Streams
> > > > > > benchmark.
> > > > > > > It appears you have 1 node(s)
> > > > > > > Unable to plot speedup to a file
> > > > > > > Unable to open matplotlib to plot speedup
> > > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
> > > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> >
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > -- Norbert Wiener
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > -- Norbert Wiener
> >
> 
> 


From knepley at gmail.com  Mon May 22 18:38:12 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 22 May 2017 18:38:12 -0500
Subject: [petsc-users] Installation question
In-Reply-To: <CAOx7dQ=F06xs=qZKU0hi7Emdu9O3gi19aLASMpvNYbL_mo4OoA@mail.gmail.com>
References: <CAOx7dQm4dhoirqEA5S7=boMEp-u4n0g0sKtvb30geq9DzwJ+tQ@mail.gmail.com>
	<alpine.LFD.2.20.1704191042400.26973@asterix>
	<CAOx7dQ=_xTm-7WEsLLenHZp5ehnyH5mQAbK83w0oOWu5sqZAqg@mail.gmail.com>
	<alpine.LFD.2.20.1704191301160.30353@asterix>
	<CAOx7dQm=ozD3VQX3-8hzth_WfEXC9e5d28Yadh=F+diGPDk0Ww@mail.gmail.com>
	<alpine.LFD.2.20.1705050859500.10073@asterix>
	<CAOx7dQmoY3eCbkks1fzX0HT7FS0bkWOPwqNLE1sbyG3V6nOZqQ@mail.gmail.com>
	<CAMYG4Gkg2ap25Z5+=Ev2M7GspFubtYvYLSFFJMDZkseFhVjXoQ@mail.gmail.com>
	<CAOx7dQ=zh4Murs5AqCPFzwNQ7GasZWYrQwhqhi3dL55NZS=apg@mail.gmail.com>
	<CAMYG4Gm23YFFLqG32S2FcvKTRmn+UmbmA22cKov1D9C+1cuSFw@mail.gmail.com>
	<CAOx7dQ=OcKmsZqtqb7ebQJZRpgK7dVYjewpnHX=JACG5vd4dxQ@mail.gmail.com>
	<1B05ACB6-6BDF-42C8-89FB-C5ECC5657934@mcs.anl.gov>
	<CAOx7dQ=F06xs=qZKU0hi7Emdu9O3gi19aLASMpvNYbL_mo4OoA@mail.gmail.com>
Message-ID: <CAMYG4GnVuToXUQo+oNxXBXpkEutLvYXV_2Px-Df+w-723HpMxg@mail.gmail.com>

On Mon, May 22, 2017 at 6:00 PM, Pham Pham <pvsang002 at gmail.com> wrote:

> Hi Barry,
>
> My code using DMDA, the mesh is partitioned in x-direction only. Can I
> have MPI+OpenMP works in the following way:
>
> I want to create a new communicator which includes  processes with
> Rank%12==0, PETSc objects will be created with this new sub-set of
> processes. In each node (which has 12 cores), the first core (Rank%12==0)
> does MPI communicating job (with Rank%12==0 process of other nodes), then
> commanded other 11 processes do computation works using openMP?
>

But this is not a sensible thing. The performance here is not dependent on
processing, its dependent on memory bandwidth, but
you do not increase that with more threads. In addition, the overhead of
threads here is just as big or bigger than processes, so
you would be better off just running that many MPI processes.

  Matt


> Thank you.
>
> On Tue, May 23, 2017 at 1:58 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>> > On May 22, 2017, at 11:25 AM, Pham Pham <pvsang002 at gmail.com> wrote:
>> >
>> > Hi Matt,
>> >
>> > For the machine I have, Is it a good idea if I mix MPI and OpenMP: MPI
>> for cores with Rank%12==0 and OpenMP for the others ?
>> >
>>
>>    MPI+OpenMP doesn't work this way. Each "rank" is an MPI process, you
>> cannot say some ranks are MPI and some are OpenMP. If you want to use one
>> MPI process per node and have each MPI process have 12 OpenMP threads you
>> need to find out for YOUR systems MPI how you tell it to put one MPI
>> process per node;
>>
>>    Barry
>>
>> > Thank you,
>> >
>> > PVS.
>> >
>> > On Thu, May 11, 2017 at 8:27 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>> > On Thu, May 11, 2017 at 7:08 AM, Pham Pham <pvsang002 at gmail.com> wrote:
>> > Hi Matt,
>> >
>> > Thank you for the reply.
>> >
>> > I am using University HPC which has multiple nodes, and should be good
>> for parallel computing. The bad performance might be due to the way I
>> install and run PETSc...
>> >
>> > Looking at the output when running streams, I can see that the
>> Processor names were the same.
>> > Does that mean only one processor involved in computing, did it cause
>> the bad performance?
>> >
>> > Yes. From the data, it appears that the kind of processor you have has
>> 12 cores, but only enough memory bandwidth to support 1.5 cores.
>> > Try running the STREAMS with only 1 process per node. This is a setting
>> in your submission script, but it is different for every cluster. Thus
>> > I would ask the local sysdamin for this machine to help you do that.
>> You should see almost perfect scaling with that configuration. You might
>> > also try 2 processes per node to compare.
>> >
>> >   Thanks,
>> >
>> >      Matt
>> >
>> > Thank you very much.
>> >
>> > Ph.
>> >
>> > Below is testing output:
>> >
>> > [mpepvs at atlas5-c01 petsc-3.7.5]$ make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> PETSC_ARCH=arch-linux-cxx-opt streams
>> > cd src/benchmarks/streams; /usr/bin/gmake  --no-print-directory
>> PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> PETSC_ARCH=arch-linux-cxx-opt streams
>> > /app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/bin/mpicxx -o
>> MPIVersion.o c -wd1572 -g -O3   -fPIC    -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
>> -I/hom
>>
>>                                        e/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/include
>> -I/app1/centos6.3/Intel/xe_2015/impi/5.0.3.048/intel64/include
>> `pwd`/MPIVersion.c
>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> +++++++++++++++++++++++++++++++
>> > The version of PETSc you are using is out-of-date, we recommend
>> updating to the new release
>> >  Available Version: 3.7.6   Installed Version: 3.7.5
>> > http://www.mcs.anl.gov/petsc/download/index.html
>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> +++++++++++++++++++++++++++++++
>> > Running streams with 'mpiexec.hydra ' using 'NPMAX=12'
>> > Number of MPI processes 1 Processor names  atlas5-c01
>> > Triad:        11026.7604   Rate (MB/s)
>> > Number of MPI processes 2 Processor names  atlas5-c01 atlas5-c01
>> > Triad:        14669.6730   Rate (MB/s)
>> > Number of MPI processes 3 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01
>> > Triad:        12848.2644   Rate (MB/s)
>> > Number of MPI processes 4 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01
>> > Triad:        15033.7687   Rate (MB/s)
>> > Number of MPI processes 5 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01
>> > Triad:        13299.3830   Rate (MB/s)
>> > Number of MPI processes 6 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> > Triad:        14382.2116   Rate (MB/s)
>> > Number of MPI processes 7 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> > Triad:        13194.2573   Rate (MB/s)
>> > Number of MPI processes 8 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> > Triad:        14199.7255   Rate (MB/s)
>> > Number of MPI processes 9 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> > Triad:        13045.8946   Rate (MB/s)
>> > Number of MPI processes 10 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01
>> > Triad:        13058.3283   Rate (MB/s)
>> > Number of MPI processes 11 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01
>> > Triad:        13037.3334   Rate (MB/s)
>> > Number of MPI processes 12 Processor names  atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> atlas5-c01 atlas5-c01 atlas5-c01 atlas5-c01
>> > Triad:        12526.6096   Rate (MB/s)
>> > ------------------------------------------------
>> > np  speedup
>> > 1 1.0
>> > 2 1.33
>> > 3 1.17
>> > 4 1.36
>> > 5 1.21
>> > 6 1.3
>> > 7 1.2
>> > 8 1.29
>> > 9 1.18
>> > 10 1.18
>> > 11 1.18
>> > 12 1.14
>> > Estimation of possible speedup of MPI programs based on Streams
>> benchmark.
>> > It appears you have 1 node(s)
>> > See graph in the file src/benchmarks/streams/scaling.png
>> >
>> > On Fri, May 5, 2017 at 11:26 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>> > On Fri, May 5, 2017 at 10:18 AM, Pham Pham <pvsang002 at gmail.com> wrote:
>> > Hi Satish,
>> >
>> > It runs now, and shows a bad speed up:
>> > Please help to improve this.
>> >
>> > http://www.mcs.anl.gov/petsc/documentation/faq.html#computers
>> >
>> > The short answer is: You cannot improve this without buying a different
>> machine. This is
>> > a fundamental algorithmic limitation that cannot be helped by threads,
>> or vectorization, or
>> > anything else.
>> >
>> >    Matt
>> >
>> > Thank you.
>> >
>> > <scaling.png>
>> > ?
>> >
>> > On Fri, May 5, 2017 at 10:02 PM, Satish Balay <balay at mcs.anl.gov>
>> wrote:
>> > With Intel MPI - its best to use mpiexec.hydra [and not mpiexec]
>> >
>> > So you can do:
>> >
>> > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> PETSC_ARCH=arch-linux-cxx-opt MPIEXEC=mpiexec.hydra test
>> >
>> >
>> > [you can also specify --with-mpiexec=mpiexec.hydra at configure time]
>> >
>> > Satish
>> >
>> >
>> > On Fri, 5 May 2017, Pham Pham wrote:
>> >
>> > > *Hi,*
>> > > *I can configure now, but fail when testing:*
>> > >
>> > > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
>> > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> PETSC_ARCH=arch-linux-cxx-opt
>> > > test   Running test examples to verify correct installation
>> > > Using PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5 and
>> > > PETSC_ARCH=arch-linux-cxx-opt
>> > > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1
>> MPI
>> > > process
>> > > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> > > mpiexec_atlas7-c10: cannot connect to local mpd
>> (/tmp/mpd2.console_mpepvs);
>> > > possible causes:
>> > >   1. no mpd is running on this host
>> > >   2. an mpd is running but was started without a "console" (-n option)
>> > > Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2
>> MPI
>> > > processes
>> > > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> > > mpiexec_atlas7-c10: cannot connect to local mpd
>> (/tmp/mpd2.console_mpepvs);
>> > > possible causes:
>> > >   1. no mpd is running on this host
>> > >   2. an mpd is running but was started without a "console" (-n option)
>> > > Possible error running Fortran example src/snes/examples/tutorials/ex
>> 5f
>> > > with 1 MPI process
>> > > See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> > > mpiexec_atlas7-c10: cannot connect to local mpd
>> (/tmp/mpd2.console_mpepvs);
>> > > possible causes:
>> > >   1. no mpd is running on this host
>> > >   2. an mpd is running but was started without a "console" (-n option)
>> > > Completed test examples
>> > > =========================================
>> > > Now to evaluate the computer systems you plan use - do:
>> > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> > > PETSC_ARCH=arch-linux-cxx-opt streams
>> > >
>> > >
>> > >
>> > >
>> > > *Please help on this.*
>> > > *Many thanks!*
>> > >
>> > >
>> > > On Thu, Apr 20, 2017 at 2:02 AM, Satish Balay <balay at mcs.anl.gov>
>> wrote:
>> > >
>> > > > Sorry - should have mentioned:
>> > > >
>> > > > do 'rm -rf arch-linux-cxx-opt' and rerun configure again.
>> > > >
>> > > > The mpich install from previous build [that is currently in
>> > > > arch-linux-cxx-opt/]
>> > > > is conflicting with --with-mpi-dir=/app1/centos6.3
>> /gnu/mvapich2-1.9/
>> > > >
>> > > > Satish
>> > > >
>> > > >
>> > > > On Wed, 19 Apr 2017, Pham Pham wrote:
>> > > >
>> > > > > I reconfigured PETSs with installed MPI, however, I got serous
>> error:
>> > > > >
>> > > > > **************************ERROR*****************************
>> ********
>> > > > >   Error during compile, check arch-linux-cxx-opt/lib/petsc/c
>> onf/make.log
>> > > > >   Send it and arch-linux-cxx-opt/lib/petsc/conf/configure.log to
>> > > > > petsc-maint at mcs.anl.gov
>> > > > > ************************************************************
>> ********
>> > > > >
>> > > > > Please explain what is happening?
>> > > > >
>> > > > > Thank you very much.
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Wed, Apr 19, 2017 at 11:43 PM, Satish Balay <balay at mcs.anl.gov
>> >
>> > > > wrote:
>> > > > >
>> > > > > > Presumably your cluster already has a recommended MPI to use
>> [which is
>> > > > > > already installed. So you should use that - instead of
>> > > > > > --download-mpich=1
>> > > > > >
>> > > > > > Satish
>> > > > > >
>> > > > > > On Wed, 19 Apr 2017, Pham Pham wrote:
>> > > > > >
>> > > > > > > Hi,
>> > > > > > >
>> > > > > > > I just installed petsc-3.7.5 into my university cluster. When
>> > > > evaluating
>> > > > > > > the computer system, PETSc reports "It appears you have 1
>> node(s)", I
>> > > > > > donot
>> > > > > > > understand this, since the system is a multinodes system.
>> Could you
>> > > > > > please
>> > > > > > > explain this to me?
>> > > > > > >
>> > > > > > > Thank you very much.
>> > > > > > >
>> > > > > > > S.
>> > > > > > >
>> > > > > > > Output:
>> > > > > > > =========================================
>> > > > > > > Now to evaluate the computer systems you plan use - do:
>> > > > > > > make PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> > > > > > > PETSC_ARCH=arch-linux-cxx-opt streams
>> > > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$ make
>> > > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> > > > > > PETSC_ARCH=arch-linux-cxx-opt
>> > > > > > > streams
>> > > > > > > cd src/benchmarks/streams; /usr/bin/gmake
>> --no-print-directory
>> > > > > > > PETSC_DIR=/home/svu/mpepvs/petsc/petsc-3.7.5
>> > > > > > PETSC_ARCH=arch-linux-cxx-opt
>> > > > > > > streams
>> > > > > > > /home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpicxx
>> -o
>> > > > > > > MPIVersion.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
>> > > > > > > -Wno-unknown-pragmas -fvisibility=hidden -g -O
>> > > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/include
>> > > > > > > -I/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/
>> include
>> > > > > > > `pwd`/MPIVersion.c
>> > > > > > > Running streams with
>> > > > > > > '/home/svu/mpepvs/petsc/petsc-3.7.5/arch-linux-cxx-opt/bin/mpiexec
>> '
>> > > > > > using
>> > > > > > > 'NPMAX=12'
>> > > > > > > Number of MPI processes 1 Processor names  atlas7-c10
>> > > > > > > Triad:         9137.5025   Rate (MB/s)
>> > > > > > > Number of MPI processes 2 Processor names  atlas7-c10
>> atlas7-c10
>> > > > > > > Triad:         9707.2815   Rate (MB/s)
>> > > > > > > Number of MPI processes 3 Processor names  atlas7-c10
>> atlas7-c10
>> > > > > > atlas7-c10
>> > > > > > > Triad:        13559.5275   Rate (MB/s)
>> > > > > > > Number of MPI processes 4 Processor names  atlas7-c10
>> atlas7-c10
>> > > > > > atlas7-c10
>> > > > > > > atlas7-c10
>> > > > > > > Triad:        14193.0597   Rate (MB/s)
>> > > > > > > Number of MPI processes 5 Processor names  atlas7-c10
>> atlas7-c10
>> > > > > > atlas7-c10
>> > > > > > > atlas7-c10 atlas7-c10
>> > > > > > > Triad:        14492.9234   Rate (MB/s)
>> > > > > > > Number of MPI processes 6 Processor names  atlas7-c10
>> atlas7-c10
>> > > > > > atlas7-c10
>> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10
>> > > > > > > Triad:        15476.5912   Rate (MB/s)
>> > > > > > > Number of MPI processes 7 Processor names  atlas7-c10
>> atlas7-c10
>> > > > > > atlas7-c10
>> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>> > > > > > > Triad:        15148.7388   Rate (MB/s)
>> > > > > > > Number of MPI processes 8 Processor names  atlas7-c10
>> atlas7-c10
>> > > > > > atlas7-c10
>> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>> > > > > > > Triad:        15799.1290   Rate (MB/s)
>> > > > > > > Number of MPI processes 9 Processor names  atlas7-c10
>> atlas7-c10
>> > > > > > atlas7-c10
>> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>> atlas7-c10
>> > > > > > > Triad:        15671.3104   Rate (MB/s)
>> > > > > > > Number of MPI processes 10 Processor names  atlas7-c10
>> atlas7-c10
>> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>> atlas7-c10
>> > > > > > > atlas7-c10 atlas7-c10
>> > > > > > > Triad:        15601.4754   Rate (MB/s)
>> > > > > > > Number of MPI processes 11 Processor names  atlas7-c10
>> atlas7-c10
>> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>> atlas7-c10
>> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10
>> > > > > > > Triad:        15434.5790   Rate (MB/s)
>> > > > > > > Number of MPI processes 12 Processor names  atlas7-c10
>> atlas7-c10
>> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>> atlas7-c10
>> > > > > > > atlas7-c10 atlas7-c10 atlas7-c10 atlas7-c10
>> > > > > > > Triad:        15134.1263   Rate (MB/s)
>> > > > > > > ------------------------------------------------
>> > > > > > > np  speedup
>> > > > > > > 1 1.0
>> > > > > > > 2 1.06
>> > > > > > > 3 1.48
>> > > > > > > 4 1.55
>> > > > > > > 5 1.59
>> > > > > > > 6 1.69
>> > > > > > > 7 1.66
>> > > > > > > 8 1.73
>> > > > > > > 9 1.72
>> > > > > > > 10 1.71
>> > > > > > > 11 1.69
>> > > > > > > 12 1.66
>> > > > > > > Estimation of possible speedup of MPI programs based on
>> Streams
>> > > > > > benchmark.
>> > > > > > > It appears you have 1 node(s)
>> > > > > > > Unable to plot speedup to a file
>> > > > > > > Unable to open matplotlib to plot speedup
>> > > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
>> > > > > > > [mpepvs at atlas7-c10 petsc-3.7.5]$
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > >
>> >
>> >
>> >
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>> >
>> >
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>> >
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170522/1bfb8452/attachment-0001.html>

From franck.houssen at inria.fr  Tue May 23 04:41:32 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Tue, 23 May 2017 11:41:32 +0200 (CEST)
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to get local matrix (= one domain) before and after assembly ?
In-Reply-To: <CAGPUisi5V7DkFLAt3kBRVTyVrZnYjPV21uTdUwiDku8_MVnHPA@mail.gmail.com>
References: <867421313.6757137.1495383596545.JavaMail.zimbra@inria.fr>
	<1253777447.6757298.1495383780337.JavaMail.zimbra@inria.fr>
	<CAMYG4Gknt_kBW8LXHma=2fFmBO1X70B0Hg6JiXi+nuDwy5ygKA@mail.gmail.com>
	<CAGPUisi5V7DkFLAt3kBRVTyVrZnYjPV21uTdUwiDku8_MVnHPA@mail.gmail.com>
Message-ID: <2033509705.7414108.1495532492501.JavaMail.zimbra@inria.fr>

I have a 3x3 global matrix made of two overlapping 2x2 local matrix (= diagonal with 1.). Each local matrix correspond to one domain (each domain is delegated to one MPI proc, so, I have 2 MPI procs because I have 2 domains). 
This is the simplest possible example: I have two 2x2 (local) diag matrix that overlap so that the global matrix built from them is 1, 2, 1 on the diagonal (local contributions add up in the middle). 

Now, I need for each MPI proc to get the assembled local matrix (sometimes called the dirichlet matrix) : this is a local matrix (sequential - not distributed with MPI) that accounts for contribution of neighboring domains (MPI proc). 

How to get the local assembled matrix ? MatGetLocalSubMatrix does not work (throw error - see example attached). MatGetSubMatrix returns a MPI distributed matrix, not a local (sequential) one. 

    1. My understanding is that MatISGetMPIXAIJ should return a local matrix (sequential AIJ matrix) : the MPI in the name recall that you get the assembled matrix (with contributions from the shared border) from the other MPI processus. Correct ? In my simple example, I replaced MatGetLocalSubMatrix with MatISGetMPIXAIJ : I get a deadlock which was surprising to me... Is MatISGetMPIXAIJ a collective call ? 
    2. Supposing this is a collective call (and that point 1 is not correct), I ride up MatISGetMPIXAIJ before the "if (rank > 0)" : I don't deadlock now, but it seems I get a global matrix which is not the assembled local matrix I am looking for. 
    3. I am supposed to destroy the matrix returned by MatISGetMPIXAIJ ? (I believe yes - not sure as AFAIU wording should associate Destroy methods to Create methods) 
Franck 

The git diff illustrate modifications I tried to add to the initial file attached to this thread: 
--- a/matISLocalMat.cpp 
+++ b/matISLocalMat.cpp 
@@ -31,6 +31,8 @@ int main(int argc,char **argv) { 
MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY); MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY); 
MatView(A, PETSC_VIEWER_STDOUT_WORLD); PetscViewerFlush(PETSC_VIEWER_STDOUT_WORLD); // Diag: 1, 2, 1 

+ Mat assembledLocalMat; 
+ MatISGetMPIXAIJ(A, MAT_INITIAL_MATRIX, &assembledLocalMat); 
if (rank > 0) { // Do not pollute stdout: print only 1 proc 
std::cout << std::endl << "non assembled local matrix:" << std::endl << std::endl; 
Mat nonAssembledLocalMat; 
@@ -38,11 +40,10 @@ int main(int argc,char **argv) { 
MatView(nonAssembledLocalMat, PETSC_VIEWER_STDOUT_SELF); // Diag: 1, 1 

std::cout << std::endl << "assembled local matrix:" << std::endl << std::endl; 
- Mat assembledLocalMat; 
- IS is; ISCreateGeneral(PETSC_COMM_SELF, localSize, localIdx, PETSC_COPY_VALUES, &is); 
- MatGetLocalSubMatrix(A, is, is, &assembledLocalMat); // KO ?!... 
- MatView(assembledLocalMat, PETSC_VIEWER_STDOUT_SELF); // Would like to get => Diag: 2, 1 
+ //IS is; ISCreateGeneral(PETSC_COMM_SELF, localSize, localIdx, PETSC_COPY_VALUES, &is); 
+ //MatGetLocalSubMatrix(A, is, is, &assembledLocalMat); // KO ?!... 
} 
+ MatView(assembledLocalMat, PETSC_VIEWER_STDOUT_WORLD); // Would like to get => Diag: 2, 1 

----- Mail original -----

> De: "Stefano Zampini" <stefano.zampini at gmail.com>
> ?: "petsc-maint" <knepley at gmail.com>
> Cc: "petsc-dev" <petsc-dev at mcs.anl.gov>, "PETSc users list"
> <petsc-users at mcs.anl.gov>, "Franck Houssen" <franck.houssen at inria.fr>
> Envoy?: Dimanche 21 Mai 2017 22:51:34
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (= one
> domain) before and after assembly ?

> To assemble the operator in aij format, use
> MatISGetMPIXAIJ
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatISGetMPIXAIJ.html

> Il 21 Mag 2017 18:43, "Matthew Knepley" < knepley at gmail.com > ha scritto:

> > On Sun, May 21, 2017 at 11:23 AM, Franck Houssen < franck.houssen at inria.fr
> > >
> > wrote:
> 

> > > I have a 3x3 global matrix is built (diag: 1, 2, 1): it's made of 2
> > > overlapping 2x2 local matrix (diag: 1, 1).
> > 
> 
> > > Getting non assembled local matrix is OK with MatISGetLocalMat.
> > 
> 
> > > How to get assembled local matrix (initial local matrix + neigbhor
> > > contributions on the borders) ? (expected result is diag: 2, 1)
> > 
> 

> > You can always use
> 

> > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrix.html
> 
> > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrices.html
> 

> > to get copies, but if you just want to build things, you can use
> 

> > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetLocalSubMatrix.html
> 

> > Thanks,
> 

> > Matt
> 

> > > Franck
> > 
> 

> > --
> 
> > What most experimenters take for granted before they begin their
> > experiments
> > is infinitely more interesting than any results to which their experiments
> > lead.
> 
> > -- Norbert Wiener
> 

> > http://www.caam.rice.edu/~mk51/
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/94f74525/attachment.html>

From franck.houssen at inria.fr  Tue May 23 04:53:18 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Tue, 23 May 2017 11:53:18 +0200 (CEST)
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<1564257107.6757440.1495383974167.JavaMail.zimbra@inria.fr>
	<CAMYG4GmzqskrmHAoD6h01r+SUQX_wL=8UPGO71gevy2w_qUvUQ@mail.gmail.com>
	<264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>
Message-ID: <1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>

The first thing I did was to put 3, not 4 : I got an error thrown in MatCreateIS (see the git diff + stack below). As the error said I used globalSize = numberOfMPIProcessus * localSize : my understanding is that, when using MatIS, the global size needs to be the sum of all local sizes. Correct ? 

I have a 3x3 global matrix made of two overlapping 2x2 local matrix (= diagonal with 1.). Each local matrix correspond to one domain (each domain is delegated to one MPI proc, so, I have 2 MPI procs because I have 2 domains). 
This is the simplest possible example: I have two 2x2 (local) diag matrix that overlap so that the global matrix built from them is 1, 2, 1 on the diagonal (local contributions add up in the middle). 
I need to MatMult this global matrix with a global vector filled with 1. 

Franck 

Git diff : 

--- a/matISLocalMat.cpp 
+++ b/matISLocalMat.cpp 
@@ -16,7 +16,7 @@ int main(int argc,char **argv) { 
int size = 0; MPI_Comm_size(MPI_COMM_WORLD, &size); if (size != 2) return 1; 
int rank = 0; MPI_Comm_rank(MPI_COMM_WORLD, &rank); 

- PetscInt localSize = 2, globalSize = localSize*2 /*2 MPI*/; 
+ PetscInt localSize = 2, globalSize = 3; 
PetscInt localIdx[2] = {0, 0}; 
if (rank == 0) {localIdx[0] = 0; localIdx[1] = 1;} 
else {localIdx[0] = 1; localIdx[1] = 2;} 

Stack error: 

[0]PETSC ERROR: Nonconforming object sizes 
[0]PETSC ERROR: Sum of local lengths 4 does not equal global length 3, my local length 2 
[0]PETSC ERROR: [0] ISG2LMapApply line 17 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/vec/is/utils/isltog.c 
[0]PETSC ERROR: [0] MatSetValues_IS line 692 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c 
[0]PETSC ERROR: [0] MatSetValues line 1157 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c 
[0]PETSC ERROR: [0] MatISSetPreallocation_IS line 95 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c 
[0]PETSC ERROR: [0] MatISSetPreallocation line 80 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c 
[0]PETSC ERROR: [0] PetscSplitOwnership line 80 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/sys/utils/psplit.c 
[0]PETSC ERROR: [0] PetscLayoutSetUp line 129 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/vec/is/utils/pmap.c 
[0]PETSC ERROR: [0] MatSetLocalToGlobalMapping_IS line 628 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c 
[0]PETSC ERROR: [0] MatSetLocalToGlobalMapping line 1899 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c 
[0]PETSC ERROR: [0] MatCreateIS line 986 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c 

----- Mail original -----

> De: "Stefano Zampini" <stefano.zampini at gmail.com>
> ?: "Matthew Knepley" <knepley at gmail.com>
> Cc: "Franck Houssen" <franck.houssen at inria.fr>, "PETSc"
> <petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> Envoy?: Dimanche 21 Mai 2017 23:02:37
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS matrix
> and a global vector ?

> Franck,

> PETSc takes care of doing the matrix-vector multiplication properly using
> MatIS. As Matt said, the layout of the vectors is the usual parallel layout.
> The local sizes of the MatIS matrix (i.e. the local size of the left and
> right vectors used in MatMult) are not the sizes of the local subdomain
> matrices in MatIS.

> > On May 21, 2017, at 6:47 PM, Matthew Knepley < knepley at gmail.com > wrote:
> 

> > On Sun, May 21, 2017 at 11:26 AM, Franck Houssen < franck.houssen at inria.fr
> > >
> > wrote:
> 

> > > Using PETSc MatIS, how to matmult a global IS matrix and a global vector
> > > ?
> > > Example is attached : I don't get what I expect that is a vector such
> > > that
> > > proc0 = [1, 2] and proc1 = [2, 1]
> > 
> 

> > 1) I think the global size of your matrix is wrong. You seem to want 3, not
> > 4
> 

> > 2) Global vectors have a non-overlapping row partition. You might be
> > thinking
> > of local vectors
> 

> > Thanks,
> 

> > Matt
> 

> > --
> 
> > What most experimenters take for granted before they begin their
> > experiments
> > is infinitely more interesting than any results to which their experiments
> > lead.
> 
> > -- Norbert Wiener
> 

> > http://www.caam.rice.edu/~mk51/
> 

----- Mail original -----

> De: "Stefano Zampini" <stefano.zampini at gmail.com>
> ?: "Matthew Knepley" <knepley at gmail.com>
> Cc: "Franck Houssen" <franck.houssen at inria.fr>, "PETSc"
> <petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> Envoy?: Dimanche 21 Mai 2017 23:02:37
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS matrix
> and a global vector ?

> Franck,

> PETSc takes care of doing the matrix-vector multiplication properly using
> MatIS. As Matt said, the layout of the vectors is the usual parallel layout.
> The local sizes of the MatIS matrix (i.e. the local size of the left and
> right vectors used in MatMult) are not the sizes of the local subdomain
> matrices in MatIS.

> > On May 21, 2017, at 6:47 PM, Matthew Knepley < knepley at gmail.com > wrote:
> 

> > On Sun, May 21, 2017 at 11:26 AM, Franck Houssen < franck.houssen at inria.fr
> > >
> > wrote:
> 

> > > Using PETSc MatIS, how to matmult a global IS matrix and a global vector
> > > ?
> > > Example is attached : I don't get what I expect that is a vector such
> > > that
> > > proc0 = [1, 2] and proc1 = [2, 1]
> > 
> 

> > 1) I think the global size of your matrix is wrong. You seem to want 3, not
> > 4
> 

> > 2) Global vectors have a non-overlapping row partition. You might be
> > thinking
> > of local vectors
> 

> > Thanks,
> 

> > Matt
> 

> > > Franck
> > 
> 

> > --
> 
> > What most experimenters take for granted before they begin their
> > experiments
> > is infinitely more interesting than any results to which their experiments
> > lead.
> 
> > -- Norbert Wiener
> 

> > http://www.caam.rice.edu/~mk51/
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/68244d32/attachment.html>

From stefano.zampini at gmail.com  Tue May 23 06:16:18 2017
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Tue, 23 May 2017 13:16:18 +0200
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to get local matrix (= one domain) before and after assembly ?
In-Reply-To: <2033509705.7414108.1495532492501.JavaMail.zimbra@inria.fr>
References: <867421313.6757137.1495383596545.JavaMail.zimbra@inria.fr>
	<1253777447.6757298.1495383780337.JavaMail.zimbra@inria.fr>
	<CAMYG4Gknt_kBW8LXHma=2fFmBO1X70B0Hg6JiXi+nuDwy5ygKA@mail.gmail.com>
	<CAGPUisi5V7DkFLAt3kBRVTyVrZnYjPV21uTdUwiDku8_MVnHPA@mail.gmail.com>
	<2033509705.7414108.1495532492501.JavaMail.zimbra@inria.fr>
Message-ID: <CAGPUishuMxWcPHUkWgXV42awH4rTv5W=64oz5xBShU8rTgn9Hw@mail.gmail.com>

MatISGetMPIXAIJ is collective, as it assembles the global operator. To get
the matrices you are looking for, you should call MatCreateSubMatrix on the
assembled global operator, with the global indices representing the
subdomain problem. Each process needs to call both functions

Stefano

Il 23 Mag 2017 11:41, "Franck Houssen" <franck.houssen at inria.fr> ha scritto:

> I have a 3x3 global matrix made of two overlapping 2x2 local matrix (=
> diagonal with 1.). Each local matrix correspond to one domain (each domain
> is delegated to one MPI proc, so, I have 2 MPI procs because I have 2
> domains).
> This is the simplest possible example: I have two 2x2 (local) diag matrix
> that overlap so that the global matrix built from them is 1, 2, 1 on the
> diagonal (local contributions add up in the middle).
>
> Now, I need for each MPI proc to get the assembled local matrix (sometimes
> called the dirichlet matrix) : this is a local matrix (sequential - not
> distributed with MPI) that accounts for contribution of neighboring domains
> (MPI proc).
>
> How to get the local assembled matrix ? MatGetLocalSubMatrix does not work
> (throw error - see example attached). MatGetSubMatrix returns a MPI
> distributed matrix, not a local (sequential) one.
>
>    1. My understanding is that MatISGetMPIXAIJ should return a local
>    matrix (sequential AIJ matrix) : the MPI in the name recall that you get
>    the assembled matrix (with contributions from the shared border) from the
>    other MPI processus. Correct ? In my simple example, I replaced
>    MatGetLocalSubMatrix with MatISGetMPIXAIJ : I get a deadlock which was
>    surprising to me... Is MatISGetMPIXAIJ a collective call ?
>    2. Supposing this is a collective call (and that point 1 is not
>    correct), I ride up  MatISGetMPIXAIJ before the "if (rank > 0)" : I don't
>    deadlock now, but it seems I get a global matrix which is not the assembled
>    local matrix I am looking for.
>    3. I am supposed to destroy the matrix returned by MatISGetMPIXAIJ ?
>    (I believe yes - not sure as AFAIU wording should associate Destroy methods
>    to Create methods)
>
> Franck
>
> The git diff illustrate modifications I tried to add to the initial file
> attached to this thread:
> --- a/matISLocalMat.cpp
> +++ b/matISLocalMat.cpp
> @@ -31,6 +31,8 @@ int main(int argc,char **argv) {
>    MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY); MatAssemblyEnd(A,
> MAT_FINAL_ASSEMBLY);
>    MatView(A, PETSC_VIEWER_STDOUT_WORLD); PetscViewerFlush(PETSC_VIEWER_STDOUT_WORLD);
> // Diag: 1, 2, 1
>
> +  Mat assembledLocalMat;
> +  MatISGetMPIXAIJ(A, MAT_INITIAL_MATRIX, &assembledLocalMat);
>    if (rank > 0) { // Do not pollute stdout: print only 1 proc
>      std::cout << std::endl << "non assembled local matrix:" << std::endl
> << std::endl;
>      Mat nonAssembledLocalMat;
> @@ -38,11 +40,10 @@ int main(int argc,char **argv) {
>      MatView(nonAssembledLocalMat, PETSC_VIEWER_STDOUT_SELF); // Diag: 1, 1
>
>      std::cout << std::endl << "assembled local matrix:" << std::endl <<
> std::endl;
> -    Mat assembledLocalMat;
> -    IS is; ISCreateGeneral(PETSC_COMM_SELF, localSize, localIdx,
> PETSC_COPY_VALUES, &is);
> -    MatGetLocalSubMatrix(A, is, is, &assembledLocalMat); // KO ?!...
> -    MatView(assembledLocalMat, PETSC_VIEWER_STDOUT_SELF); // Would like
> to get => Diag: 2, 1
> +    //IS is; ISCreateGeneral(PETSC_COMM_SELF, localSize, localIdx,
> PETSC_COPY_VALUES, &is);
> +    //MatGetLocalSubMatrix(A, is, is, &assembledLocalMat); // KO ?!...
>    }
> +  MatView(assembledLocalMat, PETSC_VIEWER_STDOUT_WORLD); // Would like to
> get => Diag: 2, 1
>
>
> ------------------------------
>
> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
> *?: *"petsc-maint" <knepley at gmail.com>
> *Cc: *"petsc-dev" <petsc-dev at mcs.anl.gov>, "PETSc users list" <
> petsc-users at mcs.anl.gov>, "Franck Houssen" <franck.houssen at inria.fr>
> *Envoy?: *Dimanche 21 Mai 2017 22:51:34
> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (=
> one domain) before and after assembly ?
>
> To assemble the operator in aij format, use
> MatISGetMPIXAIJ
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/
> MatISGetMPIXAIJ.html
>
> Il 21 Mag 2017 18:43, "Matthew Knepley" <knepley at gmail.com> ha scritto:
>
>> On Sun, May 21, 2017 at 11:23 AM, Franck Houssen <franck.houssen at inria.fr
>> > wrote:
>>
>>> I have a 3x3 global matrix is built (diag: 1, 2, 1): it's made of 2
>>> overlapping 2x2 local matrix (diag: 1, 1).
>>> Getting non assembled local matrix is OK with MatISGetLocalMat.
>>> How to get assembled local matrix (initial local matrix + neigbhor
>>> contributions on the borders) ? (expected result is diag: 2, 1)
>>>
>>
>> You can always use
>>
>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/
>> MatGetSubMatrix.html
>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/
>> MatGetSubMatrices.html
>>
>> to get copies, but if you just want to build things, you can use
>>
>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/
>> MatGetLocalSubMatrix.html
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Franck
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> http://www.caam.rice.edu/~mk51/
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/353a6272/attachment-0001.html>

From knepley at gmail.com  Tue May 23 06:21:21 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 23 May 2017 06:21:21 -0500
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<1564257107.6757440.1495383974167.JavaMail.zimbra@inria.fr>
	<CAMYG4GmzqskrmHAoD6h01r+SUQX_wL=8UPGO71gevy2w_qUvUQ@mail.gmail.com>
	<264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>
	<1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>
Message-ID: <CAMYG4G=hBu_XJ1rRXRt6Avb4--yy=g2Fmk4x-jbbR9XUo-X4hA@mail.gmail.com>

On Tue, May 23, 2017 at 4:53 AM, Franck Houssen <franck.houssen at inria.fr>
wrote:

> The first thing I did was to put 3, not 4 : I got an error thrown in
> MatCreateIS (see the git diff + stack below). As the error said I used
> globalSize = numberOfMPIProcessus * localSize : my understanding is that,
> when using MatIS, the global size needs to be the sum of all local sizes.
> Correct ?
>

No. MatIS means that the matrix is not assembled. The easiest way (for me)
to think of this is that processes do not have
to hold full rows. One process can hold part of row i, and another
processes can hold another part. However, there are still
the same number of global rows.


> I have a 3x3 global matrix made of two overlapping 2x2 local matrix (=
> diagonal with 1.). Each local matrix correspond to one domain (each domain
> is delegated to one MPI proc, so, I have 2 MPI procs because I have 2
> domains).
>

So the global size is 3. The local size here is not the size of the local
IS block, since that is a property only of MatIS. It is the
size of the local piece of the vector you multiply. This allows PETSc to
understand the parallel layout of the Vec, and how it
matched the Mat.

This is somewhat confusing because FEM people mean something different by
"local" than we do here, and in fact we use this
other definition of local when assembling operators.

   Matt


> This is the simplest possible example: I have two 2x2 (local) diag matrix
> that overlap so that the global matrix built from them is 1, 2, 1 on the
> diagonal (local contributions add up in the middle).
> I need to MatMult this global matrix with a global vector filled with 1.
>
> Franck
>
> Git diff :
>
> --- a/matISLocalMat.cpp
> +++ b/matISLocalMat.cpp
> @@ -16,7 +16,7 @@ int main(int argc,char **argv) {
>    int size = 0; MPI_Comm_size(MPI_COMM_WORLD, &size); if (size != 2)
> return 1;
>    int rank = 0; MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>
> -  PetscInt localSize = 2, globalSize = localSize*2 /*2 MPI*/;
> +  PetscInt localSize = 2, globalSize = 3;
>    PetscInt localIdx[2] = {0, 0};
>    if (rank == 0) {localIdx[0] = 0; localIdx[1] = 1;}
>    else           {localIdx[0] = 1; localIdx[1] = 2;}
>
>
>
> Stack error:
>
> [0]PETSC ERROR: Nonconforming object sizes
> [0]PETSC ERROR: Sum of local lengths 4 does not equal global length 3, my
> local length 2
> [0]PETSC ERROR: [0] ISG2LMapApply line 17 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/vec/is/utils/isltog.c
> [0]PETSC ERROR: [0] MatSetValues_IS line 692 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] MatSetValues line 1157 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> [0]PETSC ERROR: [0] MatISSetPreallocation_IS line 95
> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] MatISSetPreallocation line 80
> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] PetscSplitOwnership line 80 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/sys/utils/psplit.c
> [0]PETSC ERROR: [0] PetscLayoutSetUp line 129 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/vec/is/utils/pmap.c
> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping_IS line 628
> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping line 1899
> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> [0]PETSC ERROR: [0] MatCreateIS line 986 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>
>
>
> ------------------------------
>
> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
> *?: *"Matthew Knepley" <knepley at gmail.com>
> *Cc: *"Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <
> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> *Envoy?: *Dimanche 21 Mai 2017 23:02:37
> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> matrix and a global vector ?
>
> Franck,
>
> PETSc takes care of doing the matrix-vector multiplication properly using
> MatIS.  As Matt said, the layout of the vectors is the usual parallel
> layout.
> The local sizes of the MatIS matrix (i.e. the local size of the left and
> right vectors used in MatMult) are not the sizes of the local subdomain
>  matrices in MatIS.
>
>
> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <franck.houssen at inria.fr>
> wrote:
>
>> Using PETSc MatIS, how to matmult a global IS matrix and a global vector
>> ? Example is attached : I don't get what I expect that is a vector such
>> that proc0 = [1, 2] and proc1 = [2, 1]
>>
>
> 1) I think the global size of your matrix is wrong. You seem to want 3,
> not 4
>
> 2) Global vectors have a non-overlapping row partition. You might be
> thinking of local vectors
>
>   Thanks,
>
>     Matt
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
>
> ------------------------------
>
> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
> *?: *"Matthew Knepley" <knepley at gmail.com>
> *Cc: *"Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <
> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> *Envoy?: *Dimanche 21 Mai 2017 23:02:37
> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> matrix and a global vector ?
>
> Franck,
>
> PETSc takes care of doing the matrix-vector multiplication properly using
> MatIS.  As Matt said, the layout of the vectors is the usual parallel
> layout.
> The local sizes of the MatIS matrix (i.e. the local size of the left and
> right vectors used in MatMult) are not the sizes of the local subdomain
>  matrices in MatIS.
>
>
> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <franck.houssen at inria.fr>
> wrote:
>
>> Using PETSc MatIS, how to matmult a global IS matrix and a global vector
>> ? Example is attached : I don't get what I expect that is a vector such
>> that proc0 = [1, 2] and proc1 = [2, 1]
>>
>
> 1) I think the global size of your matrix is wrong. You seem to want 3,
> not 4
>
> 2) Global vectors have a non-overlapping row partition. You might be
> thinking of local vectors
>
>   Thanks,
>
>     Matt
>
>
>> Franck
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/ae95bc2b/attachment.html>

From stefano.zampini at gmail.com  Tue May 23 06:23:52 2017
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Tue, 23 May 2017 13:23:52 +0200
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<1564257107.6757440.1495383974167.JavaMail.zimbra@inria.fr>
	<CAMYG4GmzqskrmHAoD6h01r+SUQX_wL=8UPGO71gevy2w_qUvUQ@mail.gmail.com>
	<264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>
	<1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>
Message-ID: <CAGPUisgAq7gX3dgXAkeFXF23aEkB6Bio4D__1zE_RA=RuJFwUA@mail.gmail.com>

As I said, the local sizes of the MATIS matrix are NOT the sizes of the
subdomain problem. As in all PETSc code, the local sizes of the matrix
correspond to the local size of the non-overlapping right and left vectors
used in matmult operations. You can use PETSC_DECIDE in place of localsize
in your call to MatCreateIS. The size of the subdomain problem in MATIS is
the local size of the l2g map.

Il 23 Mag 2017 11:53, "Franck Houssen" <franck.houssen at inria.fr> ha scritto:

> The first thing I did was to put 3, not 4 : I got an error thrown in
> MatCreateIS (see the git diff + stack below). As the error said I used
> globalSize = numberOfMPIProcessus * localSize : my understanding is that,
> when using MatIS, the global size needs to be the sum of all local sizes.
> Correct ?
>
> I have a 3x3 global matrix made of two overlapping 2x2 local matrix (=
> diagonal with 1.). Each local matrix correspond to one domain (each domain
> is delegated to one MPI proc, so, I have 2 MPI procs because I have 2
> domains).
> This is the simplest possible example: I have two 2x2 (local) diag matrix
> that overlap so that the global matrix built from them is 1, 2, 1 on the
> diagonal (local contributions add up in the middle).
> I need to MatMult this global matrix with a global vector filled with 1.
>
> Franck
>
> Git diff :
>
> --- a/matISLocalMat.cpp
> +++ b/matISLocalMat.cpp
> @@ -16,7 +16,7 @@ int main(int argc,char **argv) {
>    int size = 0; MPI_Comm_size(MPI_COMM_WORLD, &size); if (size != 2)
> return 1;
>    int rank = 0; MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>
> -  PetscInt localSize = 2, globalSize = localSize*2 /*2 MPI*/;
> +  PetscInt localSize = 2, globalSize = 3;
>    PetscInt localIdx[2] = {0, 0};
>    if (rank == 0) {localIdx[0] = 0; localIdx[1] = 1;}
>    else           {localIdx[0] = 1; localIdx[1] = 2;}
>
>
>
> Stack error:
>
> [0]PETSC ERROR: Nonconforming object sizes
> [0]PETSC ERROR: Sum of local lengths 4 does not equal global length 3, my
> local length 2
> [0]PETSC ERROR: [0] ISG2LMapApply line 17 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/vec/is/utils/isltog.c
> [0]PETSC ERROR: [0] MatSetValues_IS line 692 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] MatSetValues line 1157 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> [0]PETSC ERROR: [0] MatISSetPreallocation_IS line 95
> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] MatISSetPreallocation line 80
> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] PetscSplitOwnership line 80 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/sys/utils/psplit.c
> [0]PETSC ERROR: [0] PetscLayoutSetUp line 129 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/vec/is/utils/pmap.c
> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping_IS line 628
> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping line 1899
> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> [0]PETSC ERROR: [0] MatCreateIS line 986 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>
>
>
> ------------------------------
>
> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
> *?: *"Matthew Knepley" <knepley at gmail.com>
> *Cc: *"Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <
> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> *Envoy?: *Dimanche 21 Mai 2017 23:02:37
> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> matrix and a global vector ?
>
> Franck,
>
> PETSc takes care of doing the matrix-vector multiplication properly using
> MatIS.  As Matt said, the layout of the vectors is the usual parallel
> layout.
> The local sizes of the MatIS matrix (i.e. the local size of the left and
> right vectors used in MatMult) are not the sizes of the local subdomain
>  matrices in MatIS.
>
>
> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <franck.houssen at inria.fr>
> wrote:
>
>> Using PETSc MatIS, how to matmult a global IS matrix and a global vector
>> ? Example is attached : I don't get what I expect that is a vector such
>> that proc0 = [1, 2] and proc1 = [2, 1]
>>
>
> 1) I think the global size of your matrix is wrong. You seem to want 3,
> not 4
>
> 2) Global vectors have a non-overlapping row partition. You might be
> thinking of local vectors
>
>   Thanks,
>
>     Matt
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
>
> ------------------------------
>
> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
> *?: *"Matthew Knepley" <knepley at gmail.com>
> *Cc: *"Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <
> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> *Envoy?: *Dimanche 21 Mai 2017 23:02:37
> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> matrix and a global vector ?
>
> Franck,
>
> PETSc takes care of doing the matrix-vector multiplication properly using
> MatIS.  As Matt said, the layout of the vectors is the usual parallel
> layout.
> The local sizes of the MatIS matrix (i.e. the local size of the left and
> right vectors used in MatMult) are not the sizes of the local subdomain
>  matrices in MatIS.
>
>
> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <franck.houssen at inria.fr>
> wrote:
>
>> Using PETSc MatIS, how to matmult a global IS matrix and a global vector
>> ? Example is attached : I don't get what I expect that is a vector such
>> that proc0 = [1, 2] and proc1 = [2, 1]
>>
>
> 1) I think the global size of your matrix is wrong. You seem to want 3,
> not 4
>
> 2) Global vectors have a non-overlapping row partition. You might be
> thinking of local vectors
>
>   Thanks,
>
>     Matt
>
>
>> Franck
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/7350af97/attachment-0001.html>

From knepley at gmail.com  Tue May 23 06:27:14 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 23 May 2017 06:27:14 -0500
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <CAGPUisgAq7gX3dgXAkeFXF23aEkB6Bio4D__1zE_RA=RuJFwUA@mail.gmail.com>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<1564257107.6757440.1495383974167.JavaMail.zimbra@inria.fr>
	<CAMYG4GmzqskrmHAoD6h01r+SUQX_wL=8UPGO71gevy2w_qUvUQ@mail.gmail.com>
	<264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>
	<1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>
	<CAGPUisgAq7gX3dgXAkeFXF23aEkB6Bio4D__1zE_RA=RuJFwUA@mail.gmail.com>
Message-ID: <CAMYG4GmxqjgcfhyeXd9v6ey+VzB7Jy2j0iheG5n08wriejcwdQ@mail.gmail.com>

On Tue, May 23, 2017 at 6:23 AM, Stefano Zampini <stefano.zampini at gmail.com>
wrote:

> As I said, the local sizes of the MATIS matrix are NOT the sizes of the
> subdomain problem. As in all PETSc code, the local sizes of the matrix
> correspond to the local size of the non-overlapping right and left vectors
> used in matmult operations. You can use PETSC_DECIDE in place of localsize
> in your call to MatCreateIS. The size of the subdomain problem in MATIS is
> the local size of the l2g map.
>

I just want to make sure that MatIS is really what you want, since its a
little more complex than other options.

MatIS is useful for methods that need unassembled matrices, like domain
decomposition methods which use
Neumann problems on subdomains. If you are fine with assembled sparse
matrices, than the normal AIJ should
be easier to handle. Just checking.

  Thanks,

     Matt


> Il 23 Mag 2017 11:53, "Franck Houssen" <franck.houssen at inria.fr> ha
> scritto:
>
>> The first thing I did was to put 3, not 4 : I got an error thrown in
>> MatCreateIS (see the git diff + stack below). As the error said I used
>> globalSize = numberOfMPIProcessus * localSize : my understanding is that,
>> when using MatIS, the global size needs to be the sum of all local sizes.
>> Correct ?
>>
>> I have a 3x3 global matrix made of two overlapping 2x2 local matrix (=
>> diagonal with 1.). Each local matrix correspond to one domain (each domain
>> is delegated to one MPI proc, so, I have 2 MPI procs because I have 2
>> domains).
>> This is the simplest possible example: I have two 2x2 (local) diag matrix
>> that overlap so that the global matrix built from them is 1, 2, 1 on the
>> diagonal (local contributions add up in the middle).
>> I need to MatMult this global matrix with a global vector filled with 1.
>>
>> Franck
>>
>> Git diff :
>>
>> --- a/matISLocalMat.cpp
>> +++ b/matISLocalMat.cpp
>> @@ -16,7 +16,7 @@ int main(int argc,char **argv) {
>>    int size = 0; MPI_Comm_size(MPI_COMM_WORLD, &size); if (size != 2)
>> return 1;
>>    int rank = 0; MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>
>> -  PetscInt localSize = 2, globalSize = localSize*2 /*2 MPI*/;
>> +  PetscInt localSize = 2, globalSize = 3;
>>    PetscInt localIdx[2] = {0, 0};
>>    if (rank == 0) {localIdx[0] = 0; localIdx[1] = 1;}
>>    else           {localIdx[0] = 1; localIdx[1] = 2;}
>>
>>
>>
>> Stack error:
>>
>> [0]PETSC ERROR: Nonconforming object sizes
>> [0]PETSC ERROR: Sum of local lengths 4 does not equal global length 3, my
>> local length 2
>> [0]PETSC ERROR: [0] ISG2LMapApply line 17 /home/fghoussen/Documents/INRI
>> A/petsc-3.7.6/src/vec/is/utils/isltog.c
>> [0]PETSC ERROR: [0] MatSetValues_IS line 692
>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>> [0]PETSC ERROR: [0] MatSetValues line 1157 /home/fghoussen/Documents/INRI
>> A/petsc-3.7.6/src/mat/interface/matrix.c
>> [0]PETSC ERROR: [0] MatISSetPreallocation_IS line 95
>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>> [0]PETSC ERROR: [0] MatISSetPreallocation line 80
>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>> [0]PETSC ERROR: [0] PetscSplitOwnership line 80
>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/sys/utils/psplit.c
>> [0]PETSC ERROR: [0] PetscLayoutSetUp line 129
>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/vec/is/utils/pmap.c
>> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping_IS line 628
>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping line 1899
>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
>> [0]PETSC ERROR: [0] MatCreateIS line 986 /home/fghoussen/Documents/INRI
>> A/petsc-3.7.6/src/mat/impls/is/matis.c
>>
>>
>>
>> ------------------------------
>>
>> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
>> *?: *"Matthew Knepley" <knepley at gmail.com>
>> *Cc: *"Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <
>> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
>> *Envoy?: *Dimanche 21 Mai 2017 23:02:37
>> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
>> matrix and a global vector ?
>>
>> Franck,
>>
>> PETSc takes care of doing the matrix-vector multiplication properly using
>> MatIS.  As Matt said, the layout of the vectors is the usual parallel
>> layout.
>> The local sizes of the MatIS matrix (i.e. the local size of the left and
>> right vectors used in MatMult) are not the sizes of the local subdomain
>>  matrices in MatIS.
>>
>>
>> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>
>> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <franck.houssen at inria.fr
>> > wrote:
>>
>>> Using PETSc MatIS, how to matmult a global IS matrix and a global vector
>>> ? Example is attached : I don't get what I expect that is a vector such
>>> that proc0 = [1, 2] and proc1 = [2, 1]
>>>
>>
>> 1) I think the global size of your matrix is wrong. You seem to want 3,
>> not 4
>>
>> 2) Global vectors have a non-overlapping row partition. You might be
>> thinking of local vectors
>>
>>   Thanks,
>>
>>     Matt
>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> http://www.caam.rice.edu/~mk51/
>>
>>
>> ------------------------------
>>
>> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
>> *?: *"Matthew Knepley" <knepley at gmail.com>
>> *Cc: *"Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <
>> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
>> *Envoy?: *Dimanche 21 Mai 2017 23:02:37
>> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
>> matrix and a global vector ?
>>
>> Franck,
>>
>> PETSc takes care of doing the matrix-vector multiplication properly using
>> MatIS.  As Matt said, the layout of the vectors is the usual parallel
>> layout.
>> The local sizes of the MatIS matrix (i.e. the local size of the left and
>> right vectors used in MatMult) are not the sizes of the local subdomain
>>  matrices in MatIS.
>>
>>
>> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>
>> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <franck.houssen at inria.fr
>> > wrote:
>>
>>> Using PETSc MatIS, how to matmult a global IS matrix and a global vector
>>> ? Example is attached : I don't get what I expect that is a vector such
>>> that proc0 = [1, 2] and proc1 = [2, 1]
>>>
>>
>> 1) I think the global size of your matrix is wrong. You seem to want 3,
>> not 4
>>
>> 2) Global vectors have a non-overlapping row partition. You might be
>> thinking of local vectors
>>
>>   Thanks,
>>
>>     Matt
>>
>>
>>> Franck
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> http://www.caam.rice.edu/~mk51/
>>
>>
>>
>>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/23bb842f/attachment.html>

From franck.houssen at inria.fr  Tue May 23 11:28:03 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Tue, 23 May 2017 18:28:03 +0200 (CEST)
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <CAMYG4G=hBu_XJ1rRXRt6Avb4--yy=g2Fmk4x-jbbR9XUo-X4hA@mail.gmail.com>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<1564257107.6757440.1495383974167.JavaMail.zimbra@inria.fr>
	<CAMYG4GmzqskrmHAoD6h01r+SUQX_wL=8UPGO71gevy2w_qUvUQ@mail.gmail.com>
	<264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>
	<1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>
	<CAMYG4G=hBu_XJ1rRXRt6Avb4--yy=g2Fmk4x-jbbR9XUo-X4hA@mail.gmail.com>
Message-ID: <1784716977.7683031.1495556883254.JavaMail.zimbra@inria.fr>

OK, thanks. This is helpfull... But I really think the doc should be more verbose about that: this is really confusing and I didn't find any simple example to begin with which make all this even more confusing (personal opinion). 

Franck 

----- Mail original -----

> De: "Matthew Knepley" <knepley at gmail.com>
> ?: "Franck Houssen" <franck.houssen at inria.fr>
> Cc: "Stefano Zampini" <stefano.zampini at gmail.com>, "PETSc"
> <petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> Envoy?: Mardi 23 Mai 2017 13:21:21
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS matrix
> and a global vector ?

> On Tue, May 23, 2017 at 4:53 AM, Franck Houssen < franck.houssen at inria.fr >
> wrote:

> > The first thing I did was to put 3, not 4 : I got an error thrown in
> > MatCreateIS (see the git diff + stack below). As the error said I used
> > globalSize = numberOfMPIProcessus * localSize : my understanding is that,
> > when using MatIS, the global size needs to be the sum of all local sizes.
> > Correct ?
> 

> No. MatIS means that the matrix is not assembled. The easiest way (for me) to
> think of this is that processes do not have
> to hold full rows. One process can hold part of row i, and another processes
> can hold another part. However, there are still
> the same number of global rows.

> > I have a 3x3 global matrix made of two overlapping 2x2 local matrix (=
> > diagonal with 1.). Each local matrix correspond to one domain (each domain
> > is delegated to one MPI proc, so, I have 2 MPI procs because I have 2
> > domains).
> 

> So the global size is 3. The local size here is not the size of the local IS
> block, since that is a property only of MatIS. It is the
> size of the local piece of the vector you multiply. This allows PETSc to
> understand the parallel layout of the Vec, and how it
> matched the Mat.

> This is somewhat confusing because FEM people mean something different by
> "local" than we do here, and in fact we use this
> other definition of local when assembling operators.

> Matt

> > This is the simplest possible example: I have two 2x2 (local) diag matrix
> > that overlap so that the global matrix built from them is 1, 2, 1 on the
> > diagonal (local contributions add up in the middle).
> 
> > I need to MatMult this global matrix with a global vector filled with 1.
> 

> > Franck
> 

> > Git diff :
> 

> > --- a/matISLocalMat.cpp
> 
> > +++ b/matISLocalMat.cpp
> 
> > @@ -16,7 +16,7 @@ int main(int argc,char **argv) {
> 
> > int size = 0; MPI_Comm_size(MPI_COMM_WORLD, &size); if (size != 2) return
> > 1;
> 
> > int rank = 0; MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> 

> > - PetscInt localSize = 2, globalSize = localSize*2 /*2 MPI*/;
> 
> > + PetscInt localSize = 2, globalSize = 3;
> 
> > PetscInt localIdx[2] = {0, 0};
> 
> > if (rank == 0) {localIdx[0] = 0; localIdx[1] = 1;}
> 
> > else {localIdx[0] = 1; localIdx[1] = 2;}
> 

> > Stack error:
> 

> > [0]PETSC ERROR: Nonconforming object sizes
> 
> > [0]PETSC ERROR: Sum of local lengths 4 does not equal global length 3, my
> > local length 2
> 
> > [0]PETSC ERROR: [0] ISG2LMapApply line 17
> > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/vec/is/utils/isltog.c
> 
> > [0]PETSC ERROR: [0] MatSetValues_IS line 692
> > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> 
> > [0]PETSC ERROR: [0] MatSetValues line 1157
> > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> 
> > [0]PETSC ERROR: [0] MatISSetPreallocation_IS line 95
> > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> 
> > [0]PETSC ERROR: [0] MatISSetPreallocation line 80
> > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> 
> > [0]PETSC ERROR: [0] PetscSplitOwnership line 80
> > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/sys/utils/psplit.c
> 
> > [0]PETSC ERROR: [0] PetscLayoutSetUp line 129
> > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/vec/is/utils/pmap.c
> 
> > [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping_IS line 628
> > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> 
> > [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping line 1899
> > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> 
> > [0]PETSC ERROR: [0] MatCreateIS line 986
> > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> 

> > > De: "Stefano Zampini" < stefano.zampini at gmail.com >
> > 
> 
> > > ?: "Matthew Knepley" < knepley at gmail.com >
> > 
> 
> > > Cc: "Franck Houssen" < franck.houssen at inria.fr >, "PETSc" <
> > > petsc-users at mcs.anl.gov >, "PETSc" < petsc-dev at mcs.anl.gov >
> > 
> 
> > > Envoy?: Dimanche 21 Mai 2017 23:02:37
> > 
> 
> > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> > > matrix
> > > and a global vector ?
> > 
> 

> > > Franck,
> > 
> 

> > > PETSc takes care of doing the matrix-vector multiplication properly using
> > > MatIS. As Matt said, the layout of the vectors is the usual parallel
> > > layout.
> > 
> 
> > > The local sizes of the MatIS matrix (i.e. the local size of the left and
> > > right vectors used in MatMult) are not the sizes of the local subdomain
> > > matrices in MatIS.
> > 
> 

> > > > On May 21, 2017, at 6:47 PM, Matthew Knepley < knepley at gmail.com >
> > > > wrote:
> > > 
> > 
> 

> > > > On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <
> > > > franck.houssen at inria.fr
> > > > >
> > > > wrote:
> > > 
> > 
> 

> > > > > Using PETSc MatIS, how to matmult a global IS matrix and a global
> > > > > vector
> > > > > ?
> > > > > Example is attached : I don't get what I expect that is a vector such
> > > > > that
> > > > > proc0 = [1, 2] and proc1 = [2, 1]
> > > > 
> > > 
> > 
> 

> > > > 1) I think the global size of your matrix is wrong. You seem to want 3,
> > > > not
> > > > 4
> > > 
> > 
> 

> > > > 2) Global vectors have a non-overlapping row partition. You might be
> > > > thinking
> > > > of local vectors
> > > 
> > 
> 

> > > > Thanks,
> > > 
> > 
> 

> > > > Matt
> > > 
> > 
> 

> > > > --
> > > 
> > 
> 
> > > > What most experimenters take for granted before they begin their
> > > > experiments
> > > > is infinitely more interesting than any results to which their
> > > > experiments
> > > > lead.
> > > 
> > 
> 
> > > > -- Norbert Wiener
> > > 
> > 
> 

> > > > http://www.caam.rice.edu/~mk51/
> > > 
> > 
> 

> > > De: "Stefano Zampini" < stefano.zampini at gmail.com >
> > 
> 
> > > ?: "Matthew Knepley" < knepley at gmail.com >
> > 
> 
> > > Cc: "Franck Houssen" < franck.houssen at inria.fr >, "PETSc" <
> > > petsc-users at mcs.anl.gov >, "PETSc" < petsc-dev at mcs.anl.gov >
> > 
> 
> > > Envoy?: Dimanche 21 Mai 2017 23:02:37
> > 
> 
> > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> > > matrix
> > > and a global vector ?
> > 
> 

> > > Franck,
> > 
> 

> > > PETSc takes care of doing the matrix-vector multiplication properly using
> > > MatIS. As Matt said, the layout of the vectors is the usual parallel
> > > layout.
> > 
> 
> > > The local sizes of the MatIS matrix (i.e. the local size of the left and
> > > right vectors used in MatMult) are not the sizes of the local subdomain
> > > matrices in MatIS.
> > 
> 

> > > > On May 21, 2017, at 6:47 PM, Matthew Knepley < knepley at gmail.com >
> > > > wrote:
> > > 
> > 
> 

> > > > On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <
> > > > franck.houssen at inria.fr
> > > > >
> > > > wrote:
> > > 
> > 
> 

> > > > > Using PETSc MatIS, how to matmult a global IS matrix and a global
> > > > > vector
> > > > > ?
> > > > > Example is attached : I don't get what I expect that is a vector such
> > > > > that
> > > > > proc0 = [1, 2] and proc1 = [2, 1]
> > > > 
> > > 
> > 
> 

> > > > 1) I think the global size of your matrix is wrong. You seem to want 3,
> > > > not
> > > > 4
> > > 
> > 
> 

> > > > 2) Global vectors have a non-overlapping row partition. You might be
> > > > thinking
> > > > of local vectors
> > > 
> > 
> 

> > > > Thanks,
> > > 
> > 
> 

> > > > Matt
> > > 
> > 
> 

> > > > > Franck
> > > > 
> > > 
> > 
> 

> > > > --
> > > 
> > 
> 
> > > > What most experimenters take for granted before they begin their
> > > > experiments
> > > > is infinitely more interesting than any results to which their
> > > > experiments
> > > > lead.
> > > 
> > 
> 
> > > > -- Norbert Wiener
> > > 
> > 
> 

> > > > http://www.caam.rice.edu/~mk51/
> > > 
> > 
> 

> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener

> http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/4391c8b5/attachment-0001.html>

From franck.houssen at inria.fr  Tue May 23 11:34:53 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Tue, 23 May 2017 18:34:53 +0200 (CEST)
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to get local matrix (= one domain) before and after assembly ?
In-Reply-To: <CAGPUishuMxWcPHUkWgXV42awH4rTv5W=64oz5xBShU8rTgn9Hw@mail.gmail.com>
References: <867421313.6757137.1495383596545.JavaMail.zimbra@inria.fr>
	<1253777447.6757298.1495383780337.JavaMail.zimbra@inria.fr>
	<CAMYG4Gknt_kBW8LXHma=2fFmBO1X70B0Hg6JiXi+nuDwy5ygKA@mail.gmail.com>
	<CAGPUisi5V7DkFLAt3kBRVTyVrZnYjPV21uTdUwiDku8_MVnHPA@mail.gmail.com>
	<2033509705.7414108.1495532492501.JavaMail.zimbra@inria.fr>
	<CAGPUishuMxWcPHUkWgXV42awH4rTv5W=64oz5xBShU8rTgn9Hw@mail.gmail.com>
Message-ID: <740691579.7684644.1495557293858.JavaMail.zimbra@inria.fr>

OK. I am supposed to destroy the matrix returned by MatISGetMPIXAIJ ? 
Also, my example still not get the final assembled local matrix (the MatCreateSubMatrix returns an empty matrix) but as far as I understand my (global) index set is OK: what did I miss ? 

Franck 

----- Mail original -----

> De: "Stefano Zampini" <stefano.zampini at gmail.com>
> ?: "Franck Houssen" <franck.houssen at inria.fr>
> Cc: "petsc-dev" <petsc-dev at mcs.anl.gov>, "PETSc users list"
> <petsc-users at mcs.anl.gov>, "petsc-maint" <knepley at gmail.com>
> Envoy?: Mardi 23 Mai 2017 13:16:18
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (= one
> domain) before and after assembly ?

> MatISGetMPIXAIJ is collective, as it assembles the global operator. To get
> the matrices you are looking for, you should call MatCreateSubMatrix on the
> assembled global operator, with the global indices representing the
> subdomain problem. Each process needs to call both functions

> Stefano

> Il 23 Mag 2017 11:41, "Franck Houssen" < franck.houssen at inria.fr > ha
> scritto:

> > I have a 3x3 global matrix made of two overlapping 2x2 local matrix (=
> > diagonal with 1.). Each local matrix correspond to one domain (each domain
> > is delegated to one MPI proc, so, I have 2 MPI procs because I have 2
> > domains).
> 
> > This is the simplest possible example: I have two 2x2 (local) diag matrix
> > that overlap so that the global matrix built from them is 1, 2, 1 on the
> > diagonal (local contributions add up in the middle).
> 

> > Now, I need for each MPI proc to get the assembled local matrix (sometimes
> > called the dirichlet matrix) : this is a local matrix (sequential - not
> > distributed with MPI) that accounts for contribution of neighboring domains
> > (MPI proc).
> 

> > How to get the local assembled matrix ? MatGetLocalSubMatrix does not work
> > (throw error - see example attached). MatGetSubMatrix returns a MPI
> > distributed matrix, not a local (sequential) one.
> 

> > 1. My understanding is that MatISGetMPIXAIJ should return a local matrix
> > (sequential AIJ matrix) : the MPI in the name recall that you get the
> > assembled matrix (with contributions from the shared border) from the other
> > MPI processus. Correct ? In my simple example, I replaced
> > MatGetLocalSubMatrix with MatISGetMPIXAIJ : I get a deadlock which was
> > surprising to me... Is MatISGetMPIXAIJ a collective call ?
> 
> > 2. Supposing this is a collective call (and that point 1 is not correct), I
> > ride up MatISGetMPIXAIJ before the "if (rank > 0)" : I don't deadlock now,
> > but it seems I get a global matrix which is not the assembled local matrix
> > I
> > am looking for.
> 
> > 3. I am supposed to destroy the matrix returned by MatISGetMPIXAIJ ? (I
> > believe yes - not sure as AFAIU wording should associate Destroy methods to
> > Create methods)
> 
> > Franck
> 

> > The git diff illustrate modifications I tried to add to the initial file
> > attached to this thread:
> 
> > --- a/matISLocalMat.cpp
> 
> > +++ b/matISLocalMat.cpp
> 
> > @@ -31,6 +31,8 @@ int main(int argc,char **argv) {
> 
> > MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY); MatAssemblyEnd(A,
> > MAT_FINAL_ASSEMBLY);
> 
> > MatView(A, PETSC_VIEWER_STDOUT_WORLD);
> > PetscViewerFlush(PETSC_VIEWER_STDOUT_WORLD); // Diag: 1, 2, 1
> 

> > + Mat assembledLocalMat;
> 
> > + MatISGetMPIXAIJ(A, MAT_INITIAL_MATRIX, &assembledLocalMat);
> 
> > if (rank > 0) { // Do not pollute stdout: print only 1 proc
> 
> > std::cout << std::endl << "non assembled local matrix:" << std::endl <<
> > std::endl;
> 
> > Mat nonAssembledLocalMat;
> 
> > @@ -38,11 +40,10 @@ int main(int argc,char **argv) {
> 
> > MatView(nonAssembledLocalMat, PETSC_VIEWER_STDOUT_SELF); // Diag: 1, 1
> 

> > std::cout << std::endl << "assembled local matrix:" << std::endl <<
> > std::endl;
> 
> > - Mat assembledLocalMat;
> 
> > - IS is; ISCreateGeneral(PETSC_COMM_SELF, localSize, localIdx,
> > PETSC_COPY_VALUES, &is);
> 
> > - MatGetLocalSubMatrix(A, is, is, &assembledLocalMat); // KO ?!...
> 
> > - MatView(assembledLocalMat, PETSC_VIEWER_STDOUT_SELF); // Would like to
> > get
> > => Diag: 2, 1
> 
> > + //IS is; ISCreateGeneral(PETSC_COMM_SELF, localSize, localIdx,
> > PETSC_COPY_VALUES, &is);
> 
> > + //MatGetLocalSubMatrix(A, is, is, &assembledLocalMat); // KO ?!...
> 
> > }
> 
> > + MatView(assembledLocalMat, PETSC_VIEWER_STDOUT_WORLD); // Would like to
> > get
> > => Diag: 2, 1
> 

> > > De: "Stefano Zampini" < stefano.zampini at gmail.com >
> > 
> 
> > > ?: "petsc-maint" < knepley at gmail.com >
> > 
> 
> > > Cc: "petsc-dev" < petsc-dev at mcs.anl.gov >, "PETSc users list" <
> > > petsc-users at mcs.anl.gov >, "Franck Houssen" < franck.houssen at inria.fr >
> > 
> 
> > > Envoy?: Dimanche 21 Mai 2017 22:51:34
> > 
> 
> > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (= one
> > > domain) before and after assembly ?
> > 
> 

> > > To assemble the operator in aij format, use
> > 
> 
> > > MatISGetMPIXAIJ
> > 
> 
> > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatISGetMPIXAIJ.html
> > 
> 

> > > Il 21 Mag 2017 18:43, "Matthew Knepley" < knepley at gmail.com > ha scritto:
> > 
> 

> > > > On Sun, May 21, 2017 at 11:23 AM, Franck Houssen <
> > > > franck.houssen at inria.fr
> > > > >
> > > > wrote:
> > > 
> > 
> 

> > > > > I have a 3x3 global matrix is built (diag: 1, 2, 1): it's made of 2
> > > > > overlapping 2x2 local matrix (diag: 1, 1).
> > > > 
> > > 
> > 
> 
> > > > > Getting non assembled local matrix is OK with MatISGetLocalMat.
> > > > 
> > > 
> > 
> 
> > > > > How to get assembled local matrix (initial local matrix + neigbhor
> > > > > contributions on the borders) ? (expected result is diag: 2, 1)
> > > > 
> > > 
> > 
> 

> > > > You can always use
> > > 
> > 
> 

> > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrix.html
> > > 
> > 
> 
> > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrices.html
> > > 
> > 
> 

> > > > to get copies, but if you just want to build things, you can use
> > > 
> > 
> 

> > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetLocalSubMatrix.html
> > > 
> > 
> 

> > > > Thanks,
> > > 
> > 
> 

> > > > Matt
> > > 
> > 
> 

> > > > > Franck
> > > > 
> > > 
> > 
> 

> > > > --
> > > 
> > 
> 
> > > > What most experimenters take for granted before they begin their
> > > > experiments
> > > > is infinitely more interesting than any results to which their
> > > > experiments
> > > > lead.
> > > 
> > 
> 
> > > > -- Norbert Wiener
> > > 
> > 
> 

> > > > http://www.caam.rice.edu/~mk51/
> > > 
> > 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/68af16d0/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matISLocalMat.cpp
Type: text/x-c++src
Size: 3685 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/68af16d0/attachment.cpp>

From franck.houssen at inria.fr  Tue May 23 11:40:59 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Tue, 23 May 2017 18:40:59 +0200 (CEST)
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <1784716977.7683031.1495556883254.JavaMail.zimbra@inria.fr>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<1564257107.6757440.1495383974167.JavaMail.zimbra@inria.fr>
	<CAMYG4GmzqskrmHAoD6h01r+SUQX_wL=8UPGO71gevy2w_qUvUQ@mail.gmail.com>
	<264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>
	<1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>
	<CAMYG4G=hBu_XJ1rRXRt6Avb4--yy=g2Fmk4x-jbbR9XUo-X4hA@mail.gmail.com>
	<1784716977.7683031.1495556883254.JavaMail.zimbra@inria.fr>
Message-ID: <241093803.7685734.1495557659717.JavaMail.zimbra@inria.fr>

I let this small piece of code to demonstrate the basics.... If you believe (like I do) that this is worth to be added to the available examples: feel free to do it !... 

Franck 

----- Mail original -----

> De: "Franck Houssen" <franck.houssen at inria.fr>
> ?: "Matthew Knepley" <knepley at gmail.com>
> Cc: "PETSc" <petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> Envoy?: Mardi 23 Mai 2017 18:28:03
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS matrix
> and a global vector ?

> OK, thanks. This is helpfull... But I really think the doc should be more
> verbose about that: this is really confusing and I didn't find any simple
> example to begin with which make all this even more confusing (personal
> opinion).

> Franck

> ----- Mail original -----

> > De: "Matthew Knepley" <knepley at gmail.com>
> 
> > ?: "Franck Houssen" <franck.houssen at inria.fr>
> 
> > Cc: "Stefano Zampini" <stefano.zampini at gmail.com>, "PETSc"
> > <petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> 
> > Envoy?: Mardi 23 Mai 2017 13:21:21
> 
> > Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS matrix
> > and a global vector ?
> 

> > On Tue, May 23, 2017 at 4:53 AM, Franck Houssen < franck.houssen at inria.fr >
> > wrote:
> 

> > > The first thing I did was to put 3, not 4 : I got an error thrown in
> > > MatCreateIS (see the git diff + stack below). As the error said I used
> > > globalSize = numberOfMPIProcessus * localSize : my understanding is that,
> > > when using MatIS, the global size needs to be the sum of all local sizes.
> > > Correct ?
> > 
> 

> > No. MatIS means that the matrix is not assembled. The easiest way (for me)
> > to
> > think of this is that processes do not have
> 
> > to hold full rows. One process can hold part of row i, and another
> > processes
> > can hold another part. However, there are still
> 
> > the same number of global rows.
> 

> > > I have a 3x3 global matrix made of two overlapping 2x2 local matrix (=
> > > diagonal with 1.). Each local matrix correspond to one domain (each
> > > domain
> > > is delegated to one MPI proc, so, I have 2 MPI procs because I have 2
> > > domains).
> > 
> 

> > So the global size is 3. The local size here is not the size of the local
> > IS
> > block, since that is a property only of MatIS. It is the
> 
> > size of the local piece of the vector you multiply. This allows PETSc to
> > understand the parallel layout of the Vec, and how it
> 
> > matched the Mat.
> 

> > This is somewhat confusing because FEM people mean something different by
> > "local" than we do here, and in fact we use this
> 
> > other definition of local when assembling operators.
> 

> > Matt
> 

> > > This is the simplest possible example: I have two 2x2 (local) diag matrix
> > > that overlap so that the global matrix built from them is 1, 2, 1 on the
> > > diagonal (local contributions add up in the middle).
> > 
> 
> > > I need to MatMult this global matrix with a global vector filled with 1.
> > 
> 

> > > Franck
> > 
> 

> > > Git diff :
> > 
> 

> > > --- a/matISLocalMat.cpp
> > 
> 
> > > +++ b/matISLocalMat.cpp
> > 
> 
> > > @@ -16,7 +16,7 @@ int main(int argc,char **argv) {
> > 
> 
> > > int size = 0; MPI_Comm_size(MPI_COMM_WORLD, &size); if (size != 2) return
> > > 1;
> > 
> 
> > > int rank = 0; MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> > 
> 

> > > - PetscInt localSize = 2, globalSize = localSize*2 /*2 MPI*/;
> > 
> 
> > > + PetscInt localSize = 2, globalSize = 3;
> > 
> 
> > > PetscInt localIdx[2] = {0, 0};
> > 
> 
> > > if (rank == 0) {localIdx[0] = 0; localIdx[1] = 1;}
> > 
> 
> > > else {localIdx[0] = 1; localIdx[1] = 2;}
> > 
> 

> > > Stack error:
> > 
> 

> > > [0]PETSC ERROR: Nonconforming object sizes
> > 
> 
> > > [0]PETSC ERROR: Sum of local lengths 4 does not equal global length 3, my
> > > local length 2
> > 
> 
> > > [0]PETSC ERROR: [0] ISG2LMapApply line 17
> > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/vec/is/utils/isltog.c
> > 
> 
> > > [0]PETSC ERROR: [0] MatSetValues_IS line 692
> > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > 
> 
> > > [0]PETSC ERROR: [0] MatSetValues line 1157
> > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> > 
> 
> > > [0]PETSC ERROR: [0] MatISSetPreallocation_IS line 95
> > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > 
> 
> > > [0]PETSC ERROR: [0] MatISSetPreallocation line 80
> > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > 
> 
> > > [0]PETSC ERROR: [0] PetscSplitOwnership line 80
> > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/sys/utils/psplit.c
> > 
> 
> > > [0]PETSC ERROR: [0] PetscLayoutSetUp line 129
> > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/vec/is/utils/pmap.c
> > 
> 
> > > [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping_IS line 628
> > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > 
> 
> > > [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping line 1899
> > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> > 
> 
> > > [0]PETSC ERROR: [0] MatCreateIS line 986
> > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > 
> 

> > > > De: "Stefano Zampini" < stefano.zampini at gmail.com >
> > > 
> > 
> 
> > > > ?: "Matthew Knepley" < knepley at gmail.com >
> > > 
> > 
> 
> > > > Cc: "Franck Houssen" < franck.houssen at inria.fr >, "PETSc" <
> > > > petsc-users at mcs.anl.gov >, "PETSc" < petsc-dev at mcs.anl.gov >
> > > 
> > 
> 
> > > > Envoy?: Dimanche 21 Mai 2017 23:02:37
> > > 
> > 
> 
> > > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> > > > matrix
> > > > and a global vector ?
> > > 
> > 
> 

> > > > Franck,
> > > 
> > 
> 

> > > > PETSc takes care of doing the matrix-vector multiplication properly
> > > > using
> > > > MatIS. As Matt said, the layout of the vectors is the usual parallel
> > > > layout.
> > > 
> > 
> 
> > > > The local sizes of the MatIS matrix (i.e. the local size of the left
> > > > and
> > > > right vectors used in MatMult) are not the sizes of the local subdomain
> > > > matrices in MatIS.
> > > 
> > 
> 

> > > > > On May 21, 2017, at 6:47 PM, Matthew Knepley < knepley at gmail.com >
> > > > > wrote:
> > > > 
> > > 
> > 
> 

> > > > > On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <
> > > > > franck.houssen at inria.fr
> > > > > >
> > > > > wrote:
> > > > 
> > > 
> > 
> 

> > > > > > Using PETSc MatIS, how to matmult a global IS matrix and a global
> > > > > > vector
> > > > > > ?
> > > > > > Example is attached : I don't get what I expect that is a vector
> > > > > > such
> > > > > > that
> > > > > > proc0 = [1, 2] and proc1 = [2, 1]
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > 1) I think the global size of your matrix is wrong. You seem to want
> > > > > 3,
> > > > > not
> > > > > 4
> > > > 
> > > 
> > 
> 

> > > > > 2) Global vectors have a non-overlapping row partition. You might be
> > > > > thinking
> > > > > of local vectors
> > > > 
> > > 
> > 
> 

> > > > > Thanks,
> > > > 
> > > 
> > 
> 

> > > > > Matt
> > > > 
> > > 
> > 
> 

> > > > > --
> > > > 
> > > 
> > 
> 
> > > > > What most experimenters take for granted before they begin their
> > > > > experiments
> > > > > is infinitely more interesting than any results to which their
> > > > > experiments
> > > > > lead.
> > > > 
> > > 
> > 
> 
> > > > > -- Norbert Wiener
> > > > 
> > > 
> > 
> 

> > > > > http://www.caam.rice.edu/~mk51/
> > > > 
> > > 
> > 
> 

> > > > De: "Stefano Zampini" < stefano.zampini at gmail.com >
> > > 
> > 
> 
> > > > ?: "Matthew Knepley" < knepley at gmail.com >
> > > 
> > 
> 
> > > > Cc: "Franck Houssen" < franck.houssen at inria.fr >, "PETSc" <
> > > > petsc-users at mcs.anl.gov >, "PETSc" < petsc-dev at mcs.anl.gov >
> > > 
> > 
> 
> > > > Envoy?: Dimanche 21 Mai 2017 23:02:37
> > > 
> > 
> 
> > > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> > > > matrix
> > > > and a global vector ?
> > > 
> > 
> 

> > > > Franck,
> > > 
> > 
> 

> > > > PETSc takes care of doing the matrix-vector multiplication properly
> > > > using
> > > > MatIS. As Matt said, the layout of the vectors is the usual parallel
> > > > layout.
> > > 
> > 
> 
> > > > The local sizes of the MatIS matrix (i.e. the local size of the left
> > > > and
> > > > right vectors used in MatMult) are not the sizes of the local subdomain
> > > > matrices in MatIS.
> > > 
> > 
> 

> > > > > On May 21, 2017, at 6:47 PM, Matthew Knepley < knepley at gmail.com >
> > > > > wrote:
> > > > 
> > > 
> > 
> 

> > > > > On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <
> > > > > franck.houssen at inria.fr
> > > > > >
> > > > > wrote:
> > > > 
> > > 
> > 
> 

> > > > > > Using PETSc MatIS, how to matmult a global IS matrix and a global
> > > > > > vector
> > > > > > ?
> > > > > > Example is attached : I don't get what I expect that is a vector
> > > > > > such
> > > > > > that
> > > > > > proc0 = [1, 2] and proc1 = [2, 1]
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > 1) I think the global size of your matrix is wrong. You seem to want
> > > > > 3,
> > > > > not
> > > > > 4
> > > > 
> > > 
> > 
> 

> > > > > 2) Global vectors have a non-overlapping row partition. You might be
> > > > > thinking
> > > > > of local vectors
> > > > 
> > > 
> > 
> 

> > > > > Thanks,
> > > > 
> > > 
> > 
> 

> > > > > Matt
> > > > 
> > > 
> > 
> 

> > > > > > Franck
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > --
> > > > 
> > > 
> > 
> 
> > > > > What most experimenters take for granted before they begin their
> > > > > experiments
> > > > > is infinitely more interesting than any results to which their
> > > > > experiments
> > > > > lead.
> > > > 
> > > 
> > 
> 
> > > > > -- Norbert Wiener
> > > > 
> > > 
> > 
> 

> > > > > http://www.caam.rice.edu/~mk51/
> > > > 
> > > 
> > 
> 

> > --
> 
> > What most experimenters take for granted before they begin their
> > experiments
> > is infinitely more interesting than any results to which their experiments
> > lead.
> 
> > -- Norbert Wiener
> 

> > http://www.caam.rice.edu/~mk51/
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/552da913/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matISProdMatVec.cpp
Type: text/x-c++src
Size: 2563 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/552da913/attachment-0001.cpp>

From knepley at gmail.com  Tue May 23 11:46:34 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 23 May 2017 11:46:34 -0500
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <1784716977.7683031.1495556883254.JavaMail.zimbra@inria.fr>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<1564257107.6757440.1495383974167.JavaMail.zimbra@inria.fr>
	<CAMYG4GmzqskrmHAoD6h01r+SUQX_wL=8UPGO71gevy2w_qUvUQ@mail.gmail.com>
	<264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>
	<1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>
	<CAMYG4G=hBu_XJ1rRXRt6Avb4--yy=g2Fmk4x-jbbR9XUo-X4hA@mail.gmail.com>
	<1784716977.7683031.1495556883254.JavaMail.zimbra@inria.fr>
Message-ID: <CAMYG4GkkXfO+JWugx-d7bvugXY=7Sqf3HCnrwBONORN5tXxRQA@mail.gmail.com>

On Tue, May 23, 2017 at 11:28 AM, Franck Houssen <franck.houssen at inria.fr>
wrote:

> OK, thanks. This is helpfull... But I really think the doc should be more
> verbose about that: this is really confusing and I didn't find any simple
> example to begin with which make all this even more confusing (personal
> opinion).
>

Did you respond to my other question (how are you using them)? That would
help me understand how to phrase it.

  Thanks,

    Matt


> Franck
>
>
> ------------------------------
>
> *De: *"Matthew Knepley" <knepley at gmail.com>
> *?: *"Franck Houssen" <franck.houssen at inria.fr>
> *Cc: *"Stefano Zampini" <stefano.zampini at gmail.com>, "PETSc" <
> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> *Envoy?: *Mardi 23 Mai 2017 13:21:21
> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> matrix and a global vector ?
>
> On Tue, May 23, 2017 at 4:53 AM, Franck Houssen <franck.houssen at inria.fr>
> wrote:
>
>> The first thing I did was to put 3, not 4 : I got an error thrown in
>> MatCreateIS (see the git diff + stack below). As the error said I used
>> globalSize = numberOfMPIProcessus * localSize : my understanding is that,
>> when using MatIS, the global size needs to be the sum of all local sizes.
>> Correct ?
>>
>
> No. MatIS means that the matrix is not assembled. The easiest way (for me)
> to think of this is that processes do not have
> to hold full rows. One process can hold part of row i, and another
> processes can hold another part. However, there are still
> the same number of global rows.
>
> I have a 3x3 global matrix made of two overlapping 2x2 local matrix (=
>> diagonal with 1.). Each local matrix correspond to one domain (each domain
>> is delegated to one MPI proc, so, I have 2 MPI procs because I have 2
>> domains).
>>
>
> So the global size is 3. The local size here is not the size of the local
> IS block, since that is a property only of MatIS. It is the
> size of the local piece of the vector you multiply. This allows PETSc to
> understand the parallel layout of the Vec, and how it
> matched the Mat.
>
> This is somewhat confusing because FEM people mean something different by
> "local" than we do here, and in fact we use this
> other definition of local when assembling operators.
>
>    Matt
>
>
>> This is the simplest possible example: I have two 2x2 (local) diag matrix
>> that overlap so that the global matrix built from them is 1, 2, 1 on the
>> diagonal (local contributions add up in the middle).
>> I need to MatMult this global matrix with a global vector filled with 1.
>>
>> Franck
>>
>> Git diff :
>>
>> --- a/matISLocalMat.cpp
>> +++ b/matISLocalMat.cpp
>> @@ -16,7 +16,7 @@ int main(int argc,char **argv) {
>>    int size = 0; MPI_Comm_size(MPI_COMM_WORLD, &size); if (size != 2)
>> return 1;
>>    int rank = 0; MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>
>> -  PetscInt localSize = 2, globalSize = localSize*2 /*2 MPI*/;
>> +  PetscInt localSize = 2, globalSize = 3;
>>    PetscInt localIdx[2] = {0, 0};
>>    if (rank == 0) {localIdx[0] = 0; localIdx[1] = 1;}
>>    else           {localIdx[0] = 1; localIdx[1] = 2;}
>>
>>
>>
>> Stack error:
>>
>> [0]PETSC ERROR: Nonconforming object sizes
>> [0]PETSC ERROR: Sum of local lengths 4 does not equal global length 3, my
>> local length 2
>> [0]PETSC ERROR: [0] ISG2LMapApply line 17 /home/fghoussen/Documents/
>> INRIA/petsc-3.7.6/src/vec/is/utils/isltog.c
>> [0]PETSC ERROR: [0] MatSetValues_IS line 692 /home/fghoussen/Documents/
>> INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>> [0]PETSC ERROR: [0] MatSetValues line 1157 /home/fghoussen/Documents/
>> INRIA/petsc-3.7.6/src/mat/interface/matrix.c
>> [0]PETSC ERROR: [0] MatISSetPreallocation_IS line 95
>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>> [0]PETSC ERROR: [0] MatISSetPreallocation line 80
>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>> [0]PETSC ERROR: [0] PetscSplitOwnership line 80 /home/fghoussen/Documents/
>> INRIA/petsc-3.7.6/src/sys/utils/psplit.c
>> [0]PETSC ERROR: [0] PetscLayoutSetUp line 129 /home/fghoussen/Documents/
>> INRIA/petsc-3.7.6/src/vec/is/utils/pmap.c
>> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping_IS line 628
>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping line 1899
>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
>> [0]PETSC ERROR: [0] MatCreateIS line 986 /home/fghoussen/Documents/
>> INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>>
>>
>>
>> ------------------------------
>>
>> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
>> *?: *"Matthew Knepley" <knepley at gmail.com>
>> *Cc: *"Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <
>> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
>> *Envoy?: *Dimanche 21 Mai 2017 23:02:37
>> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
>> matrix and a global vector ?
>>
>> Franck,
>>
>> PETSc takes care of doing the matrix-vector multiplication properly using
>> MatIS.  As Matt said, the layout of the vectors is the usual parallel
>> layout.
>> The local sizes of the MatIS matrix (i.e. the local size of the left and
>> right vectors used in MatMult) are not the sizes of the local subdomain
>>  matrices in MatIS.
>>
>>
>> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>
>> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <franck.houssen at inria.fr
>> > wrote:
>>
>>> Using PETSc MatIS, how to matmult a global IS matrix and a global vector
>>> ? Example is attached : I don't get what I expect that is a vector such
>>> that proc0 = [1, 2] and proc1 = [2, 1]
>>>
>>
>> 1) I think the global size of your matrix is wrong. You seem to want 3,
>> not 4
>>
>> 2) Global vectors have a non-overlapping row partition. You might be
>> thinking of local vectors
>>
>>   Thanks,
>>
>>     Matt
>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> http://www.caam.rice.edu/~mk51/
>>
>>
>> ------------------------------
>>
>> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
>> *?: *"Matthew Knepley" <knepley at gmail.com>
>> *Cc: *"Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <
>> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
>> *Envoy?: *Dimanche 21 Mai 2017 23:02:37
>> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
>> matrix and a global vector ?
>>
>> Franck,
>>
>> PETSc takes care of doing the matrix-vector multiplication properly using
>> MatIS.  As Matt said, the layout of the vectors is the usual parallel
>> layout.
>> The local sizes of the MatIS matrix (i.e. the local size of the left and
>> right vectors used in MatMult) are not the sizes of the local subdomain
>>  matrices in MatIS.
>>
>>
>> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>
>> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <franck.houssen at inria.fr
>> > wrote:
>>
>>> Using PETSc MatIS, how to matmult a global IS matrix and a global vector
>>> ? Example is attached : I don't get what I expect that is a vector such
>>> that proc0 = [1, 2] and proc1 = [2, 1]
>>>
>>
>> 1) I think the global size of your matrix is wrong. You seem to want 3,
>> not 4
>>
>> 2) Global vectors have a non-overlapping row partition. You might be
>> thinking of local vectors
>>
>>   Thanks,
>>
>>     Matt
>>
>>
>>> Franck
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> http://www.caam.rice.edu/~mk51/
>>
>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/6af471d9/attachment.html>

From franck.houssen at inria.fr  Tue May 23 11:51:27 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Tue, 23 May 2017 18:51:27 +0200 (CEST)
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <CAMYG4GkkXfO+JWugx-d7bvugXY=7Sqf3HCnrwBONORN5tXxRQA@mail.gmail.com>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<1564257107.6757440.1495383974167.JavaMail.zimbra@inria.fr>
	<CAMYG4GmzqskrmHAoD6h01r+SUQX_wL=8UPGO71gevy2w_qUvUQ@mail.gmail.com>
	<264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>
	<1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>
	<CAMYG4G=hBu_XJ1rRXRt6Avb4--yy=g2Fmk4x-jbbR9XUo-X4hA@mail.gmail.com>
	<1784716977.7683031.1495556883254.JavaMail.zimbra@inria.fr>
	<CAMYG4GkkXfO+JWugx-d7bvugXY=7Sqf3HCnrwBONORN5tXxRQA@mail.gmail.com>
Message-ID: <855172682.7687763.1495558287122.JavaMail.zimbra@inria.fr>

Not sure to know what question you're talking about ?!... 
I use MatIS to test some kind of domain decomposition methods. I define my own preconditioner for that: in the apply callback, I need to matmult my (matIS) matrix with the incoming vector. 

Franck 

----- Mail original -----

> De: "Matthew Knepley" <knepley at gmail.com>
> ?: "Franck Houssen" <franck.houssen at inria.fr>
> Cc: "Stefano Zampini" <stefano.zampini at gmail.com>, "PETSc"
> <petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> Envoy?: Mardi 23 Mai 2017 18:46:34
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS matrix
> and a global vector ?

> On Tue, May 23, 2017 at 11:28 AM, Franck Houssen < franck.houssen at inria.fr >
> wrote:

> > OK, thanks. This is helpfull... But I really think the doc should be more
> > verbose about that: this is really confusing and I didn't find any simple
> > example to begin with which make all this even more confusing (personal
> > opinion).
> 

> Did you respond to my other question (how are you using them)? That would
> help me understand how to phrase it.

> Thanks,

> Matt

> > Franck
> 

> > > De: "Matthew Knepley" < knepley at gmail.com >
> > 
> 
> > > ?: "Franck Houssen" < franck.houssen at inria.fr >
> > 
> 
> > > Cc: "Stefano Zampini" < stefano.zampini at gmail.com >, "PETSc" <
> > > petsc-users at mcs.anl.gov >, "PETSc" < petsc-dev at mcs.anl.gov >
> > 
> 
> > > Envoy?: Mardi 23 Mai 2017 13:21:21
> > 
> 
> > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> > > matrix
> > > and a global vector ?
> > 
> 

> > > On Tue, May 23, 2017 at 4:53 AM, Franck Houssen < franck.houssen at inria.fr
> > > >
> > > wrote:
> > 
> 

> > > > The first thing I did was to put 3, not 4 : I got an error thrown in
> > > > MatCreateIS (see the git diff + stack below). As the error said I used
> > > > globalSize = numberOfMPIProcessus * localSize : my understanding is
> > > > that,
> > > > when using MatIS, the global size needs to be the sum of all local
> > > > sizes.
> > > > Correct ?
> > > 
> > 
> 

> > > No. MatIS means that the matrix is not assembled. The easiest way (for
> > > me)
> > > to
> > > think of this is that processes do not have
> > 
> 
> > > to hold full rows. One process can hold part of row i, and another
> > > processes
> > > can hold another part. However, there are still
> > 
> 
> > > the same number of global rows.
> > 
> 

> > > > I have a 3x3 global matrix made of two overlapping 2x2 local matrix (=
> > > > diagonal with 1.). Each local matrix correspond to one domain (each
> > > > domain
> > > > is delegated to one MPI proc, so, I have 2 MPI procs because I have 2
> > > > domains).
> > > 
> > 
> 

> > > So the global size is 3. The local size here is not the size of the local
> > > IS
> > > block, since that is a property only of MatIS. It is the
> > 
> 
> > > size of the local piece of the vector you multiply. This allows PETSc to
> > > understand the parallel layout of the Vec, and how it
> > 
> 
> > > matched the Mat.
> > 
> 

> > > This is somewhat confusing because FEM people mean something different by
> > > "local" than we do here, and in fact we use this
> > 
> 
> > > other definition of local when assembling operators.
> > 
> 

> > > Matt
> > 
> 

> > > > This is the simplest possible example: I have two 2x2 (local) diag
> > > > matrix
> > > > that overlap so that the global matrix built from them is 1, 2, 1 on
> > > > the
> > > > diagonal (local contributions add up in the middle).
> > > 
> > 
> 
> > > > I need to MatMult this global matrix with a global vector filled with
> > > > 1.
> > > 
> > 
> 

> > > > Franck
> > > 
> > 
> 

> > > > Git diff :
> > > 
> > 
> 

> > > > --- a/matISLocalMat.cpp
> > > 
> > 
> 
> > > > +++ b/matISLocalMat.cpp
> > > 
> > 
> 
> > > > @@ -16,7 +16,7 @@ int main(int argc,char **argv) {
> > > 
> > 
> 
> > > > int size = 0; MPI_Comm_size(MPI_COMM_WORLD, &size); if (size != 2)
> > > > return
> > > > 1;
> > > 
> > 
> 
> > > > int rank = 0; MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> > > 
> > 
> 

> > > > - PetscInt localSize = 2, globalSize = localSize*2 /*2 MPI*/;
> > > 
> > 
> 
> > > > + PetscInt localSize = 2, globalSize = 3;
> > > 
> > 
> 
> > > > PetscInt localIdx[2] = {0, 0};
> > > 
> > 
> 
> > > > if (rank == 0) {localIdx[0] = 0; localIdx[1] = 1;}
> > > 
> > 
> 
> > > > else {localIdx[0] = 1; localIdx[1] = 2;}
> > > 
> > 
> 

> > > > Stack error:
> > > 
> > 
> 

> > > > [0]PETSC ERROR: Nonconforming object sizes
> > > 
> > 
> 
> > > > [0]PETSC ERROR: Sum of local lengths 4 does not equal global length 3,
> > > > my
> > > > local length 2
> > > 
> > 
> 
> > > > [0]PETSC ERROR: [0] ISG2LMapApply line 17
> > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/vec/is/utils/isltog.c
> > > 
> > 
> 
> > > > [0]PETSC ERROR: [0] MatSetValues_IS line 692
> > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > > 
> > 
> 
> > > > [0]PETSC ERROR: [0] MatSetValues line 1157
> > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> > > 
> > 
> 
> > > > [0]PETSC ERROR: [0] MatISSetPreallocation_IS line 95
> > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > > 
> > 
> 
> > > > [0]PETSC ERROR: [0] MatISSetPreallocation line 80
> > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > > 
> > 
> 
> > > > [0]PETSC ERROR: [0] PetscSplitOwnership line 80
> > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/sys/utils/psplit.c
> > > 
> > 
> 
> > > > [0]PETSC ERROR: [0] PetscLayoutSetUp line 129
> > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/vec/is/utils/pmap.c
> > > 
> > 
> 
> > > > [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping_IS line 628
> > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > > 
> > 
> 
> > > > [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping line 1899
> > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> > > 
> > 
> 
> > > > [0]PETSC ERROR: [0] MatCreateIS line 986
> > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > > 
> > 
> 

> > > > > De: "Stefano Zampini" < stefano.zampini at gmail.com >
> > > > 
> > > 
> > 
> 
> > > > > ?: "Matthew Knepley" < knepley at gmail.com >
> > > > 
> > > 
> > 
> 
> > > > > Cc: "Franck Houssen" < franck.houssen at inria.fr >, "PETSc" <
> > > > > petsc-users at mcs.anl.gov >, "PETSc" < petsc-dev at mcs.anl.gov >
> > > > 
> > > 
> > 
> 
> > > > > Envoy?: Dimanche 21 Mai 2017 23:02:37
> > > > 
> > > 
> > 
> 
> > > > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> > > > > matrix
> > > > > and a global vector ?
> > > > 
> > > 
> > 
> 

> > > > > Franck,
> > > > 
> > > 
> > 
> 

> > > > > PETSc takes care of doing the matrix-vector multiplication properly
> > > > > using
> > > > > MatIS. As Matt said, the layout of the vectors is the usual parallel
> > > > > layout.
> > > > 
> > > 
> > 
> 
> > > > > The local sizes of the MatIS matrix (i.e. the local size of the left
> > > > > and
> > > > > right vectors used in MatMult) are not the sizes of the local
> > > > > subdomain
> > > > > matrices in MatIS.
> > > > 
> > > 
> > 
> 

> > > > > > On May 21, 2017, at 6:47 PM, Matthew Knepley < knepley at gmail.com >
> > > > > > wrote:
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <
> > > > > > franck.houssen at inria.fr
> > > > > > >
> > > > > > wrote:
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > Using PETSc MatIS, how to matmult a global IS matrix and a global
> > > > > > > vector
> > > > > > > ?
> > > > > > > Example is attached : I don't get what I expect that is a vector
> > > > > > > such
> > > > > > > that
> > > > > > > proc0 = [1, 2] and proc1 = [2, 1]
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > 1) I think the global size of your matrix is wrong. You seem to
> > > > > > want
> > > > > > 3,
> > > > > > not
> > > > > > 4
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > 2) Global vectors have a non-overlapping row partition. You might
> > > > > > be
> > > > > > thinking
> > > > > > of local vectors
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > Thanks,
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > Matt
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > --
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > What most experimenters take for granted before they begin their
> > > > > > experiments
> > > > > > is infinitely more interesting than any results to which their
> > > > > > experiments
> > > > > > lead.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > -- Norbert Wiener
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > http://www.caam.rice.edu/~mk51/
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > De: "Stefano Zampini" < stefano.zampini at gmail.com >
> > > > 
> > > 
> > 
> 
> > > > > ?: "Matthew Knepley" < knepley at gmail.com >
> > > > 
> > > 
> > 
> 
> > > > > Cc: "Franck Houssen" < franck.houssen at inria.fr >, "PETSc" <
> > > > > petsc-users at mcs.anl.gov >, "PETSc" < petsc-dev at mcs.anl.gov >
> > > > 
> > > 
> > 
> 
> > > > > Envoy?: Dimanche 21 Mai 2017 23:02:37
> > > > 
> > > 
> > 
> 
> > > > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> > > > > matrix
> > > > > and a global vector ?
> > > > 
> > > 
> > 
> 

> > > > > Franck,
> > > > 
> > > 
> > 
> 

> > > > > PETSc takes care of doing the matrix-vector multiplication properly
> > > > > using
> > > > > MatIS. As Matt said, the layout of the vectors is the usual parallel
> > > > > layout.
> > > > 
> > > 
> > 
> 
> > > > > The local sizes of the MatIS matrix (i.e. the local size of the left
> > > > > and
> > > > > right vectors used in MatMult) are not the sizes of the local
> > > > > subdomain
> > > > > matrices in MatIS.
> > > > 
> > > 
> > 
> 

> > > > > > On May 21, 2017, at 6:47 PM, Matthew Knepley < knepley at gmail.com >
> > > > > > wrote:
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <
> > > > > > franck.houssen at inria.fr
> > > > > > >
> > > > > > wrote:
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > Using PETSc MatIS, how to matmult a global IS matrix and a global
> > > > > > > vector
> > > > > > > ?
> > > > > > > Example is attached : I don't get what I expect that is a vector
> > > > > > > such
> > > > > > > that
> > > > > > > proc0 = [1, 2] and proc1 = [2, 1]
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > 1) I think the global size of your matrix is wrong. You seem to
> > > > > > want
> > > > > > 3,
> > > > > > not
> > > > > > 4
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > 2) Global vectors have a non-overlapping row partition. You might
> > > > > > be
> > > > > > thinking
> > > > > > of local vectors
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > Thanks,
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > Matt
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > Franck
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > --
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > What most experimenters take for granted before they begin their
> > > > > > experiments
> > > > > > is infinitely more interesting than any results to which their
> > > > > > experiments
> > > > > > lead.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > -- Norbert Wiener
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > http://www.caam.rice.edu/~mk51/
> > > > > 
> > > > 
> > > 
> > 
> 

> > > --
> > 
> 
> > > What most experimenters take for granted before they begin their
> > > experiments
> > > is infinitely more interesting than any results to which their
> > > experiments
> > > lead.
> > 
> 
> > > -- Norbert Wiener
> > 
> 

> > > http://www.caam.rice.edu/~mk51/
> > 
> 

> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener

> http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/3c72fcc5/attachment-0001.html>

From knepley at gmail.com  Tue May 23 12:02:28 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 23 May 2017 12:02:28 -0500
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <855172682.7687763.1495558287122.JavaMail.zimbra@inria.fr>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<1564257107.6757440.1495383974167.JavaMail.zimbra@inria.fr>
	<CAMYG4GmzqskrmHAoD6h01r+SUQX_wL=8UPGO71gevy2w_qUvUQ@mail.gmail.com>
	<264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>
	<1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>
	<CAMYG4G=hBu_XJ1rRXRt6Avb4--yy=g2Fmk4x-jbbR9XUo-X4hA@mail.gmail.com>
	<1784716977.7683031.1495556883254.JavaMail.zimbra@inria.fr>
	<CAMYG4GkkXfO+JWugx-d7bvugXY=7Sqf3HCnrwBONORN5tXxRQA@mail.gmail.com>
	<855172682.7687763.1495558287122.JavaMail.zimbra@inria.fr>
Message-ID: <CAMYG4GmH-7+-Ed-GaYD-B+MNVMTre3Zd-zq0Tn64cToESUex3g@mail.gmail.com>

On Tue, May 23, 2017 at 11:51 AM, Franck Houssen <franck.houssen at inria.fr>
wrote:

> Not sure to know what question you're talking about ?!...
> I use MatIS to test some kind of domain decomposition methods. I define my
> own preconditioner for that: in the apply callback, I need to matmult my
> (matIS) matrix with the incoming vector.
>

Okay. I will create an example using your suggestion.

  Thanks,

     Matt


> Franck
>
> ------------------------------
>
> *De: *"Matthew Knepley" <knepley at gmail.com>
> *?: *"Franck Houssen" <franck.houssen at inria.fr>
> *Cc: *"Stefano Zampini" <stefano.zampini at gmail.com>, "PETSc" <
> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> *Envoy?: *Mardi 23 Mai 2017 18:46:34
> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> matrix and a global vector ?
>
> On Tue, May 23, 2017 at 11:28 AM, Franck Houssen <franck.houssen at inria.fr>
> wrote:
>
>> OK, thanks. This is helpfull... But I really think the doc should be more
>> verbose about that: this is really confusing and I didn't find any simple
>> example to begin with which make all this even more confusing (personal
>> opinion).
>>
>
> Did you respond to my other question (how are you using them)? That would
> help me understand how to phrase it.
>
>   Thanks,
>
>     Matt
>
>
>> Franck
>>
>>
>> ------------------------------
>>
>> *De: *"Matthew Knepley" <knepley at gmail.com>
>> *?: *"Franck Houssen" <franck.houssen at inria.fr>
>> *Cc: *"Stefano Zampini" <stefano.zampini at gmail.com>, "PETSc" <
>> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
>> *Envoy?: *Mardi 23 Mai 2017 13:21:21
>> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
>> matrix and a global vector ?
>>
>> On Tue, May 23, 2017 at 4:53 AM, Franck Houssen <franck.houssen at inria.fr>
>> wrote:
>>
>>> The first thing I did was to put 3, not 4 : I got an error thrown in
>>> MatCreateIS (see the git diff + stack below). As the error said I used
>>> globalSize = numberOfMPIProcessus * localSize : my understanding is that,
>>> when using MatIS, the global size needs to be the sum of all local sizes.
>>> Correct ?
>>>
>>
>> No. MatIS means that the matrix is not assembled. The easiest way (for
>> me) to think of this is that processes do not have
>> to hold full rows. One process can hold part of row i, and another
>> processes can hold another part. However, there are still
>> the same number of global rows.
>>
>> I have a 3x3 global matrix made of two overlapping 2x2 local matrix (=
>>> diagonal with 1.). Each local matrix correspond to one domain (each domain
>>> is delegated to one MPI proc, so, I have 2 MPI procs because I have 2
>>> domains).
>>>
>>
>> So the global size is 3. The local size here is not the size of the local
>> IS block, since that is a property only of MatIS. It is the
>> size of the local piece of the vector you multiply. This allows PETSc to
>> understand the parallel layout of the Vec, and how it
>> matched the Mat.
>>
>> This is somewhat confusing because FEM people mean something different by
>> "local" than we do here, and in fact we use this
>> other definition of local when assembling operators.
>>
>>    Matt
>>
>>
>>> This is the simplest possible example: I have two 2x2 (local) diag
>>> matrix that overlap so that the global matrix built from them is 1, 2, 1 on
>>> the diagonal (local contributions add up in the middle).
>>> I need to MatMult this global matrix with a global vector filled with 1.
>>>
>>> Franck
>>>
>>> Git diff :
>>>
>>> --- a/matISLocalMat.cpp
>>> +++ b/matISLocalMat.cpp
>>> @@ -16,7 +16,7 @@ int main(int argc,char **argv) {
>>>    int size = 0; MPI_Comm_size(MPI_COMM_WORLD, &size); if (size != 2)
>>> return 1;
>>>    int rank = 0; MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>>
>>> -  PetscInt localSize = 2, globalSize = localSize*2 /*2 MPI*/;
>>> +  PetscInt localSize = 2, globalSize = 3;
>>>    PetscInt localIdx[2] = {0, 0};
>>>    if (rank == 0) {localIdx[0] = 0; localIdx[1] = 1;}
>>>    else           {localIdx[0] = 1; localIdx[1] = 2;}
>>>
>>>
>>>
>>> Stack error:
>>>
>>> [0]PETSC ERROR: Nonconforming object sizes
>>> [0]PETSC ERROR: Sum of local lengths 4 does not equal global length 3,
>>> my local length 2
>>> [0]PETSC ERROR: [0] ISG2LMapApply line 17 /home/fghoussen/Documents/
>>> INRIA/petsc-3.7.6/src/vec/is/utils/isltog.c
>>> [0]PETSC ERROR: [0] MatSetValues_IS line 692 /home/fghoussen/Documents/
>>> INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>>> [0]PETSC ERROR: [0] MatSetValues line 1157 /home/fghoussen/Documents/
>>> INRIA/petsc-3.7.6/src/mat/interface/matrix.c
>>> [0]PETSC ERROR: [0] MatISSetPreallocation_IS line 95
>>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>>> [0]PETSC ERROR: [0] MatISSetPreallocation line 80
>>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>>> [0]PETSC ERROR: [0] PetscSplitOwnership line 80
>>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/sys/utils/psplit.c
>>> [0]PETSC ERROR: [0] PetscLayoutSetUp line 129 /home/fghoussen/Documents/
>>> INRIA/petsc-3.7.6/src/vec/is/utils/pmap.c
>>> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping_IS line 628
>>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>>> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping line 1899
>>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
>>> [0]PETSC ERROR: [0] MatCreateIS line 986 /home/fghoussen/Documents/
>>> INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>>>
>>>
>>>
>>> ------------------------------
>>>
>>> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
>>> *?: *"Matthew Knepley" <knepley at gmail.com>
>>> *Cc: *"Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <
>>> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
>>> *Envoy?: *Dimanche 21 Mai 2017 23:02:37
>>> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
>>> matrix and a global vector ?
>>>
>>> Franck,
>>>
>>> PETSc takes care of doing the matrix-vector multiplication properly
>>> using MatIS.  As Matt said, the layout of the vectors is the usual parallel
>>> layout.
>>> The local sizes of the MatIS matrix (i.e. the local size of the left and
>>> right vectors used in MatMult) are not the sizes of the local subdomain
>>>  matrices in MatIS.
>>>
>>>
>>> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>
>>> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <
>>> franck.houssen at inria.fr> wrote:
>>>
>>>> Using PETSc MatIS, how to matmult a global IS matrix and a global
>>>> vector ? Example is attached : I don't get what I expect that is a vector
>>>> such that proc0 = [1, 2] and proc1 = [2, 1]
>>>>
>>>
>>> 1) I think the global size of your matrix is wrong. You seem to want 3,
>>> not 4
>>>
>>> 2) Global vectors have a non-overlapping row partition. You might be
>>> thinking of local vectors
>>>
>>>   Thanks,
>>>
>>>     Matt
>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> http://www.caam.rice.edu/~mk51/
>>>
>>>
>>> ------------------------------
>>>
>>> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
>>> *?: *"Matthew Knepley" <knepley at gmail.com>
>>> *Cc: *"Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <
>>> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
>>> *Envoy?: *Dimanche 21 Mai 2017 23:02:37
>>> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
>>> matrix and a global vector ?
>>>
>>> Franck,
>>>
>>> PETSc takes care of doing the matrix-vector multiplication properly
>>> using MatIS.  As Matt said, the layout of the vectors is the usual parallel
>>> layout.
>>> The local sizes of the MatIS matrix (i.e. the local size of the left and
>>> right vectors used in MatMult) are not the sizes of the local subdomain
>>>  matrices in MatIS.
>>>
>>>
>>> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>
>>> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <
>>> franck.houssen at inria.fr> wrote:
>>>
>>>> Using PETSc MatIS, how to matmult a global IS matrix and a global
>>>> vector ? Example is attached : I don't get what I expect that is a vector
>>>> such that proc0 = [1, 2] and proc1 = [2, 1]
>>>>
>>>
>>> 1) I think the global size of your matrix is wrong. You seem to want 3,
>>> not 4
>>>
>>> 2) Global vectors have a non-overlapping row partition. You might be
>>> thinking of local vectors
>>>
>>>   Thanks,
>>>
>>>     Matt
>>>
>>>
>>>> Franck
>>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> http://www.caam.rice.edu/~mk51/
>>>
>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> http://www.caam.rice.edu/~mk51/
>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/9ce60828/attachment.html>

From friedenhe at gmail.com  Tue May 23 12:09:05 2017
From: friedenhe at gmail.com (Ping He)
Date: Tue, 23 May 2017 13:09:05 -0400
Subject: [petsc-users] How to manually set the matrix-free differencing
	parameter h?
Message-ID: <59246CB1.4020000@gmail.com>

Hi,

I am using PETSc-SNES matrix free approach, and I would like to know how 
to manually set the differencing parameter h. I tried to use the 
SNESDefaultMatrixFreeSetParameters2 function but I got an error when 
compiling: ?SNESDefaultMatrixFreeSetParameters2? was not declared in 
this scope.

Thanks very much in advance.

Regards,
Ping
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/801ef39f/attachment-0001.html>

From bsmith at mcs.anl.gov  Tue May 23 13:02:53 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 23 May 2017 13:02:53 -0500
Subject: [petsc-users] How to manually set the matrix-free differencing
 parameter h?
In-Reply-To: <59246CB1.4020000@gmail.com>
References: <59246CB1.4020000@gmail.com>
Message-ID: <6D3FA436-3969-45A8-ADAD-AAEDE2FFC891@mcs.anl.gov>


   That's not really the right function; you can use MatMFFDSetFunctionError(), MatMFFDSetType(), MatMFFDSetPeriod() to set the available parameters.


    Barry


> On May 23, 2017, at 12:09 PM, Ping He <friedenhe at gmail.com> wrote:
> 
> Hi,
> 
> I am using PETSc-SNES matrix free approach, and I would like to know how to manually set the differencing parameter h. I tried to use the SNESDefaultMatrixFreeSetParameters2 function but I got an error when compiling: ?SNESDefaultMatrixFreeSetParameters2? was not declared in this scope.
> 
> Thanks very much in advance.
> 
> Regards,
> Ping


From stefano.zampini at gmail.com  Tue May 23 13:23:49 2017
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Tue, 23 May 2017 20:23:49 +0200
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to get local matrix (= one domain) before and after assembly ?
In-Reply-To: <740691579.7684644.1495557293858.JavaMail.zimbra@inria.fr>
References: <867421313.6757137.1495383596545.JavaMail.zimbra@inria.fr>
	<1253777447.6757298.1495383780337.JavaMail.zimbra@inria.fr>
	<CAMYG4Gknt_kBW8LXHma=2fFmBO1X70B0Hg6JiXi+nuDwy5ygKA@mail.gmail.com>
	<CAGPUisi5V7DkFLAt3kBRVTyVrZnYjPV21uTdUwiDku8_MVnHPA@mail.gmail.com>
	<2033509705.7414108.1495532492501.JavaMail.zimbra@inria.fr>
	<CAGPUishuMxWcPHUkWgXV42awH4rTv5W=64oz5xBShU8rTgn9Hw@mail.gmail.com>
	<740691579.7684644.1495557293858.JavaMail.zimbra@inria.fr>
Message-ID: <9EFA5BCF-FDD3-45FA-A41A-6AA304D58C74@gmail.com>


> On May 23, 2017, at 6:34 PM, Franck Houssen <franck.houssen at inria.fr> wrote:
> 
> OK. I am supposed to destroy the matrix returned by MatISGetMPIXAIJ ?
Yes
> Also, my example still not get the final assembled local matrix (the MatCreateSubMatrix returns an empty matrix) but as far as I understand my (global) index set is OK: what did I miss ?

I really doubt you can use the example you have sent. It doesn?t compile, as MatCreateSubMatrix needs an extra argument.
Attached a modified version that does what I guess is what you are looking for (sequential Dirichlet problems on the subdomains).

> 
> Franck
> 
> 
> De: "Stefano Zampini" <stefano.zampini at gmail.com>
> ?: "Franck Houssen" <franck.houssen at inria.fr>
> Cc: "petsc-dev" <petsc-dev at mcs.anl.gov>, "PETSc users list" <petsc-users at mcs.anl.gov>, "petsc-maint" <knepley at gmail.com>
> Envoy?: Mardi 23 Mai 2017 13:16:18
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (= one domain) before and after assembly ?
> 
> MatISGetMPIXAIJ is collective, as it assembles the global operator. To get the matrices you are looking for, you should call MatCreateSubMatrix on the assembled global operator, with the global indices representing the subdomain problem. Each process needs to call both functions
> 
> Stefano
> 
> Il 23 Mag 2017 11:41, "Franck Houssen" <franck.houssen at inria.fr <mailto:franck.houssen at inria.fr>> ha scritto:
> I have a 3x3 global matrix made of two overlapping 2x2 local matrix (= diagonal with 1.). Each local matrix correspond to one domain (each domain is delegated to one MPI proc, so, I have 2 MPI procs because I have 2 domains).
> This is the simplest possible example: I have two 2x2 (local) diag matrix that overlap so that the global matrix built from them is 1, 2, 1 on the diagonal (local contributions add up in the middle).
> 
> Now, I need for each MPI proc to get the assembled local matrix (sometimes called the dirichlet matrix) : this is a local matrix (sequential - not distributed with MPI) that accounts for contribution of neighboring domains (MPI proc).
> 
> How to get the local assembled matrix ? MatGetLocalSubMatrix does not work (throw error - see example attached). MatGetSubMatrix returns a MPI distributed matrix, not a local (sequential) one.
> My understanding is that MatISGetMPIXAIJ should return a local matrix (sequential AIJ matrix) : the MPI in the name recall that you get the assembled matrix (with contributions from the shared border) from the other MPI processus. Correct ? In my simple example, I replaced MatGetLocalSubMatrix with MatISGetMPIXAIJ : I get a deadlock which was surprising to me... Is MatISGetMPIXAIJ a collective call ?
> Supposing this is a collective call (and that point 1 is not correct), I ride up  MatISGetMPIXAIJ before the "if (rank > 0)" : I don't deadlock now, but it seems I get a global matrix which is not the assembled local matrix I am looking for.
> I am supposed to destroy the matrix returned by MatISGetMPIXAIJ ? (I believe yes - not sure as AFAIU wording should associate Destroy methods to Create methods)
> Franck
> 
> The git diff illustrate modifications I tried to add to the initial file attached to this thread:
> --- a/matISLocalMat.cpp
> +++ b/matISLocalMat.cpp
> @@ -31,6 +31,8 @@ int main(int argc,char **argv) {
>    MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY); MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY);
>    MatView(A, PETSC_VIEWER_STDOUT_WORLD); PetscViewerFlush(PETSC_VIEWER_STDOUT_WORLD); // Diag: 1, 2, 1
>  
> +  Mat assembledLocalMat;
> +  MatISGetMPIXAIJ(A, MAT_INITIAL_MATRIX, &assembledLocalMat);
>    if (rank > 0) { // Do not pollute stdout: print only 1 proc
>      std::cout << std::endl << "non assembled local matrix:" << std::endl << std::endl;
>      Mat nonAssembledLocalMat;
> @@ -38,11 +40,10 @@ int main(int argc,char **argv) {
>      MatView(nonAssembledLocalMat, PETSC_VIEWER_STDOUT_SELF); // Diag: 1, 1
>  
>      std::cout << std::endl << "assembled local matrix:" << std::endl << std::endl;
> -    Mat assembledLocalMat;
> -    IS is; ISCreateGeneral(PETSC_COMM_SELF, localSize, localIdx, PETSC_COPY_VALUES, &is);
> -    MatGetLocalSubMatrix(A, is, is, &assembledLocalMat); // KO ?!...
> -    MatView(assembledLocalMat, PETSC_VIEWER_STDOUT_SELF); // Would like to get => Diag: 2, 1
> +    //IS is; ISCreateGeneral(PETSC_COMM_SELF, localSize, localIdx, PETSC_COPY_VALUES, &is);
> +    //MatGetLocalSubMatrix(A, is, is, &assembledLocalMat); // KO ?!...
>    }
> +  MatView(assembledLocalMat, PETSC_VIEWER_STDOUT_WORLD); // Would like to get => Diag: 2, 1
> 
> 
> De: "Stefano Zampini" <stefano.zampini at gmail.com <mailto:stefano.zampini at gmail.com>>
> ?: "petsc-maint" <knepley at gmail.com <mailto:knepley at gmail.com>>
> Cc: "petsc-dev" <petsc-dev at mcs.anl.gov <mailto:petsc-dev at mcs.anl.gov>>, "PETSc users list" <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>, "Franck Houssen" <franck.houssen at inria.fr <mailto:franck.houssen at inria.fr>>
> Envoy?: Dimanche 21 Mai 2017 22:51:34
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (= one domain) before and after assembly ?
> 
> To assemble the operator in aij format, use 
> MatISGetMPIXAIJ
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatISGetMPIXAIJ.html <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatISGetMPIXAIJ.html>
> 
> Il 21 Mag 2017 18:43, "Matthew Knepley" <knepley at gmail.com <mailto:knepley at gmail.com>> ha scritto:
> On Sun, May 21, 2017 at 11:23 AM, Franck Houssen <franck.houssen at inria.fr <mailto:franck.houssen at inria.fr>> wrote:
> I have a 3x3 global matrix is built (diag: 1, 2, 1): it's made of 2 overlapping 2x2 local matrix (diag: 1, 1).
> Getting non assembled local matrix is OK with MatISGetLocalMat.
> How to get assembled local matrix (initial local matrix + neigbhor contributions on the borders) ? (expected result is diag: 2, 1)
> 
> You can always use
> 
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrix.html <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrix.html>
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrices.html <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrices.html>
> 
> to get copies, but if you just want to build things, you can use
> 
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetLocalSubMatrix.html <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetLocalSubMatrix.html>
> 
>   Thanks,
> 
>      Matt
>  
> Franck
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/~mk51/>
> 
> 
> <matISLocalMat.cpp>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/2b2916aa/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matISLocalMat.cpp
Type: application/octet-stream
Size: 3788 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/2b2916aa/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/2b2916aa/attachment-0001.html>

From luvsharma11 at gmail.com  Tue May 23 14:05:08 2017
From: luvsharma11 at gmail.com (Luv Sharma)
Date: Tue, 23 May 2017 21:05:08 +0200
Subject: [petsc-users] matshell for spectral methods in fortran
Message-ID: <C4632F06-2DE0-4513-A649-327D578A61CA@gmail.com>

Dear PETSc team,

I am working on a code which solves mechanical equilibrium using spectral methods.
I want to make use of the matshell to get the action J*v.

I have been able to successfully implement it using petsc4py. But having difficulties to get it working in a fortran code.
I am using petsc-3.7.6. 

Below is a stripped down version of the existing fortran code (module). Can you please help me in figuring out how the right way to do it a code with following structure?

!--------------------------------------------------------------------------------------------------
module spectral_mech_basic

 implicit none
 private
#include <petsc/finclude/petsc.h90>

! *PETSc data here*
 DM ..
 SNES ..
 ..
contains

!--------------------------------------------------------------------------------------------------
subroutine basicPETSc_init

 external :: &
   *petsc functions here*

! initialize solver specific parts of PETSc
 call SNESCreate(PETSC_COMM_WORLD,snes,ierr); CHKERRQ(ierr)
 call SNESSetOptionsPrefix(snes,'mech_',ierr);CHKERRQ(ierr)  
 call DMDACreate3d(PETSC_COMM_WORLD, &
        DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, &                                  
        DMDA_STENCIL_BOX, &                                                                         
        grid(1),grid(2),grid(3), &                                                          
        1 , 1, worldsize, &
        9, 0, &                                                                                    
        grid(1),grid(2),localK, &                                                             
        da,ierr)                                                                                  
 CHKERRQ(ierr)
 call SNESSetDM(snes,da,ierr); CHKERRQ(ierr)
 call DMCreateGlobalVector(da,solution_vec,ierr); CHKERRQ(ierr)                                    
 call DMDASNESSetFunctionLocal(da,INSERT_VALUES,BasicPETSC_formResidual,PETSC_NULL_OBJECT,ierr)    
 CHKERRQ(ierr) 
 call SNESSetDM(snes,da,ierr); CHKERRQ(ierr)                                                       
 call SNESGetKSP(snes,ksp,ierr); CHKERRQ(ierr)                                                    
 call KSPGetPC(ksp,pc,ierr); CHKERRQ(ierr)                                                          
 call PCSetType(pc,PCNONE,ierr); CHKERRQ(ierr)                                                     
 call SNESSetFromOptions(snes,ierr); CHKERRQ(ierr)                                                 

! init fields                 
 call DMDAVecGetArrayF90(da,solution_vec,F,ierr); CHKERRQ(ierr)                                    
 call DMDAVecRestoreArrayF90(da,solution_vec,F,ierr); CHKERRQ(ierr)                               

end subroutine basicPETSc_init
!--------------------------------------------------------------------------------------------------

type(tSolutionState) function &
  basicPETSc_solution(incInfoIn,timeinc,timeinc_old,stress_BC,rotation_BC)
 implicit none
! PETSc Data
 PetscErrorCode :: ierr   
 SNESConvergedReason :: reason
 external :: &
   SNESSolve, &
! solve BVP 
 call SNESSolve(snes,PETSC_NULL_OBJECT,solution_vec,ierr)
 CHKERRQ(ierr)
end function BasicPETSc_solution
!--------------------------------------------------------------------------------------------------
!> @brief forms the basic residual vector
!--------------------------------------------------------------------------------------------------
subroutine BasicPETSC_formResidual(in,x_scal,f_scal,dummy,ierr)

 implicit none
 DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: &
   in
 PetscScalar, dimension(3,3, &
   XG_RANGE,YG_RANGE,ZG_RANGE), intent(in) :: &
   x_scal
 PetscScalar, dimension(3,3, &
   X_RANGE,Y_RANGE,Z_RANGE), intent(out) :: &
   f_scal
! constructing residual
?..
?.

 f_scal = tensorField_real(1:3,1:3,1:grid(1),1:grid(2),1:grid3)

end subroutine BasicPETSc_formResidual
!--------------------------------------------------------------------------------------------------

end module spectral_mech_basic
!--------------------------------------------------------------------------------------------------

Best regards,
Luv

> On 3 Nov 2016, at 01:17, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>  Is anyone away of cases where PETSc has been used with spectral methods?
> 
>   Thanks
> 
>    Barry
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/3c865a59/attachment-0001.html>

From stefano.zampini at gmail.com  Tue May 23 14:09:16 2017
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Tue, 23 May 2017 21:09:16 +0200
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <1784716977.7683031.1495556883254.JavaMail.zimbra@inria.fr>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<1564257107.6757440.1495383974167.JavaMail.zimbra@inria.fr>
	<CAMYG4GmzqskrmHAoD6h01r+SUQX_wL=8UPGO71gevy2w_qUvUQ@mail.gmail.com>
	<264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>
	<1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>
	<CAMYG4G=hBu_XJ1rRXRt6Avb4--yy=g2Fmk4x-jbbR9XUo-X4hA@mail.gmail.com>
	<1784716977.7683031.1495556883254.JavaMail.zimbra@inria.fr>
Message-ID: <CAGPUisg3kYAeWGOAdKWu4nFNZJWzpAwzpEmqM2RNSijGxsZ4jA@mail.gmail.com>

Il 23 Mag 2017 6:28 PM, "Franck Houssen" <franck.houssen at inria.fr> ha
scritto:

OK, thanks. This is helpfull... But I really think the doc should be more
verbose about that: this is really confusing and I didn't find any simple
example to begin with which make all this even more confusing (personal
opinion).


The man page of MatCreateIS is  clear to me

http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/Mat/MatCreateIS.html#MatCreateIS



Franck


------------------------------

*De: *"Matthew Knepley" <knepley at gmail.com>
*?: *"Franck Houssen" <franck.houssen at inria.fr>
*Cc: *"Stefano Zampini" <stefano.zampini at gmail.com>, "PETSc" <
petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
*Envoy?: *Mardi 23 Mai 2017 13:21:21

*Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
matrix and a global vector ?

On Tue, May 23, 2017 at 4:53 AM, Franck Houssen <franck.houssen at inria.fr>
wrote:

> The first thing I did was to put 3, not 4 : I got an error thrown in
> MatCreateIS (see the git diff + stack below). As the error said I used
> globalSize = numberOfMPIProcessus * localSize : my understanding is that,
> when using MatIS, the global size needs to be the sum of all local sizes.
> Correct ?
>

No. MatIS means that the matrix is not assembled. The easiest way (for me)
to think of this is that processes do not have
to hold full rows. One process can hold part of row i, and another
processes can hold another part. However, there are still
the same number of global rows.

I have a 3x3 global matrix made of two overlapping 2x2 local matrix (=
> diagonal with 1.). Each local matrix correspond to one domain (each domain
> is delegated to one MPI proc, so, I have 2 MPI procs because I have 2
> domains).
>

So the global size is 3. The local size here is not the size of the local
IS block, since that is a property only of MatIS. It is the
size of the local piece of the vector you multiply. This allows PETSc to
understand the parallel layout of the Vec, and how it
matched the Mat.

This is somewhat confusing because FEM people mean something different by
"local" than we do here, and in fact we use this
other definition of local when assembling operators.

   Matt


> This is the simplest possible example: I have two 2x2 (local) diag matrix
> that overlap so that the global matrix built from them is 1, 2, 1 on the
> diagonal (local contributions add up in the middle).
> I need to MatMult this global matrix with a global vector filled with 1.
>
> Franck
>
> Git diff :
>
> --- a/matISLocalMat.cpp
> +++ b/matISLocalMat.cpp
> @@ -16,7 +16,7 @@ int main(int argc,char **argv) {
>    int size = 0; MPI_Comm_size(MPI_COMM_WORLD, &size); if (size != 2)
> return 1;
>    int rank = 0; MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>
> -  PetscInt localSize = 2, globalSize = localSize*2 /*2 MPI*/;
> +  PetscInt localSize = 2, globalSize = 3;
>    PetscInt localIdx[2] = {0, 0};
>    if (rank == 0) {localIdx[0] = 0; localIdx[1] = 1;}
>    else           {localIdx[0] = 1; localIdx[1] = 2;}
>
>
>
> Stack error:
>
> [0]PETSC ERROR: Nonconforming object sizes
> [0]PETSC ERROR: Sum of local lengths 4 does not equal global length 3, my
> local length 2
> [0]PETSC ERROR: [0] ISG2LMapApply line 17 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/vec/is/utils/isltog.c
> [0]PETSC ERROR: [0] MatSetValues_IS line 692 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] MatSetValues line 1157 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> [0]PETSC ERROR: [0] MatISSetPreallocation_IS line 95
> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] MatISSetPreallocation line 80
> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] PetscSplitOwnership line 80 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/sys/utils/psplit.c
> [0]PETSC ERROR: [0] PetscLayoutSetUp line 129 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/vec/is/utils/pmap.c
> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping_IS line 628
> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping line 1899
> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> [0]PETSC ERROR: [0] MatCreateIS line 986 /home/fghoussen/Documents/
> INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>
>
>
> ------------------------------
>
> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
> *?: *"Matthew Knepley" <knepley at gmail.com>
> *Cc: *"Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <
> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> *Envoy?: *Dimanche 21 Mai 2017 23:02:37
> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> matrix and a global vector ?
>
> Franck,
>
> PETSc takes care of doing the matrix-vector multiplication properly using
> MatIS.  As Matt said, the layout of the vectors is the usual parallel
> layout.
> The local sizes of the MatIS matrix (i.e. the local size of the left and
> right vectors used in MatMult) are not the sizes of the local subdomain
>  matrices in MatIS.
>
>
> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <franck.houssen at inria.fr>
> wrote:
>
>> Using PETSc MatIS, how to matmult a global IS matrix and a global vector
>> ? Example is attached : I don't get what I expect that is a vector such
>> that proc0 = [1, 2] and proc1 = [2, 1]
>>
>
> 1) I think the global size of your matrix is wrong. You seem to want 3,
> not 4
>
> 2) Global vectors have a non-overlapping row partition. You might be
> thinking of local vectors
>
>   Thanks,
>
>     Matt
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
>
> ------------------------------
>
> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
> *?: *"Matthew Knepley" <knepley at gmail.com>
> *Cc: *"Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <
> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> *Envoy?: *Dimanche 21 Mai 2017 23:02:37
> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> matrix and a global vector ?
>
> Franck,
>
> PETSc takes care of doing the matrix-vector multiplication properly using
> MatIS.  As Matt said, the layout of the vectors is the usual parallel
> layout.
> The local sizes of the MatIS matrix (i.e. the local size of the left and
> right vectors used in MatMult) are not the sizes of the local subdomain
>  matrices in MatIS.
>
>
> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <franck.houssen at inria.fr>
> wrote:
>
>> Using PETSc MatIS, how to matmult a global IS matrix and a global vector
>> ? Example is attached : I don't get what I expect that is a vector such
>> that proc0 = [1, 2] and proc1 = [2, 1]
>>
>
> 1) I think the global size of your matrix is wrong. You seem to want 3,
> not 4
>
> 2) Global vectors have a non-overlapping row partition. You might be
> thinking of local vectors
>
>   Thanks,
>
>     Matt
>
>
>> Franck
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/0763f32b/attachment.html>

From bsmith at mcs.anl.gov  Tue May 23 14:16:27 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 23 May 2017 14:16:27 -0500
Subject: [petsc-users] matshell for spectral methods in fortran
In-Reply-To: <C4632F06-2DE0-4513-A649-327D578A61CA@gmail.com>
References: <C4632F06-2DE0-4513-A649-327D578A61CA@gmail.com>
Message-ID: <72BF559A-E663-4F04-93A2-222D4AD8A94B@mcs.anl.gov>


   You didn't include any code related to creating or setting the MATSHELL.  What goes wrong with your Fortran code.


> On May 23, 2017, at 2:05 PM, Luv Sharma <luvsharma11 at gmail.com> wrote:
> 
> Dear PETSc team,
> 
> I am working on a code which solves mechanical equilibrium using spectral methods.
> I want to make use of the matshell to get the action J*v.
> 
> I have been able to successfully implement it using petsc4py. But having difficulties to get it working in a fortran code.
> I am using petsc-3.7.6. 
> 
> Below is a stripped down version of the existing fortran code (module). Can you please help me in figuring out how the right way to do it a code with following structure?
> 
> !--------------------------------------------------------------------------------------------------
> module spectral_mech_basic
> 
>  implicit none
>  private
> #include <petsc/finclude/petsc.h90>
> 
> ! *PETSc data here*
>  DM ..
>  SNES ..
>  ..
> contains
> 
> !--------------------------------------------------------------------------------------------------
> subroutine basicPETSc_init
> 
>  external :: &
>    *petsc functions here*
> 
> ! initialize solver specific parts of PETSc
>  call SNESCreate(PETSC_COMM_WORLD,snes,ierr); CHKERRQ(ierr)
>  call SNESSetOptionsPrefix(snes,'mech_',ierr);CHKERRQ(ierr)  
>  call DMDACreate3d(PETSC_COMM_WORLD, &
>         DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, &                                  
>         DMDA_STENCIL_BOX, &                                                                         
>         grid(1),grid(2),grid(3), &                                                          
>         1 , 1, worldsize, &
>         9, 0, &                                                                                    
>         grid(1),grid(2),localK, &                                                             
>         da,ierr)                                                                                  
>  CHKERRQ(ierr)
>  call SNESSetDM(snes,da,ierr); CHKERRQ(ierr)
>  call DMCreateGlobalVector(da,solution_vec,ierr); CHKERRQ(ierr)                                    
>  call DMDASNESSetFunctionLocal(da,INSERT_VALUES,BasicPETSC_formResidual,PETSC_NULL_OBJECT,ierr)    
>  CHKERRQ(ierr) 
>  call SNESSetDM(snes,da,ierr); CHKERRQ(ierr)                                                       
>  call SNESGetKSP(snes,ksp,ierr); CHKERRQ(ierr)                                                    
>  call KSPGetPC(ksp,pc,ierr); CHKERRQ(ierr)                                                          
>  call PCSetType(pc,PCNONE,ierr); CHKERRQ(ierr)                                                     
>  call SNESSetFromOptions(snes,ierr); CHKERRQ(ierr)                                                 
> 
> ! init fields                 
>  call DMDAVecGetArrayF90(da,solution_vec,F,ierr); CHKERRQ(ierr)                                    
>  call DMDAVecRestoreArrayF90(da,solution_vec,F,ierr); CHKERRQ(ierr)                               
> 
> end subroutine basicPETSc_init
> !--------------------------------------------------------------------------------------------------
> 
> type(tSolutionState) function &
>   basicPETSc_solution(incInfoIn,timeinc,timeinc_old,stress_BC,rotation_BC)
>  implicit none
> ! PETSc Data
>  PetscErrorCode :: ierr   
>  SNESConvergedReason :: reason
>  external :: &
>    SNESSolve, &
> ! solve BVP 
>  call SNESSolve(snes,PETSC_NULL_OBJECT,solution_vec,ierr)
>  CHKERRQ(ierr)
> end function BasicPETSc_solution
> !--------------------------------------------------------------------------------------------------
> !> @brief forms the basic residual vector
> !--------------------------------------------------------------------------------------------------
> subroutine BasicPETSC_formResidual(in,x_scal,f_scal,dummy,ierr)
> 
>  implicit none
>  DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: &
>    in
>  PetscScalar, dimension(3,3, &
>    XG_RANGE,YG_RANGE,ZG_RANGE), intent(in) :: &
>    x_scal
>  PetscScalar, dimension(3,3, &
>    X_RANGE,Y_RANGE,Z_RANGE), intent(out) :: &
>    f_scal
> ! constructing residual
> ?..
> ?.
> 
>  f_scal = tensorField_real(1:3,1:3,1:grid(1),1:grid(2),1:grid3)
> 
> end subroutine BasicPETSc_formResidual
> !--------------------------------------------------------------------------------------------------
> 
> end module spectral_mech_basic
> !--------------------------------------------------------------------------------------------------
> 
> Best regards,
> Luv
> 
>> On 3 Nov 2016, at 01:17, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>> 
>>  Is anyone away of cases where PETSc has been used with spectral methods?
>> 
>>   Thanks
>> 
>>    Barry
>> 


From luvsharma11 at gmail.com  Tue May 23 14:36:12 2017
From: luvsharma11 at gmail.com (Luv Sharma)
Date: Tue, 23 May 2017 21:36:12 +0200
Subject: [petsc-users] matshell for spectral methods in fortran
In-Reply-To: <72BF559A-E663-4F04-93A2-222D4AD8A94B@mcs.anl.gov>
References: <C4632F06-2DE0-4513-A649-327D578A61CA@gmail.com>
	<72BF559A-E663-4F04-93A2-222D4AD8A94B@mcs.anl.gov>
Message-ID: <8B5B69FE-F0A1-4442-9B03-3A9DBA447B2E@gmail.com>

Dear Barry, 

Thanks for your quick reply.
I have tried following:

!--------------------------------------------------------------------------------------------------
module spectral_mech_basic

 implicit none
 private
#include <petsc/finclude/petsc.h90>

! *PETSc data here*
 DM ..
 SNES ..
 ..
contains

!--------------------------------------------------------------------------------------------------
subroutine basicPETSc_init

 external :: &
   *petsc functions here*

! initialize solver specific parts of PETSc
 call SNESCreate(PETSC_COMM_WORLD,snes,ierr); CHKERRQ(ierr)
 call SNESSetOptionsPrefix(snes,'mech_',ierr);CHKERRQ(ierr)  
 call DMDACreate3d(PETSC_COMM_WORLD, &
        DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, &                                  
        DMDA_STENCIL_BOX, &                                                                         
        grid(1),grid(2),grid(3), &                                                          
        1 , 1, worldsize, &
        9, 0, &                                                                                    
        grid(1),grid(2),localK, &                                                             
        da,ierr)                                                                                  
 CHKERRQ(ierr)
 call SNESSetDM(snes,da,ierr); CHKERRQ(ierr)
 call DMCreateGlobalVector(da,solution_vec,ierr); CHKERRQ(ierr)                                    
 call DMDASNESSetFunctionLocal(da,INSERT_VALUES,BasicPETSC_formResidual,PETSC_NULL_OBJECT,ierr)    
 CHKERRQ(ierr) 


 !call DMCreateMatrix(da,J_shell,ierr)
 !CHKERRQ(ierr)
 !call DMSetMatType(da,MATSHELL,ierr)
 !CHKERRQ(ierr)
 !call DMSNESSetJacobianLocal(da,SPEC_mech_formJacobian,PETSC_NULL_OBJECT,ierr)                 !< function to evaluate stiffness matrix
 !CHKERRQ(ierr)


 matsize = 9_pInt*grid(1)*grid(2)*grid(3)
 call MatCreateShell( PETSC_COMM_WORLD, matsize, matsize, matsize, matsize, PETSC_NULL_OBJECT, J_shell, ierr )
 CHKERRQ(ierr)

 call SNESSetJacobian( snes, J_shell, J_shell, SPEC_mech_formJacobian, PETSC_NULL_OBJECT, ierr)
 CHKERRQ(ierr)

 call MatShellSetOperation(J_shell, MATOP_MULT, jac_shell, ierr )
 CHKERRQ(ierr)

 call SNESSetDM(snes,da,ierr); CHKERRQ(ierr)                                                       
 call SNESGetKSP(snes,ksp,ierr); CHKERRQ(ierr)                                                    
 call KSPGetPC(ksp,pc,ierr); CHKERRQ(ierr)                                                          
 call PCSetType(pc,PCNONE,ierr); CHKERRQ(ierr)                                                     
 call SNESSetFromOptions(snes,ierr); CHKERRQ(ierr)                                                 

! init fields                 
 call DMDAVecGetArrayF90(da,solution_vec,F,ierr); CHKERRQ(ierr)                                    
 call DMDAVecRestoreArrayF90(da,solution_vec,F,ierr); CHKERRQ(ierr)                               

end subroutine basicPETSc_init
!--------------------------------------------------------------------------------------------------

type(tSolutionState) function &
  basicPETSc_solution(incInfoIn,timeinc,timeinc_old,stress_BC,rotation_BC)
 implicit none
! PETSc Data
 PetscErrorCode :: ierr   
 SNESConvergedReason :: reason
 external :: &
   SNESSolve, &
! solve BVP 
 call SNESSolve(snes,PETSC_NULL_OBJECT,solution_vec,ierr)
 CHKERRQ(ierr)
end function BasicPETSc_solution
!--------------------------------------------------------------------------------------------------
!> @brief forms the basic residual vector
!--------------------------------------------------------------------------------------------------
subroutine BasicPETSC_formResidual(in,x_scal,f_scal,dummy,ierr)

 implicit none
 DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: &
   in
 PetscScalar, dimension(3,3, &
   XG_RANGE,YG_RANGE,ZG_RANGE), intent(in) :: &
   x_scal
 PetscScalar, dimension(3,3, &
   X_RANGE,Y_RANGE,Z_RANGE), intent(out) :: &
   f_scal
! constructing residual
?..
?.

 f_scal = tensorField_real(1:3,1:3,1:grid(1),1:grid(2),1:grid3)

end subroutine BasicPETSc_formResidual
!?????????????????????????????????????????????????
!> @brief matmult routine
!--------------------------------------------------------------------------------------------------
    ! a shell jacobian; returns the action J*v
subroutine jac_shell(Jshell,v_in,v_out)

 use math, only: &
   math_rotate_backward33, &
   math_transpose33, &
   math_mul3333xx33
 use mesh, only: &
   grid, &
   grid3
 use spectral_utilities, only: &
   wgt, &
   tensorField_real, &
   utilities_FFTtensorForward, &
   utilities_fourierGammaConvolution, &
   utilities_FFTtensorBackward, &
   Utilities_constitutiveResponse, &
   Utilities_divergenceRMS
 implicit none
! DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: &
!   in
 PetscScalar, dimension(3,3, &
   1000,1,1), intent(in) :: &
   v_in
 PetscScalar, dimension(3,3, &
   1000,1,1), intent(out) :: &
   v_out
 
 Mat                  :: Jshell
 PetscErrorCode       :: ierr
 integer(pInt) :: &
   i,j,k,e

 e = 0_pInt

 tensorField_real = 0.0_pReal
 print*, SHAPE(v_in)
 print*, SHAPE(v_out)
 
 do k = 1_pInt, grid3; do j = 1_pInt, grid(2); do i = 1_pInt, grid(1)
  e = e + 1_pInt
  tensorField_real(1:3,1:3,i,j,k) =  j*v
 enddo; enddo; enddo
 call utilities_FFTtensorForward()
 call utilities_fourierGammaConvolution()
 call utilities_FFTtensorBackward()
 v_out = tensorField_real(1:3,1:3,1:grid(1),1:grid(2),1:grid3)

end subroutine jac_shell


subroutine SPEC_mech_formJacobian(snes,xx_local,Jac_pre,Jac,dummy,ierr)
 implicit none
 SNES            :: snes
 DM                                   :: dm_local
 Vec                                  :: x_local, xx_local
 Mat                                  :: Jac_pre, Jac
 PetscObject                          :: dummy
 PetscErrorCode                       :: ierr

end subroutine SPEC_mech_formJacobian



end module spectral_mech_basic
!--------------------------------------------------------------------------------------------------

Best regards,
Luv




> On 23 May 2017, at 21:16, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>   You didn't include any code related to creating or setting the MATSHELL.  What goes wrong with your Fortran code.
> 
> 
>> On May 23, 2017, at 2:05 PM, Luv Sharma <luvsharma11 at gmail.com> wrote:
>> 
>> Dear PETSc team,
>> 
>> I am working on a code which solves mechanical equilibrium using spectral methods.
>> I want to make use of the matshell to get the action J*v.
>> 
>> I have been able to successfully implement it using petsc4py. But having difficulties to get it working in a fortran code.
>> I am using petsc-3.7.6. 
>> 
>> Below is a stripped down version of the existing fortran code (module). Can you please help me in figuring out how the right way to do it a code with following structure?
>> 
>> !--------------------------------------------------------------------------------------------------
>> module spectral_mech_basic
>> 
>> implicit none
>> private
>> #include <petsc/finclude/petsc.h90>
>> 
>> ! *PETSc data here*
>> DM ..
>> SNES ..
>> ..
>> contains
>> 
>> !--------------------------------------------------------------------------------------------------
>> subroutine basicPETSc_init
>> 
>> external :: &
>>   *petsc functions here*
>> 
>> ! initialize solver specific parts of PETSc
>> call SNESCreate(PETSC_COMM_WORLD,snes,ierr); CHKERRQ(ierr)
>> call SNESSetOptionsPrefix(snes,'mech_',ierr);CHKERRQ(ierr)  
>> call DMDACreate3d(PETSC_COMM_WORLD, &
>>        DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, &                                  
>>        DMDA_STENCIL_BOX, &                                                                         
>>        grid(1),grid(2),grid(3), &                                                          
>>        1 , 1, worldsize, &
>>        9, 0, &                                                                                    
>>        grid(1),grid(2),localK, &                                                             
>>        da,ierr)                                                                                  
>> CHKERRQ(ierr)
>> call SNESSetDM(snes,da,ierr); CHKERRQ(ierr)
>> call DMCreateGlobalVector(da,solution_vec,ierr); CHKERRQ(ierr)                                    
>> call DMDASNESSetFunctionLocal(da,INSERT_VALUES,BasicPETSC_formResidual,PETSC_NULL_OBJECT,ierr)    
>> CHKERRQ(ierr) 
>> call SNESSetDM(snes,da,ierr); CHKERRQ(ierr)                                                       
>> call SNESGetKSP(snes,ksp,ierr); CHKERRQ(ierr)                                                    
>> call KSPGetPC(ksp,pc,ierr); CHKERRQ(ierr)                                                          
>> call PCSetType(pc,PCNONE,ierr); CHKERRQ(ierr)                                                     
>> call SNESSetFromOptions(snes,ierr); CHKERRQ(ierr)                                                 
>> 
>> ! init fields                 
>> call DMDAVecGetArrayF90(da,solution_vec,F,ierr); CHKERRQ(ierr)                                    
>> call DMDAVecRestoreArrayF90(da,solution_vec,F,ierr); CHKERRQ(ierr)                               
>> 
>> end subroutine basicPETSc_init
>> !--------------------------------------------------------------------------------------------------
>> 
>> type(tSolutionState) function &
>>  basicPETSc_solution(incInfoIn,timeinc,timeinc_old,stress_BC,rotation_BC)
>> implicit none
>> ! PETSc Data
>> PetscErrorCode :: ierr   
>> SNESConvergedReason :: reason
>> external :: &
>>   SNESSolve, &
>> ! solve BVP 
>> call SNESSolve(snes,PETSC_NULL_OBJECT,solution_vec,ierr)
>> CHKERRQ(ierr)
>> end function BasicPETSc_solution
>> !--------------------------------------------------------------------------------------------------
>> !> @brief forms the basic residual vector
>> !--------------------------------------------------------------------------------------------------
>> subroutine BasicPETSC_formResidual(in,x_scal,f_scal,dummy,ierr)
>> 
>> implicit none
>> DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: &
>>   in
>> PetscScalar, dimension(3,3, &
>>   XG_RANGE,YG_RANGE,ZG_RANGE), intent(in) :: &
>>   x_scal
>> PetscScalar, dimension(3,3, &
>>   X_RANGE,Y_RANGE,Z_RANGE), intent(out) :: &
>>   f_scal
>> ! constructing residual
>> ?..
>> ?.
>> 
>> f_scal = tensorField_real(1:3,1:3,1:grid(1),1:grid(2),1:grid3)
>> 
>> end subroutine BasicPETSc_formResidual
>> !--------------------------------------------------------------------------------------------------
>> 
>> end module spectral_mech_basic
>> !--------------------------------------------------------------------------------------------------
>> 
>> Best regards,
>> Luv
>> 
>>> On 3 Nov 2016, at 01:17, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>> 
>>> 
>>> Is anyone away of cases where PETSc has been used with spectral methods?
>>> 
>>>  Thanks
>>> 
>>>   Barry
>>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/8ee73acf/attachment-0001.html>

From bsmith at mcs.anl.gov  Tue May 23 15:20:11 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 23 May 2017 15:20:11 -0500
Subject: [petsc-users] matshell for spectral methods in fortran
In-Reply-To: <8B5B69FE-F0A1-4442-9B03-3A9DBA447B2E@gmail.com>
References: <C4632F06-2DE0-4513-A649-327D578A61CA@gmail.com>
	<72BF559A-E663-4F04-93A2-222D4AD8A94B@mcs.anl.gov>
	<8B5B69FE-F0A1-4442-9B03-3A9DBA447B2E@gmail.com>
Message-ID: <19D5F486-4121-4954-A8C4-072E78CFF0CA@mcs.anl.gov>


   I cannot easily see why this would or wouldn't work. If you send me the entire code as an attachment and makefile I can try to run it.

   Barry

> On May 23, 2017, at 2:36 PM, Luv Sharma <luvsharma11 at gmail.com> wrote:
> 
> Dear Barry, 
> 
> Thanks for your quick reply.
> I have tried following:
> 
> !--------------------------------------------------------------------------------------------------
> module spectral_mech_basic
> 
>  implicit none
>  private
> #include <petsc/finclude/petsc.h90>
> 
> ! *PETSc data here*
>  DM ..
>  SNES ..
>  ..
> contains
> 
> !--------------------------------------------------------------------------------------------------
> subroutine basicPETSc_init
> 
>  external :: &
>    *petsc functions here*
> 
> ! initialize solver specific parts of PETSc
>  call SNESCreate(PETSC_COMM_WORLD,snes,ierr); CHKERRQ(ierr)
>  call SNESSetOptionsPrefix(snes,'mech_',ierr);CHKERRQ(ierr)  
>  call DMDACreate3d(PETSC_COMM_WORLD, &
>         DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, &                                  
>         DMDA_STENCIL_BOX, &                                                                         
>         grid(1),grid(2),grid(3), &                                                          
>         1 , 1, worldsize, &
>         9, 0, &                                                                                    
>         grid(1),grid(2),localK, &                                                             
>         da,ierr)                                                                                  
>  CHKERRQ(ierr)
>  call SNESSetDM(snes,da,ierr); CHKERRQ(ierr)
>  call DMCreateGlobalVector(da,solution_vec,ierr); CHKERRQ(ierr)                                    
>  call DMDASNESSetFunctionLocal(da,INSERT_VALUES,BasicPETSC_formResidual,PETSC_NULL_OBJECT,ierr)    
>  CHKERRQ(ierr) 
> 
> 
>  !call DMCreateMatrix(da,J_shell,ierr)
>  !CHKERRQ(ierr)
>  !call DMSetMatType(da,MATSHELL,ierr)
>  !CHKERRQ(ierr)
>  !call DMSNESSetJacobianLocal(da,SPEC_mech_formJacobian,PETSC_NULL_OBJECT,ierr)                 !< function to evaluate stiffness matrix
>  !CHKERRQ(ierr)
> 
> 
>  matsize = 9_pInt*grid(1)*grid(2)*grid(3)
>  call MatCreateShell( PETSC_COMM_WORLD, matsize, matsize, matsize, matsize, PETSC_NULL_OBJECT, J_shell, ierr )
>  CHKERRQ(ierr)
> 
>  call SNESSetJacobian( snes, J_shell, J_shell, SPEC_mech_formJacobian, PETSC_NULL_OBJECT, ierr)
>  CHKERRQ(ierr)
> 
>  call MatShellSetOperation(J_shell, MATOP_MULT, jac_shell, ierr )
>  CHKERRQ(ierr)
> 
>  call SNESSetDM(snes,da,ierr); CHKERRQ(ierr)                                                       
>  call SNESGetKSP(snes,ksp,ierr); CHKERRQ(ierr)                                                    
>  call KSPGetPC(ksp,pc,ierr); CHKERRQ(ierr)                                                          
>  call PCSetType(pc,PCNONE,ierr); CHKERRQ(ierr)                                                     
>  call SNESSetFromOptions(snes,ierr); CHKERRQ(ierr)                                                 
> 
> ! init fields                 
>  call DMDAVecGetArrayF90(da,solution_vec,F,ierr); CHKERRQ(ierr)                                    
>  call DMDAVecRestoreArrayF90(da,solution_vec,F,ierr); CHKERRQ(ierr)                               
> 
> end subroutine basicPETSc_init
> !--------------------------------------------------------------------------------------------------
> 
> type(tSolutionState) function &
>   basicPETSc_solution(incInfoIn,timeinc,timeinc_old,stress_BC,rotation_BC)
>  implicit none
> ! PETSc Data
>  PetscErrorCode :: ierr   
>  SNESConvergedReason :: reason
>  external :: &
>    SNESSolve, &
> ! solve BVP 
>  call SNESSolve(snes,PETSC_NULL_OBJECT,solution_vec,ierr)
>  CHKERRQ(ierr)
> end function BasicPETSc_solution
> !--------------------------------------------------------------------------------------------------
> !> @brief forms the basic residual vector
> !--------------------------------------------------------------------------------------------------
> subroutine BasicPETSC_formResidual(in,x_scal,f_scal,dummy,ierr)
> 
>  implicit none
>  DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: &
>    in
>  PetscScalar, dimension(3,3, &
>    XG_RANGE,YG_RANGE,ZG_RANGE), intent(in) :: &
>    x_scal
>  PetscScalar, dimension(3,3, &
>    X_RANGE,Y_RANGE,Z_RANGE), intent(out) :: &
>    f_scal
> ! constructing residual
> ?..
> ?.
> 
>  f_scal = tensorField_real(1:3,1:3,1:grid(1),1:grid(2),1:grid3)
> 
> end subroutine BasicPETSc_formResidual
> !?????????????????????????????????????????????????
> !> @brief matmult routine
> !--------------------------------------------------------------------------------------------------
>     ! a shell jacobian; returns the action J*v
> subroutine jac_shell(Jshell,v_in,v_out)
> 
>  use math, only: &
>    math_rotate_backward33, &
>    math_transpose33, &
>    math_mul3333xx33
>  use mesh, only: &
>    grid, &
>    grid3
>  use spectral_utilities, only: &
>    wgt, &
>    tensorField_real, &
>    utilities_FFTtensorForward, &
>    utilities_fourierGammaConvolution, &
>    utilities_FFTtensorBackward, &
>    Utilities_constitutiveResponse, &
>    Utilities_divergenceRMS
>  implicit none
> ! DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: &
> !   in
>  PetscScalar, dimension(3,3, &
>    1000,1,1), intent(in) :: &
>    v_in
>  PetscScalar, dimension(3,3, &
>    1000,1,1), intent(out) :: &
>    v_out
>  
>  Mat                  :: Jshell
>  PetscErrorCode       :: ierr
>  integer(pInt) :: &
>    i,j,k,e
> 
>  e = 0_pInt
> 
>  tensorField_real = 0.0_pReal
>  print*, SHAPE(v_in)
>  print*, SHAPE(v_out)
>  
>  do k = 1_pInt, grid3; do j = 1_pInt, grid(2); do i = 1_pInt, grid(1)
>   e = e + 1_pInt
>   tensorField_real(1:3,1:3,i,j,k) =  j*v
>  enddo; enddo; enddo
>  call utilities_FFTtensorForward()
>  call utilities_fourierGammaConvolution()
>  call utilities_FFTtensorBackward()
>  v_out = tensorField_real(1:3,1:3,1:grid(1),1:grid(2),1:grid3)
> 
> end subroutine jac_shell
> 
> 
> subroutine SPEC_mech_formJacobian(snes,xx_local,Jac_pre,Jac,dummy,ierr)
>  implicit none
>  SNES            :: snes
>  DM                                   :: dm_local
>  Vec                                  :: x_local, xx_local
>  Mat                                  :: Jac_pre, Jac
>  PetscObject                          :: dummy
>  PetscErrorCode                       :: ierr
> 
> end subroutine SPEC_mech_formJacobian
> 
> 
> 
> end module spectral_mech_basic
> !--------------------------------------------------------------------------------------------------
> 
> Best regards,
> Luv
> 
> 
> 
> 
>> On 23 May 2017, at 21:16, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>> 
>>   You didn't include any code related to creating or setting the MATSHELL.  What goes wrong with your Fortran code.
>> 
>> 
>>> On May 23, 2017, at 2:05 PM, Luv Sharma <luvsharma11 at gmail.com> wrote:
>>> 
>>> Dear PETSc team,
>>> 
>>> I am working on a code which solves mechanical equilibrium using spectral methods.
>>> I want to make use of the matshell to get the action J*v.
>>> 
>>> I have been able to successfully implement it using petsc4py. But having difficulties to get it working in a fortran code.
>>> I am using petsc-3.7.6. 
>>> 
>>> Below is a stripped down version of the existing fortran code (module). Can you please help me in figuring out how the right way to do it a code with following structure?
>>> 
>>> !--------------------------------------------------------------------------------------------------
>>> module spectral_mech_basic
>>> 
>>> implicit none
>>> private
>>> #include <petsc/finclude/petsc.h90>
>>> 
>>> ! *PETSc data here*
>>> DM ..
>>> SNES ..
>>> ..
>>> contains
>>> 
>>> !--------------------------------------------------------------------------------------------------
>>> subroutine basicPETSc_init
>>> 
>>> external :: &
>>>   *petsc functions here*
>>> 
>>> ! initialize solver specific parts of PETSc
>>> call SNESCreate(PETSC_COMM_WORLD,snes,ierr); CHKERRQ(ierr)
>>> call SNESSetOptionsPrefix(snes,'mech_',ierr);CHKERRQ(ierr)  
>>> call DMDACreate3d(PETSC_COMM_WORLD, &
>>>        DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, &                                  
>>>        DMDA_STENCIL_BOX, &                                                                         
>>>        grid(1),grid(2),grid(3), &                                                          
>>>        1 , 1, worldsize, &
>>>        9, 0, &                                                                                    
>>>        grid(1),grid(2),localK, &                                                             
>>>        da,ierr)                                                                                  
>>> CHKERRQ(ierr)
>>> call SNESSetDM(snes,da,ierr); CHKERRQ(ierr)
>>> call DMCreateGlobalVector(da,solution_vec,ierr); CHKERRQ(ierr)                                    
>>> call DMDASNESSetFunctionLocal(da,INSERT_VALUES,BasicPETSC_formResidual,PETSC_NULL_OBJECT,ierr)    
>>> CHKERRQ(ierr) 
>>> call SNESSetDM(snes,da,ierr); CHKERRQ(ierr)                                                       
>>> call SNESGetKSP(snes,ksp,ierr); CHKERRQ(ierr)                                                    
>>> call KSPGetPC(ksp,pc,ierr); CHKERRQ(ierr)                                                          
>>> call PCSetType(pc,PCNONE,ierr); CHKERRQ(ierr)                                                     
>>> call SNESSetFromOptions(snes,ierr); CHKERRQ(ierr)                                                 
>>> 
>>> ! init fields                 
>>> call DMDAVecGetArrayF90(da,solution_vec,F,ierr); CHKERRQ(ierr)                                    
>>> call DMDAVecRestoreArrayF90(da,solution_vec,F,ierr); CHKERRQ(ierr)                               
>>> 
>>> end subroutine basicPETSc_init
>>> !--------------------------------------------------------------------------------------------------
>>> 
>>> type(tSolutionState) function &
>>>  basicPETSc_solution(incInfoIn,timeinc,timeinc_old,stress_BC,rotation_BC)
>>> implicit none
>>> ! PETSc Data
>>> PetscErrorCode :: ierr   
>>> SNESConvergedReason :: reason
>>> external :: &
>>>   SNESSolve, &
>>> ! solve BVP 
>>> call SNESSolve(snes,PETSC_NULL_OBJECT,solution_vec,ierr)
>>> CHKERRQ(ierr)
>>> end function BasicPETSc_solution
>>> !--------------------------------------------------------------------------------------------------
>>> !> @brief forms the basic residual vector
>>> !--------------------------------------------------------------------------------------------------
>>> subroutine BasicPETSC_formResidual(in,x_scal,f_scal,dummy,ierr)
>>> 
>>> implicit none
>>> DMDALocalInfo, dimension(DMDA_LOCAL_INFO_SIZE) :: &
>>>   in
>>> PetscScalar, dimension(3,3, &
>>>   XG_RANGE,YG_RANGE,ZG_RANGE), intent(in) :: &
>>>   x_scal
>>> PetscScalar, dimension(3,3, &
>>>   X_RANGE,Y_RANGE,Z_RANGE), intent(out) :: &
>>>   f_scal
>>> ! constructing residual
>>> ?..
>>> ?.
>>> 
>>> f_scal = tensorField_real(1:3,1:3,1:grid(1),1:grid(2),1:grid3)
>>> 
>>> end subroutine BasicPETSc_formResidual
>>> !--------------------------------------------------------------------------------------------------
>>> 
>>> end module spectral_mech_basic
>>> !--------------------------------------------------------------------------------------------------
>>> 
>>> Best regards,
>>> Luv
>>> 
>>>> On 3 Nov 2016, at 01:17, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>> 
>>>> 
>>>> Is anyone away of cases where PETSc has been used with spectral methods?
>>>> 
>>>>  Thanks
>>>> 
>>>>   Barry
>>>> 
>> 
> 


From friedenhe at gmail.com  Tue May 23 18:40:44 2017
From: friedenhe at gmail.com (Ping He)
Date: Tue, 23 May 2017 19:40:44 -0400
Subject: [petsc-users] How to manually set the matrix-free differencing
 parameter h?
In-Reply-To: <6D3FA436-3969-45A8-ADAD-AAEDE2FFC891@mcs.anl.gov>
References: <59246CB1.4020000@gmail.com>
	<6D3FA436-3969-45A8-ADAD-AAEDE2FFC891@mcs.anl.gov>
Message-ID: <5924C87C.90405@gmail.com>

Hi Barry,

Thanks for your reply. It is working.

Regards,
Ping

On 05/23/2017 02:02 PM, Barry Smith wrote:
>     That's not really the right function; you can use MatMFFDSetFunctionError(), MatMFFDSetType(), MatMFFDSetPeriod() to set the available parameters.
>
>
>      Barry
>
>
>> On May 23, 2017, at 12:09 PM, Ping He <friedenhe at gmail.com> wrote:
>>
>> Hi,
>>
>> I am using PETSc-SNES matrix free approach, and I would like to know how to manually set the differencing parameter h. I tried to use the SNESDefaultMatrixFreeSetParameters2 function but I got an error when compiling: ?SNESDefaultMatrixFreeSetParameters2? was not declared in this scope.
>>
>> Thanks very much in advance.
>>
>> Regards,
>> Ping


From lirui319 at hnu.edu.cn  Wed May 24 00:14:54 2017
From: lirui319 at hnu.edu.cn (=?GBK?B?wO7I8A==?=)
Date: Wed, 24 May 2017 13:14:54 +0800 (GMT+08:00)
Subject: [petsc-users] Installation Error
Message-ID: <15e1cc1.5bb6.15c38e117d7.Coremail.lirui319@hnu.edu.cn>


Dear professor or engineer:
   I meet a problem about installation to petsc.
   When I type the code "./configure --with-cc=gcc --with-cxx=0 --with-fc=0 --download-f2cblaslapack --download-mpich" on my terminal,the answer reveals the following results.

>>>ERROR:root:code for hash md5 was not found.
Traceback (most recent call last):
    File "/home/zhuizhuluori/lirui/software/vapor-2.5.0-Linux_x86_64/vapor/vapor-2.5.0/lib/python2.7/hashlib.py", line 139, in <module
      globals()[__func_name] = __get_hash(__func_name)
    File "/home/zhuizhuluori/lirui/software/vapor-2.5.0-Linux_x86_64/vapor/vapor-2.5.0/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor
    raise ValueError('unsupported hash type ' + name)
ValueError: unsupported hash type md5
    ERROR:root:code for hash sha1 was not found .....

   I have used petsc for a long time,and never see the this problem.my laptop is installed an old version of petsc and I wanna change it to a new version.How can I fix it?Thanks for your heartful suggestion! 






From mderezin at ucsc.edu  Wed May 24 01:09:06 2017
From: mderezin at ucsc.edu (Michal Derezinski)
Date: Tue, 23 May 2017 23:09:06 -0700
Subject: [petsc-users] Accessing submatrices without additional memory usage
Message-ID: <CAAw1VLU6=UFWGihUC5dj5frQZg_tW4yEupj9A-SrQ_Lc8iKdVg@mail.gmail.com>

Hi,

I want to be able to perform matrix operations on several contiguous
submatrices of a full matrix, without allocating the memory redundantly for
the submatrices (in addition to the memory that is already allocated for
the full matrix).
I tried using MatGetSubMatrix, but this function appears to allocate the
additional memory.

The other way I found to do this is to create the smallest submatrices I
need first, then use MatCreateNest to combine them into bigger ones
(including the full matrix).
The documentation of MatCreateNest seems to indicate that it does not
allocate additional memory for storing the new matrix.
Is this the right approach, or is there a better one?

Thanks,
Michal Derezinski.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170523/44c5ef32/attachment.html>

From danyang.su at gmail.com  Wed May 24 02:21:22 2017
From: danyang.su at gmail.com (Danyang Su)
Date: Wed, 24 May 2017 00:21:22 -0700
Subject: [petsc-users] Question on incomplete factorization level and fill
Message-ID: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>

Dear All,

I use PCFactorSetLevels for ILU and PCFactorSetFill for other 
preconditioning in my code to help solve the problems that the default 
option is hard to solve. However, I found the latter one, 
PCFactorSetFill does not take effect for my problem. The matrices and 
rhs as well as the solutions are attached from the link below. I obtain 
the solution using hypre preconditioner and it takes 7 and 38 iterations 
for matrix 1 and matrix 2. However, if I use other preconditioner, the 
solver just failed at the first matrix. I have tested this matrix using 
the native sequential solver (not PETSc) with ILU preconditioning. If I 
set the incomplete factorization level to 0, this sequential solver will 
take more than 100 iterations. If I increase the factorization level to 
1 or more, it just takes several iterations. This remind me that the PC 
factor for this matrices should be increased. However, when I tried it 
in PETSc, it just does not work.

Matrix and rhs can be obtained from the link below.

https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R

Would anyone help to check if you can make this work by increasing the 
PC factor level or fill?

Thanks and regards,

Danyang



From franck.houssen at inria.fr  Wed May 24 04:45:58 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Wed, 24 May 2017 11:45:58 +0200 (CEST)
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <CAMYG4GmH-7+-Ed-GaYD-B+MNVMTre3Zd-zq0Tn64cToESUex3g@mail.gmail.com>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>
	<1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>
	<CAMYG4G=hBu_XJ1rRXRt6Avb4--yy=g2Fmk4x-jbbR9XUo-X4hA@mail.gmail.com>
	<1784716977.7683031.1495556883254.JavaMail.zimbra@inria.fr>
	<CAMYG4GkkXfO+JWugx-d7bvugXY=7Sqf3HCnrwBONORN5tXxRQA@mail.gmail.com>
	<855172682.7687763.1495558287122.JavaMail.zimbra@inria.fr>
	<CAMYG4GmH-7+-Ed-GaYD-B+MNVMTre3Zd-zq0Tn64cToESUex3g@mail.gmail.com>
Message-ID: <1238048783.7876567.1495619158445.JavaMail.zimbra@inria.fr>

Coming from FEM, I believe the very confusing thing is that the local size of the user problem (math, physics point of view - DDM domain size) is not (can not be ?) the local size expected in MatCreateIS. 

My understanding is that the local size in MatIS is "just" related to backend implementation problems (it's logical that this local size is necessary, but, for another purpose: MPI machinery). Taking a few steps back, I can not see a case (I may be wrong) when a user does know how to compute or set "by hand" the local size that MatIS will expect: my understanding (once again, not sure) is that in most cases, the user will need local size to be PETSC_DECIDE in MatIS (because he doesn't want to "bother" with that or can not guess / compute it => unfortunatelly, as is, this jam the whole thing). 

I guess this kind of signature for MatIS would avoid/limit confusion in most cases and for most users : 
PetscErrorCode MatCreateIS(MPI_Comm comm,PetscInt bs,PetscInt M,PetscInt N,ISLocalToGlobalMapping rmap,ISLocalToGlobalMapping cmap,Mat *A,PetscInt m = PETSC_DECIDE ,PetscInt n = PETSC_DECIDE ) 
Or even 
PetscErrorCode MatCreateIS(MPI_Comm comm,PetscInt bs,PetscInt M,PetscInt N,ISLocalToGlobalMapping rmap,ISLocalToGlobalMapping cmap,Mat *A ) // Always use PETSC_DECIDE backstage ? 

Franck 

----- Mail original -----

> De: "Matthew Knepley" <knepley at gmail.com>
> ?: "Franck Houssen" <franck.houssen at inria.fr>
> Cc: "Stefano Zampini" <stefano.zampini at gmail.com>, "PETSc"
> <petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> Envoy?: Mardi 23 Mai 2017 19:02:28
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS matrix
> and a global vector ?

> On Tue, May 23, 2017 at 11:51 AM, Franck Houssen < franck.houssen at inria.fr >
> wrote:

> > Not sure to know what question you're talking about ?!...
> 
> > I use MatIS to test some kind of domain decomposition methods. I define my
> > own preconditioner for that: in the apply callback, I need to matmult my
> > (matIS) matrix with the incoming vector.
> 

> Okay. I will create an example using your suggestion.

> Thanks,

> Matt

> > Franck
> 

> > > De: "Matthew Knepley" < knepley at gmail.com >
> > 
> 
> > > ?: "Franck Houssen" < franck.houssen at inria.fr >
> > 
> 
> > > Cc: "Stefano Zampini" < stefano.zampini at gmail.com >, "PETSc" <
> > > petsc-users at mcs.anl.gov >, "PETSc" < petsc-dev at mcs.anl.gov >
> > 
> 
> > > Envoy?: Mardi 23 Mai 2017 18:46:34
> > 
> 
> > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> > > matrix
> > > and a global vector ?
> > 
> 

> > > On Tue, May 23, 2017 at 11:28 AM, Franck Houssen <
> > > franck.houssen at inria.fr
> > > >
> > > wrote:
> > 
> 

> > > > OK, thanks. This is helpfull... But I really think the doc should be
> > > > more
> > > > verbose about that: this is really confusing and I didn't find any
> > > > simple
> > > > example to begin with which make all this even more confusing (personal
> > > > opinion).
> > > 
> > 
> 

> > > Did you respond to my other question (how are you using them)? That would
> > > help me understand how to phrase it.
> > 
> 

> > > Thanks,
> > 
> 

> > > Matt
> > 
> 

> > > > Franck
> > > 
> > 
> 

> > > > > De: "Matthew Knepley" < knepley at gmail.com >
> > > > 
> > > 
> > 
> 
> > > > > ?: "Franck Houssen" < franck.houssen at inria.fr >
> > > > 
> > > 
> > 
> 
> > > > > Cc: "Stefano Zampini" < stefano.zampini at gmail.com >, "PETSc" <
> > > > > petsc-users at mcs.anl.gov >, "PETSc" < petsc-dev at mcs.anl.gov >
> > > > 
> > > 
> > 
> 
> > > > > Envoy?: Mardi 23 Mai 2017 13:21:21
> > > > 
> > > 
> > 
> 
> > > > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> > > > > matrix
> > > > > and a global vector ?
> > > > 
> > > 
> > 
> 

> > > > > On Tue, May 23, 2017 at 4:53 AM, Franck Houssen <
> > > > > franck.houssen at inria.fr
> > > > > >
> > > > > wrote:
> > > > 
> > > 
> > 
> 

> > > > > > The first thing I did was to put 3, not 4 : I got an error thrown
> > > > > > in
> > > > > > MatCreateIS (see the git diff + stack below). As the error said I
> > > > > > used
> > > > > > globalSize = numberOfMPIProcessus * localSize : my understanding is
> > > > > > that,
> > > > > > when using MatIS, the global size needs to be the sum of all local
> > > > > > sizes.
> > > > > > Correct ?
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > No. MatIS means that the matrix is not assembled. The easiest way
> > > > > (for
> > > > > me)
> > > > > to
> > > > > think of this is that processes do not have
> > > > 
> > > 
> > 
> 
> > > > > to hold full rows. One process can hold part of row i, and another
> > > > > processes
> > > > > can hold another part. However, there are still
> > > > 
> > > 
> > 
> 
> > > > > the same number of global rows.
> > > > 
> > > 
> > 
> 

> > > > > > I have a 3x3 global matrix made of two overlapping 2x2 local matrix
> > > > > > (=
> > > > > > diagonal with 1.). Each local matrix correspond to one domain (each
> > > > > > domain
> > > > > > is delegated to one MPI proc, so, I have 2 MPI procs because I have
> > > > > > 2
> > > > > > domains).
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > So the global size is 3. The local size here is not the size of the
> > > > > local
> > > > > IS
> > > > > block, since that is a property only of MatIS. It is the
> > > > 
> > > 
> > 
> 
> > > > > size of the local piece of the vector you multiply. This allows PETSc
> > > > > to
> > > > > understand the parallel layout of the Vec, and how it
> > > > 
> > > 
> > 
> 
> > > > > matched the Mat.
> > > > 
> > > 
> > 
> 

> > > > > This is somewhat confusing because FEM people mean something
> > > > > different
> > > > > by
> > > > > "local" than we do here, and in fact we use this
> > > > 
> > > 
> > 
> 
> > > > > other definition of local when assembling operators.
> > > > 
> > > 
> > 
> 

> > > > > Matt
> > > > 
> > > 
> > 
> 

> > > > > > This is the simplest possible example: I have two 2x2 (local) diag
> > > > > > matrix
> > > > > > that overlap so that the global matrix built from them is 1, 2, 1
> > > > > > on
> > > > > > the
> > > > > > diagonal (local contributions add up in the middle).
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > I need to MatMult this global matrix with a global vector filled
> > > > > > with
> > > > > > 1.
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > Franck
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > Git diff :
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > --- a/matISLocalMat.cpp
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > +++ b/matISLocalMat.cpp
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > @@ -16,7 +16,7 @@ int main(int argc,char **argv) {
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > int size = 0; MPI_Comm_size(MPI_COMM_WORLD, &size); if (size != 2)
> > > > > > return
> > > > > > 1;
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > int rank = 0; MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > - PetscInt localSize = 2, globalSize = localSize*2 /*2 MPI*/;
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > + PetscInt localSize = 2, globalSize = 3;
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > PetscInt localIdx[2] = {0, 0};
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > if (rank == 0) {localIdx[0] = 0; localIdx[1] = 1;}
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > else {localIdx[0] = 1; localIdx[1] = 2;}
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > Stack error:
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > [0]PETSC ERROR: Nonconforming object sizes
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > [0]PETSC ERROR: Sum of local lengths 4 does not equal global length
> > > > > > 3,
> > > > > > my
> > > > > > local length 2
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > [0]PETSC ERROR: [0] ISG2LMapApply line 17
> > > > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/vec/is/utils/isltog.c
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > [0]PETSC ERROR: [0] MatSetValues_IS line 692
> > > > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > [0]PETSC ERROR: [0] MatSetValues line 1157
> > > > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > [0]PETSC ERROR: [0] MatISSetPreallocation_IS line 95
> > > > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > [0]PETSC ERROR: [0] MatISSetPreallocation line 80
> > > > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > [0]PETSC ERROR: [0] PetscSplitOwnership line 80
> > > > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/sys/utils/psplit.c
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > [0]PETSC ERROR: [0] PetscLayoutSetUp line 129
> > > > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/vec/is/utils/pmap.c
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping_IS line 628
> > > > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping line 1899
> > > > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > [0]PETSC ERROR: [0] MatCreateIS line 986
> > > > > > /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > De: "Stefano Zampini" < stefano.zampini at gmail.com >
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > ?: "Matthew Knepley" < knepley at gmail.com >
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Cc: "Franck Houssen" < franck.houssen at inria.fr >, "PETSc" <
> > > > > > > petsc-users at mcs.anl.gov >, "PETSc" < petsc-dev at mcs.anl.gov >
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Envoy?: Dimanche 21 Mai 2017 23:02:37
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global
> > > > > > > IS
> > > > > > > matrix
> > > > > > > and a global vector ?
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > Franck,
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > PETSc takes care of doing the matrix-vector multiplication
> > > > > > > properly
> > > > > > > using
> > > > > > > MatIS. As Matt said, the layout of the vectors is the usual
> > > > > > > parallel
> > > > > > > layout.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > The local sizes of the MatIS matrix (i.e. the local size of the
> > > > > > > left
> > > > > > > and
> > > > > > > right vectors used in MatMult) are not the sizes of the local
> > > > > > > subdomain
> > > > > > > matrices in MatIS.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > On May 21, 2017, at 6:47 PM, Matthew Knepley <
> > > > > > > > knepley at gmail.com
> > > > > > > > >
> > > > > > > > wrote:
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <
> > > > > > > > franck.houssen at inria.fr
> > > > > > > > >
> > > > > > > > wrote:
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > Using PETSc MatIS, how to matmult a global IS matrix and a
> > > > > > > > > global
> > > > > > > > > vector
> > > > > > > > > ?
> > > > > > > > > Example is attached : I don't get what I expect that is a
> > > > > > > > > vector
> > > > > > > > > such
> > > > > > > > > that
> > > > > > > > > proc0 = [1, 2] and proc1 = [2, 1]
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > 1) I think the global size of your matrix is wrong. You seem to
> > > > > > > > want
> > > > > > > > 3,
> > > > > > > > not
> > > > > > > > 4
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > 2) Global vectors have a non-overlapping row partition. You
> > > > > > > > might
> > > > > > > > be
> > > > > > > > thinking
> > > > > > > > of local vectors
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > Thanks,
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > Matt
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > --
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > What most experimenters take for granted before they begin
> > > > > > > > their
> > > > > > > > experiments
> > > > > > > > is infinitely more interesting than any results to which their
> > > > > > > > experiments
> > > > > > > > lead.
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > -- Norbert Wiener
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > http://www.caam.rice.edu/~mk51/
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > De: "Stefano Zampini" < stefano.zampini at gmail.com >
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > ?: "Matthew Knepley" < knepley at gmail.com >
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Cc: "Franck Houssen" < franck.houssen at inria.fr >, "PETSc" <
> > > > > > > petsc-users at mcs.anl.gov >, "PETSc" < petsc-dev at mcs.anl.gov >
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Envoy?: Dimanche 21 Mai 2017 23:02:37
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global
> > > > > > > IS
> > > > > > > matrix
> > > > > > > and a global vector ?
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > Franck,
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > PETSc takes care of doing the matrix-vector multiplication
> > > > > > > properly
> > > > > > > using
> > > > > > > MatIS. As Matt said, the layout of the vectors is the usual
> > > > > > > parallel
> > > > > > > layout.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > The local sizes of the MatIS matrix (i.e. the local size of the
> > > > > > > left
> > > > > > > and
> > > > > > > right vectors used in MatMult) are not the sizes of the local
> > > > > > > subdomain
> > > > > > > matrices in MatIS.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > On May 21, 2017, at 6:47 PM, Matthew Knepley <
> > > > > > > > knepley at gmail.com
> > > > > > > > >
> > > > > > > > wrote:
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <
> > > > > > > > franck.houssen at inria.fr
> > > > > > > > >
> > > > > > > > wrote:
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > Using PETSc MatIS, how to matmult a global IS matrix and a
> > > > > > > > > global
> > > > > > > > > vector
> > > > > > > > > ?
> > > > > > > > > Example is attached : I don't get what I expect that is a
> > > > > > > > > vector
> > > > > > > > > such
> > > > > > > > > that
> > > > > > > > > proc0 = [1, 2] and proc1 = [2, 1]
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > 1) I think the global size of your matrix is wrong. You seem to
> > > > > > > > want
> > > > > > > > 3,
> > > > > > > > not
> > > > > > > > 4
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > 2) Global vectors have a non-overlapping row partition. You
> > > > > > > > might
> > > > > > > > be
> > > > > > > > thinking
> > > > > > > > of local vectors
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > Thanks,
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > Matt
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > Franck
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > --
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > What most experimenters take for granted before they begin
> > > > > > > > their
> > > > > > > > experiments
> > > > > > > > is infinitely more interesting than any results to which their
> > > > > > > > experiments
> > > > > > > > lead.
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > -- Norbert Wiener
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > http://www.caam.rice.edu/~mk51/
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > --
> > > > 
> > > 
> > 
> 
> > > > > What most experimenters take for granted before they begin their
> > > > > experiments
> > > > > is infinitely more interesting than any results to which their
> > > > > experiments
> > > > > lead.
> > > > 
> > > 
> > 
> 
> > > > > -- Norbert Wiener
> > > > 
> > > 
> > 
> 

> > > > > http://www.caam.rice.edu/~mk51/
> > > > 
> > > 
> > 
> 

> > > --
> > 
> 
> > > What most experimenters take for granted before they begin their
> > > experiments
> > > is infinitely more interesting than any results to which their
> > > experiments
> > > lead.
> > 
> 
> > > -- Norbert Wiener
> > 
> 

> > > http://www.caam.rice.edu/~mk51/
> > 
> 

> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener

> http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/7682d04b/attachment-0001.html>

From franck.houssen at inria.fr  Wed May 24 04:46:01 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Wed, 24 May 2017 11:46:01 +0200 (CEST)
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to get local matrix (= one domain) before and after assembly ?
In-Reply-To: <9EFA5BCF-FDD3-45FA-A41A-6AA304D58C74@gmail.com>
References: <867421313.6757137.1495383596545.JavaMail.zimbra@inria.fr>
	<1253777447.6757298.1495383780337.JavaMail.zimbra@inria.fr>
	<CAMYG4Gknt_kBW8LXHma=2fFmBO1X70B0Hg6JiXi+nuDwy5ygKA@mail.gmail.com>
	<CAGPUisi5V7DkFLAt3kBRVTyVrZnYjPV21uTdUwiDku8_MVnHPA@mail.gmail.com>
	<2033509705.7414108.1495532492501.JavaMail.zimbra@inria.fr>
	<CAGPUishuMxWcPHUkWgXV42awH4rTv5W=64oz5xBShU8rTgn9Hw@mail.gmail.com>
	<740691579.7684644.1495557293858.JavaMail.zimbra@inria.fr>
	<9EFA5BCF-FDD3-45FA-A41A-6AA304D58C74@gmail.com>
Message-ID: <694141704.7876584.1495619161488.JavaMail.zimbra@inria.fr>

The code I sent compile and run at my side with petsc-3.7.6 (on debian/testing with gcc-6.3). The code you sent does not compile at my side. Anyway, no big deal. 

The modification you propose as far as I understand is to replace "ISCreateGeneral(PETSC_COMM_WORLD" with "ISCreateGeneral(PETSC_COMM_SELF" : still not working at my side (empty dirichlet local matrix). 
I will try to get that with a MPI matrix (that would contain same data that MatIS : that's what I tried to avoid as this doubles allocations - anyway, no big deal). 

Franck 

----- Mail original -----

> De: "Stefano Zampini" <stefano.zampini at gmail.com>
> ?: "Franck Houssen" <franck.houssen at inria.fr>
> Cc: "petsc-dev" <petsc-dev at mcs.anl.gov>, "PETSc users list"
> <petsc-users at mcs.anl.gov>, "petsc-maint" <knepley at gmail.com>
> Envoy?: Mardi 23 Mai 2017 20:23:49
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (= one
> domain) before and after assembly ?

> > On May 23, 2017, at 6:34 PM, Franck Houssen < franck.houssen at inria.fr >
> > wrote:
> 

> > OK. I am supposed to destroy the matrix returned by MatISGetMPIXAIJ ?
> 

> Yes

> > Also, my example still not get the final assembled local matrix (the
> > MatCreateSubMatrix returns an empty matrix) but as far as I understand my
> > (global) index set is OK: what did I miss ?
> 

> I really doubt you can use the example you have sent. It doesn?t compile, as
> MatCreateSubMatrix needs an extra argument.
> Attached a modified version that does what I guess is what you are looking
> for (sequential Dirichlet problems on the subdomains).

> > Franck
> 

> > ----- Mail original -----
> 

> > > De: "Stefano Zampini" < stefano.zampini at gmail.com >
> > 
> 
> > > ?: "Franck Houssen" < franck.houssen at inria.fr >
> > 
> 
> > > Cc: "petsc-dev" < petsc-dev at mcs.anl.gov >, "PETSc users list" <
> > > petsc-users at mcs.anl.gov >, "petsc-maint" < knepley at gmail.com >
> > 
> 
> > > Envoy?: Mardi 23 Mai 2017 13:16:18
> > 
> 
> > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (= one
> > > domain) before and after assembly ?
> > 
> 

> > > MatISGetMPIXAIJ is collective, as it assembles the global operator. To
> > > get
> > > the matrices you are looking for, you should call MatCreateSubMatrix on
> > > the
> > > assembled global operator, with the global indices representing the
> > > subdomain problem. Each process needs to call both functions
> > 
> 

> > > Stefano
> > 
> 

> > > Il 23 Mag 2017 11:41, "Franck Houssen" < franck.houssen at inria.fr > ha
> > > scritto:
> > 
> 

> > > > I have a 3x3 global matrix made of two overlapping 2x2 local matrix (=
> > > > diagonal with 1.). Each local matrix correspond to one domain (each
> > > > domain
> > > > is delegated to one MPI proc, so, I have 2 MPI procs because I have 2
> > > > domains).
> > > 
> > 
> 
> > > > This is the simplest possible example: I have two 2x2 (local) diag
> > > > matrix
> > > > that overlap so that the global matrix built from them is 1, 2, 1 on
> > > > the
> > > > diagonal (local contributions add up in the middle).
> > > 
> > 
> 

> > > > Now, I need for each MPI proc to get the assembled local matrix
> > > > (sometimes
> > > > called the dirichlet matrix) : this is a local matrix (sequential - not
> > > > distributed with MPI) that accounts for contribution of neighboring
> > > > domains
> > > > (MPI proc).
> > > 
> > 
> 

> > > > How to get the local assembled matrix ? MatGetLocalSubMatrix does not
> > > > work
> > > > (throw error - see example attached). MatGetSubMatrix returns a MPI
> > > > distributed matrix, not a local (sequential) one.
> > > 
> > 
> 

> > > > 1. My understanding is that MatISGetMPIXAIJ should return a local
> > > > matrix
> > > > (sequential AIJ matrix) : the MPI in the name recall that you get the
> > > > assembled matrix (with contributions from the shared border) from the
> > > > other
> > > > MPI processus. Correct ? In my simple example, I replaced
> > > > MatGetLocalSubMatrix with MatISGetMPIXAIJ : I get a deadlock which was
> > > > surprising to me... Is MatISGetMPIXAIJ a collective call ?
> > > 
> > 
> 
> > > > 2. Supposing this is a collective call (and that point 1 is not
> > > > correct),
> > > > I
> > > > ride up MatISGetMPIXAIJ before the "if (rank > 0)" : I don't deadlock
> > > > now,
> > > > but it seems I get a global matrix which is not the assembled local
> > > > matrix
> > > > I
> > > > am looking for.
> > > 
> > 
> 
> > > > 3. I am supposed to destroy the matrix returned by MatISGetMPIXAIJ ? (I
> > > > believe yes - not sure as AFAIU wording should associate Destroy
> > > > methods
> > > > to
> > > > Create methods)
> > > 
> > 
> 
> > > > Franck
> > > 
> > 
> 

> > > > The git diff illustrate modifications I tried to add to the initial
> > > > file
> > > > attached to this thread:
> > > 
> > 
> 
> > > > --- a/matISLocalMat.cpp
> > > 
> > 
> 
> > > > +++ b/matISLocalMat.cpp
> > > 
> > 
> 
> > > > @@ -31,6 +31,8 @@ int main(int argc,char **argv) {
> > > 
> > 
> 
> > > > MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY); MatAssemblyEnd(A,
> > > > MAT_FINAL_ASSEMBLY);
> > > 
> > 
> 
> > > > MatView(A, PETSC_VIEWER_STDOUT_WORLD);
> > > > PetscViewerFlush(PETSC_VIEWER_STDOUT_WORLD); // Diag: 1, 2, 1
> > > 
> > 
> 

> > > > + Mat assembledLocalMat;
> > > 
> > 
> 
> > > > + MatISGetMPIXAIJ(A, MAT_INITIAL_MATRIX, &assembledLocalMat);
> > > 
> > 
> 
> > > > if (rank > 0) { // Do not pollute stdout: print only 1 proc
> > > 
> > 
> 
> > > > std::cout << std::endl << "non assembled local matrix:" << std::endl <<
> > > > std::endl;
> > > 
> > 
> 
> > > > Mat nonAssembledLocalMat;
> > > 
> > 
> 
> > > > @@ -38,11 +40,10 @@ int main(int argc,char **argv) {
> > > 
> > 
> 
> > > > MatView(nonAssembledLocalMat, PETSC_VIEWER_STDOUT_SELF); // Diag: 1, 1
> > > 
> > 
> 

> > > > std::cout << std::endl << "assembled local matrix:" << std::endl <<
> > > > std::endl;
> > > 
> > 
> 
> > > > - Mat assembledLocalMat;
> > > 
> > 
> 
> > > > - IS is; ISCreateGeneral(PETSC_COMM_SELF, localSize, localIdx,
> > > > PETSC_COPY_VALUES, &is);
> > > 
> > 
> 
> > > > - MatGetLocalSubMatrix(A, is, is, &assembledLocalMat); // KO ?!...
> > > 
> > 
> 
> > > > - MatView(assembledLocalMat, PETSC_VIEWER_STDOUT_SELF); // Would like
> > > > to
> > > > get
> > > > => Diag: 2, 1
> > > 
> > 
> 
> > > > + //IS is; ISCreateGeneral(PETSC_COMM_SELF, localSize, localIdx,
> > > > PETSC_COPY_VALUES, &is);
> > > 
> > 
> 
> > > > + //MatGetLocalSubMatrix(A, is, is, &assembledLocalMat); // KO ?!...
> > > 
> > 
> 
> > > > }
> > > 
> > 
> 
> > > > + MatView(assembledLocalMat, PETSC_VIEWER_STDOUT_WORLD); // Would like
> > > > to
> > > > get
> > > > => Diag: 2, 1
> > > 
> > 
> 

> > > > > De: "Stefano Zampini" < stefano.zampini at gmail.com >
> > > > 
> > > 
> > 
> 
> > > > > ?: "petsc-maint" < knepley at gmail.com >
> > > > 
> > > 
> > 
> 
> > > > > Cc: "petsc-dev" < petsc-dev at mcs.anl.gov >, "PETSc users list" <
> > > > > petsc-users at mcs.anl.gov >, "Franck Houssen" < franck.houssen at inria.fr
> > > > > >
> > > > 
> > > 
> > 
> 
> > > > > Envoy?: Dimanche 21 Mai 2017 22:51:34
> > > > 
> > > 
> > 
> 
> > > > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (=
> > > > > one
> > > > > domain) before and after assembly ?
> > > > 
> > > 
> > 
> 

> > > > > To assemble the operator in aij format, use
> > > > 
> > > 
> > 
> 
> > > > > MatISGetMPIXAIJ
> > > > 
> > > 
> > 
> 
> > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatISGetMPIXAIJ.html
> > > > 
> > > 
> > 
> 

> > > > > Il 21 Mag 2017 18:43, "Matthew Knepley" < knepley at gmail.com > ha
> > > > > scritto:
> > > > 
> > > 
> > 
> 

> > > > > > On Sun, May 21, 2017 at 11:23 AM, Franck Houssen <
> > > > > > franck.houssen at inria.fr
> > > > > > >
> > > > > > wrote:
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > I have a 3x3 global matrix is built (diag: 1, 2, 1): it's made of
> > > > > > > 2
> > > > > > > overlapping 2x2 local matrix (diag: 1, 1).
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Getting non assembled local matrix is OK with MatISGetLocalMat.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > How to get assembled local matrix (initial local matrix +
> > > > > > > neigbhor
> > > > > > > contributions on the borders) ? (expected result is diag: 2, 1)
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > You can always use
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrix.html
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrices.html
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > to get copies, but if you just want to build things, you can use
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetLocalSubMatrix.html
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > Thanks,
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > Matt
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > Franck
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > --
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > What most experimenters take for granted before they begin their
> > > > > > experiments
> > > > > > is infinitely more interesting than any results to which their
> > > > > > experiments
> > > > > > lead.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > -- Norbert Wiener
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > http://www.caam.rice.edu/~mk51/
> > > > > 
> > > > 
> > > 
> > 
> 

> > <matISLocalMat.cpp>
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/56b55fea/attachment.html>

From stefano.zampini at gmail.com  Wed May 24 06:42:10 2017
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Wed, 24 May 2017 13:42:10 +0200
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to get local matrix (= one domain) before and after assembly ?
In-Reply-To: <694141704.7876584.1495619161488.JavaMail.zimbra@inria.fr>
References: <867421313.6757137.1495383596545.JavaMail.zimbra@inria.fr>
	<1253777447.6757298.1495383780337.JavaMail.zimbra@inria.fr>
	<CAMYG4Gknt_kBW8LXHma=2fFmBO1X70B0Hg6JiXi+nuDwy5ygKA@mail.gmail.com>
	<CAGPUisi5V7DkFLAt3kBRVTyVrZnYjPV21uTdUwiDku8_MVnHPA@mail.gmail.com>
	<2033509705.7414108.1495532492501.JavaMail.zimbra@inria.fr>
	<CAGPUishuMxWcPHUkWgXV42awH4rTv5W=64oz5xBShU8rTgn9Hw@mail.gmail.com>
	<740691579.7684644.1495557293858.JavaMail.zimbra@inria.fr>
	<9EFA5BCF-FDD3-45FA-A41A-6AA304D58C74@gmail.com>
	<694141704.7876584.1495619161488.JavaMail.zimbra@inria.fr>
Message-ID: <2C9AF920-14AF-4BB4-B2E0-D1162FA0A0BB@gmail.com>


> On May 24, 2017, at 11:46 AM, Franck Houssen <franck.houssen at inria.fr> wrote:
> 
> The code I sent compile and run at my side with petsc-3.7.6 (on debian/testing with gcc-6.3). The code you sent does not compile at my side. Anyway, no big deal.
> 

MatGetSubMatrix/MatGetSubMatrices have been renamed  to MatCreateSubMatrix/MatCreateSubMatrices in petsc-dev. I thought you were using the master branch and not the latest release. Sorry for the confusion. 

To compile the code I have sent, just rename MatCreateSubMatrices with MatGetSubMatrices and it should work. 
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrices.html#MatGetSubMatrices <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrices.html#MatGetSubMatrices>
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrix.html#MatGetSubMatrix <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrix.html#MatGetSubMatrix>


> The modification you propose as far as I understand is to replace "ISCreateGeneral(PETSC_COMM_WORLD" with "ISCreateGeneral(PETSC_COMM_SELF" : still not working at my side (empty dirichlet local matrix).
> I will try to get that with a MPI matrix (that would contain same data that MatIS : that's what I tried to avoid as this doubles allocations - anyway, no big deal).
> 

In the code, you are already extracting submatrices from MPIAIJ format, not from MATIS. Attached a code that compiles and runs with petsc-3.7.6 


> Franck
> 
> 
> De: "Stefano Zampini" <stefano.zampini at gmail.com>
> ?: "Franck Houssen" <franck.houssen at inria.fr>
> Cc: "petsc-dev" <petsc-dev at mcs.anl.gov>, "PETSc users list" <petsc-users at mcs.anl.gov>, "petsc-maint" <knepley at gmail.com>
> Envoy?: Mardi 23 Mai 2017 20:23:49
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (= one domain) before and after assembly ?
> 
> 
> On May 23, 2017, at 6:34 PM, Franck Houssen <franck.houssen at inria.fr <mailto:franck.houssen at inria.fr>> wrote:
> 
> OK. I am supposed to destroy the matrix returned by MatISGetMPIXAIJ ?
> Yes
> Also, my example still not get the final assembled local matrix (the MatCreateSubMatrix returns an empty matrix) but as far as I understand my (global) index set is OK: what did I miss ?
> 
> I really doubt you can use the example you have sent. It doesn?t compile, as MatCreateSubMatrix needs an extra argument.
> Attached a modified version that does what I guess is what you are looking for (sequential Dirichlet problems on the subdomains).
> 
> 
> Franck
> 
> 
> De: "Stefano Zampini" <stefano.zampini at gmail.com <mailto:stefano.zampini at gmail.com>>
> ?: "Franck Houssen" <franck.houssen at inria.fr <mailto:franck.houssen at inria.fr>>
> Cc: "petsc-dev" <petsc-dev at mcs.anl.gov <mailto:petsc-dev at mcs.anl.gov>>, "PETSc users list" <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>, "petsc-maint" <knepley at gmail.com <mailto:knepley at gmail.com>>
> Envoy?: Mardi 23 Mai 2017 13:16:18
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (= one domain) before and after assembly ?
> 
> MatISGetMPIXAIJ is collective, as it assembles the global operator. To get the matrices you are looking for, you should call MatCreateSubMatrix on the assembled global operator, with the global indices representing the subdomain problem. Each process needs to call both functions
> 
> Stefano
> 
> Il 23 Mag 2017 11:41, "Franck Houssen" <franck.houssen at inria.fr <mailto:franck.houssen at inria.fr>> ha scritto:
> I have a 3x3 global matrix made of two overlapping 2x2 local matrix (= diagonal with 1.). Each local matrix correspond to one domain (each domain is delegated to one MPI proc, so, I have 2 MPI procs because I have 2 domains).
> This is the simplest possible example: I have two 2x2 (local) diag matrix that overlap so that the global matrix built from them is 1, 2, 1 on the diagonal (local contributions add up in the middle).
> 
> Now, I need for each MPI proc to get the assembled local matrix (sometimes called the dirichlet matrix) : this is a local matrix (sequential - not distributed with MPI) that accounts for contribution of neighboring domains (MPI proc).
> 
> How to get the local assembled matrix ? MatGetLocalSubMatrix does not work (throw error - see example attached). MatGetSubMatrix returns a MPI distributed matrix, not a local (sequential) one.
> My understanding is that MatISGetMPIXAIJ should return a local matrix (sequential AIJ matrix) : the MPI in the name recall that you get the assembled matrix (with contributions from the shared border) from the other MPI processus. Correct ? In my simple example, I replaced MatGetLocalSubMatrix with MatISGetMPIXAIJ : I get a deadlock which was surprising to me... Is MatISGetMPIXAIJ a collective call ?
> Supposing this is a collective call (and that point 1 is not correct), I ride up  MatISGetMPIXAIJ before the "if (rank > 0)" : I don't deadlock now, but it seems I get a global matrix which is not the assembled local matrix I am looking for.
> I am supposed to destroy the matrix returned by MatISGetMPIXAIJ ? (I believe yes - not sure as AFAIU wording should associate Destroy methods to Create methods)
> Franck
> 
> The git diff illustrate modifications I tried to add to the initial file attached to this thread:
> --- a/matISLocalMat.cpp
> +++ b/matISLocalMat.cpp
> @@ -31,6 +31,8 @@ int main(int argc,char **argv) {
>    MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY); MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY);
>    MatView(A, PETSC_VIEWER_STDOUT_WORLD); PetscViewerFlush(PETSC_VIEWER_STDOUT_WORLD); // Diag: 1, 2, 1
>  
> +  Mat assembledLocalMat;
> +  MatISGetMPIXAIJ(A, MAT_INITIAL_MATRIX, &assembledLocalMat);
>    if (rank > 0) { // Do not pollute stdout: print only 1 proc
>      std::cout << std::endl << "non assembled local matrix:" << std::endl << std::endl;
>      Mat nonAssembledLocalMat;
> @@ -38,11 +40,10 @@ int main(int argc,char **argv) {
>      MatView(nonAssembledLocalMat, PETSC_VIEWER_STDOUT_SELF); // Diag: 1, 1
>  
>      std::cout << std::endl << "assembled local matrix:" << std::endl << std::endl;
> -    Mat assembledLocalMat;
> -    IS is; ISCreateGeneral(PETSC_COMM_SELF, localSize, localIdx, PETSC_COPY_VALUES, &is);
> -    MatGetLocalSubMatrix(A, is, is, &assembledLocalMat); // KO ?!...
> -    MatView(assembledLocalMat, PETSC_VIEWER_STDOUT_SELF); // Would like to get => Diag: 2, 1
> +    //IS is; ISCreateGeneral(PETSC_COMM_SELF, localSize, localIdx, PETSC_COPY_VALUES, &is);
> +    //MatGetLocalSubMatrix(A, is, is, &assembledLocalMat); // KO ?!...
>    }
> +  MatView(assembledLocalMat, PETSC_VIEWER_STDOUT_WORLD); // Would like to get => Diag: 2, 1
> 
> 
> De: "Stefano Zampini" <stefano.zampini at gmail.com <mailto:stefano.zampini at gmail.com>>
> ?: "petsc-maint" <knepley at gmail.com <mailto:knepley at gmail.com>>
> Cc: "petsc-dev" <petsc-dev at mcs.anl.gov <mailto:petsc-dev at mcs.anl.gov>>, "PETSc users list" <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>, "Franck Houssen" <franck.houssen at inria.fr <mailto:franck.houssen at inria.fr>>
> Envoy?: Dimanche 21 Mai 2017 22:51:34
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (= one domain) before and after assembly ?
> 
> To assemble the operator in aij format, use 
> MatISGetMPIXAIJ
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatISGetMPIXAIJ.html <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatISGetMPIXAIJ.html>
> 
> Il 21 Mag 2017 18:43, "Matthew Knepley" <knepley at gmail.com <mailto:knepley at gmail.com>> ha scritto:
> On Sun, May 21, 2017 at 11:23 AM, Franck Houssen <franck.houssen at inria.fr <mailto:franck.houssen at inria.fr>> wrote:
> I have a 3x3 global matrix is built (diag: 1, 2, 1): it's made of 2 overlapping 2x2 local matrix (diag: 1, 1).
> Getting non assembled local matrix is OK with MatISGetLocalMat.
> How to get assembled local matrix (initial local matrix + neigbhor contributions on the borders) ? (expected result is diag: 2, 1)
> 
> You can always use
> 
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrix.html <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrix.html>
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrices.html <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrices.html>
> 
> to get copies, but if you just want to build things, you can use
> 
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetLocalSubMatrix.html <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetLocalSubMatrix.html>
> 
>   Thanks,
> 
>      Matt
>  
> Franck
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/~mk51/>
> 
> 
> <matISLocalMat.cpp>
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/d2233782/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: out.log
Type: application/octet-stream
Size: 929 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/d2233782/attachment-0002.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/d2233782/attachment-0004.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matISLocalMat.cpp
Type: application/octet-stream
Size: 3889 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/d2233782/attachment-0003.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/d2233782/attachment-0005.html>

From knepley at gmail.com  Wed May 24 06:54:18 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 24 May 2017 06:54:18 -0500
Subject: [petsc-users] Installation Error
In-Reply-To: <15e1cc1.5bb6.15c38e117d7.Coremail.lirui319@hnu.edu.cn>
References: <15e1cc1.5bb6.15c38e117d7.Coremail.lirui319@hnu.edu.cn>
Message-ID: <CAMYG4G=HJGU5Y9A2u8r1n3Qno_CTtp0DEahzPeArbABc+pEoZw@mail.gmail.com>

On Wed, May 24, 2017 at 12:14 AM, ?? <lirui319 at hnu.edu.cn> wrote:

>
> Dear professor or engineer:
>    I meet a problem about installation to petsc.
>    When I type the code "./configure --with-cc=gcc --with-cxx=0
> --with-fc=0 --download-f2cblaslapack --download-mpich" on my terminal,the
> answer reveals the following results.
>
> >>>ERROR:root:code for hash md5 was not found.
> Traceback (most recent call last):
>     File "/home/zhuizhuluori/lirui/software/vapor-2.5.0-Linux_
> x86_64/vapor/vapor-2.5.0/lib/python2.7/hashlib.py", line 139, in <module
>       globals()[__func_name] = __get_hash(__func_name)
>     File "/home/zhuizhuluori/lirui/software/vapor-2.5.0-Linux_
> x86_64/vapor/vapor-2.5.0/lib/python2.7/hashlib.py", line 91, in
> __get_builtin_constructor
>     raise ValueError('unsupported hash type ' + name)
> ValueError: unsupported hash type md5
>     ERROR:root:code for hash sha1 was not found .....
>
>    I have used petsc for a long time,and never see the this problem.my
> laptop is installed an old version of petsc and I wanna change it to a new
> version.How can I fix it?Thanks for your heartful suggestion!
>

We are asking Python to do md5 and yours cannot. Is this the entire error
message? Can you send configure.log?

I think upgrading your Python will fix this.

  Thanks,

    Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/ba18a5fa/attachment.html>

From knepley at gmail.com  Wed May 24 06:55:29 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 24 May 2017 06:55:29 -0500
Subject: [petsc-users] Accessing submatrices without additional memory
	usage
In-Reply-To: <CAAw1VLU6=UFWGihUC5dj5frQZg_tW4yEupj9A-SrQ_Lc8iKdVg@mail.gmail.com>
References: <CAAw1VLU6=UFWGihUC5dj5frQZg_tW4yEupj9A-SrQ_Lc8iKdVg@mail.gmail.com>
Message-ID: <CAMYG4Gm103senLommeCCjDen_un4BVorXJsar4y5sW-4w5EOLg@mail.gmail.com>

On Wed, May 24, 2017 at 1:09 AM, Michal Derezinski <mderezin at ucsc.edu>
wrote:

> Hi,
>
> I want to be able to perform matrix operations on several contiguous
> submatrices of a full matrix, without allocating the memory redundantly for
> the submatrices (in addition to the memory that is already allocated for
> the full matrix).
> I tried using MatGetSubMatrix, but this function appears to allocate the
> additional memory.
>
> The other way I found to do this is to create the smallest submatrices I
> need first, then use MatCreateNest to combine them into bigger ones
> (including the full matrix).
> The documentation of MatCreateNest seems to indicate that it does not
> allocate additional memory for storing the new matrix.
> Is this the right approach, or is there a better one?
>

Yes, that is the right approach.

  Thanks,

    Matt


> Thanks,
> Michal Derezinski.
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/4f40db1e/attachment.html>

From knepley at gmail.com  Wed May 24 06:59:48 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 24 May 2017 06:59:48 -0500
Subject: [petsc-users] Question on incomplete factorization level and
	fill
In-Reply-To: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
References: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
Message-ID: <CAMYG4GnTZChFxXgizw1edRtfNA==oOpeJFy57nzY1b-xkkPYbg@mail.gmail.com>

On Wed, May 24, 2017 at 2:21 AM, Danyang Su <danyang.su at gmail.com> wrote:

> Dear All,
>
> I use PCFactorSetLevels for ILU and PCFactorSetFill for other
> preconditioning in my code to help solve the problems that the default
> option is hard to solve. However, I found the latter one, PCFactorSetFill
> does not take effect for my problem. The matrices and rhs as well as the
> solutions are attached from the link below. I obtain the solution using
> hypre preconditioner and it takes 7 and 38 iterations for matrix 1 and
> matrix 2. However, if I use other preconditioner, the solver just failed at
> the first matrix. I have tested this matrix using the native sequential
> solver (not PETSc) with ILU preconditioning. If I set the incomplete
> factorization level to 0, this sequential solver will take more than 100
> iterations. If I increase the factorization level to 1 or more, it just
> takes several iterations. This remind me that the PC factor for this
> matrices should be increased. However, when I tried it in PETSc, it just
> does not work.
>
> Matrix and rhs can be obtained from the link below.
>
> https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>
> Would anyone help to check if you can make this work by increasing the PC
> factor level or fill?
>

We have ILU(k) supported in serial. However ILU(dt) which takes a tolerance
only works through Hypre

  http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html

I recommend you try SuperLU or MUMPS, which can both be downloaded
automatically by configure, and
do a full sparse LU.

  Thanks,

    Matt


> Thanks and regards,
>
> Danyang
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/916d4857/attachment.html>

From knepley at gmail.com  Wed May 24 07:03:49 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 24 May 2017 07:03:49 -0500
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <1238048783.7876567.1495619158445.JavaMail.zimbra@inria.fr>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>
	<1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>
	<CAMYG4G=hBu_XJ1rRXRt6Avb4--yy=g2Fmk4x-jbbR9XUo-X4hA@mail.gmail.com>
	<1784716977.7683031.1495556883254.JavaMail.zimbra@inria.fr>
	<CAMYG4GkkXfO+JWugx-d7bvugXY=7Sqf3HCnrwBONORN5tXxRQA@mail.gmail.com>
	<855172682.7687763.1495558287122.JavaMail.zimbra@inria.fr>
	<CAMYG4GmH-7+-Ed-GaYD-B+MNVMTre3Zd-zq0Tn64cToESUex3g@mail.gmail.com>
	<1238048783.7876567.1495619158445.JavaMail.zimbra@inria.fr>
Message-ID: <CAMYG4GnUHgdS=d50-rUZp9ptMQRxByqidviW8U0QvTuWgRbr4g@mail.gmail.com>

On Wed, May 24, 2017 at 4:45 AM, Franck Houssen <franck.houssen at inria.fr>
wrote:

> Coming from FEM, I believe the very confusing thing is that the local size
> of the user problem (math, physics point of view - DDM domain size) is not
> (can not be ?) the local size expected in MatCreateIS.
>
> My understanding is that the local size in MatIS is "just" related to
> backend implementation problems (it's logical that this local size is
> necessary, but, for another purpose: MPI machinery). Taking a few steps
> back, I can not see a case (I may be wrong) when a user does know how to
> compute or set "by hand" the local size that MatIS will expect: my
> understanding (once again, not sure) is that in most cases, the user will
> need local size to be PETSC_DECIDE in MatIS (because he doesn't want to
> "bother" with that or can not guess / compute it => unfortunatelly, as is,
> this jam the whole thing).
>
> I guess this kind of signature for MatIS would avoid/limit confusion in
> most cases and for most users :
> PetscErrorCode MatCreateIS(MPI_Comm comm,PetscInt bs,PetscInt M,PetscInt
> N,ISLocalToGlobalMapping rmap,ISLocalToGlobalMapping cmap,Mat *A,PetscInt m *=
> PETSC_DECIDE*,PetscInt n*= PETSC_DECIDE*)
> Or even
> PetscErrorCode MatCreateIS(MPI_Comm comm,PetscInt bs,PetscInt M,PetscInt
> N,ISLocalToGlobalMapping rmap,ISLocalToGlobalMapping cmap,Mat *A) //
> Always use PETSC_DECIDE backstage ?
>

I have added a MatIS example with the 1D Laplacian.

  https://bitbucket.org/petsc/petsc/branch/knepley/feature-matis-example

  Thanks,

    Matt


> Franck
>
> ------------------------------
>
> *De: *"Matthew Knepley" <knepley at gmail.com>
> *?: *"Franck Houssen" <franck.houssen at inria.fr>
> *Cc: *"Stefano Zampini" <stefano.zampini at gmail.com>, "PETSc" <
> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> *Envoy?: *Mardi 23 Mai 2017 19:02:28
> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
> matrix and a global vector ?
>
> On Tue, May 23, 2017 at 11:51 AM, Franck Houssen <franck.houssen at inria.fr>
> wrote:
>
>> Not sure to know what question you're talking about ?!...
>> I use MatIS to test some kind of domain decomposition methods. I define
>> my own preconditioner for that: in the apply callback, I need to matmult my
>> (matIS) matrix with the incoming vector.
>>
>
> Okay. I will create an example using your suggestion.
>
>   Thanks,
>
>      Matt
>
>
>> Franck
>>
>> ------------------------------
>>
>> *De: *"Matthew Knepley" <knepley at gmail.com>
>> *?: *"Franck Houssen" <franck.houssen at inria.fr>
>> *Cc: *"Stefano Zampini" <stefano.zampini at gmail.com>, "PETSc" <
>> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
>> *Envoy?: *Mardi 23 Mai 2017 18:46:34
>> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
>> matrix and a global vector ?
>>
>> On Tue, May 23, 2017 at 11:28 AM, Franck Houssen <franck.houssen at inria.fr
>> > wrote:
>>
>>> OK, thanks. This is helpfull... But I really think the doc should be
>>> more verbose about that: this is really confusing and I didn't find any
>>> simple example to begin with which make all this even more confusing
>>> (personal opinion).
>>>
>>
>> Did you respond to my other question (how are you using them)? That would
>> help me understand how to phrase it.
>>
>>   Thanks,
>>
>>     Matt
>>
>>
>>> Franck
>>>
>>>
>>> ------------------------------
>>>
>>> *De: *"Matthew Knepley" <knepley at gmail.com>
>>> *?: *"Franck Houssen" <franck.houssen at inria.fr>
>>> *Cc: *"Stefano Zampini" <stefano.zampini at gmail.com>, "PETSc" <
>>> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
>>> *Envoy?: *Mardi 23 Mai 2017 13:21:21
>>> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
>>> matrix and a global vector ?
>>>
>>> On Tue, May 23, 2017 at 4:53 AM, Franck Houssen <franck.houssen at inria.fr
>>> > wrote:
>>>
>>>> The first thing I did was to put 3, not 4 : I got an error thrown in
>>>> MatCreateIS (see the git diff + stack below). As the error said I used
>>>> globalSize = numberOfMPIProcessus * localSize : my understanding is that,
>>>> when using MatIS, the global size needs to be the sum of all local sizes.
>>>> Correct ?
>>>>
>>>
>>> No. MatIS means that the matrix is not assembled. The easiest way (for
>>> me) to think of this is that processes do not have
>>> to hold full rows. One process can hold part of row i, and another
>>> processes can hold another part. However, there are still
>>> the same number of global rows.
>>>
>>> I have a 3x3 global matrix made of two overlapping 2x2 local matrix (=
>>>> diagonal with 1.). Each local matrix correspond to one domain (each domain
>>>> is delegated to one MPI proc, so, I have 2 MPI procs because I have 2
>>>> domains).
>>>>
>>>
>>> So the global size is 3. The local size here is not the size of the
>>> local IS block, since that is a property only of MatIS. It is the
>>> size of the local piece of the vector you multiply. This allows PETSc to
>>> understand the parallel layout of the Vec, and how it
>>> matched the Mat.
>>>
>>> This is somewhat confusing because FEM people mean something different
>>> by "local" than we do here, and in fact we use this
>>> other definition of local when assembling operators.
>>>
>>>    Matt
>>>
>>>
>>>> This is the simplest possible example: I have two 2x2 (local) diag
>>>> matrix that overlap so that the global matrix built from them is 1, 2, 1 on
>>>> the diagonal (local contributions add up in the middle).
>>>> I need to MatMult this global matrix with a global vector filled with 1.
>>>>
>>>> Franck
>>>>
>>>> Git diff :
>>>>
>>>> --- a/matISLocalMat.cpp
>>>> +++ b/matISLocalMat.cpp
>>>> @@ -16,7 +16,7 @@ int main(int argc,char **argv) {
>>>>    int size = 0; MPI_Comm_size(MPI_COMM_WORLD, &size); if (size != 2)
>>>> return 1;
>>>>    int rank = 0; MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>>>
>>>> -  PetscInt localSize = 2, globalSize = localSize*2 /*2 MPI*/;
>>>> +  PetscInt localSize = 2, globalSize = 3;
>>>>    PetscInt localIdx[2] = {0, 0};
>>>>    if (rank == 0) {localIdx[0] = 0; localIdx[1] = 1;}
>>>>    else           {localIdx[0] = 1; localIdx[1] = 2;}
>>>>
>>>>
>>>>
>>>> Stack error:
>>>>
>>>> [0]PETSC ERROR: Nonconforming object sizes
>>>> [0]PETSC ERROR: Sum of local lengths 4 does not equal global length 3,
>>>> my local length 2
>>>> [0]PETSC ERROR: [0] ISG2LMapApply line 17 /home/fghoussen/Documents/
>>>> INRIA/petsc-3.7.6/src/vec/is/utils/isltog.c
>>>> [0]PETSC ERROR: [0] MatSetValues_IS line 692 /home/fghoussen/Documents/
>>>> INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>>>> [0]PETSC ERROR: [0] MatSetValues line 1157 /home/fghoussen/Documents/
>>>> INRIA/petsc-3.7.6/src/mat/interface/matrix.c
>>>> [0]PETSC ERROR: [0] MatISSetPreallocation_IS line 95
>>>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>>>> [0]PETSC ERROR: [0] MatISSetPreallocation line 80
>>>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>>>> [0]PETSC ERROR: [0] PetscSplitOwnership line 80
>>>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/sys/utils/psplit.c
>>>> [0]PETSC ERROR: [0] PetscLayoutSetUp line 129 /home/fghoussen/Documents/
>>>> INRIA/petsc-3.7.6/src/vec/is/utils/pmap.c
>>>> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping_IS line 628
>>>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>>>> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping line 1899
>>>> /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
>>>> [0]PETSC ERROR: [0] MatCreateIS line 986 /home/fghoussen/Documents/
>>>> INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
>>>>
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
>>>> *?: *"Matthew Knepley" <knepley at gmail.com>
>>>> *Cc: *"Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <
>>>> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
>>>> *Envoy?: *Dimanche 21 Mai 2017 23:02:37
>>>> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
>>>> matrix and a global vector ?
>>>>
>>>> Franck,
>>>>
>>>> PETSc takes care of doing the matrix-vector multiplication properly
>>>> using MatIS.  As Matt said, the layout of the vectors is the usual parallel
>>>> layout.
>>>> The local sizes of the MatIS matrix (i.e. the local size of the left
>>>> and right vectors used in MatMult) are not the sizes of the local subdomain
>>>>  matrices in MatIS.
>>>>
>>>>
>>>> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>>
>>>> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <
>>>> franck.houssen at inria.fr> wrote:
>>>>
>>>>> Using PETSc MatIS, how to matmult a global IS matrix and a global
>>>>> vector ? Example is attached : I don't get what I expect that is a vector
>>>>> such that proc0 = [1, 2] and proc1 = [2, 1]
>>>>>
>>>>
>>>> 1) I think the global size of your matrix is wrong. You seem to want 3,
>>>> not 4
>>>>
>>>> 2) Global vectors have a non-overlapping row partition. You might be
>>>> thinking of local vectors
>>>>
>>>>   Thanks,
>>>>
>>>>     Matt
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> http://www.caam.rice.edu/~mk51/
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> *De: *"Stefano Zampini" <stefano.zampini at gmail.com>
>>>> *?: *"Matthew Knepley" <knepley at gmail.com>
>>>> *Cc: *"Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <
>>>> petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
>>>> *Envoy?: *Dimanche 21 Mai 2017 23:02:37
>>>> *Objet: *Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS
>>>> matrix and a global vector ?
>>>>
>>>> Franck,
>>>>
>>>> PETSc takes care of doing the matrix-vector multiplication properly
>>>> using MatIS.  As Matt said, the layout of the vectors is the usual parallel
>>>> layout.
>>>> The local sizes of the MatIS matrix (i.e. the local size of the left
>>>> and right vectors used in MatMult) are not the sizes of the local subdomain
>>>>  matrices in MatIS.
>>>>
>>>>
>>>> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>>
>>>> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <
>>>> franck.houssen at inria.fr> wrote:
>>>>
>>>>> Using PETSc MatIS, how to matmult a global IS matrix and a global
>>>>> vector ? Example is attached : I don't get what I expect that is a vector
>>>>> such that proc0 = [1, 2] and proc1 = [2, 1]
>>>>>
>>>>
>>>> 1) I think the global size of your matrix is wrong. You seem to want 3,
>>>> not 4
>>>>
>>>> 2) Global vectors have a non-overlapping row partition. You might be
>>>> thinking of local vectors
>>>>
>>>>   Thanks,
>>>>
>>>>     Matt
>>>>
>>>>
>>>>> Franck
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> http://www.caam.rice.edu/~mk51/
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> http://www.caam.rice.edu/~mk51/
>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> http://www.caam.rice.edu/~mk51/
>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/b638ae18/attachment-0001.html>

From balay at mcs.anl.gov  Wed May 24 07:57:09 2017
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 24 May 2017 07:57:09 -0500
Subject: [petsc-users] Installation Error
In-Reply-To: <15e1cc1.5bb6.15c38e117d7.Coremail.lirui319@hnu.edu.cn>
References: <15e1cc1.5bb6.15c38e117d7.Coremail.lirui319@hnu.edu.cn>
Message-ID: <alpine.LFD.2.20.1705240755410.20794@asterix>

What do you have for:

which python
echo $PYTHONPATH


The following might work..

PYTHONPATH='' /usr/bin/python ./configure --with-cc=gcc --with-cxx=0 --with-fc=0 --download-f2cblaslapack --download-mpich

Satish


On Wed, 24 May 2017, ?? wrote:

> 
> Dear professor or engineer:
>    I meet a problem about installation to petsc.
>    When I type the code "./configure --with-cc=gcc --with-cxx=0 --with-fc=0 --download-f2cblaslapack --download-mpich" on my terminal,the answer reveals the following results.
> 
> >>>ERROR:root:code for hash md5 was not found.
> Traceback (most recent call last):
>     File "/home/zhuizhuluori/lirui/software/vapor-2.5.0-Linux_x86_64/vapor/vapor-2.5.0/lib/python2.7/hashlib.py", line 139, in <module
>       globals()[__func_name] = __get_hash(__func_name)
>     File "/home/zhuizhuluori/lirui/software/vapor-2.5.0-Linux_x86_64/vapor/vapor-2.5.0/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor
>     raise ValueError('unsupported hash type ' + name)
> ValueError: unsupported hash type md5
>     ERROR:root:code for hash sha1 was not found .....
> 
>    I have used petsc for a long time,and never see the this problem.my laptop is installed an old version of petsc and I wanna change it to a new version.How can I fix it?Thanks for your heartful suggestion! 
> 
> 
> 
> 
> 
> 

From jchludzinski at gmail.com  Wed May 24 08:03:01 2017
From: jchludzinski at gmail.com (John Chludzinski)
Date: Wed, 24 May 2017 09:03:01 -0400
Subject: [petsc-users] PETSC OO C guide/standard?
Message-ID: <CAP0M-J1WkXTjG=MhDb4z3EY0-3oz4RT26yEJxaUvixmuDqZ0pA@mail.gmail.com>

Is there a guide for how to write/develop PETSC OO C code? How a "class" is
defined/implemented? How you implement inheritance? Memory management? Etc?

---John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/f3e35995/attachment.html>

From knepley at gmail.com  Wed May 24 08:11:46 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 24 May 2017 08:11:46 -0500
Subject: [petsc-users] PETSC OO C guide/standard?
In-Reply-To: <CAP0M-J1WkXTjG=MhDb4z3EY0-3oz4RT26yEJxaUvixmuDqZ0pA@mail.gmail.com>
References: <CAP0M-J1WkXTjG=MhDb4z3EY0-3oz4RT26yEJxaUvixmuDqZ0pA@mail.gmail.com>
Message-ID: <CAMYG4GkdcD2r2wYpSQPc5zURXeYUVN0QD3+sXBkj0ixpfCjU-w@mail.gmail.com>

On Wed, May 24, 2017 at 8:03 AM, John Chludzinski <jchludzinski at gmail.com>
wrote:

> Is there a guide for how to write/develop PETSC OO C code? How a "class"
> is defined/implemented? How you implement inheritance? Memory management?
> Etc?
>

We have a guide: http://www.mcs.anl.gov/petsc/developers/developers.pdf

If its not in there, you can mail the list.

  Thanks,

     Matt


> ---John
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/f5948b73/attachment.html>

From franck.houssen at inria.fr  Wed May 24 08:11:28 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Wed, 24 May 2017 15:11:28 +0200 (CEST)
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to get local matrix (= one domain) before and after assembly ?
In-Reply-To: <2C9AF920-14AF-4BB4-B2E0-D1162FA0A0BB@gmail.com>
References: <867421313.6757137.1495383596545.JavaMail.zimbra@inria.fr>
	<CAGPUisi5V7DkFLAt3kBRVTyVrZnYjPV21uTdUwiDku8_MVnHPA@mail.gmail.com>
	<2033509705.7414108.1495532492501.JavaMail.zimbra@inria.fr>
	<CAGPUishuMxWcPHUkWgXV42awH4rTv5W=64oz5xBShU8rTgn9Hw@mail.gmail.com>
	<740691579.7684644.1495557293858.JavaMail.zimbra@inria.fr>
	<9EFA5BCF-FDD3-45FA-A41A-6AA304D58C74@gmail.com>
	<694141704.7876584.1495619161488.JavaMail.zimbra@inria.fr>
	<2C9AF920-14AF-4BB4-B2E0-D1162FA0A0BB@gmail.com>
Message-ID: <1874209065.7990175.1495631488066.JavaMail.zimbra@inria.fr>

OK, this is working now ! As the API changed between the latest stable and the master branch, I was actually not using the correct method. 

Thanks Stefano, 

Franck 

----- Mail original -----

> De: "Stefano Zampini" <stefano.zampini at gmail.com>
> ?: "Franck Houssen" <franck.houssen at inria.fr>
> Cc: "petsc-dev" <petsc-dev at mcs.anl.gov>, "PETSc users list"
> <petsc-users at mcs.anl.gov>, "petsc-maint" <knepley at gmail.com>
> Envoy?: Mercredi 24 Mai 2017 13:42:10
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (= one
> domain) before and after assembly ?

> > On May 24, 2017, at 11:46 AM, Franck Houssen < franck.houssen at inria.fr >
> > wrote:
> 

> > The code I sent compile and run at my side with petsc-3.7.6 (on
> > debian/testing with gcc-6.3). The code you sent does not compile at my
> > side.
> > Anyway, no big deal.
> 

> MatGetSubMatrix/MatGetSubMatrices have been renamed to
> MatCreateSubMatrix/MatCreateSubMatrices in petsc-dev. I thought you were
> using the master branch and not the latest release. Sorry for the confusion.

> To compile the code I have sent, just rename MatCreateSubMatrices with
> MatGetSubMatrices and it should work.
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrices.html#MatGetSubMatrices
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrix.html#MatGetSubMatrix

> > The modification you propose as far as I understand is to replace
> > "ISCreateGeneral(PETSC_COMM_WORLD" with "ISCreateGeneral(PETSC_COMM_SELF" :
> > still not working at my side (empty dirichlet local matrix).
> 
> > I will try to get that with a MPI matrix (that would contain same data that
> > MatIS : that's what I tried to avoid as this doubles allocations - anyway,
> > no big deal).
> 

> In the code, you are already extracting submatrices from MPIAIJ format, not
> from MATIS. Attached a code that compiles and runs with petsc-3.7.6

> > Franck
> 

> > ----- Mail original -----
> 

> > > De: "Stefano Zampini" < stefano.zampini at gmail.com >
> > 
> 
> > > ?: "Franck Houssen" < franck.houssen at inria.fr >
> > 
> 
> > > Cc: "petsc-dev" < petsc-dev at mcs.anl.gov >, "PETSc users list" <
> > > petsc-users at mcs.anl.gov >, "petsc-maint" < knepley at gmail.com >
> > 
> 
> > > Envoy?: Mardi 23 Mai 2017 20:23:49
> > 
> 
> > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (= one
> > > domain) before and after assembly ?
> > 
> 

> > > > On May 23, 2017, at 6:34 PM, Franck Houssen < franck.houssen at inria.fr >
> > > > wrote:
> > > 
> > 
> 

> > > > OK. I am supposed to destroy the matrix returned by MatISGetMPIXAIJ ?
> > > 
> > 
> 

> > > Yes
> > 
> 

> > > > Also, my example still not get the final assembled local matrix (the
> > > > MatCreateSubMatrix returns an empty matrix) but as far as I understand
> > > > my
> > > > (global) index set is OK: what did I miss ?
> > > 
> > 
> 

> > > I really doubt you can use the example you have sent. It doesn?t compile,
> > > as
> > > MatCreateSubMatrix needs an extra argument.
> > 
> 
> > > Attached a modified version that does what I guess is what you are
> > > looking
> > > for (sequential Dirichlet problems on the subdomains).
> > 
> 

> > > > Franck
> > > 
> > 
> 

> > > > ----- Mail original -----
> > > 
> > 
> 

> > > > > De: "Stefano Zampini" < stefano.zampini at gmail.com >
> > > > 
> > > 
> > 
> 
> > > > > ?: "Franck Houssen" < franck.houssen at inria.fr >
> > > > 
> > > 
> > 
> 
> > > > > Cc: "petsc-dev" < petsc-dev at mcs.anl.gov >, "PETSc users list" <
> > > > > petsc-users at mcs.anl.gov >, "petsc-maint" < knepley at gmail.com >
> > > > 
> > > 
> > 
> 
> > > > > Envoy?: Mardi 23 Mai 2017 13:16:18
> > > > 
> > > 
> > 
> 
> > > > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix (=
> > > > > one
> > > > > domain) before and after assembly ?
> > > > 
> > > 
> > 
> 

> > > > > MatISGetMPIXAIJ is collective, as it assembles the global operator.
> > > > > To
> > > > > get
> > > > > the matrices you are looking for, you should call MatCreateSubMatrix
> > > > > on
> > > > > the
> > > > > assembled global operator, with the global indices representing the
> > > > > subdomain problem. Each process needs to call both functions
> > > > 
> > > 
> > 
> 

> > > > > Stefano
> > > > 
> > > 
> > 
> 

> > > > > Il 23 Mag 2017 11:41, "Franck Houssen" < franck.houssen at inria.fr > ha
> > > > > scritto:
> > > > 
> > > 
> > 
> 

> > > > > > I have a 3x3 global matrix made of two overlapping 2x2 local matrix
> > > > > > (=
> > > > > > diagonal with 1.). Each local matrix correspond to one domain (each
> > > > > > domain
> > > > > > is delegated to one MPI proc, so, I have 2 MPI procs because I have
> > > > > > 2
> > > > > > domains).
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > This is the simplest possible example: I have two 2x2 (local) diag
> > > > > > matrix
> > > > > > that overlap so that the global matrix built from them is 1, 2, 1
> > > > > > on
> > > > > > the
> > > > > > diagonal (local contributions add up in the middle).
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > Now, I need for each MPI proc to get the assembled local matrix
> > > > > > (sometimes
> > > > > > called the dirichlet matrix) : this is a local matrix (sequential -
> > > > > > not
> > > > > > distributed with MPI) that accounts for contribution of neighboring
> > > > > > domains
> > > > > > (MPI proc).
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > How to get the local assembled matrix ? MatGetLocalSubMatrix does
> > > > > > not
> > > > > > work
> > > > > > (throw error - see example attached). MatGetSubMatrix returns a MPI
> > > > > > distributed matrix, not a local (sequential) one.
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > 1. My understanding is that MatISGetMPIXAIJ should return a local
> > > > > > matrix
> > > > > > (sequential AIJ matrix) : the MPI in the name recall that you get
> > > > > > the
> > > > > > assembled matrix (with contributions from the shared border) from
> > > > > > the
> > > > > > other
> > > > > > MPI processus. Correct ? In my simple example, I replaced
> > > > > > MatGetLocalSubMatrix with MatISGetMPIXAIJ : I get a deadlock which
> > > > > > was
> > > > > > surprising to me... Is MatISGetMPIXAIJ a collective call ?
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > 2. Supposing this is a collective call (and that point 1 is not
> > > > > > correct),
> > > > > > I
> > > > > > ride up MatISGetMPIXAIJ before the "if (rank > 0)" : I don't
> > > > > > deadlock
> > > > > > now,
> > > > > > but it seems I get a global matrix which is not the assembled local
> > > > > > matrix
> > > > > > I
> > > > > > am looking for.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > 3. I am supposed to destroy the matrix returned by MatISGetMPIXAIJ
> > > > > > ?
> > > > > > (I
> > > > > > believe yes - not sure as AFAIU wording should associate Destroy
> > > > > > methods
> > > > > > to
> > > > > > Create methods)
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Franck
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > The git diff illustrate modifications I tried to add to the initial
> > > > > > file
> > > > > > attached to this thread:
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > --- a/matISLocalMat.cpp
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > +++ b/matISLocalMat.cpp
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > @@ -31,6 +31,8 @@ int main(int argc,char **argv) {
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY); MatAssemblyEnd(A,
> > > > > > MAT_FINAL_ASSEMBLY);
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > MatView(A, PETSC_VIEWER_STDOUT_WORLD);
> > > > > > PetscViewerFlush(PETSC_VIEWER_STDOUT_WORLD); // Diag: 1, 2, 1
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > + Mat assembledLocalMat;
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > + MatISGetMPIXAIJ(A, MAT_INITIAL_MATRIX, &assembledLocalMat);
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > if (rank > 0) { // Do not pollute stdout: print only 1 proc
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > std::cout << std::endl << "non assembled local matrix:" <<
> > > > > > std::endl
> > > > > > <<
> > > > > > std::endl;
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Mat nonAssembledLocalMat;
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > @@ -38,11 +40,10 @@ int main(int argc,char **argv) {
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > MatView(nonAssembledLocalMat, PETSC_VIEWER_STDOUT_SELF); // Diag:
> > > > > > 1,
> > > > > > 1
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > std::cout << std::endl << "assembled local matrix:" << std::endl <<
> > > > > > std::endl;
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > - Mat assembledLocalMat;
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > - IS is; ISCreateGeneral(PETSC_COMM_SELF, localSize, localIdx,
> > > > > > PETSC_COPY_VALUES, &is);
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > - MatGetLocalSubMatrix(A, is, is, &assembledLocalMat); // KO ?!...
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > - MatView(assembledLocalMat, PETSC_VIEWER_STDOUT_SELF); // Would
> > > > > > like
> > > > > > to
> > > > > > get
> > > > > > => Diag: 2, 1
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > + //IS is; ISCreateGeneral(PETSC_COMM_SELF, localSize, localIdx,
> > > > > > PETSC_COPY_VALUES, &is);
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > + //MatGetLocalSubMatrix(A, is, is, &assembledLocalMat); // KO
> > > > > > ?!...
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > }
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > + MatView(assembledLocalMat, PETSC_VIEWER_STDOUT_WORLD); // Would
> > > > > > like
> > > > > > to
> > > > > > get
> > > > > > => Diag: 2, 1
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > De: "Stefano Zampini" < stefano.zampini at gmail.com >
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > ?: "petsc-maint" < knepley at gmail.com >
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Cc: "petsc-dev" < petsc-dev at mcs.anl.gov >, "PETSc users list" <
> > > > > > > petsc-users at mcs.anl.gov >, "Franck Houssen" <
> > > > > > > franck.houssen at inria.fr
> > > > > > > >
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Envoy?: Dimanche 21 Mai 2017 22:51:34
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Objet: Re: [petsc-dev] Using PETSc MatIS, how to get local matrix
> > > > > > > (=
> > > > > > > one
> > > > > > > domain) before and after assembly ?
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > To assemble the operator in aij format, use
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > MatISGetMPIXAIJ
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatISGetMPIXAIJ.html
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > Il 21 Mag 2017 18:43, "Matthew Knepley" < knepley at gmail.com > ha
> > > > > > > scritto:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > On Sun, May 21, 2017 at 11:23 AM, Franck Houssen <
> > > > > > > > franck.houssen at inria.fr
> > > > > > > > >
> > > > > > > > wrote:
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > I have a 3x3 global matrix is built (diag: 1, 2, 1): it's
> > > > > > > > > made
> > > > > > > > > of
> > > > > > > > > 2
> > > > > > > > > overlapping 2x2 local matrix (diag: 1, 1).
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > Getting non assembled local matrix is OK with
> > > > > > > > > MatISGetLocalMat.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > How to get assembled local matrix (initial local matrix +
> > > > > > > > > neigbhor
> > > > > > > > > contributions on the borders) ? (expected result is diag: 2,
> > > > > > > > > 1)
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > You can always use
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrix.html
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetSubMatrices.html
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > to get copies, but if you just want to build things, you can
> > > > > > > > use
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetLocalSubMatrix.html
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > Thanks,
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > Matt
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > > Franck
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > --
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > What most experimenters take for granted before they begin
> > > > > > > > their
> > > > > > > > experiments
> > > > > > > > is infinitely more interesting than any results to which their
> > > > > > > > experiments
> > > > > > > > lead.
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > -- Norbert Wiener
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > > > > > http://www.caam.rice.edu/~mk51/
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 

> > > > <matISLocalMat.cpp>
> > > 
> > 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/22a01549/attachment-0001.html>

From jchludzinski at gmail.com  Wed May 24 08:50:39 2017
From: jchludzinski at gmail.com (John Chludzinski)
Date: Wed, 24 May 2017 09:50:39 -0400
Subject: [petsc-users] PETSC OO C guide/standard?
In-Reply-To: <CAMYG4GkdcD2r2wYpSQPc5zURXeYUVN0QD3+sXBkj0ixpfCjU-w@mail.gmail.com>
References: <CAP0M-J1WkXTjG=MhDb4z3EY0-3oz4RT26yEJxaUvixmuDqZ0pA@mail.gmail.com>
	<CAMYG4GkdcD2r2wYpSQPc5zURXeYUVN0QD3+sXBkj0ixpfCjU-w@mail.gmail.com>
Message-ID: <CAP0M-J0ugnfCG3dO1YbaYBJRgW+___wZzOfft43Akdusdfn0=Q@mail.gmail.com>

Considering that the current C++ standard is >1600 pages and counting
(still glomming on new "features"), I'm planning to try an OO style of C
coding style.

The standard's size (number of pages) being the best (and only *practical*)
means to measure language complexity.

On Wed, May 24, 2017 at 9:11 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Wed, May 24, 2017 at 8:03 AM, John Chludzinski <jchludzinski at gmail.com>
> wrote:
>
>> Is there a guide for how to write/develop PETSC OO C code? How a "class"
>> is defined/implemented? How you implement inheritance? Memory management?
>> Etc?
>>
>
> We have a guide: http://www.mcs.anl.gov/petsc/developers/developers.pdf
>
> If its not in there, you can mail the list.
>
>   Thanks,
>
>      Matt
>
>
>> ---John
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/0d0232e2/attachment.html>

From knepley at gmail.com  Wed May 24 08:53:35 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 24 May 2017 08:53:35 -0500
Subject: [petsc-users] PETSC OO C guide/standard?
In-Reply-To: <CAP0M-J0ugnfCG3dO1YbaYBJRgW+___wZzOfft43Akdusdfn0=Q@mail.gmail.com>
References: <CAP0M-J1WkXTjG=MhDb4z3EY0-3oz4RT26yEJxaUvixmuDqZ0pA@mail.gmail.com>
	<CAMYG4GkdcD2r2wYpSQPc5zURXeYUVN0QD3+sXBkj0ixpfCjU-w@mail.gmail.com>
	<CAP0M-J0ugnfCG3dO1YbaYBJRgW+___wZzOfft43Akdusdfn0=Q@mail.gmail.com>
Message-ID: <CAMYG4G=M9NRaKzkfW0AS--SUe60eHvS__giaTcZeJuPUB7AZhg@mail.gmail.com>

On Wed, May 24, 2017 at 8:50 AM, John Chludzinski <jchludzinski at gmail.com>
wrote:

> Considering that the current C++ standard is >1600 pages and counting
> (still glomming on new "features"), I'm planning to try an OO style of C
> coding style.
>
> The standard's size (number of pages) being the best (and only *practical*)
> means to measure language complexity.
>

Here is another thing I wrote talking about OO in PETSc:

  https://arxiv.org/abs/1209.1711

    Matt


> On Wed, May 24, 2017 at 9:11 AM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Wed, May 24, 2017 at 8:03 AM, John Chludzinski <jchludzinski at gmail.com
>> > wrote:
>>
>>> Is there a guide for how to write/develop PETSC OO C code? How a "class"
>>> is defined/implemented? How you implement inheritance? Memory management?
>>> Etc?
>>>
>>
>> We have a guide: http://www.mcs.anl.gov/petsc/developers/developers.pdf
>>
>> If its not in there, you can mail the list.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> ---John
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> http://www.caam.rice.edu/~mk51/
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/599235ed/attachment.html>

From bsmith at mcs.anl.gov  Wed May 24 12:37:24 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 24 May 2017 12:37:24 -0500
Subject: [petsc-users] Question on incomplete factorization level and
	fill
In-Reply-To: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
References: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
Message-ID: <1BBCC24A-97A1-49CD-A234-09837E24FCA8@mcs.anl.gov>


> On May 24, 2017, at 2:21 AM, Danyang Su <danyang.su at gmail.com> wrote:
> 
> Dear All,
> 
> I use PCFactorSetLevels for ILU and PCFactorSetFill for other preconditioning in my code to help solve the problems that the default option is hard to solve. However, I found the latter one, PCFactorSetFill does not take effect for my problem.

   SetFill doesn't affect the numerical answers at all. It is just a prediction you make of how much memory you expect to be used inside the factorization.


> The matrices and rhs as well as the solutions are attached from the link below. I obtain the solution using hypre preconditioner and it takes 7 and 38 iterations for matrix 1 and matrix 2. However, if I use other preconditioner, the solver just failed at the first matrix. I have tested this matrix using the native sequential solver (not PETSc) with ILU preconditioning. If I set the incomplete factorization level to 0, this sequential solver will take more than 100 iterations. If I increase the factorization level to 1 or more, it just takes several iterations. This remind me that the PC factor for this matrices should be increased. However, when I tried it in PETSc, it just does not work.
> 
> Matrix and rhs can be obtained from the link below.
> 
> https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
> 
> Would anyone help to check if you can make this work by increasing the PC factor level or fill?
> 
> Thanks and regards,
> 
> Danyang
> 
> 


From michal.derezinski at gmail.com  Wed May 24 12:37:11 2017
From: michal.derezinski at gmail.com (=?utf-8?Q?Micha=C5=82_Derezi=C5=84ski?=)
Date: Wed, 24 May 2017 10:37:11 -0700
Subject: [petsc-users] Accessing submatrices without additional memory
 usage
In-Reply-To: <CAMYG4Gm103senLommeCCjDen_un4BVorXJsar4y5sW-4w5EOLg@mail.gmail.com>
References: <CAAw1VLU6=UFWGihUC5dj5frQZg_tW4yEupj9A-SrQ_Lc8iKdVg@mail.gmail.com>
	<CAMYG4Gm103senLommeCCjDen_un4BVorXJsar4y5sW-4w5EOLg@mail.gmail.com>
Message-ID: <0B69CBD4-9524-429A-8478-0BBF0C236F94@gmail.com>

Great! Then I have a follow-up question:

My goal is to be able to load the full matrix X from disk, while at the same time in parallel, performing computations on the submatrices that have already been loaded. Essentially, I want to think of X as a block matrix (where the blocks are horizontal, spanning the full width of the matrix), where I?m loading one block at a time, and all the blocks that have already been loaded are combined using MatCreateNest, so that I can make computations on that portion of the matrix.

In this scenario, every process needs to be simultaneously loading the next block of X, and perform computations on the previously loaded portion. My strategy is for each MPI process to spawn a thread for data loading (so that the memory between the process and the thread is shared), while the process does computations. My concern is that the data loading thread may be using up computational resources of the processor, even though it is mainly doing IO. Will this be an issue? What is the best way to minimize the cpu time of this parallel data loading scheme?

Thanks,
Michal.


> Wiadomo?? napisana przez Matthew Knepley <knepley at gmail.com> w dniu 24.05.2017, o godz. 04:55:
> 
> On Wed, May 24, 2017 at 1:09 AM, Michal Derezinski <mderezin at ucsc.edu <mailto:mderezin at ucsc.edu>> wrote:
> Hi,
> 
> I want to be able to perform matrix operations on several contiguous submatrices of a full matrix, without allocating the memory redundantly for the submatrices (in addition to the memory that is already allocated for the full matrix).
> I tried using MatGetSubMatrix, but this function appears to allocate the additional memory.
> 
> The other way I found to do this is to create the smallest submatrices I need first, then use MatCreateNest to combine them into bigger ones (including the full matrix).
> The documentation of MatCreateNest seems to indicate that it does not allocate additional memory for storing the new matrix.
> Is this the right approach, or is there a better one?
> 
> Yes, that is the right approach.
> 
>   Thanks,
> 
>     Matt
>  
> Thanks,
> Michal Derezinski.
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/~mk51/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/b4e1ffbd/attachment.html>

From bsmith at mcs.anl.gov  Wed May 24 12:43:49 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 24 May 2017 12:43:49 -0500
Subject: [petsc-users] Accessing submatrices without additional memory
 usage
In-Reply-To: <0B69CBD4-9524-429A-8478-0BBF0C236F94@gmail.com>
References: <CAAw1VLU6=UFWGihUC5dj5frQZg_tW4yEupj9A-SrQ_Lc8iKdVg@mail.gmail.com>
	<CAMYG4Gm103senLommeCCjDen_un4BVorXJsar4y5sW-4w5EOLg@mail.gmail.com>
	<0B69CBD4-9524-429A-8478-0BBF0C236F94@gmail.com>
Message-ID: <226AA580-DAC1-4455-AB26-A98ECB76A2FA@mcs.anl.gov>


   How big are the sub matrices, how many MPI processes are you hoping to use, how fast/sophisticated is your file system?

    All of these things and others will determine whether this approach will buy you anything or not. 

    I recommend NOT doing this first, instead just sequentially read in the matrices and perform the computations and then run profiling to determine where the time is being spent and whether even trying this kind of optimization makes sense. I suspect it does not.

   Barry

> On May 24, 2017, at 12:37 PM, Micha? Derezi?ski <michal.derezinski at gmail.com> wrote:
> 
> Great! Then I have a follow-up question:
> 
> My goal is to be able to load the full matrix X from disk, while at the same time in parallel, performing computations on the submatrices that have already been loaded. Essentially, I want to think of X as a block matrix (where the blocks are horizontal, spanning the full width of the matrix), where I?m loading one block at a time, and all the blocks that have already been loaded are combined using MatCreateNest, so that I can make computations on that portion of the matrix.
> 
> In this scenario, every process needs to be simultaneously loading the next block of X, and perform computations on the previously loaded portion. My strategy is for each MPI process to spawn a thread for data loading (so that the memory between the process and the thread is shared), while the process does computations. My concern is that the data loading thread may be using up computational resources of the processor, even though it is mainly doing IO. Will this be an issue? What is the best way to minimize the cpu time of this parallel data loading scheme?
> 
> Thanks,
> Michal.
> 
> 
>> Wiadomo?? napisana przez Matthew Knepley <knepley at gmail.com> w dniu 24.05.2017, o godz. 04:55:
>> 
>> On Wed, May 24, 2017 at 1:09 AM, Michal Derezinski <mderezin at ucsc.edu> wrote:
>> Hi,
>> 
>> I want to be able to perform matrix operations on several contiguous submatrices of a full matrix, without allocating the memory redundantly for the submatrices (in addition to the memory that is already allocated for the full matrix).
>> I tried using MatGetSubMatrix, but this function appears to allocate the additional memory.
>> 
>> The other way I found to do this is to create the smallest submatrices I need first, then use MatCreateNest to combine them into bigger ones (including the full matrix).
>> The documentation of MatCreateNest seems to indicate that it does not allocate additional memory for storing the new matrix.
>> Is this the right approach, or is there a better one?
>> 
>> Yes, that is the right approach.
>> 
>>   Thanks,
>> 
>>     Matt
>>  
>> Thanks,
>> Michal Derezinski.
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> http://www.caam.rice.edu/~mk51/
> 


From knepley at gmail.com  Wed May 24 12:44:47 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 24 May 2017 12:44:47 -0500
Subject: [petsc-users] Accessing submatrices without additional memory
	usage
In-Reply-To: <0B69CBD4-9524-429A-8478-0BBF0C236F94@gmail.com>
References: <CAAw1VLU6=UFWGihUC5dj5frQZg_tW4yEupj9A-SrQ_Lc8iKdVg@mail.gmail.com>
	<CAMYG4Gm103senLommeCCjDen_un4BVorXJsar4y5sW-4w5EOLg@mail.gmail.com>
	<0B69CBD4-9524-429A-8478-0BBF0C236F94@gmail.com>
Message-ID: <CAMYG4GmgG3CDGqti1BB1L+ZHFmwo9ZLNEv0GpXJJ2LbDOu7kAQ@mail.gmail.com>

On Wed, May 24, 2017 at 12:37 PM, Micha? Derezi?ski <
michal.derezinski at gmail.com> wrote:

> Great! Then I have a follow-up question:
>
> My goal is to be able to load the full matrix X from disk, while at the
> same time in parallel, performing computations on the submatrices that have
> already been loaded. Essentially, I want to think of X as a block matrix
> (where the blocks are horizontal, spanning the full width of the matrix),
> where I?m loading one block at a time, and all the blocks that have already
> been loaded are combined using MatCreateNest, so that I can make
> computations on that portion of the matrix.
>

I need to understand better. So

  1) You want to load a sparse matrix from disk

  2) You are imagining that it is loaded row-wise, since you can do a
calculation with some rows before others are loaded.

       What calculation, a MatMult?
       How long does that MatMult take compared to loading?

  3) If you are talking about a dense matrix, you should be loading in
parallel using MPI-I/O. We do this for Vec.

Before you do complicated programming, I would assure myself that the
performance gain is worth it.


> In this scenario, every process needs to be simultaneously loading the
> next block of X, and perform computations on the previously loaded portion.
> My strategy is for each MPI process to spawn a thread for data loading (so
> that the memory between the process and the thread is shared), while the
> process does computations. My concern is that the data loading thread may
> be using up computational resources of the processor, even though it is
> mainly doing IO. Will this be an issue? What is the best way to minimize
> the cpu time of this parallel data loading scheme?
>

Oh, you want to load each block in parallel, but there are many blocks. I
would really caution you against using threads. They
are death to clean code. Use non-blocking reads.

  Thanks,

     Matt


> Thanks,
> Michal.
>
>
> Wiadomo?? napisana przez Matthew Knepley <knepley at gmail.com> w dniu
> 24.05.2017, o godz. 04:55:
>
> On Wed, May 24, 2017 at 1:09 AM, Michal Derezinski <mderezin at ucsc.edu>
> wrote:
>
>> Hi,
>>
>> I want to be able to perform matrix operations on several contiguous
>> submatrices of a full matrix, without allocating the memory redundantly for
>> the submatrices (in addition to the memory that is already allocated for
>> the full matrix).
>> I tried using MatGetSubMatrix, but this function appears to allocate the
>> additional memory.
>>
>> The other way I found to do this is to create the smallest submatrices I
>> need first, then use MatCreateNest to combine them into bigger ones
>> (including the full matrix).
>> The documentation of MatCreateNest seems to indicate that it does not
>> allocate additional memory for storing the new matrix.
>> Is this the right approach, or is there a better one?
>>
>
> Yes, that is the right approach.
>
>   Thanks,
>
>     Matt
>
>
>> Thanks,
>> Michal Derezinski.
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/7a2e865c/attachment.html>

From danyang.su at gmail.com  Wed May 24 12:50:25 2017
From: danyang.su at gmail.com (Danyang Su)
Date: Wed, 24 May 2017 10:50:25 -0700
Subject: [petsc-users] Question on incomplete factorization level and
	fill
In-Reply-To: <CAMYG4GnTZChFxXgizw1edRtfNA==oOpeJFy57nzY1b-xkkPYbg@mail.gmail.com>
References: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
	<CAMYG4GnTZChFxXgizw1edRtfNA==oOpeJFy57nzY1b-xkkPYbg@mail.gmail.com>
Message-ID: <c78911ea-edba-3966-754d-0102e2d75d42@gmail.com>

Hi Matthew and Barry,

Thanks for the quick response.

I also tried superlu and mumps, both work but it is about four times 
slower than ILU(dt) prec through hypre, with 24 processors I have 
tested. When I look into the convergence information, the method using 
ILU(dt) still takes 200 to 3000 linear iterations for each newton 
iteration. One reason is this equation is hard to solve. As for the 
general cases, the same method works awesome and get very good speedup. 
I also doubt if I use hypre correctly for this case. Is there anyway to 
check this problem, or is it possible to increase the factorization 
level through hypre?

Thanks,

Danyang


On 17-05-24 04:59 AM, Matthew Knepley wrote:
> On Wed, May 24, 2017 at 2:21 AM, Danyang Su <danyang.su at gmail.com 
> <mailto:danyang.su at gmail.com>> wrote:
>
>     Dear All,
>
>     I use PCFactorSetLevels for ILU and PCFactorSetFill for other
>     preconditioning in my code to help solve the problems that the
>     default option is hard to solve. However, I found the latter one,
>     PCFactorSetFill does not take effect for my problem. The matrices
>     and rhs as well as the solutions are attached from the link below.
>     I obtain the solution using hypre preconditioner and it takes 7
>     and 38 iterations for matrix 1 and matrix 2. However, if I use
>     other preconditioner, the solver just failed at the first matrix.
>     I have tested this matrix using the native sequential solver (not
>     PETSc) with ILU preconditioning. If I set the incomplete
>     factorization level to 0, this sequential solver will take more
>     than 100 iterations. If I increase the factorization level to 1 or
>     more, it just takes several iterations. This remind me that the PC
>     factor for this matrices should be increased. However, when I
>     tried it in PETSc, it just does not work.
>
>     Matrix and rhs can be obtained from the link below.
>
>     https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>     <https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R>
>
>     Would anyone help to check if you can make this work by increasing
>     the PC factor level or fill?
>
>
> We have ILU(k) supported in serial. However ILU(dt) which takes a 
> tolerance only works through Hypre
>
> http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>
> I recommend you try SuperLU or MUMPS, which can both be downloaded 
> automatically by configure, and
> do a full sparse LU.
>
>   Thanks,
>
>     Matt
>
>     Thanks and regards,
>
>     Danyang
>
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/%7Emk51/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/0ba46ff2/attachment-0001.html>

From knepley at gmail.com  Wed May 24 13:12:07 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 24 May 2017 13:12:07 -0500
Subject: [petsc-users] Question on incomplete factorization level and
	fill
In-Reply-To: <c78911ea-edba-3966-754d-0102e2d75d42@gmail.com>
References: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
	<CAMYG4GnTZChFxXgizw1edRtfNA==oOpeJFy57nzY1b-xkkPYbg@mail.gmail.com>
	<c78911ea-edba-3966-754d-0102e2d75d42@gmail.com>
Message-ID: <CAMYG4GmaFrpOzo1iFXJQoLZpwZ0rxy_ULpNPgcgtD4AMZHrRfg@mail.gmail.com>

On Wed, May 24, 2017 at 12:50 PM, Danyang Su <danyang.su at gmail.com> wrote:

> Hi Matthew and Barry,
>
> Thanks for the quick response.
>
> I also tried superlu and mumps, both work but it is about four times
> slower than ILU(dt) prec through hypre, with 24 processors I have tested.
>
You mean the total time is 4x? And you are taking hundreds of iterates?
That seems hard to believe, unless you are dropping
a huge number of elements.

> When I look into the convergence information, the method using ILU(dt)
> still takes 200 to 3000 linear iterations for each newton iteration. One
> reason is this equation is hard to solve. As for the general cases, the
> same method works awesome and get very good speedup.
>
I do not understand what you mean here.

> I also doubt if I use hypre correctly for this case. Is there anyway to
> check this problem, or is it possible to increase the factorization level
> through hypre?
>
> I don't know.

  Matt

> Thanks,
>
> Danyang
>
> On 17-05-24 04:59 AM, Matthew Knepley wrote:
>
> On Wed, May 24, 2017 at 2:21 AM, Danyang Su <danyang.su at gmail.com> wrote:
>
>> Dear All,
>>
>> I use PCFactorSetLevels for ILU and PCFactorSetFill for other
>> preconditioning in my code to help solve the problems that the default
>> option is hard to solve. However, I found the latter one, PCFactorSetFill
>> does not take effect for my problem. The matrices and rhs as well as the
>> solutions are attached from the link below. I obtain the solution using
>> hypre preconditioner and it takes 7 and 38 iterations for matrix 1 and
>> matrix 2. However, if I use other preconditioner, the solver just failed at
>> the first matrix. I have tested this matrix using the native sequential
>> solver (not PETSc) with ILU preconditioning. If I set the incomplete
>> factorization level to 0, this sequential solver will take more than 100
>> iterations. If I increase the factorization level to 1 or more, it just
>> takes several iterations. This remind me that the PC factor for this
>> matrices should be increased. However, when I tried it in PETSc, it just
>> does not work.
>>
>> Matrix and rhs can be obtained from the link below.
>>
>> https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>>
>> Would anyone help to check if you can make this work by increasing the PC
>> factor level or fill?
>>
>
> We have ILU(k) supported in serial. However ILU(dt) which takes a
> tolerance only works through Hypre
>
>   http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>
> I recommend you try SuperLU or MUMPS, which can both be downloaded
> automatically by configure, and
> do a full sparse LU.
>
>   Thanks,
>
>     Matt
>
>
>> Thanks and regards,
>>
>> Danyang
>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/bd229a7f/attachment.html>

From michal.derezinski at gmail.com  Wed May 24 13:13:42 2017
From: michal.derezinski at gmail.com (=?utf-8?Q?Micha=C5=82_Derezi=C5=84ski?=)
Date: Wed, 24 May 2017 11:13:42 -0700
Subject: [petsc-users] Accessing submatrices without additional memory
 usage
In-Reply-To: <CAMYG4GmgG3CDGqti1BB1L+ZHFmwo9ZLNEv0GpXJJ2LbDOu7kAQ@mail.gmail.com>
References: <CAAw1VLU6=UFWGihUC5dj5frQZg_tW4yEupj9A-SrQ_Lc8iKdVg@mail.gmail.com>
	<CAMYG4Gm103senLommeCCjDen_un4BVorXJsar4y5sW-4w5EOLg@mail.gmail.com>
	<0B69CBD4-9524-429A-8478-0BBF0C236F94@gmail.com>
	<CAMYG4GmgG3CDGqti1BB1L+ZHFmwo9ZLNEv0GpXJJ2LbDOu7kAQ@mail.gmail.com>
Message-ID: <E8838E1F-ADF4-4F7B-8F96-6D8494FC3D3B@gmail.com>


> Wiadomo?? napisana przez Matthew Knepley <knepley at gmail.com> w dniu 24.05.2017, o godz. 10:44:
> 
> On Wed, May 24, 2017 at 12:37 PM, Micha? Derezi?ski <michal.derezinski at gmail.com <mailto:michal.derezinski at gmail.com>> wrote:
> Great! Then I have a follow-up question:
> 
> My goal is to be able to load the full matrix X from disk, while at the same time in parallel, performing computations on the submatrices that have already been loaded. Essentially, I want to think of X as a block matrix (where the blocks are horizontal, spanning the full width of the matrix), where I?m loading one block at a time, and all the blocks that have already been loaded are combined using MatCreateNest, so that I can make computations on that portion of the matrix.
> 
> I need to understand better. So
> 
>   1) You want to load a sparse matrix from disk
> 

Yes, the matrix is sparse, stored on disk in row-wise chunks (one per process), with total size of around 3TB.

>   2) You are imagining that it is loaded row-wise, since you can do a calculation with some rows before others are loaded.
> 
>        What calculation, a MatMult?
>        How long does that MatMult take compared to loading?
> 

Yes, a MatMult.
I already have a more straightforward implementation where the matrix is loaded completely at the beginning, and then all of the multiplications are performed.
Based on the loading time and computation time with the current implementation, it appears that most of the computation time could be subsumed into the loading time.

>   3) If you are talking about a dense matrix, you should be loading in parallel using MPI-I/O. We do this for Vec.
> 
> Before you do complicated programming, I would assure myself that the performance gain is worth it.
>  
> In this scenario, every process needs to be simultaneously loading the next block of X, and perform computations on the previously loaded portion. My strategy is for each MPI process to spawn a thread for data loading (so that the memory between the process and the thread is shared), while the process does computations. My concern is that the data loading thread may be using up computational resources of the processor, even though it is mainly doing IO. Will this be an issue? What is the best way to minimize the cpu time of this parallel data loading scheme?
> 
> Oh, you want to load each block in parallel, but there are many blocks. I would really caution you against using threads. They
> are death to clean code. Use non-blocking reads.

I see. Could you expand on your suggestion regarding non-blocking reads? Are you proposing that each process makes an asynchronous read request in between every, say, MatMult operation?

> 
>   Thanks,
> 
>      Matt
>  
> Thanks,
> Michal.
> 
> 
>> Wiadomo?? napisana przez Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> w dniu 24.05.2017, o godz. 04:55:
>> 
>> On Wed, May 24, 2017 at 1:09 AM, Michal Derezinski <mderezin at ucsc.edu <mailto:mderezin at ucsc.edu>> wrote:
>> Hi,
>> 
>> I want to be able to perform matrix operations on several contiguous submatrices of a full matrix, without allocating the memory redundantly for the submatrices (in addition to the memory that is already allocated for the full matrix).
>> I tried using MatGetSubMatrix, but this function appears to allocate the additional memory.
>> 
>> The other way I found to do this is to create the smallest submatrices I need first, then use MatCreateNest to combine them into bigger ones (including the full matrix).
>> The documentation of MatCreateNest seems to indicate that it does not allocate additional memory for storing the new matrix.
>> Is this the right approach, or is there a better one?
>> 
>> Yes, that is the right approach.
>> 
>>   Thanks,
>> 
>>     Matt
>>  
>> Thanks,
>> Michal Derezinski.
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/~mk51/>
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/~mk51/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/92e4fc8c/attachment.html>

From knepley at gmail.com  Wed May 24 13:19:09 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 24 May 2017 13:19:09 -0500
Subject: [petsc-users] Accessing submatrices without additional memory
	usage
In-Reply-To: <E8838E1F-ADF4-4F7B-8F96-6D8494FC3D3B@gmail.com>
References: <CAAw1VLU6=UFWGihUC5dj5frQZg_tW4yEupj9A-SrQ_Lc8iKdVg@mail.gmail.com>
	<CAMYG4Gm103senLommeCCjDen_un4BVorXJsar4y5sW-4w5EOLg@mail.gmail.com>
	<0B69CBD4-9524-429A-8478-0BBF0C236F94@gmail.com>
	<CAMYG4GmgG3CDGqti1BB1L+ZHFmwo9ZLNEv0GpXJJ2LbDOu7kAQ@mail.gmail.com>
	<E8838E1F-ADF4-4F7B-8F96-6D8494FC3D3B@gmail.com>
Message-ID: <CAMYG4Gkr8_a21TuaB--LaHCicVYQRfzbXXxQ9Ninq9a5dDBAhQ@mail.gmail.com>

On Wed, May 24, 2017 at 1:13 PM, Micha? Derezi?ski <
michal.derezinski at gmail.com> wrote:

>
> Wiadomo?? napisana przez Matthew Knepley <knepley at gmail.com> w dniu
> 24.05.2017, o godz. 10:44:
>
> On Wed, May 24, 2017 at 12:37 PM, Micha? Derezi?ski <michal.derezinski@
> gmail.com> wrote:
>
>> Great! Then I have a follow-up question:
>>
>> My goal is to be able to load the full matrix X from disk, while at the
>> same time in parallel, performing computations on the submatrices that have
>> already been loaded. Essentially, I want to think of X as a block matrix
>> (where the blocks are horizontal, spanning the full width of the matrix),
>> where I?m loading one block at a time, and all the blocks that have already
>> been loaded are combined using MatCreateNest, so that I can make
>> computations on that portion of the matrix.
>>
>
> I need to understand better. So
>
>   1) You want to load a sparse matrix from disk
>
>
> Yes, the matrix is sparse, stored on disk in row-wise chunks (one per
> process), with total size of around 3TB.
>
>   2) You are imagining that it is loaded row-wise, since you can do a
> calculation with some rows before others are loaded.
>
>        What calculation, a MatMult?
>        How long does that MatMult take compared to loading?
>
>
> Yes, a MatMult.
> I already have a more straightforward implementation where the matrix is
> loaded completely at the beginning, and then all of the multiplications are
> performed.
> Based on the loading time and computation time with the current
> implementation, it appears that most of the computation time could be
> subsumed into the loading time.
>
>   3) If you are talking about a dense matrix, you should be loading in
> parallel using MPI-I/O. We do this for Vec.
>
> Before you do complicated programming, I would assure myself that the
> performance gain is worth it.
>
>
>> In this scenario, every process needs to be simultaneously loading the
>> next block of X, and perform computations on the previously loaded portion.
>> My strategy is for each MPI process to spawn a thread for data loading (so
>> that the memory between the process and the thread is shared), while the
>> process does computations. My concern is that the data loading thread may
>> be using up computational resources of the processor, even though it is
>> mainly doing IO. Will this be an issue? What is the best way to minimize
>> the cpu time of this parallel data loading scheme?
>>
>
> Oh, you want to load each block in parallel, but there are many blocks. I
> would really caution you against using threads. They
> are death to clean code. Use non-blocking reads.
>
>
> I see. Could you expand on your suggestion regarding non-blocking reads?
> Are you proposing that each process makes an asynchronous read request in
> between every, say, MatMult operation?
>

Check this out: http://beige.ucs.indiana.edu/I590/node109.html

PETSc does not do this currently, but it sounds like you are handling the
load.

  Thanks,

    Matt


>
>   Thanks,
>
>      Matt
>
>
>> Thanks,
>> Michal.
>>
>>
>> Wiadomo?? napisana przez Matthew Knepley <knepley at gmail.com> w dniu
>> 24.05.2017, o godz. 04:55:
>>
>> On Wed, May 24, 2017 at 1:09 AM, Michal Derezinski <mderezin at ucsc.edu>
>> wrote:
>>
>>> Hi,
>>>
>>> I want to be able to perform matrix operations on several contiguous
>>> submatrices of a full matrix, without allocating the memory redundantly for
>>> the submatrices (in addition to the memory that is already allocated for
>>> the full matrix).
>>> I tried using MatGetSubMatrix, but this function appears to allocate the
>>> additional memory.
>>>
>>> The other way I found to do this is to create the smallest submatrices I
>>> need first, then use MatCreateNest to combine them into bigger ones
>>> (including the full matrix).
>>> The documentation of MatCreateNest seems to indicate that it does not
>>> allocate additional memory for storing the new matrix.
>>> Is this the right approach, or is there a better one?
>>>
>>
>> Yes, that is the right approach.
>>
>>   Thanks,
>>
>>     Matt
>>
>>
>>> Thanks,
>>> Michal Derezinski.
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> http://www.caam.rice.edu/~mk51/
>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/aa4478bb/attachment-0001.html>

From jed at jedbrown.org  Wed May 24 13:32:18 2017
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 24 May 2017 12:32:18 -0600
Subject: [petsc-users] Accessing submatrices without additional memory
	usage
In-Reply-To: <0B69CBD4-9524-429A-8478-0BBF0C236F94@gmail.com>
References: <CAAw1VLU6=UFWGihUC5dj5frQZg_tW4yEupj9A-SrQ_Lc8iKdVg@mail.gmail.com>
	<CAMYG4Gm103senLommeCCjDen_un4BVorXJsar4y5sW-4w5EOLg@mail.gmail.com>
	<0B69CBD4-9524-429A-8478-0BBF0C236F94@gmail.com>
Message-ID: <87shju2jil.fsf@jedbrown.org>

Micha? Derezi?ski <michal.derezinski at gmail.com> writes:

> Great! Then I have a follow-up question:
>
> My goal is to be able to load the full matrix X from disk, while at
> the same time in parallel, performing computations on the submatrices
> that have already been loaded. Essentially, I want to think of X as a
> block matrix (where the blocks are horizontal, spanning the full width
> of the matrix), 

What would be the distribution of the vector that this non-square
submatrix (probably with many empty columns) is applied to?

Could you back up and explain what problem you're trying to solve?  It
sounds like you're about to code yourself into a dungeon.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/df31c36e/attachment.pgp>

From bsmith at mcs.anl.gov  Wed May 24 13:38:28 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 24 May 2017 13:38:28 -0500
Subject: [petsc-users] [petsc-dev] Using PETSc MatIS,
 how to matmult a global IS matrix and a global vector ?
In-Reply-To: <1238048783.7876567.1495619158445.JavaMail.zimbra@inria.fr>
References: <2012394521.6757315.1495383841678.JavaMail.zimbra@inria.fr>
	<264DC59D-B914-42E5-9A89-0746F21A37BF@gmail.com>
	<1392596904.7422896.1495533198072.JavaMail.zimbra@inria.fr>
	<CAMYG4G=hBu_XJ1rRXRt6Avb4--yy=g2Fmk4x-jbbR9XUo-X4hA@mail.gmail.com>
	<1784716977.7683031.1495556883254.JavaMail.zimbra@inria.fr>
	<CAMYG4GkkXfO+JWugx-d7bvugXY=7Sqf3HCnrwBONORN5tXxRQA@mail.gmail.com>
	<855172682.7687763.1495558287122.JavaMail.zimbra@inria.fr>
	<CAMYG4GmH-7+-Ed-GaYD-B+MNVMTre3Zd-zq0Tn64cToESUex3g@mail.gmail.com>
	<1238048783.7876567.1495619158445.JavaMail.zimbra@inria.fr>
Message-ID: <02FA1C44-2E29-4000-B21E-B9D96C5B14A0@mcs.anl.gov>


> On May 24, 2017, at 4:45 AM, Franck Houssen <franck.houssen at inria.fr> wrote:
> 
> Coming from FEM, I believe the very confusing thing is that the local size of the user problem (math, physics point of view - DDM domain size) is not (can not be ?) the local size expected in MatCreateIS.
> 
> My understanding is that the local size in MatIS is "just" related to backend implementation problems (it's logical that this local size is necessary, but, for another purpose: MPI machinery). Taking a few steps back, I can not see a case (I may be wrong) when a user does know how to compute or set "by hand" the local size that MatIS will expect: my understanding (once again, not sure) is that in most cases, the user will need local size to be PETSC_DECIDE in MatIS (because he doesn't want to "bother" with that or can not guess / compute it => unfortunatelly, as is, this jam the whole thing).
> 
> I guess this kind of signature for MatIS would avoid/limit confusion in most cases and for most users :
> PetscErrorCode MatCreateIS(MPI_Comm comm,PetscInt bs,PetscInt M,PetscInt N,ISLocalToGlobalMapping rmap,ISLocalToGlobalMapping cmap,Mat *A,PetscInt m = PETSC_DECIDE,PetscInt n= PETSC_DECIDE)
> Or even
> PetscErrorCode MatCreateIS(MPI_Comm comm,PetscInt bs,PetscInt M,PetscInt N,ISLocalToGlobalMapping rmap,ISLocalToGlobalMapping cmap,Mat *A) // Always use PETSC_DECIDE backstage ?
> 
   You are correct that often m and n may be PETSC_DECIDE, however there are also valid reasons for them to be determined by the user and not just set automatically. With finite elements and PETSc one often partitions first the elements and then partitions the degrees of freedom on the elements subservient to the partitioning of the elements; by this I mean any degree of freedom that is on an element interior to a process in the element partitioning (degree of freedom in no way "shared" between processes) would be a assigned to that MPI process while "shared" elements are assigned by some rule to one of the processes that "share" the degree of freedom. In this case if the user computes the correct local m and n they will get exactly the partitioning of degrees of freedom they want (in the global vector) but if they let PETSc decide they won't get neccessarily the same partitioning. 

   The reason the m and n are the 3rd and 4th argument instead of the last arguments is to match the calls for, for example MatCreateAIJ() and MatCreateBAIJ() so that users understand the m and n have the same meaning as that case. Unfortunately this does not seem to have worked since the m and n arguments appear to have been confusing to you.

    Barry



> Franck
> 
> De: "Matthew Knepley" <knepley at gmail.com>
> ?: "Franck Houssen" <franck.houssen at inria.fr>
> Cc: "Stefano Zampini" <stefano.zampini at gmail.com>, "PETSc" <petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> Envoy?: Mardi 23 Mai 2017 19:02:28
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS matrix and a global vector ?
> 
> On Tue, May 23, 2017 at 11:51 AM, Franck Houssen <franck.houssen at inria.fr> wrote:
> Not sure to know what question you're talking about ?!...
> I use MatIS to test some kind of domain decomposition methods. I define my own preconditioner for that: in the apply callback, I need to matmult my (matIS) matrix with the incoming vector.
> 
> Okay. I will create an example using your suggestion.
> 
>   Thanks,
> 
>      Matt
>  
> Franck
> 
> De: "Matthew Knepley" <knepley at gmail.com>
> ?: "Franck Houssen" <franck.houssen at inria.fr>
> Cc: "Stefano Zampini" <stefano.zampini at gmail.com>, "PETSc" <petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> Envoy?: Mardi 23 Mai 2017 18:46:34
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS matrix and a global vector ?
> 
> On Tue, May 23, 2017 at 11:28 AM, Franck Houssen <franck.houssen at inria.fr> wrote:
> OK, thanks. This is helpfull... But I really think the doc should be more verbose about that: this is really confusing and I didn't find any simple example to begin with which make all this even more confusing (personal opinion).
> 
> Did you respond to my other question (how are you using them)? That would help me understand how to phrase it.
> 
>   Thanks,
> 
>     Matt
>  
> Franck
> 
> 
> De: "Matthew Knepley" <knepley at gmail.com>
> ?: "Franck Houssen" <franck.houssen at inria.fr>
> Cc: "Stefano Zampini" <stefano.zampini at gmail.com>, "PETSc" <petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> Envoy?: Mardi 23 Mai 2017 13:21:21
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS matrix and a global vector ?
> 
> On Tue, May 23, 2017 at 4:53 AM, Franck Houssen <franck.houssen at inria.fr> wrote:
> The first thing I did was to put 3, not 4 : I got an error thrown in MatCreateIS (see the git diff + stack below). As the error said I used globalSize = numberOfMPIProcessus * localSize : my understanding is that, when using MatIS, the global size needs to be the sum of all local sizes. Correct ?
> 
> No. MatIS means that the matrix is not assembled. The easiest way (for me) to think of this is that processes do not have
> to hold full rows. One process can hold part of row i, and another processes can hold another part. However, there are still
> the same number of global rows.
> 
> I have a 3x3 global matrix made of two overlapping 2x2 local matrix (= diagonal with 1.). Each local matrix correspond to one domain (each domain is delegated to one MPI proc, so, I have 2 MPI procs because I have 2 domains).
> 
> So the global size is 3. The local size here is not the size of the local IS block, since that is a property only of MatIS. It is the
> size of the local piece of the vector you multiply. This allows PETSc to understand the parallel layout of the Vec, and how it
> matched the Mat.
> 
> This is somewhat confusing because FEM people mean something different by "local" than we do here, and in fact we use this
> other definition of local when assembling operators.
> 
>    Matt
>  
> This is the simplest possible example: I have two 2x2 (local) diag matrix that overlap so that the global matrix built from them is 1, 2, 1 on the diagonal (local contributions add up in the middle).
> I need to MatMult this global matrix with a global vector filled with 1.
> 
> Franck
> 
> Git diff :
> 
> --- a/matISLocalMat.cpp
> +++ b/matISLocalMat.cpp
> @@ -16,7 +16,7 @@ int main(int argc,char **argv) {
>    int size = 0; MPI_Comm_size(MPI_COMM_WORLD, &size); if (size != 2) return 1;
>    int rank = 0; MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>  
> -  PetscInt localSize = 2, globalSize = localSize*2 /*2 MPI*/;
> +  PetscInt localSize = 2, globalSize = 3;
>    PetscInt localIdx[2] = {0, 0};
>    if (rank == 0) {localIdx[0] = 0; localIdx[1] = 1;}
>    else           {localIdx[0] = 1; localIdx[1] = 2;}
> 
> 
> 
> Stack error:
> 
> [0]PETSC ERROR: Nonconforming object sizes
> [0]PETSC ERROR: Sum of local lengths 4 does not equal global length 3, my local length 2
> [0]PETSC ERROR: [0] ISG2LMapApply line 17 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/vec/is/utils/isltog.c
> [0]PETSC ERROR: [0] MatSetValues_IS line 692 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] MatSetValues line 1157 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> [0]PETSC ERROR: [0] MatISSetPreallocation_IS line 95 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] MatISSetPreallocation line 80 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] PetscSplitOwnership line 80 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/sys/utils/psplit.c
> [0]PETSC ERROR: [0] PetscLayoutSetUp line 129 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/vec/is/utils/pmap.c
> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping_IS line 628 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> [0]PETSC ERROR: [0] MatSetLocalToGlobalMapping line 1899 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/interface/matrix.c
> [0]PETSC ERROR: [0] MatCreateIS line 986 /home/fghoussen/Documents/INRIA/petsc-3.7.6/src/mat/impls/is/matis.c
> 
> 
> 
> De: "Stefano Zampini" <stefano.zampini at gmail.com>
> ?: "Matthew Knepley" <knepley at gmail.com>
> Cc: "Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> Envoy?: Dimanche 21 Mai 2017 23:02:37
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS matrix and a global vector ?
> 
> Franck,
> 
> PETSc takes care of doing the matrix-vector multiplication properly using MatIS.  As Matt said, the layout of the vectors is the usual parallel layout.
> The local sizes of the MatIS matrix (i.e. the local size of the left and right vectors used in MatMult) are not the sizes of the local subdomain  matrices in MatIS.
> 
> 
> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <franck.houssen at inria.fr> wrote:
> Using PETSc MatIS, how to matmult a global IS matrix and a global vector ? Example is attached : I don't get what I expect that is a vector such that proc0 = [1, 2] and proc1 = [2, 1]
> 
> 1) I think the global size of your matrix is wrong. You seem to want 3, not 4
> 
> 2) Global vectors have a non-overlapping row partition. You might be thinking of local vectors
> 
>   Thanks,
> 
>     Matt
>  
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> http://www.caam.rice.edu/~mk51/
> 
> De: "Stefano Zampini" <stefano.zampini at gmail.com>
> ?: "Matthew Knepley" <knepley at gmail.com>
> Cc: "Franck Houssen" <franck.houssen at inria.fr>, "PETSc" <petsc-users at mcs.anl.gov>, "PETSc" <petsc-dev at mcs.anl.gov>
> Envoy?: Dimanche 21 Mai 2017 23:02:37
> Objet: Re: [petsc-dev] Using PETSc MatIS, how to matmult a global IS matrix and a global vector ?
> 
> Franck,
> 
> PETSc takes care of doing the matrix-vector multiplication properly using MatIS.  As Matt said, the layout of the vectors is the usual parallel layout.
> The local sizes of the MatIS matrix (i.e. the local size of the left and right vectors used in MatMult) are not the sizes of the local subdomain  matrices in MatIS.
> 
> 
> On May 21, 2017, at 6:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Sun, May 21, 2017 at 11:26 AM, Franck Houssen <franck.houssen at inria.fr> wrote:
> Using PETSc MatIS, how to matmult a global IS matrix and a global vector ? Example is attached : I don't get what I expect that is a vector such that proc0 = [1, 2] and proc1 = [2, 1]
> 
> 1) I think the global size of your matrix is wrong. You seem to want 3, not 4
> 
> 2) Global vectors have a non-overlapping row partition. You might be thinking of local vectors
> 
>   Thanks,
> 
>     Matt
>  
> Franck
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> http://www.caam.rice.edu/~mk51/
> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> http://www.caam.rice.edu/~mk51/
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> http://www.caam.rice.edu/~mk51/
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> http://www.caam.rice.edu/~mk51/
> 


From danyang.su at gmail.com  Wed May 24 13:49:38 2017
From: danyang.su at gmail.com (Danyang Su)
Date: Wed, 24 May 2017 11:49:38 -0700
Subject: [petsc-users] Question on incomplete factorization level and
	fill
In-Reply-To: <CAMYG4GmaFrpOzo1iFXJQoLZpwZ0rxy_ULpNPgcgtD4AMZHrRfg@mail.gmail.com>
References: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
	<CAMYG4GnTZChFxXgizw1edRtfNA==oOpeJFy57nzY1b-xkkPYbg@mail.gmail.com>
	<c78911ea-edba-3966-754d-0102e2d75d42@gmail.com>
	<CAMYG4GmaFrpOzo1iFXJQoLZpwZ0rxy_ULpNPgcgtD4AMZHrRfg@mail.gmail.com>
Message-ID: <b528c54d-c8e0-5030-315b-fdb97c279b19@gmail.com>

Hi Matt,

Yes. The matrix is 450000x450000 sparse. The hypre takes hundreds of 
iterates, not for all but in most of the timesteps. The matrix is not 
well conditioned, with nonzero entries range from 1.0e-29 to 1.0e2. I 
also made double check if there is anything wrong in the parallel 
version, however, the matrix is the same with sequential version except 
some round error which is relatively very small. Usually for those not 
well conditioned matrix, direct solver should be faster than iterative 
solver, right? But when I use the sequential iterative solver with ILU 
prec developed almost 20 years go by others, the solver converge fast 
with appropriate factorization level. In other words, when I use 24 
processor using hypre, the speed is almost the same as as the old 
sequential iterative solver using 1 processor.

I use most of the default configuration for the general case with pretty 
good speedup. And I am not sure if I miss something for this problem.

Thanks,

Danyang


On 17-05-24 11:12 AM, Matthew Knepley wrote:
> On Wed, May 24, 2017 at 12:50 PM, Danyang Su <danyang.su at gmail.com 
> <mailto:danyang.su at gmail.com>> wrote:
>
>     Hi Matthew and Barry,
>
>     Thanks for the quick response.
>
>     I also tried superlu and mumps, both work but it is about four
>     times slower than ILU(dt) prec through hypre, with 24 processors I
>     have tested.
>
> You mean the total time is 4x? And you are taking hundreds of 
> iterates? That seems hard to believe, unless you are dropping
> a huge number of elements.
>
>     When I look into the convergence information, the method using
>     ILU(dt) still takes 200 to 3000 linear iterations for each newton
>     iteration. One reason is this equation is hard to solve. As for
>     the general cases, the same method works awesome and get very good
>     speedup.
>
> I do not understand what you mean here.
>
>     I also doubt if I use hypre correctly for this case. Is there
>     anyway to check this problem, or is it possible to increase the
>     factorization level through hypre?
>
> I don't know.
>
>   Matt
>
>     Thanks,
>
>     Danyang
>
>
>     On 17-05-24 04:59 AM, Matthew Knepley wrote:
>>     On Wed, May 24, 2017 at 2:21 AM, Danyang Su <danyang.su at gmail.com
>>     <mailto:danyang.su at gmail.com>> wrote:
>>
>>         Dear All,
>>
>>         I use PCFactorSetLevels for ILU and PCFactorSetFill for other
>>         preconditioning in my code to help solve the problems that
>>         the default option is hard to solve. However, I found the
>>         latter one, PCFactorSetFill does not take effect for my
>>         problem. The matrices and rhs as well as the solutions are
>>         attached from the link below. I obtain the solution using
>>         hypre preconditioner and it takes 7 and 38 iterations for
>>         matrix 1 and matrix 2. However, if I use other
>>         preconditioner, the solver just failed at the first matrix. I
>>         have tested this matrix using the native sequential solver
>>         (not PETSc) with ILU preconditioning. If I set the incomplete
>>         factorization level to 0, this sequential solver will take
>>         more than 100 iterations. If I increase the factorization
>>         level to 1 or more, it just takes several iterations. This
>>         remind me that the PC factor for this matrices should be
>>         increased. However, when I tried it in PETSc, it just does
>>         not work.
>>
>>         Matrix and rhs can be obtained from the link below.
>>
>>         https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>>         <https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R>
>>
>>         Would anyone help to check if you can make this work by
>>         increasing the PC factor level or fill?
>>
>>
>>     We have ILU(k) supported in serial. However ILU(dt) which takes a
>>     tolerance only works through Hypre
>>
>>     http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>>     <http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html>
>>
>>     I recommend you try SuperLU or MUMPS, which can both be
>>     downloaded automatically by configure, and
>>     do a full sparse LU.
>>
>>       Thanks,
>>
>>         Matt
>>
>>         Thanks and regards,
>>
>>         Danyang
>>
>>
>>
>>
>>
>>     -- 
>>     What most experimenters take for granted before they begin their
>>     experiments is infinitely more interesting than any results to
>>     which their experiments lead.
>>     -- Norbert Wiener
>>
>>     http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/%7Emk51/>
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/%7Emk51/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/6726774f/attachment-0001.html>

From michal.derezinski at gmail.com  Wed May 24 13:53:07 2017
From: michal.derezinski at gmail.com (=?utf-8?Q?Micha=C5=82_Derezi=C5=84ski?=)
Date: Wed, 24 May 2017 11:53:07 -0700
Subject: [petsc-users] Accessing submatrices without additional memory
 usage
In-Reply-To: <87shju2jil.fsf@jedbrown.org>
References: <CAAw1VLU6=UFWGihUC5dj5frQZg_tW4yEupj9A-SrQ_Lc8iKdVg@mail.gmail.com>
	<CAMYG4Gm103senLommeCCjDen_un4BVorXJsar4y5sW-4w5EOLg@mail.gmail.com>
	<0B69CBD4-9524-429A-8478-0BBF0C236F94@gmail.com>
	<87shju2jil.fsf@jedbrown.org>
Message-ID: <5BF044B5-49B3-49CB-A58D-A06B11DD6000@gmail.com>

It is an optimization problem minimizing a convex objective for a binary classification task, which I?m solving using a Tao solver.
The multiplication operations are performing gradient computation for each step of the optimization.
So I?m performing both a MatMult and a MatMultTranspose, in both cases the vector may be a dense vector.

The crucial part of the implementation is that at the beginning I am not running on the entire dataset (rows of the full matrix).
As a consequence I don?t need to have the entire matrix loaded right away. In fact, in some cases I may choose to stop the optimization before the entire matrix has been loaded (I already verified that this scenario may come up as a use case). That is why it is important that I don?t load it at the beginning.

Parallel loading is not a necessary part of the implementation. Initially, I intend to alternate between loading a portion of the matrix, then doing computations, then loading more of the matrix, etc. But, given that I observed large loading times for some datasets, parallel loading may make sense, if done efficiently.

Thanks,
Michal.

> Wiadomo?? napisana przez Jed Brown <jed at jedbrown.org> w dniu 24.05.2017, o godz. 11:32:
> 
> Micha? Derezi?ski <michal.derezinski at gmail.com> writes:
> 
>> Great! Then I have a follow-up question:
>> 
>> My goal is to be able to load the full matrix X from disk, while at
>> the same time in parallel, performing computations on the submatrices
>> that have already been loaded. Essentially, I want to think of X as a
>> block matrix (where the blocks are horizontal, spanning the full width
>> of the matrix), 
> 
> What would be the distribution of the vector that this non-square
> submatrix (probably with many empty columns) is applied to?
> 
> Could you back up and explain what problem you're trying to solve?  It
> sounds like you're about to code yourself into a dungeon.


From jed at jedbrown.org  Wed May 24 14:06:11 2017
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 24 May 2017 13:06:11 -0600
Subject: [petsc-users] Accessing submatrices without additional memory
	usage
In-Reply-To: <5BF044B5-49B3-49CB-A58D-A06B11DD6000@gmail.com>
References: <CAAw1VLU6=UFWGihUC5dj5frQZg_tW4yEupj9A-SrQ_Lc8iKdVg@mail.gmail.com>
	<CAMYG4Gm103senLommeCCjDen_un4BVorXJsar4y5sW-4w5EOLg@mail.gmail.com>
	<0B69CBD4-9524-429A-8478-0BBF0C236F94@gmail.com>
	<87shju2jil.fsf@jedbrown.org>
	<5BF044B5-49B3-49CB-A58D-A06B11DD6000@gmail.com>
Message-ID: <87poey2hy4.fsf@jedbrown.org>

Okay, do you have more parameters than observations?  And each segment
of the matrix will be fully distributed?  Do you have a parallel file
system?  Is your matrix sparse or dense?

Micha? Derezi?ski <michal.derezinski at gmail.com> writes:

> It is an optimization problem minimizing a convex objective for a binary classification task, which I?m solving using a Tao solver.
> The multiplication operations are performing gradient computation for each step of the optimization.
> So I?m performing both a MatMult and a MatMultTranspose, in both cases the vector may be a dense vector.
>
> The crucial part of the implementation is that at the beginning I am not running on the entire dataset (rows of the full matrix).
> As a consequence I don?t need to have the entire matrix loaded right away. In fact, in some cases I may choose to stop the optimization before the entire matrix has been loaded (I already verified that this scenario may come up as a use case). That is why it is important that I don?t load it at the beginning.
>
> Parallel loading is not a necessary part of the implementation. Initially, I intend to alternate between loading a portion of the matrix, then doing computations, then loading more of the matrix, etc. But, given that I observed large loading times for some datasets, parallel loading may make sense, if done efficiently.
>
> Thanks,
> Michal.
>
>> Wiadomo?? napisana przez Jed Brown <jed at jedbrown.org> w dniu 24.05.2017, o godz. 11:32:
>> 
>> Micha? Derezi?ski <michal.derezinski at gmail.com> writes:
>> 
>>> Great! Then I have a follow-up question:
>>> 
>>> My goal is to be able to load the full matrix X from disk, while at
>>> the same time in parallel, performing computations on the submatrices
>>> that have already been loaded. Essentially, I want to think of X as a
>>> block matrix (where the blocks are horizontal, spanning the full width
>>> of the matrix), 
>> 
>> What would be the distribution of the vector that this non-square
>> submatrix (probably with many empty columns) is applied to?
>> 
>> Could you back up and explain what problem you're trying to solve?  It
>> sounds like you're about to code yourself into a dungeon.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/22fd673f/attachment.pgp>

From michal.derezinski at gmail.com  Wed May 24 14:15:21 2017
From: michal.derezinski at gmail.com (=?utf-8?Q?Micha=C5=82_Derezi=C5=84ski?=)
Date: Wed, 24 May 2017 12:15:21 -0700
Subject: [petsc-users] Accessing submatrices without additional memory
 usage
In-Reply-To: <87poey2hy4.fsf@jedbrown.org>
References: <CAAw1VLU6=UFWGihUC5dj5frQZg_tW4yEupj9A-SrQ_Lc8iKdVg@mail.gmail.com>
	<CAMYG4Gm103senLommeCCjDen_un4BVorXJsar4y5sW-4w5EOLg@mail.gmail.com>
	<0B69CBD4-9524-429A-8478-0BBF0C236F94@gmail.com>
	<87shju2jil.fsf@jedbrown.org>
	<5BF044B5-49B3-49CB-A58D-A06B11DD6000@gmail.com>
	<87poey2hy4.fsf@jedbrown.org>
Message-ID: <0D256C07-0556-4D0E-A9D3-D7D3F5D8B2C6@gmail.com>


> Wiadomo?? napisana przez Jed Brown <jed at jedbrown.org> w dniu 24.05.2017, o godz. 12:06:
> 
> Okay, do you have more parameters than observations?  

No (not necessarily). The biggest matrix is 50M observations and 12M parameters.

> And each segment
> of the matrix will be fully distributed?

Yes.

>  Do you have a parallel file
> system?

Yes.

>  Is your matrix sparse or dense?

Yes.

> 
> Micha? Derezi?ski <michal.derezinski at gmail.com> writes:
> 
>> It is an optimization problem minimizing a convex objective for a binary classification task, which I?m solving using a Tao solver.
>> The multiplication operations are performing gradient computation for each step of the optimization.
>> So I?m performing both a MatMult and a MatMultTranspose, in both cases the vector may be a dense vector.
>> 
>> The crucial part of the implementation is that at the beginning I am not running on the entire dataset (rows of the full matrix).
>> As a consequence I don?t need to have the entire matrix loaded right away. In fact, in some cases I may choose to stop the optimization before the entire matrix has been loaded (I already verified that this scenario may come up as a use case). That is why it is important that I don?t load it at the beginning.
>> 
>> Parallel loading is not a necessary part of the implementation. Initially, I intend to alternate between loading a portion of the matrix, then doing computations, then loading more of the matrix, etc. But, given that I observed large loading times for some datasets, parallel loading may make sense, if done efficiently.
>> 
>> Thanks,
>> Michal.
>> 
>>> Wiadomo?? napisana przez Jed Brown <jed at jedbrown.org> w dniu 24.05.2017, o godz. 11:32:
>>> 
>>> Micha? Derezi?ski <michal.derezinski at gmail.com> writes:
>>> 
>>>> Great! Then I have a follow-up question:
>>>> 
>>>> My goal is to be able to load the full matrix X from disk, while at
>>>> the same time in parallel, performing computations on the submatrices
>>>> that have already been loaded. Essentially, I want to think of X as a
>>>> block matrix (where the blocks are horizontal, spanning the full width
>>>> of the matrix), 
>>> 
>>> What would be the distribution of the vector that this non-square
>>> submatrix (probably with many empty columns) is applied to?
>>> 
>>> Could you back up and explain what problem you're trying to solve?  It
>>> sounds like you're about to code yourself into a dungeon.


From hzhang at mcs.anl.gov  Wed May 24 14:21:05 2017
From: hzhang at mcs.anl.gov (Hong)
Date: Wed, 24 May 2017 14:21:05 -0500
Subject: [petsc-users] Question on incomplete factorization level and
	fill
In-Reply-To: <b528c54d-c8e0-5030-315b-fdb97c279b19@gmail.com>
References: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
	<CAMYG4GnTZChFxXgizw1edRtfNA==oOpeJFy57nzY1b-xkkPYbg@mail.gmail.com>
	<c78911ea-edba-3966-754d-0102e2d75d42@gmail.com>
	<CAMYG4GmaFrpOzo1iFXJQoLZpwZ0rxy_ULpNPgcgtD4AMZHrRfg@mail.gmail.com>
	<b528c54d-c8e0-5030-315b-fdb97c279b19@gmail.com>
Message-ID: <CAGCphBvLSvn8NXFpQ2qjBtyXwtHA0dqziJVSdZSVtu9BtVxZ5g@mail.gmail.com>

Danyang :
I tested your data.
Your matrices encountered zero pivots, e.g.
petsc/src/ksp/ksp/examples/tutorials (master)
$ mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin
-ksp_monitor -ksp_error_if_not_converged

[15]PETSC ERROR: Zero pivot in LU factorization:
http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot
[15]PETSC ERROR: Zero pivot row 1249 value 2.05808e-14 tolerance 2.22045e-14
...

Adding option '-sub_pc_factor_shift_type nonzero', I got
mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin
-ksp_monitor -ksp_error_if_not_converged -sub_pc_factor_shift_type nonzero
-mat_view ascii::ascii_info

Mat Object: 24 MPI processes
  type: mpiaij
  rows=450000, cols=450000
  total: nonzeros=6991400, allocated nonzeros=6991400
  total number of mallocs used during MatSetValues calls =0
    not using I-node (on process 0) routines
  0 KSP Residual norm 5.849777711755e+01
  1 KSP Residual norm 6.824179430230e-01
  2 KSP Residual norm 3.994483555787e-02
  3 KSP Residual norm 6.085841461433e-03
  4 KSP Residual norm 8.876162583511e-04
  5 KSP Residual norm 9.407780665278e-05
Number of iterations =   5
Residual norm 0.00542891

Hong

> Hi Matt,
>
> Yes. The matrix is 450000x450000 sparse. The hypre takes hundreds of
> iterates, not for all but in most of the timesteps. The matrix is not well
> conditioned, with nonzero entries range from 1.0e-29 to 1.0e2. I also made
> double check if there is anything wrong in the parallel version, however,
> the matrix is the same with sequential version except some round error
> which is relatively very small. Usually for those not well conditioned
> matrix, direct solver should be faster than iterative solver, right? But
> when I use the sequential iterative solver with ILU prec developed almost
> 20 years go by others, the solver converge fast with appropriate
> factorization level. In other words, when I use 24 processor using hypre,
> the speed is almost the same as as the old sequential iterative solver
> using 1 processor.
>
> I use most of the default configuration for the general case with pretty
> good speedup. And I am not sure if I miss something for this problem.
>
> Thanks,
>
> Danyang
>
> On 17-05-24 11:12 AM, Matthew Knepley wrote:
>
> On Wed, May 24, 2017 at 12:50 PM, Danyang Su <danyang.su at gmail.com> wrote:
>
>> Hi Matthew and Barry,
>>
>> Thanks for the quick response.
>>
>> I also tried superlu and mumps, both work but it is about four times
>> slower than ILU(dt) prec through hypre, with 24 processors I have tested.
>>
> You mean the total time is 4x? And you are taking hundreds of iterates?
> That seems hard to believe, unless you are dropping
> a huge number of elements.
>
>> When I look into the convergence information, the method using ILU(dt)
>> still takes 200 to 3000 linear iterations for each newton iteration. One
>> reason is this equation is hard to solve. As for the general cases, the
>> same method works awesome and get very good speedup.
>>
> I do not understand what you mean here.
>
>> I also doubt if I use hypre correctly for this case. Is there anyway to
>> check this problem, or is it possible to increase the factorization level
>> through hypre?
>>
> I don't know.
>
>   Matt
>
>> Thanks,
>>
>> Danyang
>>
>> On 17-05-24 04:59 AM, Matthew Knepley wrote:
>>
>> On Wed, May 24, 2017 at 2:21 AM, Danyang Su <danyang.su at gmail.com> wrote:
>>
>>> Dear All,
>>>
>>> I use PCFactorSetLevels for ILU and PCFactorSetFill for other
>>> preconditioning in my code to help solve the problems that the default
>>> option is hard to solve. However, I found the latter one, PCFactorSetFill
>>> does not take effect for my problem. The matrices and rhs as well as the
>>> solutions are attached from the link below. I obtain the solution using
>>> hypre preconditioner and it takes 7 and 38 iterations for matrix 1 and
>>> matrix 2. However, if I use other preconditioner, the solver just failed at
>>> the first matrix. I have tested this matrix using the native sequential
>>> solver (not PETSc) with ILU preconditioning. If I set the incomplete
>>> factorization level to 0, this sequential solver will take more than 100
>>> iterations. If I increase the factorization level to 1 or more, it just
>>> takes several iterations. This remind me that the PC factor for this
>>> matrices should be increased. However, when I tried it in PETSc, it just
>>> does not work.
>>>
>>> Matrix and rhs can be obtained from the link below.
>>>
>>> https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>>>
>>> Would anyone help to check if you can make this work by increasing the
>>> PC factor level or fill?
>>>
>>
>> We have ILU(k) supported in serial. However ILU(dt) which takes a
>> tolerance only works through Hypre
>>
>>   http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>>
>> I recommend you try SuperLU or MUMPS, which can both be downloaded
>> automatically by configure, and
>> do a full sparse LU.
>>
>>   Thanks,
>>
>>     Matt
>>
>>
>>> Thanks and regards,
>>>
>>> Danyang
>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> http://www.caam.rice.edu/~mk51/
>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/03c50857/attachment-0001.html>

From danyang.su at gmail.com  Wed May 24 14:28:51 2017
From: danyang.su at gmail.com (Danyang Su)
Date: Wed, 24 May 2017 12:28:51 -0700
Subject: [petsc-users] Question on incomplete factorization level and
	fill
In-Reply-To: <CAGCphBvLSvn8NXFpQ2qjBtyXwtHA0dqziJVSdZSVtu9BtVxZ5g@mail.gmail.com>
References: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
	<CAMYG4GnTZChFxXgizw1edRtfNA==oOpeJFy57nzY1b-xkkPYbg@mail.gmail.com>
	<c78911ea-edba-3966-754d-0102e2d75d42@gmail.com>
	<CAMYG4GmaFrpOzo1iFXJQoLZpwZ0rxy_ULpNPgcgtD4AMZHrRfg@mail.gmail.com>
	<b528c54d-c8e0-5030-315b-fdb97c279b19@gmail.com>
	<CAGCphBvLSvn8NXFpQ2qjBtyXwtHA0dqziJVSdZSVtu9BtVxZ5g@mail.gmail.com>
Message-ID: <f2389e74-f3a1-420d-818d-33c782c0aa39@gmail.com>

Hi Hong,

Awesome. Thanks for testing the case. I will try your options for the 
code and get back to you later.

Regards,

Danyang


On 17-05-24 12:21 PM, Hong wrote:
> Danyang :
> I tested your data.
> Your matrices encountered zero pivots, e.g.
> petsc/src/ksp/ksp/examples/tutorials (master)
> $ mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin 
> -ksp_monitor -ksp_error_if_not_converged
>
> [15]PETSC ERROR: Zero pivot in LU factorization: 
> http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot
> [15]PETSC ERROR: Zero pivot row 1249 value 2.05808e-14 tolerance 
> 2.22045e-14
> ...
>
> Adding option '-sub_pc_factor_shift_type nonzero', I got
> mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin 
> -ksp_monitor -ksp_error_if_not_converged -sub_pc_factor_shift_type 
> nonzero -mat_view ascii::ascii_info
>
> Mat Object: 24 MPI processes
>   type: mpiaij
>   rows=450000, cols=450000
>   total: nonzeros=6991400, allocated nonzeros=6991400
>   total number of mallocs used during MatSetValues calls =0
>     not using I-node (on process 0) routines
>   0 KSP Residual norm 5.849777711755e+01
>   1 KSP Residual norm 6.824179430230e-01
>   2 KSP Residual norm 3.994483555787e-02
>   3 KSP Residual norm 6.085841461433e-03
>   4 KSP Residual norm 8.876162583511e-04
>   5 KSP Residual norm 9.407780665278e-05
> Number of iterations =   5
> Residual norm 0.00542891
>
> Hong
>
>     Hi Matt,
>
>     Yes. The matrix is 450000x450000 sparse. The hypre takes hundreds
>     of iterates, not for all but in most of the timesteps. The matrix
>     is not well conditioned, with nonzero entries range from 1.0e-29
>     to 1.0e2. I also made double check if there is anything wrong in
>     the parallel version, however, the matrix is the same with
>     sequential version except some round error which is relatively
>     very small. Usually for those not well conditioned matrix, direct
>     solver should be faster than iterative solver, right? But when I
>     use the sequential iterative solver with ILU prec developed almost
>     20 years go by others, the solver converge fast with appropriate
>     factorization level. In other words, when I use 24 processor using
>     hypre, the speed is almost the same as as the old sequential
>     iterative solver using 1 processor.
>
>     I use most of the default configuration for the general case with
>     pretty good speedup. And I am not sure if I miss something for
>     this problem.
>
>     Thanks,
>
>     Danyang
>
>
>     On 17-05-24 11:12 AM, Matthew Knepley wrote:
>>     On Wed, May 24, 2017 at 12:50 PM, Danyang Su
>>     <danyang.su at gmail.com <mailto:danyang.su at gmail.com>> wrote:
>>
>>         Hi Matthew and Barry,
>>
>>         Thanks for the quick response.
>>
>>         I also tried superlu and mumps, both work but it is about
>>         four times slower than ILU(dt) prec through hypre, with 24
>>         processors I have tested.
>>
>>     You mean the total time is 4x? And you are taking hundreds of
>>     iterates? That seems hard to believe, unless you are dropping
>>     a huge number of elements.
>>
>>         When I look into the convergence information, the method
>>         using ILU(dt) still takes 200 to 3000 linear iterations for
>>         each newton iteration. One reason is this equation is hard to
>>         solve. As for the general cases, the same method works
>>         awesome and get very good speedup.
>>
>>     I do not understand what you mean here.
>>
>>         I also doubt if I use hypre correctly for this case. Is there
>>         anyway to check this problem, or is it possible to increase
>>         the factorization level through hypre?
>>
>>     I don't know.
>>
>>       Matt
>>
>>         Thanks,
>>
>>         Danyang
>>
>>
>>         On 17-05-24 04:59 AM, Matthew Knepley wrote:
>>>         On Wed, May 24, 2017 at 2:21 AM, Danyang Su
>>>         <danyang.su at gmail.com <mailto:danyang.su at gmail.com>> wrote:
>>>
>>>             Dear All,
>>>
>>>             I use PCFactorSetLevels for ILU and PCFactorSetFill for
>>>             other preconditioning in my code to help solve the
>>>             problems that the default option is hard to solve.
>>>             However, I found the latter one, PCFactorSetFill does
>>>             not take effect for my problem. The matrices and rhs as
>>>             well as the solutions are attached from the link below.
>>>             I obtain the solution using hypre preconditioner and it
>>>             takes 7 and 38 iterations for matrix 1 and matrix 2.
>>>             However, if I use other preconditioner, the solver just
>>>             failed at the first matrix. I have tested this matrix
>>>             using the native sequential solver (not PETSc) with ILU
>>>             preconditioning. If I set the incomplete factorization
>>>             level to 0, this sequential solver will take more than
>>>             100 iterations. If I increase the factorization level to
>>>             1 or more, it just takes several iterations. This remind
>>>             me that the PC factor for this matrices should be
>>>             increased. However, when I tried it in PETSc, it just
>>>             does not work.
>>>
>>>             Matrix and rhs can be obtained from the link below.
>>>
>>>             https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>>>             <https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R>
>>>
>>>             Would anyone help to check if you can make this work by
>>>             increasing the PC factor level or fill?
>>>
>>>
>>>         We have ILU(k) supported in serial. However ILU(dt) which
>>>         takes a tolerance only works through Hypre
>>>
>>>         http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>>>         <http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html>
>>>
>>>         I recommend you try SuperLU or MUMPS, which can both be
>>>         downloaded automatically by configure, and
>>>         do a full sparse LU.
>>>
>>>           Thanks,
>>>
>>>             Matt
>>>
>>>             Thanks and regards,
>>>
>>>             Danyang
>>>
>>>
>>>
>>>
>>>
>>>         -- 
>>>         What most experimenters take for granted before they begin
>>>         their experiments is infinitely more interesting than any
>>>         results to which their experiments lead.
>>>         -- Norbert Wiener
>>>
>>>         http://www.caam.rice.edu/~mk51/
>>>         <http://www.caam.rice.edu/%7Emk51/>
>>
>>
>>
>>
>>     -- 
>>     What most experimenters take for granted before they begin their
>>     experiments is infinitely more interesting than any results to
>>     which their experiments lead.
>>     -- Norbert Wiener
>>
>>     http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/%7Emk51/>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/f7a2d84e/attachment.html>

From jed at jedbrown.org  Wed May 24 14:28:54 2017
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 24 May 2017 13:28:54 -0600
Subject: [petsc-users] Accessing submatrices without additional memory
	usage
In-Reply-To: <0D256C07-0556-4D0E-A9D3-D7D3F5D8B2C6@gmail.com>
References: <CAAw1VLU6=UFWGihUC5dj5frQZg_tW4yEupj9A-SrQ_Lc8iKdVg@mail.gmail.com>
	<CAMYG4Gm103senLommeCCjDen_un4BVorXJsar4y5sW-4w5EOLg@mail.gmail.com>
	<0B69CBD4-9524-429A-8478-0BBF0C236F94@gmail.com>
	<87shju2jil.fsf@jedbrown.org>
	<5BF044B5-49B3-49CB-A58D-A06B11DD6000@gmail.com>
	<87poey2hy4.fsf@jedbrown.org>
	<0D256C07-0556-4D0E-A9D3-D7D3F5D8B2C6@gmail.com>
Message-ID: <87mva22gw9.fsf@jedbrown.org>

Micha? Derezi?ski <michal.derezinski at gmail.com> writes:

>> Wiadomo?? napisana przez Jed Brown <jed at jedbrown.org> w dniu 24.05.2017, o godz. 12:06:
>> 
>> Okay, do you have more parameters than observations?  
>
> No (not necessarily). The biggest matrix is 50M observations and 12M parameters.
>
>> And each segment
>> of the matrix will be fully distributed?
>
> Yes.
>
>>  Do you have a parallel file
>> system?
>
> Yes.
>
>>  Is your matrix sparse or dense?
>
> Yes.

By that you mean sparse?

You'll need some sort of segmented storage (could be separate files or a
file format that allows seeking).  (If the matrix is generated by some
other process, you'd benefit from skipping the file system entirely, but
I understand that may not be possible.)

I would use MatNest, creating a new one after each segment is loaded.
There isn't currently a MatLoadBegin/End interface, but that could be
created if it would be useful.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/62f0f65d/attachment.pgp>

From danyang.su at gmail.com  Wed May 24 15:06:47 2017
From: danyang.su at gmail.com (Danyang Su)
Date: Wed, 24 May 2017 13:06:47 -0700
Subject: [petsc-users] Question on incomplete factorization level and
	fill
In-Reply-To: <f2389e74-f3a1-420d-818d-33c782c0aa39@gmail.com>
References: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
	<CAMYG4GnTZChFxXgizw1edRtfNA==oOpeJFy57nzY1b-xkkPYbg@mail.gmail.com>
	<c78911ea-edba-3966-754d-0102e2d75d42@gmail.com>
	<CAMYG4GmaFrpOzo1iFXJQoLZpwZ0rxy_ULpNPgcgtD4AMZHrRfg@mail.gmail.com>
	<b528c54d-c8e0-5030-315b-fdb97c279b19@gmail.com>
	<CAGCphBvLSvn8NXFpQ2qjBtyXwtHA0dqziJVSdZSVtu9BtVxZ5g@mail.gmail.com>
	<f2389e74-f3a1-420d-818d-33c782c0aa39@gmail.com>
Message-ID: <bfb44b8d-fc38-6884-368a-216322668782@gmail.com>

Dear Hong,

I just tested with different number of processors for the same matrix. 
It sometimes got "ERROR: Arguments are incompatible" for different 
number of processors. It works fine using 4, 8, or 24 processors, but 
failed with "ERROR: Arguments are incompatible" using 16 or 48 
processors. The error information is attached. I tested this on my local 
computer with 6 cores 12 threads. Any suggestion on this?

Thanks,

Danyang


On 17-05-24 12:28 PM, Danyang Su wrote:
>
> Hi Hong,
>
> Awesome. Thanks for testing the case. I will try your options for the 
> code and get back to you later.
>
> Regards,
>
> Danyang
>
>
> On 17-05-24 12:21 PM, Hong wrote:
>> Danyang :
>> I tested your data.
>> Your matrices encountered zero pivots, e.g.
>> petsc/src/ksp/ksp/examples/tutorials (master)
>> $ mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin 
>> -ksp_monitor -ksp_error_if_not_converged
>>
>> [15]PETSC ERROR: Zero pivot in LU factorization: 
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot
>> [15]PETSC ERROR: Zero pivot row 1249 value 2.05808e-14 tolerance 
>> 2.22045e-14
>> ...
>>
>> Adding option '-sub_pc_factor_shift_type nonzero', I got
>> mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin 
>> -ksp_monitor -ksp_error_if_not_converged -sub_pc_factor_shift_type 
>> nonzero -mat_view ascii::ascii_info
>>
>> Mat Object: 24 MPI processes
>>   type: mpiaij
>>   rows=450000, cols=450000
>>   total: nonzeros=6991400, allocated nonzeros=6991400
>>   total number of mallocs used during MatSetValues calls =0
>>     not using I-node (on process 0) routines
>>   0 KSP Residual norm 5.849777711755e+01
>>   1 KSP Residual norm 6.824179430230e-01
>>   2 KSP Residual norm 3.994483555787e-02
>>   3 KSP Residual norm 6.085841461433e-03
>>   4 KSP Residual norm 8.876162583511e-04
>>   5 KSP Residual norm 9.407780665278e-05
>> Number of iterations =   5
>> Residual norm 0.00542891
>>
>> Hong
>>
>>     Hi Matt,
>>
>>     Yes. The matrix is 450000x450000 sparse. The hypre takes hundreds
>>     of iterates, not for all but in most of the timesteps. The matrix
>>     is not well conditioned, with nonzero entries range from 1.0e-29
>>     to 1.0e2. I also made double check if there is anything wrong in
>>     the parallel version, however, the matrix is the same with
>>     sequential version except some round error which is relatively
>>     very small. Usually for those not well conditioned matrix, direct
>>     solver should be faster than iterative solver, right? But when I
>>     use the sequential iterative solver with ILU prec developed
>>     almost 20 years go by others, the solver converge fast with
>>     appropriate factorization level. In other words, when I use 24
>>     processor using hypre, the speed is almost the same as as the old
>>     sequential iterative solver using 1 processor.
>>
>>     I use most of the default configuration for the general case with
>>     pretty good speedup. And I am not sure if I miss something for
>>     this problem.
>>
>>     Thanks,
>>
>>     Danyang
>>
>>
>>     On 17-05-24 11:12 AM, Matthew Knepley wrote:
>>>     On Wed, May 24, 2017 at 12:50 PM, Danyang Su
>>>     <danyang.su at gmail.com <mailto:danyang.su at gmail.com>> wrote:
>>>
>>>         Hi Matthew and Barry,
>>>
>>>         Thanks for the quick response.
>>>
>>>         I also tried superlu and mumps, both work but it is about
>>>         four times slower than ILU(dt) prec through hypre, with 24
>>>         processors I have tested.
>>>
>>>     You mean the total time is 4x? And you are taking hundreds of
>>>     iterates? That seems hard to believe, unless you are dropping
>>>     a huge number of elements.
>>>
>>>         When I look into the convergence information, the method
>>>         using ILU(dt) still takes 200 to 3000 linear iterations for
>>>         each newton iteration. One reason is this equation is hard
>>>         to solve. As for the general cases, the same method works
>>>         awesome and get very good speedup.
>>>
>>>     I do not understand what you mean here.
>>>
>>>         I also doubt if I use hypre correctly for this case. Is
>>>         there anyway to check this problem, or is it possible to
>>>         increase the factorization level through hypre?
>>>
>>>     I don't know.
>>>
>>>       Matt
>>>
>>>         Thanks,
>>>
>>>         Danyang
>>>
>>>
>>>         On 17-05-24 04:59 AM, Matthew Knepley wrote:
>>>>         On Wed, May 24, 2017 at 2:21 AM, Danyang Su
>>>>         <danyang.su at gmail.com <mailto:danyang.su at gmail.com>> wrote:
>>>>
>>>>             Dear All,
>>>>
>>>>             I use PCFactorSetLevels for ILU and PCFactorSetFill for
>>>>             other preconditioning in my code to help solve the
>>>>             problems that the default option is hard to solve.
>>>>             However, I found the latter one, PCFactorSetFill does
>>>>             not take effect for my problem. The matrices and rhs as
>>>>             well as the solutions are attached from the link below.
>>>>             I obtain the solution using hypre preconditioner and it
>>>>             takes 7 and 38 iterations for matrix 1 and matrix 2.
>>>>             However, if I use other preconditioner, the solver just
>>>>             failed at the first matrix. I have tested this matrix
>>>>             using the native sequential solver (not PETSc) with ILU
>>>>             preconditioning. If I set the incomplete factorization
>>>>             level to 0, this sequential solver will take more than
>>>>             100 iterations. If I increase the factorization level
>>>>             to 1 or more, it just takes several iterations. This
>>>>             remind me that the PC factor for this matrices should
>>>>             be increased. However, when I tried it in PETSc, it
>>>>             just does not work.
>>>>
>>>>             Matrix and rhs can be obtained from the link below.
>>>>
>>>>             https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>>>>             <https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R>
>>>>
>>>>             Would anyone help to check if you can make this work by
>>>>             increasing the PC factor level or fill?
>>>>
>>>>
>>>>         We have ILU(k) supported in serial. However ILU(dt) which
>>>>         takes a tolerance only works through Hypre
>>>>
>>>>         http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>>>>         <http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html>
>>>>
>>>>         I recommend you try SuperLU or MUMPS, which can both be
>>>>         downloaded automatically by configure, and
>>>>         do a full sparse LU.
>>>>
>>>>           Thanks,
>>>>
>>>>             Matt
>>>>
>>>>             Thanks and regards,
>>>>
>>>>             Danyang
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         -- 
>>>>         What most experimenters take for granted before they begin
>>>>         their experiments is infinitely more interesting than any
>>>>         results to which their experiments lead.
>>>>         -- Norbert Wiener
>>>>
>>>>         http://www.caam.rice.edu/~mk51/
>>>>         <http://www.caam.rice.edu/%7Emk51/>
>>>
>>>
>>>
>>>
>>>     -- 
>>>     What most experimenters take for granted before they begin their
>>>     experiments is infinitely more interesting than any results to
>>>     which their experiments lead.
>>>     -- Norbert Wiener
>>>
>>>     http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/%7Emk51/>
>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/b6347109/attachment-0001.html>
-------------- next part --------------
Mat Object: 16 MPI processes
  type: mpiaij
  rows=450000, cols=450000
  total: nonzeros=6991400, allocated nonzeros=6991400
  total number of mallocs used during MatSetValues calls =0
    not using I-node (on process 0) routines
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Arguments are incompatible
[0]PETSC ERROR: Incompatible vector local lengths 28130 != 28125
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[1]PETSC ERROR: Arguments are incompatible
[1]PETSC ERROR: Incompatible vector local lengths 28130 != 28125
[1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[1]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[1]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:54:10 2017
[1]PETSC ERROR: [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[2]PETSC ERROR: Arguments are incompatible
[2]PETSC ERROR: Incompatible vector local lengths 28130 != 28125
[2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[2]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[2]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:54:10 2017
[2]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[2]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[4]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[4]PETSC ERROR: Arguments are incompatible
[4]PETSC ERROR: Incompatible vector local lengths 28130 != 28125
[4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[4]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[4]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:54:10 2017
[4]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[4]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[6]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[6]PETSC ERROR: Arguments are incompatible
[6]PETSC ERROR: Incompatible vector local lengths 28130 != 28125
[6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[6]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[6]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:54:10 2017
[6]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[6]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[8]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[8]PETSC ERROR: Arguments are incompatible
[8]PETSC ERROR: Incompatible vector local lengths 28120 != 28125
[8]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[8]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[8]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:54:10 2017
[8]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[8]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[8]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:54:10 2017
[0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[0]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[0]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[0]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[0]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[1]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[1]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[1]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[1]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[2]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[2]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[2]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[2]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[3]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[3]PETSC ERROR: Arguments are incompatible
[3]PETSC ERROR: Incompatible vector local lengths 28130 != 28125
[3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[3]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[3]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:54:10 2017
[3]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[3]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[3]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[3]PETSC ERROR: [4]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[4]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[4]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[4]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[4]PETSC ERROR: PETSc Option Table entries:
[4]PETSC ERROR: [5]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[5]PETSC ERROR: Arguments are incompatible
[5]PETSC ERROR: Incompatible vector local lengths 28130 != 28125
[5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[5]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[5]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:54:10 2017
[5]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[5]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[5]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[5]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[5]PETSC ERROR: [6]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[6]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[6]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[6]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[6]PETSC ERROR: PETSc Option Table entries:
[7]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[7]PETSC ERROR: Arguments are incompatible
[7]PETSC ERROR: Incompatible vector local lengths 28130 != 28125
[7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[7]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[7]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:54:10 2017
[7]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[7]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[7]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[7]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
#2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[8]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[8]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[8]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[8]PETSC ERROR: PETSc Option Table entries:
[8]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[8]PETSC ERROR: -ksp_error_if_not_converged
[8]PETSC ERROR: [11]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[11]PETSC ERROR: Arguments are incompatible
[11]PETSC ERROR: Incompatible vector local lengths 28120 != 28125
[11]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[11]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[11]PETSC ERROR: [12]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[12]PETSC ERROR: Arguments are incompatible
[12]PETSC ERROR: Incompatible vector local lengths 28120 != 28125
[12]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[12]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[12]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:54:10 2017
[12]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[12]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[12]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[12]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[12]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[0]PETSC ERROR: PETSc Option Table entries:
[0]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[0]PETSC ERROR: -ksp_error_if_not_converged
[0]PETSC ERROR: -mat_view ascii::ascii_info
[0]PETSC ERROR: -matload_block_size 1
[0]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[0]PETSC ERROR: -skp_monitor
[0]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[0]PETSC ERROR: -vecload_block_size 10
[1]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[1]PETSC ERROR: PETSc Option Table entries:
[1]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[1]PETSC ERROR: -ksp_error_if_not_converged
[1]PETSC ERROR: -mat_view ascii::ascii_info
[1]PETSC ERROR: -matload_block_size 1
[1]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[1]PETSC ERROR: -skp_monitor
[1]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[1]PETSC ERROR: -vecload_block_size 10
[2]PETSC ERROR: PETSc Option Table entries:
[2]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[2]PETSC ERROR: -ksp_error_if_not_converged
[2]PETSC ERROR: -mat_view ascii::ascii_info
[2]PETSC ERROR: -matload_block_size 1
[2]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[2]PETSC ERROR: -skp_monitor
[2]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[2]PETSC ERROR: -vecload_block_size 10
[2]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
#3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[3]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[3]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[3]PETSC ERROR: PETSc Option Table entries:
[3]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[3]PETSC ERROR: -ksp_error_if_not_converged
[3]PETSC ERROR: -mat_view ascii::ascii_info
[3]PETSC ERROR: -matload_block_size 1
[3]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[3]PETSC ERROR: -skp_monitor
-f0 ./mat_rhs/a_react_in_2.bin
[4]PETSC ERROR: -ksp_error_if_not_converged
[4]PETSC ERROR: -mat_view ascii::ascii_info
[4]PETSC ERROR: -matload_block_size 1
[4]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[4]PETSC ERROR: -skp_monitor
[4]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[4]PETSC ERROR: -vecload_block_size 10
[4]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
#4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[5]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[5]PETSC ERROR: PETSc Option Table entries:
[5]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[5]PETSC ERROR: -ksp_error_if_not_converged
[5]PETSC ERROR: -mat_view ascii::ascii_info
[5]PETSC ERROR: -matload_block_size 1
[5]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[5]PETSC ERROR: -skp_monitor
[5]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[6]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[6]PETSC ERROR: -ksp_error_if_not_converged
[6]PETSC ERROR: -mat_view ascii::ascii_info
[6]PETSC ERROR: -matload_block_size 1
[6]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[6]PETSC ERROR: -skp_monitor
[6]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[6]PETSC ERROR: -vecload_block_size 10
[6]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
[7]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[7]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[7]PETSC ERROR: PETSc Option Table entries:
[7]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[7]PETSC ERROR: -ksp_error_if_not_converged
[7]PETSC ERROR: -mat_view ascii::ascii_info
[7]PETSC ERROR: -matload_block_size 1
[7]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[7]PETSC ERROR: -skp_monitor
-mat_view ascii::ascii_info
[8]PETSC ERROR: -matload_block_size 1
[8]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[8]PETSC ERROR: -skp_monitor
[8]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[8]PETSC ERROR: -vecload_block_size 10
[8]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:54:10 2017
[11]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[11]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[11]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[11]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[11]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[11]PETSC ERROR: [12]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[12]PETSC ERROR: PETSc Option Table entries:
[12]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[12]PETSC ERROR: -ksp_error_if_not_converged
[12]PETSC ERROR: -mat_view ascii::ascii_info
[12]PETSC ERROR: -matload_block_size 1
[12]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[12]PETSC ERROR: -skp_monitor
[12]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
[1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 2
[cli_2]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 2
[3]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[3]PETSC ERROR: -vecload_block_size 10
[3]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
[5]PETSC ERROR: -vecload_block_size 10
[5]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
[7]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[7]PETSC ERROR: -vecload_block_size 10
[7]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
#5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[11]PETSC ERROR: PETSc Option Table entries:
[11]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[11]PETSC ERROR: -ksp_error_if_not_converged
[11]PETSC ERROR: -mat_view ascii::ascii_info
[11]PETSC ERROR: -matload_block_size 1
[11]PETSC ERROR: [12]PETSC ERROR: -vecload_block_size 10
[12]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 1
[cli_1]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 1
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 3
[cli_3]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 3
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 5
[cli_5]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 5
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 6
[cli_6]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 6
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 7
[cli_7]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 7
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 8
[cli_8]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 8
-rhs ./mat_rhs/b_react_in_2.bin
[11]PETSC ERROR: -skp_monitor
[11]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[11]PETSC ERROR: -vecload_block_size 10
[11]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 12
[cli_12]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 12
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 0
[cli_0]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 0
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 11
[cli_11]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 11
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 4
[cli_4]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 4
[9]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[9]PETSC ERROR: Arguments are incompatible
[9]PETSC ERROR: Incompatible vector local lengths 28120 != 28125
[9]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[9]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[9]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:54:10 2017
[9]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[9]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[9]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[9]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[9]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[9]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[9]PETSC ERROR: PETSc Option Table entries:
[9]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[9]PETSC ERROR: -ksp_error_if_not_converged
[9]PETSC ERROR: -mat_view ascii::ascii_info
[9]PETSC ERROR: -matload_block_size 1
[9]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[9]PETSC ERROR: -skp_monitor
[9]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[9]PETSC ERROR: -vecload_block_size 10
[9]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 9
[cli_9]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 9

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 75
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
-------------- next part --------------
Mat Object: 48 MPI processes
  type: mpiaij
  rows=450000, cols=450000
  total: nonzeros=6991400, allocated nonzeros=6991400
  total number of mallocs used during MatSetValues calls =0
    not using I-node (on process 0) routines
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Arguments are incompatible
[0]PETSC ERROR: Incompatible vector local lengths 9380 != 9375
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[0]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[0]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[0]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[0]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[0]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[0]PETSC ERROR: PETSc Option Table entries:
[0]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[0]PETSC ERROR: -ksp_error_if_not_converged
[0]PETSC ERROR: -mat_view ascii::ascii_info
[0]PETSC ERROR: -matload_block_size 1
[0]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[0]PETSC ERROR: -skp_monitor
[0]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[0]PETSC ERROR: -vecload_block_size 10
[0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 0
[cli_0]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 0
[1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[1]PETSC ERROR: Arguments are incompatible
[1]PETSC ERROR: Incompatible vector local lengths 9380 != 9375
[1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[1]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[1]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[1]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[1]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[1]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[1]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[1]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[1]PETSC ERROR: PETSc Option Table entries:
[1]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[1]PETSC ERROR: -ksp_error_if_not_converged
[1]PETSC ERROR: -mat_view ascii::ascii_info
[1]PETSC ERROR: -matload_block_size 1
[1]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[1]PETSC ERROR: -skp_monitor
[1]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[1]PETSC ERROR: -vecload_block_size 10
[1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 1
[cli_1]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 1
[2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[2]PETSC ERROR: Arguments are incompatible
[2]PETSC ERROR: Incompatible vector local lengths 9380 != 9375
[2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[2]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[2]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[2]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[2]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[2]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[2]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[2]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[2]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[2]PETSC ERROR: PETSc Option Table entries:
[2]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[2]PETSC ERROR: -ksp_error_if_not_converged
[2]PETSC ERROR: -mat_view ascii::ascii_info
[2]PETSC ERROR: -matload_block_size 1
[2]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[2]PETSC ERROR: -skp_monitor
[2]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[2]PETSC ERROR: -vecload_block_size 10
[2]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 2
[cli_2]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 2
[4]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[4]PETSC ERROR: Arguments are incompatible
[4]PETSC ERROR: Incompatible vector local lengths 9380 != 9375
[4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[4]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[4]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[4]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[4]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[4]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[4]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[4]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[4]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[4]PETSC ERROR: PETSc Option Table entries:
[4]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[4]PETSC ERROR: -ksp_error_if_not_converged
[4]PETSC ERROR: -mat_view ascii::ascii_info
[4]PETSC ERROR: -matload_block_size 1
[4]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[4]PETSC ERROR: -skp_monitor
[4]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[4]PETSC ERROR: -vecload_block_size 10
[4]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 4
[cli_4]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 4
[5]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[5]PETSC ERROR: Arguments are incompatible
[5]PETSC ERROR: Incompatible vector local lengths 9380 != 9375
[5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[5]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[5]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[5]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[5]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[5]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[5]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[5]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[5]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[5]PETSC ERROR: PETSc Option Table entries:
[5]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[5]PETSC ERROR: -ksp_error_if_not_converged
[5]PETSC ERROR: -mat_view ascii::ascii_info
[5]PETSC ERROR: -matload_block_size 1
[5]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[5]PETSC ERROR: -skp_monitor
[5]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[5]PETSC ERROR: -vecload_block_size 10
[5]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 5
[cli_5]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 5
[6]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[6]PETSC ERROR: Arguments are incompatible
[6]PETSC ERROR: Incompatible vector local lengths 9380 != 9375
[6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[6]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[6]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[6]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[6]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[6]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[6]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[6]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[6]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[6]PETSC ERROR: PETSc Option Table entries:
[6]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[6]PETSC ERROR: -ksp_error_if_not_converged
[6]PETSC ERROR: -mat_view ascii::ascii_info
[6]PETSC ERROR: -matload_block_size 1
[6]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[6]PETSC ERROR: -skp_monitor
[6]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[6]PETSC ERROR: -vecload_block_size 10
[6]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 6
[cli_6]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 6
[8]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[8]PETSC ERROR: Arguments are incompatible
[8]PETSC ERROR: Incompatible vector local lengths 9380 != 9375
[8]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[8]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[8]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[8]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[8]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[8]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[8]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[8]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[8]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[8]PETSC ERROR: PETSc Option Table entries:
[8]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[8]PETSC ERROR: -ksp_error_if_not_converged
[8]PETSC ERROR: -mat_view ascii::ascii_info
[8]PETSC ERROR: -matload_block_size 1
[8]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[8]PETSC ERROR: -skp_monitor
[8]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[8]PETSC ERROR: -vecload_block_size 10
[8]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 8
[cli_8]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 8
[10]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[10]PETSC ERROR: Arguments are incompatible
[10]PETSC ERROR: Incompatible vector local lengths 9380 != 9375
[10]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[10]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[10]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[10]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[10]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[10]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[10]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[10]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[10]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[10]PETSC ERROR: PETSc Option Table entries:
[10]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[10]PETSC ERROR: -ksp_error_if_not_converged
[10]PETSC ERROR: -mat_view ascii::ascii_info
[10]PETSC ERROR: -matload_block_size 1
[10]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[10]PETSC ERROR: -skp_monitor
[10]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[10]PETSC ERROR: -vecload_block_size 10
[10]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 10
[cli_10]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 10
[16]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[16]PETSC ERROR: Arguments are incompatible
[16]PETSC ERROR: Incompatible vector local lengths 9380 != 9375
[16]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[16]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[16]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[16]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[16]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[16]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[16]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[16]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[16]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[16]PETSC ERROR: PETSc Option Table entries:
[16]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[16]PETSC ERROR: -ksp_error_if_not_converged
[16]PETSC ERROR: -mat_view ascii::ascii_info
[16]PETSC ERROR: -matload_block_size 1
[16]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[16]PETSC ERROR: -skp_monitor
[16]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[16]PETSC ERROR: -vecload_block_size 10
[16]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 16
[cli_16]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 16
[24]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[24]PETSC ERROR: Arguments are incompatible
[24]PETSC ERROR: Incompatible vector local lengths 9370 != 9375
[24]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[24]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[24]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[24]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[24]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[24]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[24]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[24]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[24]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[24]PETSC ERROR: PETSc Option Table entries:
[24]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[24]PETSC ERROR: -ksp_error_if_not_converged
[24]PETSC ERROR: -mat_view ascii::ascii_info
[24]PETSC ERROR: -matload_block_size 1
[24]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[24]PETSC ERROR: -skp_monitor
[24]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[24]PETSC ERROR: -vecload_block_size 10
[24]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 24
[cli_24]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 24
[28]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[28]PETSC ERROR: Arguments are incompatible
[28]PETSC ERROR: Incompatible vector local lengths 9370 != 9375
[28]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[28]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[28]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[28]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[28]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[28]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[28]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[28]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[28]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[28]PETSC ERROR: PETSc Option Table entries:
[28]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[28]PETSC ERROR: -ksp_error_if_not_converged
[28]PETSC ERROR: -mat_view ascii::ascii_info
[28]PETSC ERROR: -matload_block_size 1
[28]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[28]PETSC ERROR: -skp_monitor
[28]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[28]PETSC ERROR: -vecload_block_size 10
[28]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 28
[cli_28]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 28
[32]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[32]PETSC ERROR: Arguments are incompatible
[32]PETSC ERROR: Incompatible vector local lengths 9370 != 9375
[32]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[32]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[32]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[32]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[32]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[32]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[32]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[32]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[32]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[32]PETSC ERROR: PETSc Option Table entries:
[32]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[32]PETSC ERROR: -ksp_error_if_not_converged
[32]PETSC ERROR: -mat_view ascii::ascii_info
[32]PETSC ERROR: -matload_block_size 1
[32]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[32]PETSC ERROR: -skp_monitor
[32]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[32]PETSC ERROR: -vecload_block_size 10
[32]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 32
[cli_32]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 32
[33]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[33]PETSC ERROR: Arguments are incompatible
[33]PETSC ERROR: Incompatible vector local lengths 9370 != 9375
[33]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[33]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[33]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[33]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[33]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[33]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[33]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[33]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[33]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[33]PETSC ERROR: PETSc Option Table entries:
[33]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[33]PETSC ERROR: -ksp_error_if_not_converged
[33]PETSC ERROR: -mat_view ascii::ascii_info
[33]PETSC ERROR: -matload_block_size 1
[33]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[33]PETSC ERROR: -skp_monitor
[33]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[33]PETSC ERROR: -vecload_block_size 10
[33]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 33
[cli_33]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 33
[34]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[34]PETSC ERROR: Arguments are incompatible
[34]PETSC ERROR: Incompatible vector local lengths 9370 != 9375
[34]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[34]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[34]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[34]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[34]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[34]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[34]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[34]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[34]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[34]PETSC ERROR: PETSc Option Table entries:
[34]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[34]PETSC ERROR: -ksp_error_if_not_converged
[34]PETSC ERROR: -mat_view ascii::ascii_info
[34]PETSC ERROR: -matload_block_size 1
[34]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[34]PETSC ERROR: -skp_monitor
[34]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[34]PETSC ERROR: -vecload_block_size 10
[34]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 34
[cli_34]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 34
[40]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[40]PETSC ERROR: Arguments are incompatible
[40]PETSC ERROR: Incompatible vector local lengths 9370 != 9375
[40]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[40]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[40]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[40]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[40]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[40]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[40]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[40]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[40]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[40]PETSC ERROR: PETSc Option Table entries:
[40]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[40]PETSC ERROR: -ksp_error_if_not_converged
[40]PETSC ERROR: -mat_view ascii::ascii_info
[40]PETSC ERROR: -matload_block_size 1
[40]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[40]PETSC ERROR: -skp_monitor
[40]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[40]PETSC ERROR: -vecload_block_size 10
[40]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 40
[cli_40]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 40
[42]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[42]PETSC ERROR: Arguments are incompatible
[42]PETSC ERROR: Incompatible vector local lengths 9370 != 9375
[42]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[42]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[42]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[42]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[42]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[42]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[42]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[42]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[42]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[42]PETSC ERROR: PETSc Option Table entries:
[42]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[42]PETSC ERROR: -ksp_error_if_not_converged
[42]PETSC ERROR: -mat_view ascii::ascii_info
[42]PETSC ERROR: -matload_block_size 1
[42]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[42]PETSC ERROR: -skp_monitor
[42]PETSC ERROR: -sub_pc_factor_shift_type nonzero
[42]PETSC ERROR: -vecload_block_size 10
[42]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 42
[cli_42]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 75) - process 42
[44]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[44]PETSC ERROR: Arguments are incompatible
[44]PETSC ERROR: Incompatible vector local lengths 9370 != 9375
[44]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[44]PETSC ERROR: Petsc Release Version 3.7.5, Jan, 01, 2017 
[44]PETSC ERROR: ./ex10 on a linux-gnu-dbg named nwmop by dsu Wed May 24 12:57:30 2017
[44]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes
[44]PETSC ERROR: #1 VecCopy() line 1639 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/vec/vec/interface/vector.c
[44]PETSC ERROR: #2 KSPInitialResidual() line 65 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itres.c
[44]PETSC ERROR: #3 KSPSolve_GMRES() line 239 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/impls/gmres/gmres.c
[44]PETSC ERROR: #4 KSPSolve() line 656 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/interface/itfunc.c
[44]PETSC ERROR: #5 main() line 330 in /home/dsu/Soft/PETSc/petsc-3.7.5/src/ksp/ksp/examples/tutorials/ex10.c
[44]PETSC ERROR: PETSc Option Table entries:
[44]PETSC ERROR: -f0 ./mat_rhs/a_react_in_2.bin
[44]PETSC ERROR: -ksp_error_if_not_converged
[44]PETSC ERROR: -mat_view ascii::ascii_info
[44]PETSC ERROR: -matload_block_size 1
[44]PETSC ERROR: -rhs ./mat_rhs/b_react_in_2.bin
[44]PETSC ERROR: -skp_monitor
[44]PETSC ERROR: -sub_pc_factor_shift_type nonzero

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 75
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
-------------- next part --------------
Mat Object: 8 MPI processes
  type: mpiaij
  rows=450000, cols=450000
  total: nonzeros=6991400, allocated nonzeros=6991400
  total number of mallocs used during MatSetValues calls =0
    not using I-node (on process 0) routines
Number of iterations =   5
Residual norm 0.00542965
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
Option left: name:-skp_monitor (no value)

From bsmith at mcs.anl.gov  Wed May 24 15:18:09 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 24 May 2017 15:18:09 -0500
Subject: [petsc-users] Question on incomplete factorization level and
	fill
In-Reply-To: <bfb44b8d-fc38-6884-368a-216322668782@gmail.com>
References: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
	<CAMYG4GnTZChFxXgizw1edRtfNA==oOpeJFy57nzY1b-xkkPYbg@mail.gmail.com>
	<c78911ea-edba-3966-754d-0102e2d75d42@gmail.com>
	<CAMYG4GmaFrpOzo1iFXJQoLZpwZ0rxy_ULpNPgcgtD4AMZHrRfg@mail.gmail.com>
	<b528c54d-c8e0-5030-315b-fdb97c279b19@gmail.com>
	<CAGCphBvLSvn8NXFpQ2qjBtyXwtHA0dqziJVSdZSVtu9BtVxZ5g@mail.gmail.com>
	<f2389e74-f3a1-420d-818d-33c782c0aa39@gmail.com>
	<bfb44b8d-fc38-6884-368a-216322668782@gmail.com>
Message-ID: <2596E6EB-1F05-46F1-8B44-A36825EF9AEC@mcs.anl.gov>


   I don't think this has anything to do with the specific solver but is because you are loading both a vector and matrix from a file and when it uses the default parallel layout for each, because you have -matload_block_size 1  and -vecload_block_size 10 they do not get the same layout.

   Remove the -matload_block_size 1  and -vecload_block_size 10 they don't mean anything here anyways.

   Does this resolve the problem?

   Barry

> On May 24, 2017, at 3:06 PM, Danyang Su <danyang.su at gmail.com> wrote:
> 
> Dear Hong,
> 
> I just tested with different number of processors for the same matrix. It sometimes got "ERROR: Arguments are incompatible" for different number of processors. It works fine using 4, 8, or 24 processors, but failed with "ERROR: Arguments are incompatible" using 16 or 48 processors. The error information is attached. I tested this on my local computer with 6 cores 12 threads. Any suggestion on this?
> 
> Thanks,
> Danyang
> 
> On 17-05-24 12:28 PM, Danyang Su wrote:
>> Hi Hong,
>> 
>> Awesome. Thanks for testing the case. I will try your options for the code and get back to you later.
>> 
>> Regards,
>> 
>> Danyang
>> 
>> On 17-05-24 12:21 PM, Hong wrote:
>>> Danyang :
>>> I tested your data.
>>> Your matrices encountered zero pivots, e.g.
>>> petsc/src/ksp/ksp/examples/tutorials (master)
>>> $ mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin -ksp_monitor -ksp_error_if_not_converged
>>> 
>>> [15]PETSC ERROR: Zero pivot in LU factorization: http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot
>>> [15]PETSC ERROR: Zero pivot row 1249 value 2.05808e-14 tolerance 2.22045e-14
>>> ...
>>> 
>>> Adding option '-sub_pc_factor_shift_type nonzero', I got
>>> mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin -ksp_monitor -ksp_error_if_not_converged -sub_pc_factor_shift_type nonzero -mat_view ascii::ascii_info
>>> 
>>> Mat Object: 24 MPI processes
>>>   type: mpiaij
>>>   rows=450000, cols=450000
>>>   total: nonzeros=6991400, allocated nonzeros=6991400
>>>   total number of mallocs used during MatSetValues calls =0
>>>     not using I-node (on process 0) routines
>>>   0 KSP Residual norm 5.849777711755e+01
>>>   1 KSP Residual norm 6.824179430230e-01
>>>   2 KSP Residual norm 3.994483555787e-02
>>>   3 KSP Residual norm 6.085841461433e-03
>>>   4 KSP Residual norm 8.876162583511e-04
>>>   5 KSP Residual norm 9.407780665278e-05
>>> Number of iterations =   5
>>> Residual norm 0.00542891
>>> 
>>> Hong
>>> Hi Matt,
>>> 
>>> Yes. The matrix is 450000x450000 sparse. The hypre takes hundreds of iterates, not for all but in most of the timesteps. The matrix is not well conditioned, with nonzero entries range from 1.0e-29 to 1.0e2. I also made double check if there is anything wrong in the parallel version, however, the matrix is the same with sequential version except some round error which is relatively very small. Usually for those not well conditioned matrix, direct solver should be faster than iterative solver, right? But when I use the sequential iterative solver with ILU prec developed almost 20 years go by others, the solver converge fast with appropriate factorization level. In other words, when I use 24 processor using hypre, the speed is almost the same as as the old sequential iterative solver using 1 processor.
>>> 
>>> I use most of the default configuration for the general case with pretty good speedup. And I am not sure if I miss something for this problem.
>>> 
>>> Thanks,
>>> 
>>> Danyang
>>> 
>>> On 17-05-24 11:12 AM, Matthew Knepley wrote:
>>>> On Wed, May 24, 2017 at 12:50 PM, Danyang Su <danyang.su at gmail.com> wrote:
>>>> Hi Matthew and Barry,
>>>> 
>>>> Thanks for the quick response. 
>>>> I also tried superlu and mumps, both work but it is about four times slower than ILU(dt) prec through hypre, with 24 processors I have tested.
>>>> 
>>>> You mean the total time is 4x? And you are taking hundreds of iterates? That seems hard to believe, unless you are dropping
>>>> a huge number of elements. 
>>>> When I look into the convergence information, the method using ILU(dt) still takes 200 to 3000 linear iterations for each newton iteration. One reason is this equation is hard to solve. As for the general cases, the same method works awesome and get very good speedup.
>>>> 
>>>> I do not understand what you mean here. 
>>>> I also doubt if I use hypre correctly for this case. Is there anyway to check this problem, or is it possible to increase the factorization level through hypre?
>>>> 
>>>> I don't know.
>>>> 
>>>>   Matt 
>>>> Thanks,
>>>> 
>>>> Danyang
>>>> 
>>>> On 17-05-24 04:59 AM, Matthew Knepley wrote:
>>>>> On Wed, May 24, 2017 at 2:21 AM, Danyang Su <danyang.su at gmail.com> wrote:
>>>>> Dear All,
>>>>> 
>>>>> I use PCFactorSetLevels for ILU and PCFactorSetFill for other preconditioning in my code to help solve the problems that the default option is hard to solve. However, I found the latter one, PCFactorSetFill does not take effect for my problem. The matrices and rhs as well as the solutions are attached from the link below. I obtain the solution using hypre preconditioner and it takes 7 and 38 iterations for matrix 1 and matrix 2. However, if I use other preconditioner, the solver just failed at the first matrix. I have tested this matrix using the native sequential solver (not PETSc) with ILU preconditioning. If I set the incomplete factorization level to 0, this sequential solver will take more than 100 iterations. If I increase the factorization level to 1 or more, it just takes several iterations. This remind me that the PC factor for this matrices should be increased. However, when I tried it in PETSc, it just does not work.
>>>>> 
>>>>> Matrix and rhs can be obtained from the link below.
>>>>> 
>>>>> https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>>>>> 
>>>>> Would anyone help to check if you can make this work by increasing the PC factor level or fill?
>>>>> 
>>>>> We have ILU(k) supported in serial. However ILU(dt) which takes a tolerance only works through Hypre
>>>>> 
>>>>>   http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>>>>> 
>>>>> I recommend you try SuperLU or MUMPS, which can both be downloaded automatically by configure, and
>>>>> do a full sparse LU.
>>>>> 
>>>>>   Thanks,
>>>>> 
>>>>>     Matt
>>>>>  
>>>>> Thanks and regards,
>>>>> 
>>>>> Danyang
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>> -- Norbert Wiener
>>>>> 
>>>>> http://www.caam.rice.edu/~mk51/
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>>> 
>>>> http://www.caam.rice.edu/~mk51/
>>> 
>>> 
>> 
> 
> <outscreen_p16.txt><outscreen_p48.txt><outscreen_p8.txt>


From michal.derezinski at gmail.com  Wed May 24 15:20:56 2017
From: michal.derezinski at gmail.com (=?utf-8?Q?Micha=C5=82_Derezi=C5=84ski?=)
Date: Wed, 24 May 2017 13:20:56 -0700
Subject: [petsc-users] Accessing submatrices without additional memory
 usage
In-Reply-To: <87mva22gw9.fsf@jedbrown.org>
References: <CAAw1VLU6=UFWGihUC5dj5frQZg_tW4yEupj9A-SrQ_Lc8iKdVg@mail.gmail.com>
	<CAMYG4Gm103senLommeCCjDen_un4BVorXJsar4y5sW-4w5EOLg@mail.gmail.com>
	<0B69CBD4-9524-429A-8478-0BBF0C236F94@gmail.com>
	<87shju2jil.fsf@jedbrown.org>
	<5BF044B5-49B3-49CB-A58D-A06B11DD6000@gmail.com>
	<87poey2hy4.fsf@jedbrown.org>
	<0D256C07-0556-4D0E-A9D3-D7D3F5D8B2C6@gmail.com>
	<87mva22gw9.fsf@jedbrown.org>
Message-ID: <CE83E125-246E-4B22-B07B-EB6790AF89B1@gmail.com>


> Wiadomo?? napisana przez Jed Brown <jed at jedbrown.org> w dniu 24.05.2017, o godz. 12:28:
> 
> Micha? Derezi?ski <michal.derezinski at gmail.com <mailto:michal.derezinski at gmail.com>> writes:
> 
>>> Wiadomo?? napisana przez Jed Brown <jed at jedbrown.org> w dniu 24.05.2017, o godz. 12:06:
>>> 
>>> Okay, do you have more parameters than observations?  
>> 
>> No (not necessarily). The biggest matrix is 50M observations and 12M parameters.
>> 
>>> And each segment
>>> of the matrix will be fully distributed?
>> 
>> Yes.
>> 
>>> Do you have a parallel file
>>> system?
>> 
>> Yes.
>> 
>>> Is your matrix sparse or dense?
>> 
>> Yes.
> 
> By that you mean sparse?
> 

Yes, sorry, that?s what I meant.

> You'll need some sort of segmented storage (could be separate files or a
> file format that allows seeking).  (If the matrix is generated by some
> other process, you'd benefit from skipping the file system entirely, but
> I understand that may not be possible.)
> 

I have the segmented storage in place.

> I would use MatNest, creating a new one after each segment is loaded.
> There isn't currently a MatLoadBegin/End interface, but that could be
> created if it would be useful.

Ok, yeah, that was my plan with MatNest. 

As far as loading in parallel with computation goes, the feedback that I?m hearing so far is:
1. Don?t do it, unless you really have to;
2. If you?re going to do it, instead of spawning a separate thread, use asynchronous read, eg MPI_File_iread.

Does this make sense?

Thanks,
Michal.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/4dca7076/attachment.html>

From hzhang at mcs.anl.gov  Wed May 24 20:32:40 2017
From: hzhang at mcs.anl.gov (Hong)
Date: Wed, 24 May 2017 20:32:40 -0500
Subject: [petsc-users] Question on incomplete factorization level and
	fill
In-Reply-To: <bfb44b8d-fc38-6884-368a-216322668782@gmail.com>
References: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
	<CAMYG4GnTZChFxXgizw1edRtfNA==oOpeJFy57nzY1b-xkkPYbg@mail.gmail.com>
	<c78911ea-edba-3966-754d-0102e2d75d42@gmail.com>
	<CAMYG4GmaFrpOzo1iFXJQoLZpwZ0rxy_ULpNPgcgtD4AMZHrRfg@mail.gmail.com>
	<b528c54d-c8e0-5030-315b-fdb97c279b19@gmail.com>
	<CAGCphBvLSvn8NXFpQ2qjBtyXwtHA0dqziJVSdZSVtu9BtVxZ5g@mail.gmail.com>
	<f2389e74-f3a1-420d-818d-33c782c0aa39@gmail.com>
	<bfb44b8d-fc38-6884-368a-216322668782@gmail.com>
Message-ID: <CAGCphBvmU1jyK60QKsE2=K_5TEe_8+U+=wHcqB3V1ssbZGwQ+w@mail.gmail.com>

Remove your option '-vecload_block_size 10'.

Hong

On Wed, May 24, 2017 at 3:06 PM, Danyang Su <danyang.su at gmail.com> wrote:

> Dear Hong,
>
> I just tested with different number of processors for the same matrix. It
> sometimes got "ERROR: Arguments are incompatible" for different number of
> processors. It works fine using 4, 8, or 24 processors, but failed with
> "ERROR: Arguments are incompatible" using 16 or 48 processors. The error
> information is attached. I tested this on my local computer with 6 cores 12
> threads. Any suggestion on this?
>
> Thanks,
>
> Danyang
>
> On 17-05-24 12:28 PM, Danyang Su wrote:
>
> Hi Hong,
>
> Awesome. Thanks for testing the case. I will try your options for the code
> and get back to you later.
>
> Regards,
>
> Danyang
>
> On 17-05-24 12:21 PM, Hong wrote:
>
> Danyang :
> I tested your data.
> Your matrices encountered zero pivots, e.g.
> petsc/src/ksp/ksp/examples/tutorials (master)
> $ mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin
> -ksp_monitor -ksp_error_if_not_converged
>
> [15]PETSC ERROR: Zero pivot in LU factorization:
> http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot
> [15]PETSC ERROR: Zero pivot row 1249 value 2.05808e-14 tolerance
> 2.22045e-14
> ...
>
> Adding option '-sub_pc_factor_shift_type nonzero', I got
> mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin
> -ksp_monitor -ksp_error_if_not_converged -sub_pc_factor_shift_type nonzero
> -mat_view ascii::ascii_info
>
> Mat Object: 24 MPI processes
>   type: mpiaij
>   rows=450000, cols=450000
>   total: nonzeros=6991400, allocated nonzeros=6991400
>   total number of mallocs used during MatSetValues calls =0
>     not using I-node (on process 0) routines
>   0 KSP Residual norm 5.849777711755e+01
>   1 KSP Residual norm 6.824179430230e-01
>   2 KSP Residual norm 3.994483555787e-02
>   3 KSP Residual norm 6.085841461433e-03
>   4 KSP Residual norm 8.876162583511e-04
>   5 KSP Residual norm 9.407780665278e-05
> Number of iterations =   5
> Residual norm 0.00542891
>
> Hong
>
>> Hi Matt,
>>
>> Yes. The matrix is 450000x450000 sparse. The hypre takes hundreds of
>> iterates, not for all but in most of the timesteps. The matrix is not well
>> conditioned, with nonzero entries range from 1.0e-29 to 1.0e2. I also made
>> double check if there is anything wrong in the parallel version, however,
>> the matrix is the same with sequential version except some round error
>> which is relatively very small. Usually for those not well conditioned
>> matrix, direct solver should be faster than iterative solver, right? But
>> when I use the sequential iterative solver with ILU prec developed almost
>> 20 years go by others, the solver converge fast with appropriate
>> factorization level. In other words, when I use 24 processor using hypre,
>> the speed is almost the same as as the old sequential iterative solver
>> using 1 processor.
>>
>> I use most of the default configuration for the general case with pretty
>> good speedup. And I am not sure if I miss something for this problem.
>>
>> Thanks,
>>
>> Danyang
>>
>> On 17-05-24 11:12 AM, Matthew Knepley wrote:
>>
>> On Wed, May 24, 2017 at 12:50 PM, Danyang Su <danyang.su at gmail.com>
>> wrote:
>>
>>> Hi Matthew and Barry,
>>>
>>> Thanks for the quick response.
>>>
>>> I also tried superlu and mumps, both work but it is about four times
>>> slower than ILU(dt) prec through hypre, with 24 processors I have tested.
>>>
>> You mean the total time is 4x? And you are taking hundreds of iterates?
>> That seems hard to believe, unless you are dropping
>> a huge number of elements.
>>
>>> When I look into the convergence information, the method using ILU(dt)
>>> still takes 200 to 3000 linear iterations for each newton iteration. One
>>> reason is this equation is hard to solve. As for the general cases, the
>>> same method works awesome and get very good speedup.
>>>
>> I do not understand what you mean here.
>>
>>> I also doubt if I use hypre correctly for this case. Is there anyway to
>>> check this problem, or is it possible to increase the factorization level
>>> through hypre?
>>>
>> I don't know.
>>
>>   Matt
>>
>>> Thanks,
>>>
>>> Danyang
>>>
>>> On 17-05-24 04:59 AM, Matthew Knepley wrote:
>>>
>>> On Wed, May 24, 2017 at 2:21 AM, Danyang Su <danyang.su at gmail.com>
>>> wrote:
>>>
>>>> Dear All,
>>>>
>>>> I use PCFactorSetLevels for ILU and PCFactorSetFill for other
>>>> preconditioning in my code to help solve the problems that the default
>>>> option is hard to solve. However, I found the latter one, PCFactorSetFill
>>>> does not take effect for my problem. The matrices and rhs as well as the
>>>> solutions are attached from the link below. I obtain the solution using
>>>> hypre preconditioner and it takes 7 and 38 iterations for matrix 1 and
>>>> matrix 2. However, if I use other preconditioner, the solver just failed at
>>>> the first matrix. I have tested this matrix using the native sequential
>>>> solver (not PETSc) with ILU preconditioning. If I set the incomplete
>>>> factorization level to 0, this sequential solver will take more than 100
>>>> iterations. If I increase the factorization level to 1 or more, it just
>>>> takes several iterations. This remind me that the PC factor for this
>>>> matrices should be increased. However, when I tried it in PETSc, it just
>>>> does not work.
>>>>
>>>> Matrix and rhs can be obtained from the link below.
>>>>
>>>> https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>>>>
>>>> Would anyone help to check if you can make this work by increasing the
>>>> PC factor level or fill?
>>>>
>>>
>>> We have ILU(k) supported in serial. However ILU(dt) which takes a
>>> tolerance only works through Hypre
>>>
>>>   http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>>>
>>> I recommend you try SuperLU or MUMPS, which can both be downloaded
>>> automatically by configure, and
>>> do a full sparse LU.
>>>
>>>   Thanks,
>>>
>>>     Matt
>>>
>>>
>>>> Thanks and regards,
>>>>
>>>> Danyang
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> http://www.caam.rice.edu/~mk51/
>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> http://www.caam.rice.edu/~mk51/
>>
>>
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/60796a5e/attachment-0001.html>

From danyang.su at gmail.com  Wed May 24 22:05:27 2017
From: danyang.su at gmail.com (Danyang Su)
Date: Wed, 24 May 2017 20:05:27 -0700
Subject: [petsc-users] Question on incomplete factorization level and
	fill
In-Reply-To: <CAGCphBvmU1jyK60QKsE2=K_5TEe_8+U+=wHcqB3V1ssbZGwQ+w@mail.gmail.com>
References: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
	<CAMYG4GnTZChFxXgizw1edRtfNA==oOpeJFy57nzY1b-xkkPYbg@mail.gmail.com>
	<c78911ea-edba-3966-754d-0102e2d75d42@gmail.com>
	<CAMYG4GmaFrpOzo1iFXJQoLZpwZ0rxy_ULpNPgcgtD4AMZHrRfg@mail.gmail.com>
	<b528c54d-c8e0-5030-315b-fdb97c279b19@gmail.com>
	<CAGCphBvLSvn8NXFpQ2qjBtyXwtHA0dqziJVSdZSVtu9BtVxZ5g@mail.gmail.com>
	<f2389e74-f3a1-420d-818d-33c782c0aa39@gmail.com>
	<bfb44b8d-fc38-6884-368a-216322668782@gmail.com>
	<CAGCphBvmU1jyK60QKsE2=K_5TEe_8+U+=wHcqB3V1ssbZGwQ+w@mail.gmail.com>
Message-ID: <04d53b3d-a949-c389-2501-69845c5416d3@gmail.com>

Hi All,

I just delete the .info file and it works without problem now.

Thanks,

Danyang


On 17-05-24 06:32 PM, Hong wrote:
> Remove your option '-vecload_block_size 10'.
> Hong
>
> On Wed, May 24, 2017 at 3:06 PM, Danyang Su <danyang.su at gmail.com 
> <mailto:danyang.su at gmail.com>> wrote:
>
>     Dear Hong,
>
>     I just tested with different number of processors for the same
>     matrix. It sometimes got "ERROR: Arguments are incompatible" for
>     different number of processors. It works fine using 4, 8, or 24
>     processors, but failed with "ERROR: Arguments are incompatible"
>     using 16 or 48 processors. The error information is attached. I
>     tested this on my local computer with 6 cores 12 threads. Any
>     suggestion on this?
>
>     Thanks,
>
>     Danyang
>
>
>     On 17-05-24 12:28 PM, Danyang Su wrote:
>>
>>     Hi Hong,
>>
>>     Awesome. Thanks for testing the case. I will try your options for
>>     the code and get back to you later.
>>
>>     Regards,
>>
>>     Danyang
>>
>>
>>     On 17-05-24 12:21 PM, Hong wrote:
>>>     Danyang :
>>>     I tested your data.
>>>     Your matrices encountered zero pivots, e.g.
>>>     petsc/src/ksp/ksp/examples/tutorials (master)
>>>     $ mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs
>>>     b_react_in_2.bin -ksp_monitor -ksp_error_if_not_converged
>>>
>>>     [15]PETSC ERROR: Zero pivot in LU factorization:
>>>     http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot
>>>     <http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot>
>>>     [15]PETSC ERROR: Zero pivot row 1249 value 2.05808e-14 tolerance
>>>     2.22045e-14
>>>     ...
>>>
>>>     Adding option '-sub_pc_factor_shift_type nonzero', I got
>>>     mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin
>>>     -ksp_monitor -ksp_error_if_not_converged
>>>     -sub_pc_factor_shift_type nonzero -mat_view ascii::ascii_info
>>>
>>>     Mat Object: 24 MPI processes
>>>       type: mpiaij
>>>       rows=450000, cols=450000
>>>       total: nonzeros=6991400, allocated nonzeros=6991400
>>>       total number of mallocs used during MatSetValues calls =0
>>>         not using I-node (on process 0) routines
>>>       0 KSP Residual norm 5.849777711755e+01
>>>       1 KSP Residual norm 6.824179430230e-01
>>>       2 KSP Residual norm 3.994483555787e-02
>>>       3 KSP Residual norm 6.085841461433e-03
>>>       4 KSP Residual norm 8.876162583511e-04
>>>       5 KSP Residual norm 9.407780665278e-05
>>>     Number of iterations =   5
>>>     Residual norm 0.00542891
>>>
>>>     Hong
>>>
>>>         Hi Matt,
>>>
>>>         Yes. The matrix is 450000x450000 sparse. The hypre takes
>>>         hundreds of iterates, not for all but in most of the
>>>         timesteps. The matrix is not well conditioned, with nonzero
>>>         entries range from 1.0e-29 to 1.0e2. I also made double
>>>         check if there is anything wrong in the parallel version,
>>>         however, the matrix is the same with sequential version
>>>         except some round error which is relatively very small.
>>>         Usually for those not well conditioned matrix, direct solver
>>>         should be faster than iterative solver, right? But when I
>>>         use the sequential iterative solver with ILU prec developed
>>>         almost 20 years go by others, the solver converge fast with
>>>         appropriate factorization level. In other words, when I use
>>>         24 processor using hypre, the speed is almost the same as as
>>>         the old sequential iterative solver using 1 processor.
>>>
>>>         I use most of the default configuration for the general case
>>>         with pretty good speedup. And I am not sure if I miss
>>>         something for this problem.
>>>
>>>         Thanks,
>>>
>>>         Danyang
>>>
>>>
>>>         On 17-05-24 11:12 AM, Matthew Knepley wrote:
>>>>         On Wed, May 24, 2017 at 12:50 PM, Danyang Su
>>>>         <danyang.su at gmail.com <mailto:danyang.su at gmail.com>> wrote:
>>>>
>>>>             Hi Matthew and Barry,
>>>>
>>>>             Thanks for the quick response.
>>>>
>>>>             I also tried superlu and mumps, both work but it is
>>>>             about four times slower than ILU(dt) prec through
>>>>             hypre, with 24 processors I have tested.
>>>>
>>>>         You mean the total time is 4x? And you are taking hundreds
>>>>         of iterates? That seems hard to believe, unless you are
>>>>         dropping
>>>>         a huge number of elements.
>>>>
>>>>             When I look into the convergence information, the
>>>>             method using ILU(dt) still takes 200 to 3000 linear
>>>>             iterations for each newton iteration. One reason is
>>>>             this equation is hard to solve. As for the general
>>>>             cases, the same method works awesome and get very good
>>>>             speedup.
>>>>
>>>>         I do not understand what you mean here.
>>>>
>>>>             I also doubt if I use hypre correctly for this case. Is
>>>>             there anyway to check this problem, or is it possible
>>>>             to increase the factorization level through hypre?
>>>>
>>>>         I don't know.
>>>>
>>>>           Matt
>>>>
>>>>             Thanks,
>>>>
>>>>             Danyang
>>>>
>>>>
>>>>             On 17-05-24 04:59 AM, Matthew Knepley wrote:
>>>>>             On Wed, May 24, 2017 at 2:21 AM, Danyang Su
>>>>>             <danyang.su at gmail.com <mailto:danyang.su at gmail.com>>
>>>>>             wrote:
>>>>>
>>>>>                 Dear All,
>>>>>
>>>>>                 I use PCFactorSetLevels for ILU and
>>>>>                 PCFactorSetFill for other preconditioning in my
>>>>>                 code to help solve the problems that the default
>>>>>                 option is hard to solve. However, I found the
>>>>>                 latter one, PCFactorSetFill does not take effect
>>>>>                 for my problem. The matrices and rhs as well as
>>>>>                 the solutions are attached from the link below. I
>>>>>                 obtain the solution using hypre preconditioner and
>>>>>                 it takes 7 and 38 iterations for matrix 1 and
>>>>>                 matrix 2. However, if I use other preconditioner,
>>>>>                 the solver just failed at the first matrix. I have
>>>>>                 tested this matrix using the native sequential
>>>>>                 solver (not PETSc) with ILU preconditioning. If I
>>>>>                 set the incomplete factorization level to 0, this
>>>>>                 sequential solver will take more than 100
>>>>>                 iterations. If I increase the factorization level
>>>>>                 to 1 or more, it just takes several iterations.
>>>>>                 This remind me that the PC factor for this
>>>>>                 matrices should be increased. However, when I
>>>>>                 tried it in PETSc, it just does not work.
>>>>>
>>>>>                 Matrix and rhs can be obtained from the link below.
>>>>>
>>>>>                 https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>>>>>                 <https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R>
>>>>>
>>>>>                 Would anyone help to check if you can make this
>>>>>                 work by increasing the PC factor level or fill?
>>>>>
>>>>>
>>>>>             We have ILU(k) supported in serial. However ILU(dt)
>>>>>             which takes a tolerance only works through Hypre
>>>>>
>>>>>             http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>>>>>             <http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html>
>>>>>
>>>>>             I recommend you try SuperLU or MUMPS, which can both
>>>>>             be downloaded automatically by configure, and
>>>>>             do a full sparse LU.
>>>>>
>>>>>               Thanks,
>>>>>
>>>>>                 Matt
>>>>>
>>>>>                 Thanks and regards,
>>>>>
>>>>>                 Danyang
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>             -- 
>>>>>             What most experimenters take for granted before they
>>>>>             begin their experiments is infinitely more interesting
>>>>>             than any results to which their experiments lead.
>>>>>             -- Norbert Wiener
>>>>>
>>>>>             http://www.caam.rice.edu/~mk51/
>>>>>             <http://www.caam.rice.edu/%7Emk51/>
>>>>
>>>>
>>>>
>>>>
>>>>         -- 
>>>>         What most experimenters take for granted before they begin
>>>>         their experiments is infinitely more interesting than any
>>>>         results to which their experiments lead.
>>>>         -- Norbert Wiener
>>>>
>>>>         http://www.caam.rice.edu/~mk51/
>>>>         <http://www.caam.rice.edu/%7Emk51/>
>>>
>>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170524/2e60ecd1/attachment-0001.html>

From danyang.su at gmail.com  Thu May 25 02:26:16 2017
From: danyang.su at gmail.com (Danyang Su)
Date: Thu, 25 May 2017 00:26:16 -0700
Subject: [petsc-users] PCFactorSetShiftType does not work in code but
 -pc_factor_set_shift_type works
In-Reply-To: <CAGCphBvmU1jyK60QKsE2=K_5TEe_8+U+=wHcqB3V1ssbZGwQ+w@mail.gmail.com>
References: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
	<CAMYG4GnTZChFxXgizw1edRtfNA==oOpeJFy57nzY1b-xkkPYbg@mail.gmail.com>
	<c78911ea-edba-3966-754d-0102e2d75d42@gmail.com>
	<CAMYG4GmaFrpOzo1iFXJQoLZpwZ0rxy_ULpNPgcgtD4AMZHrRfg@mail.gmail.com>
	<b528c54d-c8e0-5030-315b-fdb97c279b19@gmail.com>
	<CAGCphBvLSvn8NXFpQ2qjBtyXwtHA0dqziJVSdZSVtu9BtVxZ5g@mail.gmail.com>
	<f2389e74-f3a1-420d-818d-33c782c0aa39@gmail.com>
	<bfb44b8d-fc38-6884-368a-216322668782@gmail.com>
	<CAGCphBvmU1jyK60QKsE2=K_5TEe_8+U+=wHcqB3V1ssbZGwQ+w@mail.gmail.com>
Message-ID: <8634589f-d1a5-bf4f-b158-3ddb5a18026b@gmail.com>

Dear Hong and Barry,

I have implemented this option in the code, as we also need to use 
configuration from file for convenience. When I run the code using 
options, it works fine, however, when I run the code using configuration 
file, it does not work. The code has two set of equations, flow and 
reactive, with prefix been set to "flow_" and "react_". When I run the 
code using

mpiexec -n 4 ../executable -flow_sub_pc_factor_shift_type nonzero 
-react_sub_pc_factor_shift_type nonzero

it works. However, if I run using

mpiexec -n 4 ../executable

and let the executable file read the options from file, it just does not 
work at "call PCFactorSetShiftType(pc_flow,MAT_SHIFT_NONZERO, ierr)  or 
none, positive_definite ...". Do I miss something here?

Below is the pseudo code I have used for flow equations, similar for 
reactive equations.

           call MatCreateAIJ(Petsc_Comm_World,nndof,nndof,nngbldof,     &
nngbldof,d_nz,PETSC_NULL_INTEGER,o_nz,     &
                             PETSC_NULL_INTEGER,a_flow,ierr)
           CHKERRQ(ierr)

         call MatSetFromOptions(a_flow,ierr)
         CHKERRQ(ierr)

         call KSPCreate(Petsc_Comm_World, ksp_flow, ierr)
         CHKERRQ(ierr)

         call KSPAppendOptionsPrefix(ksp_flow,"flow_",ierr)
         CHKERRQ(ierr)

         call KSPSetInitialGuessNonzero(ksp_flow,                       &
                 b_initial_guess_nonzero_flow, ierr)
         CHKERRQ(ierr)

         call KSPSetInitialGuessNonzero(ksp_flow,                       &
                 b_initial_guess_nonzero_flow, ierr)
         CHKERRQ(ierr)

         call KSPSetDM(ksp_flow,dmda_flow%da,ierr)
         CHKERRQ(ierr)
         call KSPSetDMActive(ksp_flow,PETSC_FALSE,ierr)
         CHKERRQ(ierr)

         !!!!*********CHECK IF READ OPTION FROM FILE*********!!!!
         if (read_option_from_file) then

           call KSPSetType(ksp_flow, KSPGMRES, ierr)    !or KSPBCGS or 
others...
           CHKERRQ(ierr)

           call KSPGetPC(ksp_flow, pc_flow, ierr)
           CHKERRQ(ierr)

           call PCSetType(pc_flow,PCBJACOBI, ierr)       !or PCILU or 
PCJACOBI or PCHYPRE ...
           CHKERRQ(ierr)

           call PCFactorSetShiftType(pc_flow,MAT_SHIFT_NONZERO, ierr)  
or none, positive_definite ...
           CHKERRQ(ierr)

         end if

         call PCFactorGetMatSolverPackage(pc_flow,solver_pkg_flow,ierr)
         CHKERRQ(ierr)

         call compute_jacobian(rank,dmda_flow%da,                       &
a_flow,a_in,ia_in,ja_in,nngl_in,         &
row_idx_l2pg,col_idx_l2pg,               &
                               b_non_interlaced)
         call KSPSetFromOptions(ksp_flow,ierr)
         CHKERRQ(ierr)

         call KSPSetUp(ksp_flow,ierr)
         CHKERRQ(ierr)

         call KSPSetUpOnBlocks(ksp_flow,ierr)
         CHKERRQ(ierr)

         call KSPSolve(ksp_flow,b_flow,x_flow,ierr)
         CHKERRQ(ierr)


Thanks and Regards,

Danyang

On 17-05-24 06:32 PM, Hong wrote:
> Remove your option '-vecload_block_size 10'.
> Hong
>
> On Wed, May 24, 2017 at 3:06 PM, Danyang Su <danyang.su at gmail.com 
> <mailto:danyang.su at gmail.com>> wrote:
>
>     Dear Hong,
>
>     I just tested with different number of processors for the same
>     matrix. It sometimes got "ERROR: Arguments are incompatible" for
>     different number of processors. It works fine using 4, 8, or 24
>     processors, but failed with "ERROR: Arguments are incompatible"
>     using 16 or 48 processors. The error information is attached. I
>     tested this on my local computer with 6 cores 12 threads. Any
>     suggestion on this?
>
>     Thanks,
>
>     Danyang
>
>
>     On 17-05-24 12:28 PM, Danyang Su wrote:
>>
>>     Hi Hong,
>>
>>     Awesome. Thanks for testing the case. I will try your options for
>>     the code and get back to you later.
>>
>>     Regards,
>>
>>     Danyang
>>
>>
>>     On 17-05-24 12:21 PM, Hong wrote:
>>>     Danyang :
>>>     I tested your data.
>>>     Your matrices encountered zero pivots, e.g.
>>>     petsc/src/ksp/ksp/examples/tutorials (master)
>>>     $ mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs
>>>     b_react_in_2.bin -ksp_monitor -ksp_error_if_not_converged
>>>
>>>     [15]PETSC ERROR: Zero pivot in LU factorization:
>>>     http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot
>>>     <http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot>
>>>     [15]PETSC ERROR: Zero pivot row 1249 value 2.05808e-14 tolerance
>>>     2.22045e-14
>>>     ...
>>>
>>>     Adding option '-sub_pc_factor_shift_type nonzero', I got
>>>     mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin
>>>     -ksp_monitor -ksp_error_if_not_converged
>>>     -sub_pc_factor_shift_type nonzero -mat_view ascii::ascii_info
>>>
>>>     Mat Object: 24 MPI processes
>>>       type: mpiaij
>>>       rows=450000, cols=450000
>>>       total: nonzeros=6991400, allocated nonzeros=6991400
>>>       total number of mallocs used during MatSetValues calls =0
>>>         not using I-node (on process 0) routines
>>>       0 KSP Residual norm 5.849777711755e+01
>>>       1 KSP Residual norm 6.824179430230e-01
>>>       2 KSP Residual norm 3.994483555787e-02
>>>       3 KSP Residual norm 6.085841461433e-03
>>>       4 KSP Residual norm 8.876162583511e-04
>>>       5 KSP Residual norm 9.407780665278e-05
>>>     Number of iterations =   5
>>>     Residual norm 0.00542891
>>>
>>>     Hong
>>>
>>>         Hi Matt,
>>>
>>>         Yes. The matrix is 450000x450000 sparse. The hypre takes
>>>         hundreds of iterates, not for all but in most of the
>>>         timesteps. The matrix is not well conditioned, with nonzero
>>>         entries range from 1.0e-29 to 1.0e2. I also made double
>>>         check if there is anything wrong in the parallel version,
>>>         however, the matrix is the same with sequential version
>>>         except some round error which is relatively very small.
>>>         Usually for those not well conditioned matrix, direct solver
>>>         should be faster than iterative solver, right? But when I
>>>         use the sequential iterative solver with ILU prec developed
>>>         almost 20 years go by others, the solver converge fast with
>>>         appropriate factorization level. In other words, when I use
>>>         24 processor using hypre, the speed is almost the same as as
>>>         the old sequential iterative solver using 1 processor.
>>>
>>>         I use most of the default configuration for the general case
>>>         with pretty good speedup. And I am not sure if I miss
>>>         something for this problem.
>>>
>>>         Thanks,
>>>
>>>         Danyang
>>>
>>>
>>>         On 17-05-24 11:12 AM, Matthew Knepley wrote:
>>>>         On Wed, May 24, 2017 at 12:50 PM, Danyang Su
>>>>         <danyang.su at gmail.com <mailto:danyang.su at gmail.com>> wrote:
>>>>
>>>>             Hi Matthew and Barry,
>>>>
>>>>             Thanks for the quick response.
>>>>
>>>>             I also tried superlu and mumps, both work but it is
>>>>             about four times slower than ILU(dt) prec through
>>>>             hypre, with 24 processors I have tested.
>>>>
>>>>         You mean the total time is 4x? And you are taking hundreds
>>>>         of iterates? That seems hard to believe, unless you are
>>>>         dropping
>>>>         a huge number of elements.
>>>>
>>>>             When I look into the convergence information, the
>>>>             method using ILU(dt) still takes 200 to 3000 linear
>>>>             iterations for each newton iteration. One reason is
>>>>             this equation is hard to solve. As for the general
>>>>             cases, the same method works awesome and get very good
>>>>             speedup.
>>>>
>>>>         I do not understand what you mean here.
>>>>
>>>>             I also doubt if I use hypre correctly for this case. Is
>>>>             there anyway to check this problem, or is it possible
>>>>             to increase the factorization level through hypre?
>>>>
>>>>         I don't know.
>>>>
>>>>           Matt
>>>>
>>>>             Thanks,
>>>>
>>>>             Danyang
>>>>
>>>>
>>>>             On 17-05-24 04:59 AM, Matthew Knepley wrote:
>>>>>             On Wed, May 24, 2017 at 2:21 AM, Danyang Su
>>>>>             <danyang.su at gmail.com <mailto:danyang.su at gmail.com>>
>>>>>             wrote:
>>>>>
>>>>>                 Dear All,
>>>>>
>>>>>                 I use PCFactorSetLevels for ILU and
>>>>>                 PCFactorSetFill for other preconditioning in my
>>>>>                 code to help solve the problems that the default
>>>>>                 option is hard to solve. However, I found the
>>>>>                 latter one, PCFactorSetFill does not take effect
>>>>>                 for my problem. The matrices and rhs as well as
>>>>>                 the solutions are attached from the link below. I
>>>>>                 obtain the solution using hypre preconditioner and
>>>>>                 it takes 7 and 38 iterations for matrix 1 and
>>>>>                 matrix 2. However, if I use other preconditioner,
>>>>>                 the solver just failed at the first matrix. I have
>>>>>                 tested this matrix using the native sequential
>>>>>                 solver (not PETSc) with ILU preconditioning. If I
>>>>>                 set the incomplete factorization level to 0, this
>>>>>                 sequential solver will take more than 100
>>>>>                 iterations. If I increase the factorization level
>>>>>                 to 1 or more, it just takes several iterations.
>>>>>                 This remind me that the PC factor for this
>>>>>                 matrices should be increased. However, when I
>>>>>                 tried it in PETSc, it just does not work.
>>>>>
>>>>>                 Matrix and rhs can be obtained from the link below.
>>>>>
>>>>>                 https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>>>>>                 <https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R>
>>>>>
>>>>>                 Would anyone help to check if you can make this
>>>>>                 work by increasing the PC factor level or fill?
>>>>>
>>>>>
>>>>>             We have ILU(k) supported in serial. However ILU(dt)
>>>>>             which takes a tolerance only works through Hypre
>>>>>
>>>>>             http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>>>>>             <http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html>
>>>>>
>>>>>             I recommend you try SuperLU or MUMPS, which can both
>>>>>             be downloaded automatically by configure, and
>>>>>             do a full sparse LU.
>>>>>
>>>>>               Thanks,
>>>>>
>>>>>                 Matt
>>>>>
>>>>>                 Thanks and regards,
>>>>>
>>>>>                 Danyang
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>             -- 
>>>>>             What most experimenters take for granted before they
>>>>>             begin their experiments is infinitely more interesting
>>>>>             than any results to which their experiments lead.
>>>>>             -- Norbert Wiener
>>>>>
>>>>>             http://www.caam.rice.edu/~mk51/
>>>>>             <http://www.caam.rice.edu/%7Emk51/>
>>>>
>>>>
>>>>
>>>>
>>>>         -- 
>>>>         What most experimenters take for granted before they begin
>>>>         their experiments is infinitely more interesting than any
>>>>         results to which their experiments lead.
>>>>         -- Norbert Wiener
>>>>
>>>>         http://www.caam.rice.edu/~mk51/
>>>>         <http://www.caam.rice.edu/%7Emk51/>
>>>
>>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/3ee44a39/attachment-0001.html>

From michael.afanasiev at erdw.ethz.ch  Thu May 25 03:43:27 2017
From: michael.afanasiev at erdw.ethz.ch (Michael Afanasiev)
Date: Thu, 25 May 2017 10:43:27 +0200
Subject: [petsc-users] Postdoc Position in the Computational Seismology
	Group at ETH Zurich.
Message-ID: <8EADEF87-4EEA-4174-892C-1AD17FA556EE@erdw.ethz.ch>

Hi everyone,

The computational seismology group at ETH Zurich is looking for a postdoc to work with us on Salvus (www.salvus.io <http://www.salvis.io/>) - a spectral-element software package for full-waveform modelling and inversion. The exact focus of the job is tied to the applicant's strengths and interests, and ranges from HPC engineering to tackling large-scale frequency domain (Helmholtz) applications. The code is currently integrated with PETSc, and utilizes DMPLEX for unstructured mesh management. Please find more details below.

Cheers,
Mike.
_____

Postdoctoral research position: Full-waveform modeling and inversion across the scales
 
The Computational Seismology Group at ETH Z?rich is seeking to appoint a postdoctoral researcher to work on Salvus, an open-source framework for full-waveform modeling and inversion (http://salvus.io <http://salvus.io/>). The position is full-time (100%) for a duration of 24 months, with possibility for extension. Earliest starting date is 1 June 2017.
 
 Background:
 
Salvus is a modular open-source code package for large-scale waveform modelling and inversion built on the basis of modern programming principles. This project will enable Salvus to (1) harness large homogeneous and various heterogeneous HPC architectures that are available today, and (2) easily adapt to future architectures, requiring minimal code modifications.
 
The project is intended to position Salvus as a top wavefield modelling and inversion package in the exascale era. To ensure performance of Salvus on today's and tomorrow's supercomputing platforms, work will focus on cross-architecture developments, code and I/O optimisation, and systematic testing and validation. This will be complemented by actions to increase and broaden the usability and impact of Salvus. They include workflow developments, the implementation of frequency-domain solvers, and extensions of the physics that can be modelled.
 
The successful candidate will be embedded into the team of Salvus developers and users covering a wide range of fields, including Computational Science, Applied Mathematics, Seismology, Exploration and Environmental Geophysics, Geothermal Energy, and Geofluids. She or he will have access to Piz Daint, currently Europe?s fastest supercomputer, located at the Swiss National Supercomputing Center (CSCS, www.cscs.ch <http://www.cscs.ch/>).
 
Apart from the core responsibilities listed below, the successful candidate will have considerable freedom of research in order to develop an independent scientific career. Topics of interest to the group include, but are not limited to real-world waveform modeling and inversion applications, the development of methods for uncertainty analysis, and the transfer of Salvus to new domains outside traditional seismology.
 
 Core responsibilities:
 
Cross-architecture developments, leveraging Salvus? mixin-based design to implement hardware-specific versions of compute-intensive code segments, while leaving most of the code unchanged.
General code optimisations to achieve maximal performance from single nodes to full machine runs.
I/O optimisation to handle the enormous data volumes needed in adjoint simulations. Sub-tasks include the incorporation and extension of a previously developed wavefield compression library, and the interfacing to modern parallel seismic data formats.
Workflow developments to facilitate the solution of large-scale inverse problems, including the automatic orchestration of a large number of HPC jobs.

Expected qualifications:
 
The ideal candidate should have the following attributes:
 
PhD degree in geophysics, computer science, physics, applied mathematics or a related field,
strong programming skills in C or C++,
experience developing software which exploits large scale HPC platforms with a strong knowledge of MPI and experience with at least one other parallel paradigm (OpenMP, CUDA, OpenCL),
experience with collaborative software development (i.e. continuous integration services),
experience with finite element methods, numerical wave propagation, and/or inverse problems,
experience with Krylov methods and preconditioners - specifically either domain-decomposition methods and or multigrid methods (geometric, algebraic). 
 
Furthermore, the successful candidate is expected to have excellent organizational, communication and interpersonal skills that allow her or him to work in a highly collaborative and interdisciplinary environment.
 
 Application:
 
To apply for this position, please send your full resume, cover letter and the names of three references to Prof. Andreas Fichtner (andreas.fichtner at erdw.ethz.ch <mailto:andreas.fichtner at erdw.ethz.ch>). If possible, please also attach a link to one or more software packages you have been involved with (GitHub, GitLab, Bitbucket, ?). The position will remain open until filled.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/5236c9a6/attachment.html>

From jchludzinski at gmail.com  Thu May 25 07:54:29 2017
From: jchludzinski at gmail.com (John Chludzinski)
Date: Thu, 25 May 2017 08:54:29 -0400
Subject: [petsc-users] PETSC OO C guide/standard?
In-Reply-To: <CAMYG4G=M9NRaKzkfW0AS--SUe60eHvS__giaTcZeJuPUB7AZhg@mail.gmail.com>
References: <CAP0M-J1WkXTjG=MhDb4z3EY0-3oz4RT26yEJxaUvixmuDqZ0pA@mail.gmail.com>
	<CAMYG4GkdcD2r2wYpSQPc5zURXeYUVN0QD3+sXBkj0ixpfCjU-w@mail.gmail.com>
	<CAP0M-J0ugnfCG3dO1YbaYBJRgW+___wZzOfft43Akdusdfn0=Q@mail.gmail.com>
	<CAMYG4G=M9NRaKzkfW0AS--SUe60eHvS__giaTcZeJuPUB7AZhg@mail.gmail.com>
Message-ID: <CAP0M-J3DO3SUMEfmQ98b2D3QY-mDSYBNQacBMBryFno1QN770w@mail.gmail.com>

Thanks.

C++ has now become the apotheosis of "no value-added complexity".

Even Bjarne Stroustrup admits to understanding only a small fraction of the
whole.

On Wed, May 24, 2017 at 9:53 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Wed, May 24, 2017 at 8:50 AM, John Chludzinski <jchludzinski at gmail.com>
> wrote:
>
>> Considering that the current C++ standard is >1600 pages and counting
>> (still glomming on new "features"), I'm planning to try an OO style of C
>> coding style.
>>
>> The standard's size (number of pages) being the best (and only
>> *practical*) means to measure language complexity.
>>
>
> Here is another thing I wrote talking about OO in PETSc:
>
>   https://arxiv.org/abs/1209.1711
>
>     Matt
>
>
>> On Wed, May 24, 2017 at 9:11 AM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Wed, May 24, 2017 at 8:03 AM, John Chludzinski <
>>> jchludzinski at gmail.com> wrote:
>>>
>>>> Is there a guide for how to write/develop PETSC OO C code? How a
>>>> "class" is defined/implemented? How you implement inheritance? Memory
>>>> management? Etc?
>>>>
>>>
>>> We have a guide: http://www.mcs.anl.gov/petsc/developers/developers.pdf
>>>
>>> If its not in there, you can mail the list.
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>
>>>> ---John
>>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> http://www.caam.rice.edu/~mk51/
>>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/2a42f96c/attachment-0001.html>

From lawrence.mitchell at imperial.ac.uk  Thu May 25 09:27:39 2017
From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell)
Date: Thu, 25 May 2017 15:27:39 +0100
Subject: [petsc-users] DMPlex distribution with FVM adjacency
Message-ID: <ec0e53d9-7ec1-a69c-94a9-1cc1129d48a9@imperial.ac.uk>

Dear petsc-users,

I am trying to distribute a triangle mesh with a cell halo defined by
FVM adjacency (i.e. if I have a facet in the initial (0-overlap)
distribution, I want the cell on the other side of it).

Reading the documentation, I think I do:

DMPlexSetAdjacencyUseCone(PETSC_TRUE)
DMPlexSetAdjacencyUseClosure(PETSC_FALSE)

and then
DMPlexDistribute(..., ovelap=1)

If I do this for a simple mesh and then try and do anything on it, I
run into all sorts of problems because I have a plex where I have some
facets, but not even one cell in the support of the facet.  Is this to
be expected?

For example the following petsc4py code breaks when run on 3 processes:

$ mpiexec -n 3 python bork.py
[1] DMPlexGetOrdering() line 133 in
/data/lmitche1/src/deps/petsc/src/dm/impls/plex/plexreorder.c
[1] DMPlexCreateOrderingClosure_Static() line 41 in
/data/lmitche1/src/deps/petsc/src/dm/impls/plex/plexreorder.c
[1] Petsc has generated inconsistent data
[1] Number of depth 2 faces 34 does not match permuted nubmer 29
: error code 77
[2] DMPlexGetOrdering() line 133 in
/data/lmitche1/src/deps/petsc/src/dm/impls/plex/plexreorder.c
[2] DMPlexCreateOrderingClosure_Static() line 41 in
/data/lmitche1/src/deps/petsc/src/dm/impls/plex/plexreorder.c
[2] Petsc has generated inconsistent data
[2] Number of depth 2 faces 33 does not match permuted nubmer 28
: error code 77
[0] DMPlexGetOrdering() line 133 in
/data/lmitche1/src/deps/petsc/src/dm/impls/plex/plexreorder.c
[0] DMPlexCreateOrderingClosure_Static() line 41 in
/data/lmitche1/src/deps/petsc/src/dm/impls/plex/plexreorder.c
[0] Petsc has generated inconsistent data
[0] Number of depth 2 faces 33 does not match permuted nubmer 31

$ cat > bork.py<<\EOF
from petsc4py import PETSc
import numpy as np
Lx = Ly = 1
nx = ny = 4

xcoords = np.linspace(0.0, Lx, nx + 1, dtype=PETSc.RealType)
ycoords = np.linspace(0.0, Ly, ny + 1, dtype=PETSc.RealType)
coords = np.asarray(np.meshgrid(xcoords, ycoords)).swapaxes(0,
2).reshape(-1, 2)

# cell vertices
i, j = np.meshgrid(np.arange(nx, dtype=PETSc.IntType), np.arange(ny,
dtype=PETSc.IntType))
cells = [i*(ny+1) + j, i*(ny+1) + j+1, (i+1)*(ny+1) + j+1,
(i+1)*(ny+1) + j]
cells = np.asarray(cells, dtype=PETSc.IntType).swapaxes(0,
2).reshape(-1, 4)
idx = [0, 1, 3, 1, 2, 3]
cells = cells[:, idx].reshape(-1, 3)

comm = PETSc.COMM_WORLD
if comm.rank == 0:
    dm = PETSc.DMPlex().createFromCellList(2, cells, coords, comm=comm)
else:
    dm = PETSc.DMPlex().createFromCellList(2, np.zeros((0, 4),
dtype=PETSc.IntType),
                                           np.zeros((0, 2),
dtype=PETSc.RealType),
                                           comm=comm)

dm.setAdjacencyUseClosure(False)
dm.setAdjacencyUseCone(True)

dm.distribute(overlap=1)

dm.getOrdering(PETSc.Mat.OrderingType.RCM)

dm.view()
EOF

Am I doing something wrong?  Is this not expected to work?

Cheers,

Lawrence

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/7fb54dbd/attachment.pgp>

From hzhang at mcs.anl.gov  Thu May 25 09:49:59 2017
From: hzhang at mcs.anl.gov (Hong)
Date: Thu, 25 May 2017 09:49:59 -0500
Subject: [petsc-users] PCFactorSetShiftType does not work in code but
 -pc_factor_set_shift_type works
In-Reply-To: <8634589f-d1a5-bf4f-b158-3ddb5a18026b@gmail.com>
References: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
	<CAMYG4GnTZChFxXgizw1edRtfNA==oOpeJFy57nzY1b-xkkPYbg@mail.gmail.com>
	<c78911ea-edba-3966-754d-0102e2d75d42@gmail.com>
	<CAMYG4GmaFrpOzo1iFXJQoLZpwZ0rxy_ULpNPgcgtD4AMZHrRfg@mail.gmail.com>
	<b528c54d-c8e0-5030-315b-fdb97c279b19@gmail.com>
	<CAGCphBvLSvn8NXFpQ2qjBtyXwtHA0dqziJVSdZSVtu9BtVxZ5g@mail.gmail.com>
	<f2389e74-f3a1-420d-818d-33c782c0aa39@gmail.com>
	<bfb44b8d-fc38-6884-368a-216322668782@gmail.com>
	<CAGCphBvmU1jyK60QKsE2=K_5TEe_8+U+=wHcqB3V1ssbZGwQ+w@mail.gmail.com>
	<8634589f-d1a5-bf4f-b158-3ddb5a18026b@gmail.com>
Message-ID: <CAGCphBu7jAaqSA_j8wreS_XRoPpD+vJYt-hiua1Tkp4q=Bf8Jg@mail.gmail.com>

Danyang:
You must access inner pc, then set shift. See
petsc/src/ksp/ksp/examples/tutorials/ex7.c

For example, I add following to petsc/src/ksp/ksp/examples/tutorials/ex2.c,
line 191:
  PetscBool isbjacobi;
  PC        pc;
  ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
  ierr =
PetscObjectTypeCompare((PetscObject)pc,PCBJACOBI,&isbjacobi);CHKERRQ(ierr);
  if (isbjacobi) {
    PetscInt nlocal;
    KSP      *subksp;
    PC       subpc;

    ierr = KSPSetUp(ksp);CHKERRQ(ierr);
    ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);

    /* Extract the array of KSP contexts for the local blocks */
    ierr = PCBJacobiGetSubKSP(pc,&nlocal,NULL,&subksp);CHKERRQ(ierr);
    printf("isbjacobi, nlocal %D, set option to subpc...\n",nlocal);
    for (i=0; i<nlocal; i++) {
      ierr = KSPGetPC(subksp[i],&subpc);CHKERRQ(ierr);
      ierr = PCFactorSetShiftType(subpc,MAT_SHIFT_NONZERO);CHKERRQ(ierr);
    }
  }


Dear Hong and Barry,
>
> I have implemented this option in the code, as we also need to use
> configuration from file for convenience. When I run the code using options,
> it works fine, however, when I run the code using configuration file, it
> does not work. The code has two set of equations, flow and reactive, with
> prefix been set to "flow_" and "react_". When I run the code using
>
> mpiexec -n 4 ../executable -flow_sub_pc_factor_shift_type nonzero
> -react_sub_pc_factor_shift_type nonzero
>
> it works. However, if I run using
>
> mpiexec -n 4 ../executable
>
> and let the executable file read the options from file, it just does not
> work at "call PCFactorSetShiftType(pc_flow,MAT_SHIFT_NONZERO, ierr)  or
> none, positive_definite ...". Do I miss something here?
>
> Below is the pseudo code I have used for flow equations, similar for
> reactive equations.
>
>           call MatCreateAIJ(Petsc_Comm_World,nndof,nndof,nngbldof,     &
>                             nngbldof,d_nz,PETSC_NULL_INTEGER,o_nz,     &
>                             PETSC_NULL_INTEGER,a_flow,ierr)
>           CHKERRQ(ierr)
>
>         call MatSetFromOptions(a_flow,ierr)
>         CHKERRQ(ierr)
>
>         call KSPCreate(Petsc_Comm_World, ksp_flow, ierr)
>         CHKERRQ(ierr)
>
>         call KSPAppendOptionsPrefix(ksp_flow,"flow_",ierr)
>         CHKERRQ(ierr)
>
>         call KSPSetInitialGuessNonzero(ksp_flow,                       &
>                 b_initial_guess_nonzero_flow, ierr)
>         CHKERRQ(ierr)
>
>         call KSPSetInitialGuessNonzero(ksp_flow,                       &
>                 b_initial_guess_nonzero_flow, ierr)
>         CHKERRQ(ierr)
>
>         call KSPSetDM(ksp_flow,dmda_flow%da,ierr)
>         CHKERRQ(ierr)
>         call KSPSetDMActive(ksp_flow,PETSC_FALSE,ierr)
>         CHKERRQ(ierr)
>
>         !!!!*********CHECK IF READ OPTION FROM FILE*********!!!!
>         if (read_option_from_file) then
>
>           call KSPSetType(ksp_flow, KSPGMRES, ierr)    !or KSPBCGS or
> others...
>           CHKERRQ(ierr)
>
>           call KSPGetPC(ksp_flow, pc_flow, ierr)
>           CHKERRQ(ierr)
>
>           call PCSetType(pc_flow,PCBJACOBI, ierr)       !or PCILU or
> PCJACOBI or PCHYPRE ...
>           CHKERRQ(ierr)
>
>           call PCFactorSetShiftType(pc_flow,MAT_SHIFT_NONZERO, ierr)  or
> none, positive_definite ...
>           CHKERRQ(ierr)
>
>         end if
>
>         call PCFactorGetMatSolverPackage(pc_flow,solver_pkg_flow,ierr)
>         CHKERRQ(ierr)
>
>         call compute_jacobian(rank,dmda_flow%da,                       &
>                               a_flow,a_in,ia_in,ja_in,nngl_in,         &
>                               row_idx_l2pg,col_idx_l2pg,               &
>                               b_non_interlaced)
>         call KSPSetFromOptions(ksp_flow,ierr)
>         CHKERRQ(ierr)
>
>         call KSPSetUp(ksp_flow,ierr)
>         CHKERRQ(ierr)
>
>         call KSPSetUpOnBlocks(ksp_flow,ierr)
>         CHKERRQ(ierr)
>
>         call KSPSolve(ksp_flow,b_flow,x_flow,ierr)
>         CHKERRQ(ierr)
>
>
> Thanks and Regards,
>
> Danyang
> On 17-05-24 06:32 PM, Hong wrote:
>
> Remove your option '-vecload_block_size 10'.
> Hong
>
> On Wed, May 24, 2017 at 3:06 PM, Danyang Su <danyang.su at gmail.com> wrote:
>
>> Dear Hong,
>>
>> I just tested with different number of processors for the same matrix. It
>> sometimes got "ERROR: Arguments are incompatible" for different number of
>> processors. It works fine using 4, 8, or 24 processors, but failed with
>> "ERROR: Arguments are incompatible" using 16 or 48 processors. The error
>> information is attached. I tested this on my local computer with 6 cores 12
>> threads. Any suggestion on this?
>>
>> Thanks,
>>
>> Danyang
>>
>> On 17-05-24 12:28 PM, Danyang Su wrote:
>>
>> Hi Hong,
>>
>> Awesome. Thanks for testing the case. I will try your options for the
>> code and get back to you later.
>>
>> Regards,
>>
>> Danyang
>>
>> On 17-05-24 12:21 PM, Hong wrote:
>>
>> Danyang :
>> I tested your data.
>> Your matrices encountered zero pivots, e.g.
>> petsc/src/ksp/ksp/examples/tutorials (master)
>> $ mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin
>> -ksp_monitor -ksp_error_if_not_converged
>>
>> [15]PETSC ERROR: Zero pivot in LU factorization:
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot
>> [15]PETSC ERROR: Zero pivot row 1249 value 2.05808e-14 tolerance
>> 2.22045e-14
>> ...
>>
>> Adding option '-sub_pc_factor_shift_type nonzero', I got
>> mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin
>> -ksp_monitor -ksp_error_if_not_converged -sub_pc_factor_shift_type nonzero
>> -mat_view ascii::ascii_info
>>
>> Mat Object: 24 MPI processes
>>   type: mpiaij
>>   rows=450000, cols=450000
>>   total: nonzeros=6991400, allocated nonzeros=6991400
>>   total number of mallocs used during MatSetValues calls =0
>>     not using I-node (on process 0) routines
>>   0 KSP Residual norm 5.849777711755e+01
>>   1 KSP Residual norm 6.824179430230e-01
>>   2 KSP Residual norm 3.994483555787e-02
>>   3 KSP Residual norm 6.085841461433e-03
>>   4 KSP Residual norm 8.876162583511e-04
>>   5 KSP Residual norm 9.407780665278e-05
>> Number of iterations =   5
>> Residual norm 0.00542891
>>
>> Hong
>>
>>> Hi Matt,
>>>
>>> Yes. The matrix is 450000x450000 sparse. The hypre takes hundreds of
>>> iterates, not for all but in most of the timesteps. The matrix is not well
>>> conditioned, with nonzero entries range from 1.0e-29 to 1.0e2. I also made
>>> double check if there is anything wrong in the parallel version, however,
>>> the matrix is the same with sequential version except some round error
>>> which is relatively very small. Usually for those not well conditioned
>>> matrix, direct solver should be faster than iterative solver, right? But
>>> when I use the sequential iterative solver with ILU prec developed almost
>>> 20 years go by others, the solver converge fast with appropriate
>>> factorization level. In other words, when I use 24 processor using hypre,
>>> the speed is almost the same as as the old sequential iterative solver
>>> using 1 processor.
>>>
>>> I use most of the default configuration for the general case with pretty
>>> good speedup. And I am not sure if I miss something for this problem.
>>>
>>> Thanks,
>>>
>>> Danyang
>>>
>>> On 17-05-24 11:12 AM, Matthew Knepley wrote:
>>>
>>> On Wed, May 24, 2017 at 12:50 PM, Danyang Su <danyang.su at gmail.com>
>>> wrote:
>>>
>>>> Hi Matthew and Barry,
>>>>
>>>> Thanks for the quick response.
>>>>
>>>> I also tried superlu and mumps, both work but it is about four times
>>>> slower than ILU(dt) prec through hypre, with 24 processors I have tested.
>>>>
>>> You mean the total time is 4x? And you are taking hundreds of iterates?
>>> That seems hard to believe, unless you are dropping
>>> a huge number of elements.
>>>
>>>> When I look into the convergence information, the method using ILU(dt)
>>>> still takes 200 to 3000 linear iterations for each newton iteration. One
>>>> reason is this equation is hard to solve. As for the general cases, the
>>>> same method works awesome and get very good speedup.
>>>>
>>> I do not understand what you mean here.
>>>
>>>> I also doubt if I use hypre correctly for this case. Is there anyway to
>>>> check this problem, or is it possible to increase the factorization level
>>>> through hypre?
>>>>
>>> I don't know.
>>>
>>>   Matt
>>>
>>>> Thanks,
>>>>
>>>> Danyang
>>>>
>>>> On 17-05-24 04:59 AM, Matthew Knepley wrote:
>>>>
>>>> On Wed, May 24, 2017 at 2:21 AM, Danyang Su <danyang.su at gmail.com>
>>>> wrote:
>>>>
>>>>> Dear All,
>>>>>
>>>>> I use PCFactorSetLevels for ILU and PCFactorSetFill for other
>>>>> preconditioning in my code to help solve the problems that the default
>>>>> option is hard to solve. However, I found the latter one, PCFactorSetFill
>>>>> does not take effect for my problem. The matrices and rhs as well as the
>>>>> solutions are attached from the link below. I obtain the solution using
>>>>> hypre preconditioner and it takes 7 and 38 iterations for matrix 1 and
>>>>> matrix 2. However, if I use other preconditioner, the solver just failed at
>>>>> the first matrix. I have tested this matrix using the native sequential
>>>>> solver (not PETSc) with ILU preconditioning. If I set the incomplete
>>>>> factorization level to 0, this sequential solver will take more than 100
>>>>> iterations. If I increase the factorization level to 1 or more, it just
>>>>> takes several iterations. This remind me that the PC factor for this
>>>>> matrices should be increased. However, when I tried it in PETSc, it just
>>>>> does not work.
>>>>>
>>>>> Matrix and rhs can be obtained from the link below.
>>>>>
>>>>> https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>>>>>
>>>>> Would anyone help to check if you can make this work by increasing the
>>>>> PC factor level or fill?
>>>>>
>>>>
>>>> We have ILU(k) supported in serial. However ILU(dt) which takes a
>>>> tolerance only works through Hypre
>>>>
>>>>   http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>>>>
>>>> I recommend you try SuperLU or MUMPS, which can both be downloaded
>>>> automatically by configure, and
>>>> do a full sparse LU.
>>>>
>>>>   Thanks,
>>>>
>>>>     Matt
>>>>
>>>>
>>>>> Thanks and regards,
>>>>>
>>>>> Danyang
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> http://www.caam.rice.edu/~mk51/
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> http://www.caam.rice.edu/~mk51/
>>>
>>>
>>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/f5e7b712/attachment-0001.html>

From knepley at gmail.com  Thu May 25 10:25:30 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 25 May 2017 10:25:30 -0500
Subject: [petsc-users] DMPlex distribution with FVM adjacency
In-Reply-To: <ec0e53d9-7ec1-a69c-94a9-1cc1129d48a9@imperial.ac.uk>
References: <ec0e53d9-7ec1-a69c-94a9-1cc1129d48a9@imperial.ac.uk>
Message-ID: <CAMYG4GmpGdOGerUb7rcSFK7q7JVx=RzZazMu+qcPjn6GdNFsVw@mail.gmail.com>

On Thu, May 25, 2017 at 9:27 AM, Lawrence Mitchell <
lawrence.mitchell at imperial.ac.uk> wrote:

> Dear petsc-users,
>
> I am trying to distribute a triangle mesh with a cell halo defined by
> FVM adjacency (i.e. if I have a facet in the initial (0-overlap)
> distribution, I want the cell on the other side of it).
>
> Reading the documentation, I think I do:
>
> DMPlexSetAdjacencyUseCone(PETSC_TRUE)
> DMPlexSetAdjacencyUseClosure(PETSC_FALSE)
>
> and then
> DMPlexDistribute(..., ovelap=1)
>
> If I do this for a simple mesh and then try and do anything on it, I
> run into all sorts of problems because I have a plex where I have some
> facets, but not even one cell in the support of the facet.  Is this to
> be expected?
>

Hmm. I don't think so. You should have at least one cell in the support of
every facet.
TS ex11 works exactly this way.

When using that adjacency, the overlap cells you get will not have anything
but the
facet connecting them to that partition. Although, if you have adjacent
cells in that overlap layer,
you can get ghost faces between those.

With the code below, do you get an interpolated mesh when you create it
there. That call in C
has another argument


http://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/DMPlexCreateFromCellList.html

If its just cells and vertices, you could get some bizarre things like you
see.

    Matt


> For example the following petsc4py code breaks when run on 3 processes:
>
> $ mpiexec -n 3 python bork.py
> [1] DMPlexGetOrdering() line 133 in
> /data/lmitche1/src/deps/petsc/src/dm/impls/plex/plexreorder.c
> [1] DMPlexCreateOrderingClosure_Static() line 41 in
> /data/lmitche1/src/deps/petsc/src/dm/impls/plex/plexreorder.c
> [1] Petsc has generated inconsistent data
> [1] Number of depth 2 faces 34 does not match permuted nubmer 29
> : error code 77
> [2] DMPlexGetOrdering() line 133 in
> /data/lmitche1/src/deps/petsc/src/dm/impls/plex/plexreorder.c
> [2] DMPlexCreateOrderingClosure_Static() line 41 in
> /data/lmitche1/src/deps/petsc/src/dm/impls/plex/plexreorder.c
> [2] Petsc has generated inconsistent data
> [2] Number of depth 2 faces 33 does not match permuted nubmer 28
> : error code 77
> [0] DMPlexGetOrdering() line 133 in
> /data/lmitche1/src/deps/petsc/src/dm/impls/plex/plexreorder.c
> [0] DMPlexCreateOrderingClosure_Static() line 41 in
> /data/lmitche1/src/deps/petsc/src/dm/impls/plex/plexreorder.c
> [0] Petsc has generated inconsistent data
> [0] Number of depth 2 faces 33 does not match permuted nubmer 31
>
> $ cat > bork.py<<\EOF
> from petsc4py import PETSc
> import numpy as np
> Lx = Ly = 1
> nx = ny = 4
>
> xcoords = np.linspace(0.0, Lx, nx + 1, dtype=PETSc.RealType)
> ycoords = np.linspace(0.0, Ly, ny + 1, dtype=PETSc.RealType)
> coords = np.asarray(np.meshgrid(xcoords, ycoords)).swapaxes(0,
> 2).reshape(-1, 2)
>
> # cell vertices
> i, j = np.meshgrid(np.arange(nx, dtype=PETSc.IntType), np.arange(ny,
> dtype=PETSc.IntType))
> cells = [i*(ny+1) + j, i*(ny+1) + j+1, (i+1)*(ny+1) + j+1,
> (i+1)*(ny+1) + j]
> cells = np.asarray(cells, dtype=PETSc.IntType).swapaxes(0,
> 2).reshape(-1, 4)
> idx = [0, 1, 3, 1, 2, 3]
> cells = cells[:, idx].reshape(-1, 3)
>
> comm = PETSc.COMM_WORLD
> if comm.rank == 0:
>     dm = PETSc.DMPlex().createFromCellList(2, cells, coords, comm=comm)
> else:
>     dm = PETSc.DMPlex().createFromCellList(2, np.zeros((0, 4),
> dtype=PETSc.IntType),
>                                            np.zeros((0, 2),
> dtype=PETSc.RealType),
>                                            comm=comm)
>
> dm.setAdjacencyUseClosure(False)
> dm.setAdjacencyUseCone(True)
>
> dm.distribute(overlap=1)
>
> dm.getOrdering(PETSc.Mat.OrderingType.RCM)
>
> dm.view()
> EOF
>
> Am I doing something wrong?  Is this not expected to work?
>
> Cheers,
>
> Lawrence
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/b7d31fa5/attachment.html>

From lawrence.mitchell at imperial.ac.uk  Thu May 25 11:27:23 2017
From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell)
Date: Thu, 25 May 2017 17:27:23 +0100
Subject: [petsc-users] DMPlex distribution with FVM adjacency
In-Reply-To: <CAMYG4GmpGdOGerUb7rcSFK7q7JVx=RzZazMu+qcPjn6GdNFsVw@mail.gmail.com>
References: <ec0e53d9-7ec1-a69c-94a9-1cc1129d48a9@imperial.ac.uk>
	<CAMYG4GmpGdOGerUb7rcSFK7q7JVx=RzZazMu+qcPjn6GdNFsVw@mail.gmail.com>
Message-ID: <15e465f7-dea1-39c5-7c43-ba447a7a8c09@imperial.ac.uk>



On 25/05/17 16:25, Matthew Knepley wrote:
> On Thu, May 25, 2017 at 9:27 AM, Lawrence Mitchell
> <lawrence.mitchell at imperial.ac.uk
> <mailto:lawrence.mitchell at imperial.ac.uk>> wrote:
> 
>     Dear petsc-users,
> 
>     I am trying to distribute a triangle mesh with a cell halo defined by
>     FVM adjacency (i.e. if I have a facet in the initial (0-overlap)
>     distribution, I want the cell on the other side of it).
> 
>     Reading the documentation, I think I do:
> 
>     DMPlexSetAdjacencyUseCone(PETSC_TRUE)
>     DMPlexSetAdjacencyUseClosure(PETSC_FALSE)
> 
>     and then
>     DMPlexDistribute(..., ovelap=1)
> 
>     If I do this for a simple mesh and then try and do anything on it, I
>     run into all sorts of problems because I have a plex where I have some
>     facets, but not even one cell in the support of the facet.  Is this to
>     be expected?
> 
> 
> Hmm. I don't think so. You should have at least one cell in the
> support of every facet.
> TS ex11 works exactly this way.
> 
> When using that adjacency, the overlap cells you get will not have
> anything but the
> facet connecting them to that partition. Although, if you have
> adjacent cells in that overlap layer,
> you can get ghost faces between those.
> 
> With the code below, do you get an interpolated mesh when you create
> it there. That call in C
> has another argument
> 
>   http://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/DMPlexCreateFromCellList.html

The mesh is interpolated.


OK, so let's see if I can understand what the different adjacency
relations are:

usecone=False, useclosure=False:

adj(p) => cone(p) + cone(support(p))

usecone=True, useclosure=False:

adj(p) => support(p) + support(cone(p))

usecone=False, useclosure=True

adj(p) => closure(star(p))

usecone=True, useclosure=True

adj(p) => star(closure(p))

So let's imagine I have a facet f, the adjacent points are the
support(cone(f)) so the support of the vertices in 2D, so those are
some new facets.

So now, following https://arxiv.org/pdf/1506.06194.pdf, I need to
complete this new mesh, so I ask for the closure of these new facets.
But that might mean I won't ask for cells, right?  So I think I would
end up with some facets that don't have any support.  And empirically
I observe that:

e.g. the code attached:

$ mpiexec -n 3 python bar.py
[0] 7 [0]
[0] 8 [0]
[0] 9 [0 1]
[0] 10 [1]
[0] 11 [1]
[0] 12 []
[1] 10 [0 2]
[1] 11 [0 1]
[1] 12 [0]
[1] 13 [1]
[1] 14 [2]
[1] 15 [2]
[1] 16 [1 3]
[1] 17 [3]
[1] 18 [3]
[2] 7 [0 1]
[2] 8 [0]
[2] 9 [0]
[2] 10 [1]
[2] 11 []
[2] 12 [1]


What I would like (although I'm not sure if this is supported right
now), is the overlap to contain closure(support(facet)) for all shared
facets.  I think that's equivalent to closure(support(p)) \forall p.

That way on any shared facets, I have both cells and their closure.

Is that easy to do?

Lawrence

import sys, petsc4py
petsc4py.init(sys.argv)
from petsc4py import PETSc
import numpy as np
Lx = Ly = 1
nx = 1
ny = 2

xcoords = np.linspace(0.0, Lx, nx + 1, dtype=PETSc.RealType)
ycoords = np.linspace(0.0, Ly, ny + 1, dtype=PETSc.RealType)
coords = np.asarray(np.meshgrid(xcoords, ycoords)).swapaxes(0,
2).reshape(-1, 2)

# cell vertices
i, j = np.meshgrid(np.arange(nx, dtype=PETSc.IntType), np.arange(ny,
dtype=PETSc.IntType))
cells = [i*(ny+1) + j, i*(ny+1) + j+1, (i+1)*(ny+1) + j+1,
(i+1)*(ny+1) + j]
cells = np.asarray(cells, dtype=PETSc.IntType).swapaxes(0,
2).reshape(-1, 4)
idx = [0, 1, 3, 1, 2, 3]
cells = cells[:, idx].reshape(-1, 3)

comm = PETSc.COMM_WORLD
if comm.rank == 0:
    dm = PETSc.DMPlex().createFromCellList(2, cells, coords,
interpolate=True, comm=comm)
else:
    dm = PETSc.DMPlex().createFromCellList(2, np.zeros((0,
cells.shape[1]), dtype=PETSc.IntType),
                                           np.zeros((0, 2),
dtype=PETSc.RealType),
                                           interpolate=True,
                                           comm=comm)

dm.setAdjacencyUseClosure(False)
dm.setAdjacencyUseCone(True)

dm.distribute(overlap=1)
sf = dm.getPointSF()

for p in range(*dm.getDepthStratum(dm.getDepth()-1)):
    PETSc.Sys.syncPrint("[%d] %d %s" % (comm.rank, p, dm.getSupport(p)))

PETSc.Sys.syncFlush()

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/11c00049/attachment.pgp>

From knepley at gmail.com  Thu May 25 12:05:53 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 25 May 2017 12:05:53 -0500
Subject: [petsc-users] DMPlex distribution with FVM adjacency
In-Reply-To: <15e465f7-dea1-39c5-7c43-ba447a7a8c09@imperial.ac.uk>
References: <ec0e53d9-7ec1-a69c-94a9-1cc1129d48a9@imperial.ac.uk>
	<CAMYG4GmpGdOGerUb7rcSFK7q7JVx=RzZazMu+qcPjn6GdNFsVw@mail.gmail.com>
	<15e465f7-dea1-39c5-7c43-ba447a7a8c09@imperial.ac.uk>
Message-ID: <CAMYG4GmhEEA5jLOAeitzHH4wZKQSW1mDmUvG_DSbXaR=RzBGyQ@mail.gmail.com>

On Thu, May 25, 2017 at 11:27 AM, Lawrence Mitchell <
lawrence.mitchell at imperial.ac.uk> wrote:

> On 25/05/17 16:25, Matthew Knepley wrote:
> > On Thu, May 25, 2017 at 9:27 AM, Lawrence Mitchell
> > <lawrence.mitchell at imperial.ac.uk
> > <mailto:lawrence.mitchell at imperial.ac.uk>> wrote:
> >
> >     Dear petsc-users,
> >
> >     I am trying to distribute a triangle mesh with a cell halo defined by
> >     FVM adjacency (i.e. if I have a facet in the initial (0-overlap)
> >     distribution, I want the cell on the other side of it).
> >
> >     Reading the documentation, I think I do:
> >
> >     DMPlexSetAdjacencyUseCone(PETSC_TRUE)
> >     DMPlexSetAdjacencyUseClosure(PETSC_FALSE)
> >
> >     and then
> >     DMPlexDistribute(..., ovelap=1)
> >
> >     If I do this for a simple mesh and then try and do anything on it, I
> >     run into all sorts of problems because I have a plex where I have
> some
> >     facets, but not even one cell in the support of the facet.  Is this
> to
> >     be expected?
> >
> >
> > Hmm. I don't think so. You should have at least one cell in the
> > support of every facet.
> > TS ex11 works exactly this way.
> >
> > When using that adjacency, the overlap cells you get will not have
> > anything but the
> > facet connecting them to that partition. Although, if you have
> > adjacent cells in that overlap layer,
> > you can get ghost faces between those.
> >
> > With the code below, do you get an interpolated mesh when you create
> > it there. That call in C
> > has another argument
> >
> >   http://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/
> DMPlexCreateFromCellList.html
>
> The mesh is interpolated.
>
>
> OK, so let's see if I can understand what the different adjacency
> relations are:
>
> usecone=False, useclosure=False:
>
> adj(p) => cone(p) + cone(support(p))
>
> usecone=True, useclosure=False:
>
> adj(p) => support(p) + support(cone(p))
>
> usecone=False, useclosure=True
>
> adj(p) => closure(star(p))
>
> usecone=True, useclosure=True
>
> adj(p) => star(closure(p))
>
> So let's imagine I have a facet f, the adjacent points are the
> support(cone(f)) so the support of the vertices in 2D, so those are
> some new facets.
>

If you want that, is there a reason you cannot use the FEM style FALSE+TRUE?
If you already want the closure, usually the star is not really adding
anything new.

   Matt


> So now, following https://arxiv.org/pdf/1506.06194.pdf, I need to
> complete this new mesh, so I ask for the closure of these new facets.
> But that might mean I won't ask for cells, right?  So I think I would
> end up with some facets that don't have any support.  And empirically
> I observe that:
>
> e.g. the code attached:
>
> $ mpiexec -n 3 python bar.py
> [0] 7 [0]
> [0] 8 [0]
> [0] 9 [0 1]
> [0] 10 [1]
> [0] 11 [1]
> [0] 12 []
> [1] 10 [0 2]
> [1] 11 [0 1]
> [1] 12 [0]
> [1] 13 [1]
> [1] 14 [2]
> [1] 15 [2]
> [1] 16 [1 3]
> [1] 17 [3]
> [1] 18 [3]
> [2] 7 [0 1]
> [2] 8 [0]
> [2] 9 [0]
> [2] 10 [1]
> [2] 11 []
> [2] 12 [1]
>
>
> What I would like (although I'm not sure if this is supported right
> now), is the overlap to contain closure(support(facet)) for all shared
> facets.  I think that's equivalent to closure(support(p)) \forall p.
>
> That way on any shared facets, I have both cells and their closure.
>
> Is that easy to do?
>
> Lawrence
>
> import sys, petsc4py
> petsc4py.init(sys.argv)
> from petsc4py import PETSc
> import numpy as np
> Lx = Ly = 1
> nx = 1
> ny = 2
>
> xcoords = np.linspace(0.0, Lx, nx + 1, dtype=PETSc.RealType)
> ycoords = np.linspace(0.0, Ly, ny + 1, dtype=PETSc.RealType)
> coords = np.asarray(np.meshgrid(xcoords, ycoords)).swapaxes(0,
> 2).reshape(-1, 2)
>
> # cell vertices
> i, j = np.meshgrid(np.arange(nx, dtype=PETSc.IntType), np.arange(ny,
> dtype=PETSc.IntType))
> cells = [i*(ny+1) + j, i*(ny+1) + j+1, (i+1)*(ny+1) + j+1,
> (i+1)*(ny+1) + j]
> cells = np.asarray(cells, dtype=PETSc.IntType).swapaxes(0,
> 2).reshape(-1, 4)
> idx = [0, 1, 3, 1, 2, 3]
> cells = cells[:, idx].reshape(-1, 3)
>
> comm = PETSc.COMM_WORLD
> if comm.rank == 0:
>     dm = PETSc.DMPlex().createFromCellList(2, cells, coords,
> interpolate=True, comm=comm)
> else:
>     dm = PETSc.DMPlex().createFromCellList(2, np.zeros((0,
> cells.shape[1]), dtype=PETSc.IntType),
>                                            np.zeros((0, 2),
> dtype=PETSc.RealType),
>                                            interpolate=True,
>                                            comm=comm)
>
> dm.setAdjacencyUseClosure(False)
> dm.setAdjacencyUseCone(True)
>
> dm.distribute(overlap=1)
> sf = dm.getPointSF()
>
> for p in range(*dm.getDepthStratum(dm.getDepth()-1)):
>     PETSc.Sys.syncPrint("[%d] %d %s" % (comm.rank, p, dm.getSupport(p)))
>
> PETSc.Sys.syncFlush()
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/47cc0d45/attachment.html>

From lawrence.mitchell at imperial.ac.uk  Thu May 25 13:10:59 2017
From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell)
Date: Thu, 25 May 2017 19:10:59 +0100
Subject: [petsc-users] DMPlex distribution with FVM adjacency
In-Reply-To: <CAMYG4GmhEEA5jLOAeitzHH4wZKQSW1mDmUvG_DSbXaR=RzBGyQ@mail.gmail.com>
References: <ec0e53d9-7ec1-a69c-94a9-1cc1129d48a9@imperial.ac.uk>
	<CAMYG4GmpGdOGerUb7rcSFK7q7JVx=RzZazMu+qcPjn6GdNFsVw@mail.gmail.com>
	<15e465f7-dea1-39c5-7c43-ba447a7a8c09@imperial.ac.uk>
	<CAMYG4GmhEEA5jLOAeitzHH4wZKQSW1mDmUvG_DSbXaR=RzBGyQ@mail.gmail.com>
Message-ID: <54529998-4688-4774-845B-1FDF67A8C20B@imperial.ac.uk>



> On 25 May 2017, at 18:05, Matthew Knepley <knepley at gmail.com> wrote:
> 
> If you want that, is there a reason you cannot use the FEM style FALSE+TRUE?
> If you already want the closure, usually the star is not really adding anything new.

Ok, let me clarify. 

Given shared facets, I'd like closure(support(facet)) this is a subset of the fem adjacency. "Add in the cell and its closure from the remote rank". This doesn't include remote cells I can only see through vertices. Without sending data evaluated at facet quad points, I think this is the adjacency I need to compute facet integrals: all the dofs in closure(support(facet)).

I thought this was what the fv adjacency was, but I think I was mistaken. That is support(cone(p)) for all p that I have.
Now I do a rendezvous to gather everything in the closure of these new points. But I think that means I still don't have some cells?

Make sense?

Lawrence
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/715ebf60/attachment.html>

From knepley at gmail.com  Thu May 25 13:23:23 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 25 May 2017 13:23:23 -0500
Subject: [petsc-users] DMPlex distribution with FVM adjacency
In-Reply-To: <54529998-4688-4774-845B-1FDF67A8C20B@imperial.ac.uk>
References: <ec0e53d9-7ec1-a69c-94a9-1cc1129d48a9@imperial.ac.uk>
	<CAMYG4GmpGdOGerUb7rcSFK7q7JVx=RzZazMu+qcPjn6GdNFsVw@mail.gmail.com>
	<15e465f7-dea1-39c5-7c43-ba447a7a8c09@imperial.ac.uk>
	<CAMYG4GmhEEA5jLOAeitzHH4wZKQSW1mDmUvG_DSbXaR=RzBGyQ@mail.gmail.com>
	<54529998-4688-4774-845B-1FDF67A8C20B@imperial.ac.uk>
Message-ID: <CAMYG4GkPAwjbrKmTBcOHWmtSCrQmhCKSCTO8pJWcvK-mQmLsMw@mail.gmail.com>

On Thu, May 25, 2017 at 1:10 PM, Lawrence Mitchell <
lawrence.mitchell at imperial.ac.uk> wrote:

>
>
> On 25 May 2017, at 18:05, Matthew Knepley <knepley at gmail.com> wrote:
>
> If you want that, is there a reason you cannot use the FEM style
> FALSE+TRUE?
> If you already want the closure, usually the star is not really adding
> anything new.
>
>
> Ok, let me clarify.
>
> Given shared facets, I'd like closure(support(facet)) this is a subset of
> the fem adjacency. "Add in the cell and its closure from the remote rank".
> This doesn't include remote cells I can only see through vertices. Without
> sending data evaluated at facet quad points, I think this is the adjacency
> I need to compute facet integrals: all the dofs in closure(support(facet)).
>

This seems incoherent to me. For FV, dofs reside in the cells, so you
should only need the cell for adjacency. If you
need dofs defined at vertices, then you should also need cells which are
only attached by vertices. How could this
scheme be consistent without this?

  Thanks,

    Matt


> I thought this was what the fv adjacency was, but I think I was mistaken.
> That is support(cone(p)) for all p that I have.
> Now I do a rendezvous to gather everything in the closure of these new
> points. But I think that means I still don't have some cells?
>
> Make sense?
>
> Lawrence
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/e175f3fd/attachment.html>

From lawrence.mitchell at imperial.ac.uk  Thu May 25 13:38:51 2017
From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell)
Date: Thu, 25 May 2017 19:38:51 +0100
Subject: [petsc-users] DMPlex distribution with FVM adjacency
In-Reply-To: <CAMYG4GkPAwjbrKmTBcOHWmtSCrQmhCKSCTO8pJWcvK-mQmLsMw@mail.gmail.com>
References: <ec0e53d9-7ec1-a69c-94a9-1cc1129d48a9@imperial.ac.uk>
	<CAMYG4GmpGdOGerUb7rcSFK7q7JVx=RzZazMu+qcPjn6GdNFsVw@mail.gmail.com>
	<15e465f7-dea1-39c5-7c43-ba447a7a8c09@imperial.ac.uk>
	<CAMYG4GmhEEA5jLOAeitzHH4wZKQSW1mDmUvG_DSbXaR=RzBGyQ@mail.gmail.com>
	<54529998-4688-4774-845B-1FDF67A8C20B@imperial.ac.uk>
	<CAMYG4GkPAwjbrKmTBcOHWmtSCrQmhCKSCTO8pJWcvK-mQmLsMw@mail.gmail.com>
Message-ID: <0BEB36D4-C35B-48E4-8F66-8EE8D38E08B6@imperial.ac.uk>


> On 25 May 2017, at 19:23, Matthew Knepley <knepley at gmail.com> wrote:
> 
> Ok, let me clarify. 
> 
> Given shared facets, I'd like closure(support(facet)) this is a subset of the fem adjacency. "Add in the cell and its closure from the remote rank". This doesn't include remote cells I can only see through vertices. Without sending data evaluated at facet quad points, I think this is the adjacency I need to compute facet integrals: all the dofs in closure(support(facet)).
> 
> This seems incoherent to me. For FV, dofs reside in the cells, so you should only need the cell for adjacency. If you
> need dofs defined at vertices, then you should also need cells which are only attached by vertices. How could this
> scheme be consistent without this?

OK, so what I think is this:

I need to compute integrals over cells and facets.

So I do:

GlobalToLocal(INSERT_VALUES)
ComputeIntegralsOnOwnedEntities
LocalToGlobal(ADD_VALUES)

That way, an integration is performed on every entity exactly once, and LocalToGlobal ensures that I get a consistent assembled Vec.

OK, so if I only compute cell integrals, then the zero overlap distribution with all the points in the closure of the cell (including some remote points) is sufficient.

If I compute facet integrals, I need both cells (and their closure) in the support of the facet.  Again, each facet is only integrated by one process, and the LocalToGlobal adds in contributions to remote dofs.  This is the same as cell integrals, just I need a bit more data, no?

The other option is to notice that what I actually need when I compute a facet integral is the test function and/or any coefficients evaluated at quadrature points on the facet.  So if I don't want the extra overlapped halo, then what I need to do is for the remote process to evaluate any coefficients at the quad points, then send the evaluated data to the facet owner.  Now the facet owner can compute the integral, and again LocalToGlobal adds in contributions to remote dofs.

Lawrence

From knepley at gmail.com  Thu May 25 13:46:01 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 25 May 2017 13:46:01 -0500
Subject: [petsc-users] DMPlex distribution with FVM adjacency
In-Reply-To: <0BEB36D4-C35B-48E4-8F66-8EE8D38E08B6@imperial.ac.uk>
References: <ec0e53d9-7ec1-a69c-94a9-1cc1129d48a9@imperial.ac.uk>
	<CAMYG4GmpGdOGerUb7rcSFK7q7JVx=RzZazMu+qcPjn6GdNFsVw@mail.gmail.com>
	<15e465f7-dea1-39c5-7c43-ba447a7a8c09@imperial.ac.uk>
	<CAMYG4GmhEEA5jLOAeitzHH4wZKQSW1mDmUvG_DSbXaR=RzBGyQ@mail.gmail.com>
	<54529998-4688-4774-845B-1FDF67A8C20B@imperial.ac.uk>
	<CAMYG4GkPAwjbrKmTBcOHWmtSCrQmhCKSCTO8pJWcvK-mQmLsMw@mail.gmail.com>
	<0BEB36D4-C35B-48E4-8F66-8EE8D38E08B6@imperial.ac.uk>
Message-ID: <CAMYG4GmUKNZswgzVZjK2m8OH3P3=RacYEZ4r313Kq580Dt5xMQ@mail.gmail.com>

On Thu, May 25, 2017 at 1:38 PM, Lawrence Mitchell <
lawrence.mitchell at imperial.ac.uk> wrote:

>
> > On 25 May 2017, at 19:23, Matthew Knepley <knepley at gmail.com> wrote:
> >
> > Ok, let me clarify.
> >
> > Given shared facets, I'd like closure(support(facet)) this is a subset
> of the fem adjacency. "Add in the cell and its closure from the remote
> rank". This doesn't include remote cells I can only see through vertices.
> Without sending data evaluated at facet quad points, I think this is the
> adjacency I need to compute facet integrals: all the dofs in
> closure(support(facet)).
> >
> > This seems incoherent to me. For FV, dofs reside in the cells, so you
> should only need the cell for adjacency. If you
> > need dofs defined at vertices, then you should also need cells which are
> only attached by vertices. How could this
> > scheme be consistent without this?
>
> OK, so what I think is this:
>
> I need to compute integrals over cells and facets.
>

Sounds like DG. I will get out my dead chicken for the incantation.


> So I do:
>
> GlobalToLocal(INSERT_VALUES)
> ComputeIntegralsOnOwnedEntities
> LocalToGlobal(ADD_VALUES)
>
> That way, an integration is performed on every entity exactly once, and
> LocalToGlobal ensures that I get a consistent assembled Vec.
>
> OK, so if I only compute cell integrals, then the zero overlap
> distribution with all the points in the closure of the cell (including some
> remote points) is sufficient.
>

Yep.


> If I compute facet integrals, I need both cells (and their closure) in the
> support of the facet.  Again, each facet is only integrated by one process,
> and the LocalToGlobal adds in contributions to remote dofs.  This is the
> same as cell integrals, just I need a bit more data, no?
>
> The other option is to notice that what I actually need when I compute a
> facet integral is the test function and/or any coefficients evaluated at
> quadrature points on the facet.  So if I don't want the extra overlapped
> halo, then what I need to do is for the remote process to evaluate any
> coefficients at the quad points, then send the evaluated data to the facet
> owner.  Now the facet owner can compute the integral, and again
> LocalToGlobal adds in contributions to remote dofs.


That seems baroque. So this is just another adjacency pattern. You should
be able to easily define it, or if you are a patient person,
wait for me to do it. Its here


https://bitbucket.org/petsc/petsc/src/01c3230e040078628f5e559992965c1c4b6f473d/src/dm/impls/plex/plexdistribute.c?at=master&fileviewer=file-view-default#plexdistribute.c-239

I am more than willing to make this overridable by the user through
function composition or another mechanism.

  Thanks,

     Matt


>
> Lawrence




-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/a29d64b9/attachment-0001.html>

From lawrence.mitchell at imperial.ac.uk  Thu May 25 13:58:02 2017
From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell)
Date: Thu, 25 May 2017 19:58:02 +0100
Subject: [petsc-users] DMPlex distribution with FVM adjacency
In-Reply-To: <CAMYG4GmUKNZswgzVZjK2m8OH3P3=RacYEZ4r313Kq580Dt5xMQ@mail.gmail.com>
References: <ec0e53d9-7ec1-a69c-94a9-1cc1129d48a9@imperial.ac.uk>
	<CAMYG4GmpGdOGerUb7rcSFK7q7JVx=RzZazMu+qcPjn6GdNFsVw@mail.gmail.com>
	<15e465f7-dea1-39c5-7c43-ba447a7a8c09@imperial.ac.uk>
	<CAMYG4GmhEEA5jLOAeitzHH4wZKQSW1mDmUvG_DSbXaR=RzBGyQ@mail.gmail.com>
	<54529998-4688-4774-845B-1FDF67A8C20B@imperial.ac.uk>
	<CAMYG4GkPAwjbrKmTBcOHWmtSCrQmhCKSCTO8pJWcvK-mQmLsMw@mail.gmail.com>
	<0BEB36D4-C35B-48E4-8F66-8EE8D38E08B6@imperial.ac.uk>
	<CAMYG4GmUKNZswgzVZjK2m8OH3P3=RacYEZ4r313Kq580Dt5xMQ@mail.gmail.com>
Message-ID: <CDF3EA33-D6E4-4F5E-A547-17B6CB8C863F@imperial.ac.uk>


> On 25 May 2017, at 19:46, Matthew Knepley <knepley at gmail.com> wrote:
> 
> Sounds like DG. I will get out my dead chicken for the incantation

Actually no!  Mixed H(div)-L2 for Stokes.  Which has facet integrals for partially discontinuous fields.  If you do redundant compute for such terms, you need a depth-2 FEM adjacency, which is just grim.  Equally we have some strange users who have jump terms in CG formulations.

Lawrence

From knepley at gmail.com  Thu May 25 14:03:29 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 25 May 2017 14:03:29 -0500
Subject: [petsc-users] DMPlex distribution with FVM adjacency
In-Reply-To: <CDF3EA33-D6E4-4F5E-A547-17B6CB8C863F@imperial.ac.uk>
References: <ec0e53d9-7ec1-a69c-94a9-1cc1129d48a9@imperial.ac.uk>
	<CAMYG4GmpGdOGerUb7rcSFK7q7JVx=RzZazMu+qcPjn6GdNFsVw@mail.gmail.com>
	<15e465f7-dea1-39c5-7c43-ba447a7a8c09@imperial.ac.uk>
	<CAMYG4GmhEEA5jLOAeitzHH4wZKQSW1mDmUvG_DSbXaR=RzBGyQ@mail.gmail.com>
	<54529998-4688-4774-845B-1FDF67A8C20B@imperial.ac.uk>
	<CAMYG4GkPAwjbrKmTBcOHWmtSCrQmhCKSCTO8pJWcvK-mQmLsMw@mail.gmail.com>
	<0BEB36D4-C35B-48E4-8F66-8EE8D38E08B6@imperial.ac.uk>
	<CAMYG4GmUKNZswgzVZjK2m8OH3P3=RacYEZ4r313Kq580Dt5xMQ@mail.gmail.com>
	<CDF3EA33-D6E4-4F5E-A547-17B6CB8C863F@imperial.ac.uk>
Message-ID: <CAMYG4GndThnx0AQ61UwwsskWMtO8AkXO7G3_ten5W=xyYEGXnA@mail.gmail.com>

On Thu, May 25, 2017 at 1:58 PM, Lawrence Mitchell <
lawrence.mitchell at imperial.ac.uk> wrote:

>
> > On 25 May 2017, at 19:46, Matthew Knepley <knepley at gmail.com> wrote:
> >
> > Sounds like DG. I will get out my dead chicken for the incantation
>
> Actually no!  Mixed H(div)-L2 for Stokes.  Which has facet integrals for
> partially discontinuous fields.  If you do redundant compute for such
> terms, you need a depth-2 FEM adjacency, which is just grim.  Equally we
> have some strange users who have jump terms in CG formulations.


Hmm, I thought I made adjacency per field. I have to look. That way, no
problem with the Stokes example. DG is still weird.

  Matt


>
> Lawrence




-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/83f63553/attachment.html>

From lawrence.mitchell at imperial.ac.uk  Thu May 25 14:22:15 2017
From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell)
Date: Thu, 25 May 2017 20:22:15 +0100
Subject: [petsc-users] DMPlex distribution with FVM adjacency
In-Reply-To: <CAMYG4GndThnx0AQ61UwwsskWMtO8AkXO7G3_ten5W=xyYEGXnA@mail.gmail.com>
References: <ec0e53d9-7ec1-a69c-94a9-1cc1129d48a9@imperial.ac.uk>
	<CAMYG4GmpGdOGerUb7rcSFK7q7JVx=RzZazMu+qcPjn6GdNFsVw@mail.gmail.com>
	<15e465f7-dea1-39c5-7c43-ba447a7a8c09@imperial.ac.uk>
	<CAMYG4GmhEEA5jLOAeitzHH4wZKQSW1mDmUvG_DSbXaR=RzBGyQ@mail.gmail.com>
	<54529998-4688-4774-845B-1FDF67A8C20B@imperial.ac.uk>
	<CAMYG4GkPAwjbrKmTBcOHWmtSCrQmhCKSCTO8pJWcvK-mQmLsMw@mail.gmail.com>
	<0BEB36D4-C35B-48E4-8F66-8EE8D38E08B6@imperial.ac.uk>
	<CAMYG4GmUKNZswgzVZjK2m8OH3P3=RacYEZ4r313Kq580Dt5xMQ@mail.gmail.com>
	<CDF3EA33-D6E4-4F5E-A547-17B6CB8C863F@imperial.ac.uk>
	<CAMYG4GndThnx0AQ61UwwsskWMtO8AkXO7G3_ten5W=xyYEGXnA@mail.gmail.com>
Message-ID: <6C66D04E-72AD-445B-9DE6-BB0961B9F622@imperial.ac.uk>


> On 25 May 2017, at 20:03, Matthew Knepley <knepley at gmail.com> wrote:
> 
> 
> Hmm, I thought I made adjacency per field. I have to look. That way, no problem with the Stokes example. DG is still weird.

You might, we don't right now.  We just make the topological adjacency that is "large enough", and then make fields on that.

> 
> That seems baroque. So this is just another adjacency pattern. You should be able to easily define it, or if you are a patient person,
> wait for me to do it. Its here
> 
> https://bitbucket.org/petsc/petsc/src/01c3230e040078628f5e559992965c1c4b6f473d/src/dm/impls/plex/plexdistribute.c?at=master&fileviewer=file-view-default#plexdistribute.c-239
> 
> I am more than willing to make this overridable by the user through function composition or another mechanism.

Hmm, that naive thing of just modifying the XXX_Support_Internal to compute with DMPlexGetTransitiveClosure rather than DMPlexGetCone didn't do what I expected, but I don't understand the way this bootstrapping is done very well.

Cheers,

Lawrence



From knepley at gmail.com  Thu May 25 15:00:19 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 25 May 2017 15:00:19 -0500
Subject: [petsc-users] DMPlex distribution with FVM adjacency
In-Reply-To: <6C66D04E-72AD-445B-9DE6-BB0961B9F622@imperial.ac.uk>
References: <ec0e53d9-7ec1-a69c-94a9-1cc1129d48a9@imperial.ac.uk>
	<CAMYG4GmpGdOGerUb7rcSFK7q7JVx=RzZazMu+qcPjn6GdNFsVw@mail.gmail.com>
	<15e465f7-dea1-39c5-7c43-ba447a7a8c09@imperial.ac.uk>
	<CAMYG4GmhEEA5jLOAeitzHH4wZKQSW1mDmUvG_DSbXaR=RzBGyQ@mail.gmail.com>
	<54529998-4688-4774-845B-1FDF67A8C20B@imperial.ac.uk>
	<CAMYG4GkPAwjbrKmTBcOHWmtSCrQmhCKSCTO8pJWcvK-mQmLsMw@mail.gmail.com>
	<0BEB36D4-C35B-48E4-8F66-8EE8D38E08B6@imperial.ac.uk>
	<CAMYG4GmUKNZswgzVZjK2m8OH3P3=RacYEZ4r313Kq580Dt5xMQ@mail.gmail.com>
	<CDF3EA33-D6E4-4F5E-A547-17B6CB8C863F@imperial.ac.uk>
	<CAMYG4GndThnx0AQ61UwwsskWMtO8AkXO7G3_ten5W=xyYEGXnA@mail.gmail.com>
	<6C66D04E-72AD-445B-9DE6-BB0961B9F622@imperial.ac.uk>
Message-ID: <CAMYG4Gk0EVrAoNUznf6wyjuLfX-YAMEwt-jzNu0E-Vy6b953XA@mail.gmail.com>

On Thu, May 25, 2017 at 2:22 PM, Lawrence Mitchell <
lawrence.mitchell at imperial.ac.uk> wrote:

>
> > On 25 May 2017, at 20:03, Matthew Knepley <knepley at gmail.com> wrote:
> >
> >
> > Hmm, I thought I made adjacency per field. I have to look. That way, no
> problem with the Stokes example. DG is still weird.
>
> You might, we don't right now.  We just make the topological adjacency
> that is "large enough", and then make fields on that.
>
> >
> > That seems baroque. So this is just another adjacency pattern. You
> should be able to easily define it, or if you are a patient person,
> > wait for me to do it. Its here
> >
> > https://bitbucket.org/petsc/petsc/src/01c3230e040078628f5e559992965c
> 1c4b6f473d/src/dm/impls/plex/plexdistribute.c?at=master&
> fileviewer=file-view-default#plexdistribute.c-239
> >
> > I am more than willing to make this overridable by the user through
> function composition or another mechanism.
>
> Hmm, that naive thing of just modifying the XXX_Support_Internal to
> compute with DMPlexGetTransitiveClosure rather than DMPlexGetCone didn't do
> what I expected, but I don't understand the way this bootstrapping is done
> very well.
>

It should do the right thing. Notice that you have to be careful about the
arrays that you use since I reuse them for efficiency here.
What is going wrong?

   Matt


> Cheers,
>
> Lawrence
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/e778481e/attachment.html>

From danyang.su at gmail.com  Thu May 25 15:18:39 2017
From: danyang.su at gmail.com (Danyang Su)
Date: Thu, 25 May 2017 13:18:39 -0700
Subject: [petsc-users] PCFactorSetShiftType does not work in code but
 -pc_factor_set_shift_type works
In-Reply-To: <CAGCphBu7jAaqSA_j8wreS_XRoPpD+vJYt-hiua1Tkp4q=Bf8Jg@mail.gmail.com>
References: <93217794-9c63-fd52-ab36-4174de8cb9c8@gmail.com>
	<CAMYG4GnTZChFxXgizw1edRtfNA==oOpeJFy57nzY1b-xkkPYbg@mail.gmail.com>
	<c78911ea-edba-3966-754d-0102e2d75d42@gmail.com>
	<CAMYG4GmaFrpOzo1iFXJQoLZpwZ0rxy_ULpNPgcgtD4AMZHrRfg@mail.gmail.com>
	<b528c54d-c8e0-5030-315b-fdb97c279b19@gmail.com>
	<CAGCphBvLSvn8NXFpQ2qjBtyXwtHA0dqziJVSdZSVtu9BtVxZ5g@mail.gmail.com>
	<f2389e74-f3a1-420d-818d-33c782c0aa39@gmail.com>
	<bfb44b8d-fc38-6884-368a-216322668782@gmail.com>
	<CAGCphBvmU1jyK60QKsE2=K_5TEe_8+U+=wHcqB3V1ssbZGwQ+w@mail.gmail.com>
	<8634589f-d1a5-bf4f-b158-3ddb5a18026b@gmail.com>
	<CAGCphBu7jAaqSA_j8wreS_XRoPpD+vJYt-hiua1Tkp4q=Bf8Jg@mail.gmail.com>
Message-ID: <02bb196f-6243-0a7f-ec1b-5ebb202e4539@gmail.com>

Hi Hong,

It works like a charm. I really appreciate your help.

Regards,

Danyang


On 17-05-25 07:49 AM, Hong wrote:
> Danyang:
> You must access inner pc, then set shift. See
> petsc/src/ksp/ksp/examples/tutorials/ex7.c
>
> For example, I add following to 
> petsc/src/ksp/ksp/examples/tutorials/ex2.c, line 191:
>   PetscBool isbjacobi;
>   PC        pc;
>   ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
>   ierr = 
> PetscObjectTypeCompare((PetscObject)pc,PCBJACOBI,&isbjacobi);CHKERRQ(ierr);
>   if (isbjacobi) {
>     PetscInt nlocal;
>     KSP      *subksp;
>     PC       subpc;
>
>     ierr = KSPSetUp(ksp);CHKERRQ(ierr);
>     ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
>
>     /* Extract the array of KSP contexts for the local blocks */
>     ierr = PCBJacobiGetSubKSP(pc,&nlocal,NULL,&subksp);CHKERRQ(ierr);
>     printf("isbjacobi, nlocal %D, set option to subpc...\n",nlocal);
>     for (i=0; i<nlocal; i++) {
>       ierr = KSPGetPC(subksp[i],&subpc);CHKERRQ(ierr);
>       ierr = PCFactorSetShiftType(subpc,MAT_SHIFT_NONZERO);CHKERRQ(ierr);
>     }
>   }
>
>
>     Dear Hong and Barry,
>
>     I have implemented this option in the code, as we also need to use
>     configuration from file for convenience. When I run the code using
>     options, it works fine, however, when I run the code using
>     configuration file, it does not work. The code has two set of
>     equations, flow and reactive, with prefix been set to "flow_" and
>     "react_". When I run the code using
>
>     mpiexec -n 4 ../executable -flow_sub_pc_factor_shift_type nonzero
>     -react_sub_pc_factor_shift_type nonzero
>
>     it works. However, if I run using
>
>     mpiexec -n 4 ../executable
>
>     and let the executable file read the options from file, it just
>     does not work at "call
>     PCFactorSetShiftType(pc_flow,MAT_SHIFT_NONZERO, ierr)  or none,
>     positive_definite ...". Do I miss something here?
>
>     Below is the pseudo code I have used for flow equations, similar
>     for reactive equations.
>
>               call MatCreateAIJ(Petsc_Comm_World,nndof,nndof,nngbldof, &
>                                 nngbldof,d_nz,PETSC_NULL_INTEGER,o_nz, &
>                                 PETSC_NULL_INTEGER,a_flow,ierr)
>               CHKERRQ(ierr)
>
>             call MatSetFromOptions(a_flow,ierr)
>             CHKERRQ(ierr)
>
>             call KSPCreate(Petsc_Comm_World, ksp_flow, ierr)
>             CHKERRQ(ierr)
>
>             call KSPAppendOptionsPrefix(ksp_flow,"flow_",ierr)
>             CHKERRQ(ierr)
>
>             call KSPSetInitialGuessNonzero(ksp_flow, &
>                     b_initial_guess_nonzero_flow, ierr)
>             CHKERRQ(ierr)
>
>             call KSPSetInitialGuessNonzero(ksp_flow, &
>                     b_initial_guess_nonzero_flow, ierr)
>             CHKERRQ(ierr)
>
>             call KSPSetDM(ksp_flow,dmda_flow%da,ierr)
>             CHKERRQ(ierr)
>             call KSPSetDMActive(ksp_flow,PETSC_FALSE,ierr)
>             CHKERRQ(ierr)
>
>             !!!!*********CHECK IF READ OPTION FROM FILE*********!!!!
>             if (read_option_from_file) then
>
>               call KSPSetType(ksp_flow, KSPGMRES, ierr) !or KSPBCGS or
>     others...
>               CHKERRQ(ierr)
>
>               call KSPGetPC(ksp_flow, pc_flow, ierr)
>               CHKERRQ(ierr)
>
>               call PCSetType(pc_flow,PCBJACOBI, ierr)       !or PCILU
>     or PCJACOBI or PCHYPRE ...
>               CHKERRQ(ierr)
>
>               call PCFactorSetShiftType(pc_flow,MAT_SHIFT_NONZERO,
>     ierr)  or none, positive_definite ...
>               CHKERRQ(ierr)
>
>             end if
>
>             call
>     PCFactorGetMatSolverPackage(pc_flow,solver_pkg_flow,ierr)
>             CHKERRQ(ierr)
>
>             call compute_jacobian(rank,dmda_flow%da, &
>     a_flow,a_in,ia_in,ja_in,nngl_in,         &
>     row_idx_l2pg,col_idx_l2pg,            &
>                                   b_non_interlaced)
>             call KSPSetFromOptions(ksp_flow,ierr)
>             CHKERRQ(ierr)
>
>             call KSPSetUp(ksp_flow,ierr)
>             CHKERRQ(ierr)
>
>             call KSPSetUpOnBlocks(ksp_flow,ierr)
>             CHKERRQ(ierr)
>
>             call KSPSolve(ksp_flow,b_flow,x_flow,ierr)
>             CHKERRQ(ierr)
>
>
>     Thanks and Regards,
>
>     Danyang
>
>     On 17-05-24 06:32 PM, Hong wrote:
>>     Remove your option '-vecload_block_size 10'.
>>     Hong
>>
>>     On Wed, May 24, 2017 at 3:06 PM, Danyang Su <danyang.su at gmail.com
>>     <mailto:danyang.su at gmail.com>> wrote:
>>
>>         Dear Hong,
>>
>>         I just tested with different number of processors for the
>>         same matrix. It sometimes got "ERROR: Arguments are
>>         incompatible" for different number of processors. It works
>>         fine using 4, 8, or 24 processors, but failed with "ERROR:
>>         Arguments are incompatible" using 16 or 48 processors. The
>>         error information is attached. I tested this on my local
>>         computer with 6 cores 12 threads. Any suggestion on this?
>>
>>         Thanks,
>>
>>         Danyang
>>
>>
>>         On 17-05-24 12:28 PM, Danyang Su wrote:
>>>
>>>         Hi Hong,
>>>
>>>         Awesome. Thanks for testing the case. I will try your
>>>         options for the code and get back to you later.
>>>
>>>         Regards,
>>>
>>>         Danyang
>>>
>>>
>>>         On 17-05-24 12:21 PM, Hong wrote:
>>>>         Danyang :
>>>>         I tested your data.
>>>>         Your matrices encountered zero pivots, e.g.
>>>>         petsc/src/ksp/ksp/examples/tutorials (master)
>>>>         $ mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs
>>>>         b_react_in_2.bin -ksp_monitor -ksp_error_if_not_converged
>>>>
>>>>         [15]PETSC ERROR: Zero pivot in LU factorization:
>>>>         http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot
>>>>         <http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot>
>>>>         [15]PETSC ERROR: Zero pivot row 1249 value 2.05808e-14
>>>>         tolerance 2.22045e-14
>>>>         ...
>>>>
>>>>         Adding option '-sub_pc_factor_shift_type nonzero', I got
>>>>         mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs
>>>>         b_react_in_2.bin -ksp_monitor -ksp_error_if_not_converged
>>>>         -sub_pc_factor_shift_type nonzero -mat_view ascii::ascii_info
>>>>
>>>>         Mat Object: 24 MPI processes
>>>>         type: mpiaij
>>>>         rows=450000, cols=450000
>>>>         total: nonzeros=6991400, allocated nonzeros=6991400
>>>>         total number of mallocs used during MatSetValues calls =0
>>>>         not using I-node (on process 0) routines
>>>>           0 KSP Residual norm 5.849777711755e+01
>>>>           1 KSP Residual norm 6.824179430230e-01
>>>>           2 KSP Residual norm 3.994483555787e-02
>>>>           3 KSP Residual norm 6.085841461433e-03
>>>>           4 KSP Residual norm 8.876162583511e-04
>>>>           5 KSP Residual norm 9.407780665278e-05
>>>>         Number of iterations =   5
>>>>         Residual norm 0.00542891
>>>>
>>>>         Hong
>>>>
>>>>             Hi Matt,
>>>>
>>>>             Yes. The matrix is 450000x450000 sparse. The hypre
>>>>             takes hundreds of iterates, not for all but in most of
>>>>             the timesteps. The matrix is not well conditioned, with
>>>>             nonzero entries range from 1.0e-29 to 1.0e2. I also
>>>>             made double check if there is anything wrong in the
>>>>             parallel version, however, the matrix is the same with
>>>>             sequential version except some round error which is
>>>>             relatively very small. Usually for those not well
>>>>             conditioned matrix, direct solver should be faster than
>>>>             iterative solver, right? But when I use the sequential
>>>>             iterative solver with ILU prec developed almost 20
>>>>             years go by others, the solver converge fast with
>>>>             appropriate factorization level. In other words, when I
>>>>             use 24 processor using hypre, the speed is almost the
>>>>             same as as the old sequential iterative solver using 1
>>>>             processor.
>>>>
>>>>             I use most of the default configuration for the general
>>>>             case with pretty good speedup. And I am not sure if I
>>>>             miss something for this problem.
>>>>
>>>>             Thanks,
>>>>
>>>>             Danyang
>>>>
>>>>
>>>>             On 17-05-24 11:12 AM, Matthew Knepley wrote:
>>>>>             On Wed, May 24, 2017 at 12:50 PM, Danyang Su
>>>>>             <danyang.su at gmail.com <mailto:danyang.su at gmail.com>>
>>>>>             wrote:
>>>>>
>>>>>                 Hi Matthew and Barry,
>>>>>
>>>>>                 Thanks for the quick response.
>>>>>
>>>>>                 I also tried superlu and mumps, both work but it
>>>>>                 is about four times slower than ILU(dt) prec
>>>>>                 through hypre, with 24 processors I have tested.
>>>>>
>>>>>             You mean the total time is 4x? And you are taking
>>>>>             hundreds of iterates? That seems hard to believe,
>>>>>             unless you are dropping
>>>>>             a huge number of elements.
>>>>>
>>>>>                 When I look into the convergence information, the
>>>>>                 method using ILU(dt) still takes 200 to 3000
>>>>>                 linear iterations for each newton iteration. One
>>>>>                 reason is this equation is hard to solve. As for
>>>>>                 the general cases, the same method works awesome
>>>>>                 and get very good speedup.
>>>>>
>>>>>             I do not understand what you mean here.
>>>>>
>>>>>                 I also doubt if I use hypre correctly for this
>>>>>                 case. Is there anyway to check this problem, or is
>>>>>                 it possible to increase the factorization level
>>>>>                 through hypre?
>>>>>
>>>>>             I don't know.
>>>>>
>>>>>               Matt
>>>>>
>>>>>                 Thanks,
>>>>>
>>>>>                 Danyang
>>>>>
>>>>>
>>>>>                 On 17-05-24 04:59 AM, Matthew Knepley wrote:
>>>>>>                 On Wed, May 24, 2017 at 2:21 AM, Danyang Su
>>>>>>                 <danyang.su at gmail.com
>>>>>>                 <mailto:danyang.su at gmail.com>> wrote:
>>>>>>
>>>>>>                     Dear All,
>>>>>>
>>>>>>                     I use PCFactorSetLevels for ILU and
>>>>>>                     PCFactorSetFill for other preconditioning in
>>>>>>                     my code to help solve the problems that the
>>>>>>                     default option is hard to solve. However, I
>>>>>>                     found the latter one, PCFactorSetFill does
>>>>>>                     not take effect for my problem. The matrices
>>>>>>                     and rhs as well as the solutions are attached
>>>>>>                     from the link below. I obtain the solution
>>>>>>                     using hypre preconditioner and it takes 7 and
>>>>>>                     38 iterations for matrix 1 and matrix 2.
>>>>>>                     However, if I use other preconditioner, the
>>>>>>                     solver just failed at the first matrix. I
>>>>>>                     have tested this matrix using the native
>>>>>>                     sequential solver (not PETSc) with ILU
>>>>>>                     preconditioning. If I set the incomplete
>>>>>>                     factorization level to 0, this sequential
>>>>>>                     solver will take more than 100 iterations. If
>>>>>>                     I increase the factorization level to 1 or
>>>>>>                     more, it just takes several iterations. This
>>>>>>                     remind me that the PC factor for this
>>>>>>                     matrices should be increased. However, when I
>>>>>>                     tried it in PETSc, it just does not work.
>>>>>>
>>>>>>                     Matrix and rhs can be obtained from the link
>>>>>>                     below.
>>>>>>
>>>>>>                     https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
>>>>>>                     <https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R>
>>>>>>
>>>>>>                     Would anyone help to check if you can make
>>>>>>                     this work by increasing the PC factor level
>>>>>>                     or fill?
>>>>>>
>>>>>>
>>>>>>                 We have ILU(k) supported in serial. However
>>>>>>                 ILU(dt) which takes a tolerance only works
>>>>>>                 through Hypre
>>>>>>
>>>>>>                 http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
>>>>>>                 <http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html>
>>>>>>
>>>>>>                 I recommend you try SuperLU or MUMPS, which can
>>>>>>                 both be downloaded automatically by configure, and
>>>>>>                 do a full sparse LU.
>>>>>>
>>>>>>                   Thanks,
>>>>>>
>>>>>>                     Matt
>>>>>>
>>>>>>                     Thanks and regards,
>>>>>>
>>>>>>                     Danyang
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>                 -- 
>>>>>>                 What most experimenters take for granted before
>>>>>>                 they begin their experiments is infinitely more
>>>>>>                 interesting than any results to which their
>>>>>>                 experiments lead.
>>>>>>                 -- Norbert Wiener
>>>>>>
>>>>>>                 http://www.caam.rice.edu/~mk51/
>>>>>>                 <http://www.caam.rice.edu/%7Emk51/>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>             -- 
>>>>>             What most experimenters take for granted before they
>>>>>             begin their experiments is infinitely more interesting
>>>>>             than any results to which their experiments lead.
>>>>>             -- Norbert Wiener
>>>>>
>>>>>             http://www.caam.rice.edu/~mk51/
>>>>>             <http://www.caam.rice.edu/%7Emk51/>
>>>>
>>>>
>>>
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/a17faf3d/attachment-0001.html>

From jed at jedbrown.org  Thu May 25 23:26:46 2017
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 25 May 2017 22:26:46 -0600
Subject: [petsc-users] PETSc User Meeting 2017, June 14-16 in Boulder,
	Colorado
In-Reply-To: <87y3wbtk1i.fsf@jedbrown.org>
References: <87y3wbtk1i.fsf@jedbrown.org>
Message-ID: <87shjsxmyh.fsf@jedbrown.org>

The program is up on the website:

  https://www.mcs.anl.gov/petsc/meetings/2017/

If you haven't registered yet, we can still accommodate you, but please
register soon.  If you haven't booked lodging, please do that soon --
the on-campus lodging option will close on *Tuesday, May 30*.

  https://confreg.colorado.edu/CSM2017

We are looking forward to seeing you in Boulder!

Jed Brown <jed at jedbrown.org> writes:

> We'd like to invite you to join us at the 2017 PETSc User Meeting held
> at the University of Colorado Boulder on June 14-16, 2017.
>
>   http://www.mcs.anl.gov/petsc/meetings/2017/
>
> The first day consists of tutorials on various aspects and features of
> PETSc. The second and third days will be devoted to exchange,
> discussions, and a refinement of strategies for the future with our
> users. We encourage you to present work illustrating your own use of
> PETSc, for example in applications or in libraries built on top of
> PETSc.
>
> Registration for the PETSc User Meeting 2017 is free for students and
> $75 for non-students. We can host a maximum of 150 participants, so
> register soon (and by May 15).
>
>   http://www.eventzilla.net/web/e/petsc-user-meeting-2017-2138890185
>
> We are also offering low-cost lodging on campus.  A lodging registration
> site will be available soon and announced here and on the website.
>
> Thanks to the generosity of Intel, we will be able to offer a limited
> number of student travel grants. We are also soliciting additional
> sponsors -- please contact us if you are interested.
>
>
> We are looking forward to seeing you in Boulder!
>
> Please contact us at petsc2017 at mcs.anl.gov if you have any questions or
> comments.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/48e38653/attachment.pgp>

From knepley at gmail.com  Thu May 25 23:34:21 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 25 May 2017 23:34:21 -0500
Subject: [petsc-users] PETSc User Meeting 2017, June 14-16 in Boulder,
	Colorado
In-Reply-To: <87shjsxmyh.fsf@jedbrown.org>
References: <87y3wbtk1i.fsf@jedbrown.org> <87shjsxmyh.fsf@jedbrown.org>
Message-ID: <CAMYG4GnbLw+++Kq0+FZFv1HsWNCu=FQA3Jy2XFkLxA9pRh+LrQ@mail.gmail.com>

On Thu, May 25, 2017 at 11:26 PM, Jed Brown <jed at jedbrown.org> wrote:

> The program is up on the website:
>
>   https://www.mcs.anl.gov/petsc/meetings/2017/


Put Toby on the oanel.

  Matt


>
> If you haven't registered yet, we can still accommodate you, but please
> register soon.  If you haven't booked lodging, please do that soon --
> the on-campus lodging option will close on *Tuesday, May 30*.
>
>   https://confreg.colorado.edu/CSM2017
>
> We are looking forward to seeing you in Boulder!
>
> Jed Brown <jed at jedbrown.org> writes:
>
> > We'd like to invite you to join us at the 2017 PETSc User Meeting held
> > at the University of Colorado Boulder on June 14-16, 2017.
> >
> >   http://www.mcs.anl.gov/petsc/meetings/2017/
> >
> > The first day consists of tutorials on various aspects and features of
> > PETSc. The second and third days will be devoted to exchange,
> > discussions, and a refinement of strategies for the future with our
> > users. We encourage you to present work illustrating your own use of
> > PETSc, for example in applications or in libraries built on top of
> > PETSc.
> >
> > Registration for the PETSc User Meeting 2017 is free for students and
> > $75 for non-students. We can host a maximum of 150 participants, so
> > register soon (and by May 15).
> >
> >   http://www.eventzilla.net/web/e/petsc-user-meeting-2017-2138890185
> >
> > We are also offering low-cost lodging on campus.  A lodging registration
> > site will be available soon and announced here and on the website.
> >
> > Thanks to the generosity of Intel, we will be able to offer a limited
> > number of student travel grants. We are also soliciting additional
> > sponsors -- please contact us if you are interested.
> >
> >
> > We are looking forward to seeing you in Boulder!
> >
> > Please contact us at petsc2017 at mcs.anl.gov if you have any questions or
> > comments.
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170525/72021897/attachment.html>

From franck.houssen at inria.fr  Fri May 26 04:52:12 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Fri, 26 May 2017 11:52:12 +0200 (CEST)
Subject: [petsc-users] How to VecView with a formatted precision (%10.8f) ?
In-Reply-To: <6348799.8316454.1495792325379.JavaMail.zimbra@inria.fr>
Message-ID: <1559132119.8316480.1495792332227.JavaMail.zimbra@inria.fr>

How to VecView with a formatted precision (%10.8f) ? Not possible ? 

Franck 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170526/0880d1ec/attachment.html>

From Fabian.Jakub at physik.uni-muenchen.de  Fri May 26 12:27:25 2017
From: Fabian.Jakub at physik.uni-muenchen.de (Fabian.Jakub)
Date: Fri, 26 May 2017 19:27:25 +0200
Subject: [petsc-users] DMPlex export to hdf5/vtk for triangle/prism mesh
Message-ID: <28b9b347-6d83-7789-8f13-0409f312db34@physik.uni-muenchen.de>

Dear Petsc Team,

I am playing around with DMPlex, using it to generate the Mesh for the
ICON weather model(http://doi.org/10.1002/2015MS000431), which employs a
triangle mesh horizontally and columns, vertically.

This results in a grid, looking like prisms, where top and bottom faces
are triangles and side faces are rectangles.

I was delighted to see that I could export the triangle DMPlex (2d Mesh)
to hdf5 and use petsc_gen_xdmf.py to then visualize the mesh in
visit/paraview.
This is especially nice when exporting petscsections/vectors directly to
VTK.

I then tried the same approach for the prism grid in 3D.
I attached the code for one single cell, as well as the output in hdf5.

However, trying to convert the hdf5 output, it fails with:

make prism.xmf

$PETSC_DIR/bin/petsc_gen_xdmf.py prism.h5
Traceback (most recent call last):
  File
"/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//bin/petsc_gen_xdmf.py",
line 241, in <module>
    generateXdmf(f)
  File
"/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//bin/petsc_gen_xdmf.py",
line 235, in generateXdmf
    Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells,
numCorners, cellDim, geomPath, numVertices, spaceDim, time, vfields,
cfields)
  File
"/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//bin/petsc_gen_xdmf.py",
line 193, in write
    self.writeSpaceGridHeader(fp, numCells, numCorners, cellDim, spaceDim)
  File
"/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//bin/petsc_gen_xdmf.py",
line 75, in writeSpaceGridHeader
    ''' % (self.cellMap[cellDim][numCorners], numCells, "XYZ" if
spaceDim > 2 else "XY"))
KeyError: 6


Also, if I try to export a vector directly to vtk, visit and paraview
fail to open it.

My question is:
Is this a general limitation of these output formats, that I can not mix
faces with 3 and 4 vertices or is it a limitation of the
petsc_gen_xdmf.py or the VTK Viewer.

I'd also welcome any thoughts on the prism mesh in general.
Is it that uncommon to use and do you foresee other complications with it?
I fear I cannot change the discretization of the host model but maybe it
makes sense to use a different grid for my radiative transfer code?

Many thanks,


Fabian
-------------- next part --------------
include ${PETSC_DIR}/lib/petsc/conf/variables
include ${PETSC_DIR}/lib/petsc/conf/rules

prism.xmf:: prism.h5
		${PETSC_DIR}/bin/petsc_gen_xdmf.py prism.h5

prism.h5:: plex_prism
		./plex_prism -show_plex ::ascii_info_detail
		./plex_prism -show_plex hdf5:prism.h5

plex_prism:: plex_prism.F90
		${PETSC_FCOMPILE} -c plex_prism.F90
		${FLINKER} plex_prism.o -o plex_prism ${PETSC_LIB}

clean::
		rm -rf *.o prism.h5 prism.xmf plex_prism
-------------- next part --------------
A non-text attachment was scrubbed...
Name: plex_prism.F90
Type: text/x-fortran
Size: 5515 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170526/92c3cbf5/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: prism.h5
Type: application/x-hdf
Size: 25400 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170526/92c3cbf5/attachment-0001.hdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170526/92c3cbf5/attachment-0001.pgp>

From jed at jedbrown.org  Fri May 26 12:27:42 2017
From: jed at jedbrown.org (Jed Brown)
Date: Fri, 26 May 2017 11:27:42 -0600
Subject: [petsc-users] How to VecView with a formatted precision
	(%10.8f) ?
In-Reply-To: <1559132119.8316480.1495792332227.JavaMail.zimbra@inria.fr>
References: <1559132119.8316480.1495792332227.JavaMail.zimbra@inria.fr>
Message-ID: <87shjrwmsx.fsf@jedbrown.org>

No, but this could be added to the ASCII viewer.  Why do you want it?

Franck Houssen <franck.houssen at inria.fr> writes:

> How to VecView with a formatted precision (%10.8f) ? Not possible ? 
>
> Franck 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170526/24e21e2e/attachment.pgp>

From lvella at gmail.com  Fri May 26 16:20:08 2017
From: lvella at gmail.com (Lucas Clemente Vella)
Date: Fri, 26 May 2017 18:20:08 -0300
Subject: [petsc-users] How to replace the default global database?
Message-ID: <CAGCathwr3taHTSxp3gH58oyPQvUgrB2rEz1Vyx8CC2o4dCQBXg@mail.gmail.com>

Here is what I want to do:
- Take the global PetscOptions and store it somewhere;
- Create my own PetscOptions;
- Populate it with my options;
- Set my new PetscOptions as the global default;
- Create some PETSc objects;
- Restore old PetscOptions as default global;
- Destroy the PetscOptions I created.

I could not find a function to replace global PetscOptions, or to copy one
PetscOptions to another. Is it possible to do what I want to do? How?

-- 
Lucas Clemente Vella
lvella at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170526/b5525a2b/attachment.html>

From bsmith at mcs.anl.gov  Fri May 26 17:55:41 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 26 May 2017 17:55:41 -0500
Subject: [petsc-users] How to replace the default global database?
In-Reply-To: <CAGCathwr3taHTSxp3gH58oyPQvUgrB2rEz1Vyx8CC2o4dCQBXg@mail.gmail.com>
References: <CAGCathwr3taHTSxp3gH58oyPQvUgrB2rEz1Vyx8CC2o4dCQBXg@mail.gmail.com>
Message-ID: <1DEA680E-6BF8-459E-8AD2-3628E3B56BAF@mcs.anl.gov>


   I do not think you want to do this. The standard way we handle what it seems you need is to use PetscObjectSetOptionsPrefix() for the different PETSc objects giving them different prefixes and then appending the prefix for the options when you provide them to the options database. For example if you have a KSP for a flow solver and a KSP for a pressure solver you might do

    KSPCreate(PETSC_COMM_WORLD,&flow);
    KSPSetOptionsPrefix(flow,"u");

    KSPCreate(PETSC_COMM_WORLD,&pressure);
    KSPSetOptionsPrefix(pressure,"p");

    and set options like   

     -u_pc_type jacobi

     -p_pc_type gamg

    Will this do what you need?

   Barry

     Because the options data base can be accessed by any object at any time (not just when it is created), it doesn't make sense to change the default options database ever because it would be uncertain what objects the change affected or did not affect.

    




> On May 26, 2017, at 4:20 PM, Lucas Clemente Vella <lvella at gmail.com> wrote:
> 
> Here is what I want to do:
> - Take the global PetscOptions and store it somewhere;
> - Create my own PetscOptions;
> - Populate it with my options;
> - Set my new PetscOptions as the global default;
> - Create some PETSc objects;
> - Restore old PetscOptions as default global;
> - Destroy the PetscOptions I created.
> 
> I could not find a function to replace global PetscOptions, or to copy one PetscOptions to another. Is it possible to do what I want to do? How?
> 
> -- 
> Lucas Clemente Vella
> lvella at gmail.com


From knepley at gmail.com  Fri May 26 22:40:40 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 26 May 2017 22:40:40 -0500
Subject: [petsc-users] DMPlex export to hdf5/vtk for triangle/prism mesh
In-Reply-To: <28b9b347-6d83-7789-8f13-0409f312db34@physik.uni-muenchen.de>
References: <28b9b347-6d83-7789-8f13-0409f312db34@physik.uni-muenchen.de>
Message-ID: <CAMYG4GkVOTtFL_OBpUBPGEcBTikMbLOb5ruz=VU7tVTO6JQHTQ@mail.gmail.com>

On Fri, May 26, 2017 at 12:27 PM, Fabian.Jakub <
Fabian.Jakub at physik.uni-muenchen.de> wrote:

> Dear Petsc Team,
>
> I am playing around with DMPlex, using it to generate the Mesh for the
> ICON weather model(http://doi.org/10.1002/2015MS000431), which employs a
> triangle mesh horizontally and columns, vertically.
>
> This results in a grid, looking like prisms, where top and bottom faces
> are triangles and side faces are rectangles.
>
> I was delighted to see that I could export the triangle DMPlex (2d Mesh)
> to hdf5 and use petsc_gen_xdmf.py to then visualize the mesh in
> visit/paraview.
> This is especially nice when exporting petscsections/vectors directly to
> VTK.
>

Great.


> I then tried the same approach for the prism grid in 3D.
> I attached the code for one single cell, as well as the output in hdf5.
>
> However, trying to convert the hdf5 output, it fails with:
>
> make prism.xmf
>
> $PETSC_DIR/bin/petsc_gen_xdmf.py prism.h5
> Traceback (most recent call last):
>   File
> "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
> bin/petsc_gen_xdmf.py",
> line 241, in <module>
>     generateXdmf(f)
>   File
> "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
> bin/petsc_gen_xdmf.py",
> line 235, in generateXdmf
>     Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells,
> numCorners, cellDim, geomPath, numVertices, spaceDim, time, vfields,
> cfields)
>   File
> "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
> bin/petsc_gen_xdmf.py",
> line 193, in write
>     self.writeSpaceGridHeader(fp, numCells, numCorners, cellDim, spaceDim)
>   File
> "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
> bin/petsc_gen_xdmf.py",
> line 75, in writeSpaceGridHeader
>     ''' % (self.cellMap[cellDim][numCorners], numCells, "XYZ" if
> spaceDim > 2 else "XY"))
> KeyError: 6
>
>
> Also, if I try to export a vector directly to vtk, visit and paraview
> fail to open it.
>
> My question is:
> Is this a general limitation of these output formats, that I can not mix
> faces with 3 and 4 vertices or is it a limitation of the
> petsc_gen_xdmf.py or the VTK Viewer.
>

petsc_gen_xdmf. Take a look here


https://bitbucket.org/petsc/petsc/src/1731673c3fe570066779d46b51a4aee7a45775ed/bin/petsc_gen_xdmf.py?at=master&fileviewer=file-view-default#petsc_gen_xdmf.py-9

This is what fails. You need to add something like

  6: "Wedge"

in the dictionary. See http://www.xdmf.org/index.php/XDMF_Model_and_Format


> I'd also welcome any thoughts on the prism mesh in general.
> Is it that uncommon to use and do you foresee other complications with it?
>

You need an element that works with prisms, but it seems you already have
one. I know
there is good work from here: https://arxiv.org/abs/1411.2940


> I fear I cannot change the discretization of the host model but maybe it
> makes sense to use a different grid for my radiative transfer code?
>

I do not really do RT, but would be happy to try and think about it.

  Thanks,

     Matt


> Many thanks,
>
>
> Fabian
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170526/0718e460/attachment.html>

From leejearl at 126.com  Sun May 28 01:31:48 2017
From: leejearl at 126.com (leejearl)
Date: Sun, 28 May 2017 14:31:48 +0800
Subject: [petsc-users] a question about PetscSectionCreate
Message-ID: <7a1aa1ce-fc8a-7c03-9219-995cea4f74b2@126.com>

Hi, PETSc developer:

     I need to create a PetscSection with a struct. The struct is 
defined as follow,

    typedef struct
    {
       PetscReal x;
       PetscInt id;
    } testStruct;

    When I run the program, I got a wrong output as follow,

    Vec Object: 1 MPI processes
   type: seq
2.
4.94066e-324
2.
4.94066e-324
2.
4.94066e-324
2.
4.94066e-324
2.
4.94066e-324
2.
4.94066e-324
2.
4.94066e-324
2.
4.94066e-324

But when I defined the struct as

    typedef struct
    {
       PetscReal x;
       PetscReal id;
    } testStruct;

The output is ok. It seems that  there is some wrong with the memories 
when I define the "id" as a PetscInt type.

I can not find out the reasons, and any one can help me with it? The 
source file "test.c" is attached.


Thanks,

leejearl

-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.c
Type: text/x-csrc
Size: 2198 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170528/fd8f4dba/attachment.c>

From dave.mayhem23 at gmail.com  Sun May 28 01:49:25 2017
From: dave.mayhem23 at gmail.com (Dave May)
Date: Sun, 28 May 2017 06:49:25 +0000
Subject: [petsc-users] a question about PetscSectionCreate
In-Reply-To: <7a1aa1ce-fc8a-7c03-9219-995cea4f74b2@126.com>
References: <7a1aa1ce-fc8a-7c03-9219-995cea4f74b2@126.com>
Message-ID: <CAJ98EDry2BBqXBUXoX5KMJZ4E16SZpnE+eEGwWAALsD=gKxgpA@mail.gmail.com>

On Sun, 28 May 2017 at 08:31, leejearl <leejearl at 126.com> wrote:

> Hi, PETSc developer:
>
>      I need to create a PetscSection with a struct. The struct is
> defined as follow,
>
>     typedef struct
>     {
>        PetscReal x;
>        PetscInt id;
>     } testStruct;
>
>     When I run the program, I got a wrong output as follow,
>
>     Vec Object: 1 MPI processes
>    type: seq
> 2.
> 4.94066e-324
> 2.
> 4.94066e-324
> 2.
> 4.94066e-324
> 2.
> 4.94066e-324
> 2.
> 4.94066e-324
> 2.
> 4.94066e-324
> 2.
> 4.94066e-324
> 2.
> 4.94066e-324
>
> But when I defined the struct as
>
>     typedef struct
>     {
>        PetscReal x;
>        PetscReal id;
>     } testStruct;
>
> The output is ok. It seems that  there is some wrong with the memories
> when I define the "id" as a PetscInt type.


Yep.


>
> I can not find out the reasons, and any one can help me with it?


The Vec object can only store quantities of type PetscScalar. It cannot
store PetscInt's and it definitely cannot represent a mixture of
PetscReal's and PetscInt's.


Thanks,
 Dave

The
> source file "test.c" is attached.
>
>
> Thanks,
>
> leejearl
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170528/c1bc4534/attachment.html>

From leejearl at mail.nwpu.edu.cn  Sun May 28 02:30:22 2017
From: leejearl at mail.nwpu.edu.cn (leejearl)
Date: Sun, 28 May 2017 15:30:22 +0800
Subject: [petsc-users] a question about PetscSectionCreate
Message-ID: <19d62bf5-8c56-0e99-b8c7-0bee39ad01d4@mail.nwpu.edu.cn>

Hi, Dave:
     Thank you for your kind reply. If I want to store a mixture of 
PetscReal and PetscInt, how can I do it?

     Thanks,
leejearl


From dave.mayhem23 at gmail.com  Sun May 28 02:44:31 2017
From: dave.mayhem23 at gmail.com (Dave May)
Date: Sun, 28 May 2017 07:44:31 +0000
Subject: [petsc-users] a question about PetscSectionCreate
In-Reply-To: <19d62bf5-8c56-0e99-b8c7-0bee39ad01d4@mail.nwpu.edu.cn>
References: <19d62bf5-8c56-0e99-b8c7-0bee39ad01d4@mail.nwpu.edu.cn>
Message-ID: <CAJ98EDr0d0LnRcC1j8H23o1_4fgoyq1k45TvJ3_0bsdfYcrXtQ@mail.gmail.com>

On Sun, 28 May 2017 at 09:30, leejearl <leejearl at mail.nwpu.edu.cn> wrote:

> Hi, Dave:
>      Thank you for your kind reply. If I want to store a mixture of
> PetscReal and PetscInt, how can I do it?


What operations do you need to perform with your struct?


>
>      Thanks,
> leejearl
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170528/a65093ae/attachment.html>

From leejearl at 126.com  Sun May 28 03:16:36 2017
From: leejearl at 126.com (leejearl)
Date: Sun, 28 May 2017 16:16:36 +0800
Subject: [petsc-users] a question about PetscSectionCreate
Message-ID: <de99b92f-825a-3bac-978e-f3cd4bc86d90@126.com>

Hi, Dave:
    I want to store a PetscInt tag for every cell of the dmplex with the struct.
    Thanks,

leejearl

>>/Hi, Dave: />/ > Thank you for your kind reply. If I want to store a mixture of />/PetscReal and PetscInt, how can I do it? /

>What operations do you need to perform with your struct?


>>//>/ > Thanks, />/ > leejearl />>//>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170528/8ff3ddb6/attachment.html>

From lawrence.mitchell at imperial.ac.uk  Sun May 28 06:02:16 2017
From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell)
Date: Sun, 28 May 2017 12:02:16 +0100
Subject: [petsc-users] a question about PetscSectionCreate
In-Reply-To: <de99b92f-825a-3bac-978e-f3cd4bc86d90@126.com>
References: <de99b92f-825a-3bac-978e-f3cd4bc86d90@126.com>
Message-ID: <D8DEF74E-BC3F-4659-8544-7F6E0CFDAD93@imperial.ac.uk>



> On 28 May 2017, at 09:16, leejearl <leejearl at 126.com> wrote:
> 
> Hi, Dave: I want to store a PetscInt tag for every cell of the dmplex with the struct. Thanks,

You probably want to use a DMLabel to store these ids. Unless you have a different I'd for every cell. 

Lawrence

From knepley at gmail.com  Sun May 28 06:32:09 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 28 May 2017 06:32:09 -0500
Subject: [petsc-users] a question about PetscSectionCreate
In-Reply-To: <CAJ98EDry2BBqXBUXoX5KMJZ4E16SZpnE+eEGwWAALsD=gKxgpA@mail.gmail.com>
References: <7a1aa1ce-fc8a-7c03-9219-995cea4f74b2@126.com>
	<CAJ98EDry2BBqXBUXoX5KMJZ4E16SZpnE+eEGwWAALsD=gKxgpA@mail.gmail.com>
Message-ID: <CAMYG4GmC8Wx5cxGUDe5OxPQ8RDdU1TvrCGJ_Z0HjkETfEkg8vw@mail.gmail.com>

On Sun, May 28, 2017 at 1:49 AM, Dave May <dave.mayhem23 at gmail.com> wrote:

>
> On Sun, 28 May 2017 at 08:31, leejearl <leejearl at 126.com> wrote:
>
>> Hi, PETSc developer:
>>
>>      I need to create a PetscSection with a struct. The struct is
>> defined as follow,
>>
>>     typedef struct
>>     {
>>        PetscReal x;
>>        PetscInt id;
>>     } testStruct;
>>
>>     When I run the program, I got a wrong output as follow,
>>
>>     Vec Object: 1 MPI processes
>>    type: seq
>> 2.
>> 4.94066e-324
>> 2.
>> 4.94066e-324
>> 2.
>> 4.94066e-324
>> 2.
>> 4.94066e-324
>> 2.
>> 4.94066e-324
>> 2.
>> 4.94066e-324
>> 2.
>> 4.94066e-324
>> 2.
>> 4.94066e-324
>>
>> But when I defined the struct as
>>
>>     typedef struct
>>     {
>>        PetscReal x;
>>        PetscReal id;
>>     } testStruct;
>>
>> The output is ok. It seems that  there is some wrong with the memories
>> when I define the "id" as a PetscInt type.
>
>
> Yep.
>
>
>>
>> I can not find out the reasons, and any one can help me with it?
>
>
> The Vec object can only store quantities of type PetscScalar. It cannot
> store PetscInt's and it definitely cannot represent a mixture of
> PetscReal's and PetscInt's.
>

Dave is correct. However this usage completely misses the point of Section.
Section is a device for storing indices into
ANY storage, not just Vec and IS. I would manage an array of the structs
that I allocate, and use the Section to index into.

   Matt


>
> Thanks,
>  Dave
>
> The
>> source file "test.c" is attached.
>>
>>
>> Thanks,
>>
>> leejearl
>>
>>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170528/79337776/attachment-0001.html>

From knepley at gmail.com  Sun May 28 06:35:11 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 28 May 2017 06:35:11 -0500
Subject: [petsc-users] a question about PetscSectionCreate
In-Reply-To: <D8DEF74E-BC3F-4659-8544-7F6E0CFDAD93@imperial.ac.uk>
References: <de99b92f-825a-3bac-978e-f3cd4bc86d90@126.com>
	<D8DEF74E-BC3F-4659-8544-7F6E0CFDAD93@imperial.ac.uk>
Message-ID: <CAMYG4G=xNRgkzKnwJDH0H4adUGUyJ154Vroiaxwqo14JGSRf1w@mail.gmail.com>

On Sun, May 28, 2017 at 6:02 AM, Lawrence Mitchell <
lawrence.mitchell at imperial.ac.uk> wrote:

>
>
> > On 28 May 2017, at 09:16, leejearl <leejearl at 126.com> wrote:
> >
> > Hi, Dave: I want to store a PetscInt tag for every cell of the dmplex
> with the struct. Thanks,
>
> You probably want to use a DMLabel to store these ids. Unless you have a
> different I'd for every cell.


Several things to think about:

1) If you want to store a tag for EVERY cell, then just use an IS. Cell
numberings are guaranteed to be
    contiguous and start from 0.

2) If you want to tag only SOME cells, then use a DMLabel as Lawrence
suggests. This uses hash tables
    for fast construction, and sorted lists for fast search and retrieval.

3) If you want to store a VARIABLE number of data items per cell, then use
a Section and an array that you allocate.

   Matt


>
> Lawrence
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170528/788f5874/attachment.html>

From leejearl at 126.com  Sun May 28 21:57:05 2017
From: leejearl at 126.com (leejearl)
Date: Mon, 29 May 2017 10:57:05 +0800
Subject: [petsc-users] a question about PetscSectionCreate
In-Reply-To: <CAMYG4GmC8Wx5cxGUDe5OxPQ8RDdU1TvrCGJ_Z0HjkETfEkg8vw@mail.gmail.com>
References: <7a1aa1ce-fc8a-7c03-9219-995cea4f74b2@126.com>
	<CAJ98EDry2BBqXBUXoX5KMJZ4E16SZpnE+eEGwWAALsD=gKxgpA@mail.gmail.com>
	<CAMYG4GmC8Wx5cxGUDe5OxPQ8RDdU1TvrCGJ_Z0HjkETfEkg8vw@mail.gmail.com>
Message-ID: <2a54add1-7e17-dd50-4a2b-58d905449bc5@126.com>

Thanks for your kind replies. I will give a result after the test.

On 2017?05?28? 19:32, Matthew Knepley wrote:
> On Sun, May 28, 2017 at 1:49 AM, Dave May <dave.mayhem23 at gmail.com 
> <mailto:dave.mayhem23 at gmail.com>> wrote:
>
>
>     On Sun, 28 May 2017 at 08:31, leejearl <leejearl at 126.com
>     <mailto:leejearl at 126.com>> wrote:
>
>         Hi, PETSc developer:
>
>              I need to create a PetscSection with a struct. The struct is
>         defined as follow,
>
>             typedef struct
>             {
>                PetscReal x;
>                PetscInt id;
>             } testStruct;
>
>             When I run the program, I got a wrong output as follow,
>
>             Vec Object: 1 MPI processes
>            type: seq
>         2.
>         4.94066e-324
>         2.
>         4.94066e-324
>         2.
>         4.94066e-324
>         2.
>         4.94066e-324
>         2.
>         4.94066e-324
>         2.
>         4.94066e-324
>         2.
>         4.94066e-324
>         2.
>         4.94066e-324
>
>         But when I defined the struct as
>
>             typedef struct
>             {
>                PetscReal x;
>                PetscReal id;
>             } testStruct;
>
>         The output is ok. It seems that  there is some wrong with the
>         memories
>         when I define the "id" as a PetscInt type.
>
>
>     Yep.
>
>
>
>         I can not find out the reasons, and any one can help me with it?
>
>
>     The Vec object can only store quantities of type PetscScalar. It
>     cannot store PetscInt's and it definitely cannot represent a
>     mixture of PetscReal's and PetscInt's.
>
>
> Dave is correct. However this usage completely misses the point of 
> Section. Section is a device for storing indices into
> ANY storage, not just Vec and IS. I would manage an array of the 
> structs that I allocate, and use the Section to index into.
>
>    Matt
>
>
>     Thanks,
>      Dave
>
>         The
>         source file "test.c" is attached.
>
>
>         Thanks,
>
>         leejearl
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/%7Emk51/>

-- 
??
???????????????
Phone: 17792092487
QQ: 188524324

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170529/bb28b1eb/attachment.html>

From leejearl at 126.com  Mon May 29 01:39:26 2017
From: leejearl at 126.com (leejearl)
Date: Mon, 29 May 2017 14:39:26 +0800
Subject: [petsc-users] a question about PetscSectionCreate
In-Reply-To: <CAMYG4G=xNRgkzKnwJDH0H4adUGUyJ154Vroiaxwqo14JGSRf1w@mail.gmail.com>
References: <de99b92f-825a-3bac-978e-f3cd4bc86d90@126.com>
	<D8DEF74E-BC3F-4659-8544-7F6E0CFDAD93@imperial.ac.uk>
	<CAMYG4G=xNRgkzKnwJDH0H4adUGUyJ154Vroiaxwqo14JGSRf1w@mail.gmail.com>
Message-ID: <9715fa58-bf80-7aca-d01a-c74cdcde5701@126.com>

Hi, all:
I have create a IS for every cell in dmplex by the following steps:
1. Creating a integer array which size is matched to the number of cells.
2. Use the routine "ISCreateGeneral" to create a corresponding IS.

Is there any routine which can create a IS for every cell in the dmplex 
directly?, and what is the difference between ISCopy() and ISDuplicate()?

Thanks,
leejearl

On 2017?05?28? 19:35, Matthew Knepley wrote:
> On Sun, May 28, 2017 at 6:02 AM, Lawrence Mitchell 
> <lawrence.mitchell at imperial.ac.uk 
> <mailto:lawrence.mitchell at imperial.ac.uk>> wrote:
>
>
>
>     > On 28 May 2017, at 09:16, leejearl <leejearl at 126.com
>     <mailto:leejearl at 126.com>> wrote:
>     >
>     > Hi, Dave: I want to store a PetscInt tag for every cell of the
>     dmplex with the struct. Thanks,
>
>     You probably want to use a DMLabel to store these ids. Unless you
>     have a different I'd for every cell.
>
>
> Several things to think about:
>
> 1) If you want to store a tag for EVERY cell, then just use an IS. 
> Cell numberings are guaranteed to be
>     contiguous and start from 0.
>
> 2) If you want to tag only SOME cells, then use a DMLabel as Lawrence 
> suggests. This uses hash tables
>     for fast construction, and sorted lists for fast search and retrieval.
>
> 3) If you want to store a VARIABLE number of data items per cell, then 
> use a Section and an array that you allocate.
>
>    Matt
>
>
>     Lawrence
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/%7Emk51/>

-- 
??
???????????????
Phone: 17792092487
QQ: 188524324

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170529/f7673e56/attachment.html>

From dave.mayhem23 at gmail.com  Mon May 29 02:47:56 2017
From: dave.mayhem23 at gmail.com (Dave May)
Date: Mon, 29 May 2017 07:47:56 +0000
Subject: [petsc-users] a question about PetscSectionCreate
In-Reply-To: <9715fa58-bf80-7aca-d01a-c74cdcde5701@126.com>
References: <de99b92f-825a-3bac-978e-f3cd4bc86d90@126.com>
	<D8DEF74E-BC3F-4659-8544-7F6E0CFDAD93@imperial.ac.uk>
	<CAMYG4G=xNRgkzKnwJDH0H4adUGUyJ154Vroiaxwqo14JGSRf1w@mail.gmail.com>
	<9715fa58-bf80-7aca-d01a-c74cdcde5701@126.com>
Message-ID: <CAJ98EDqExAd=K+-kJ=a8m2mAtYCxjkdn6p5weseyBo=rcLF2Mw@mail.gmail.com>

On Mon, 29 May 2017 at 08:39, leejearl <leejearl at 126.com> wrote:

> Hi, all:
> I have create a IS for every cell in dmplex by the following steps:
> 1. Creating a integer array which size is matched to the number of cells.
> 2. Use the routine "ISCreateGeneral" to create a corresponding IS.
>
> Is there any routine which can create a IS for every cell in the dmplex
> directly?,
>

I don't think so as Plex would have to somehow know what geom quantity to
use to define the size of IS (e.g. vertex, cell, face, edge)

and what is the difference between ISCopy() and ISDuplicate()?
>

ISDuplicate allocates memory for a new with the same comm and layout as the
original IS AND copies values from the original IS into the new one. (Note
that this is slightly different from other duplicate functions like
VecDuplicate which only allocate memory and does not copy values from the
orig vec.)

ISCopy does not allocate memory for the IS (passed as the second arg), it
only performs the copy of values.

Thanks
  Dave


>
> Thanks,
> leejearl
>
>
> On 2017?05?28? 19:35, Matthew Knepley wrote:
>
> On Sun, May 28, 2017 at 6:02 AM, Lawrence Mitchell <
> lawrence.mitchell at imperial.ac.uk> wrote:
>
>>
>>
>> > On 28 May 2017, at 09:16, leejearl <leejearl at 126.com> wrote:
>> >
>> > Hi, Dave: I want to store a PetscInt tag for every cell of the dmplex
>> with the struct. Thanks,
>>
>> You probably want to use a DMLabel to store these ids. Unless you have a
>> different I'd for every cell.
>
>
> Several things to think about:
>
> 1) If you want to store a tag for EVERY cell, then just use an IS. Cell
> numberings are guaranteed to be
>     contiguous and start from 0.
>
> 2) If you want to tag only SOME cells, then use a DMLabel as Lawrence
> suggests. This uses hash tables
>     for fast construction, and sorted lists for fast search and retrieval.
>
> 3) If you want to store a VARIABLE number of data items per cell, then use
> a Section and an array that you allocate.
>
>    Matt
>
>
>>
>> Lawrence
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>
>
> --
> ??
> ???????????????
> Phone: 17792092487
> QQ: 188524324
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170529/fae8a3cc/attachment.html>

From dnolte at dim.uchile.cl  Mon May 29 11:17:30 2017
From: dnolte at dim.uchile.cl (David Nolte)
Date: Mon, 29 May 2017 12:17:30 -0400
Subject: [petsc-users] petsc4py and python's logging module
Message-ID: <921c7e29-707c-eb57-1f8e-0b12a45aa7e9@dim.uchile.cl>

Dear all,

is it possible to use python's logging module
(https://docs.python.org/2/howto/logging.html) to handle PETSc output in
python, such as the residuals during a KSP/SNES solve?
I log my solver's activity to a file using the logging module, it would
be great to include the PETSc output aswell.

Regards,
David


From xinzhe.wu1990 at gmail.com  Mon May 29 11:19:20 2017
From: xinzhe.wu1990 at gmail.com (Xinzhe Wu)
Date: Mon, 29 May 2017 18:19:20 +0200
Subject: [petsc-users] Errors about PETSc MPI+GPU
Message-ID: <CAEni+++5u-JZ5_CLhJznXrLZH6oDc5CdcurndX9=TK0Gz8Y_bw@mail.gmail.com>

Dear all,

We have developed the codes with PETSc + SLEPc which works well on CPU
version. Now we want to try these codes with GPU + MPI, but get some weird
errors shown as below.

I have found someone talked about this problem here
http://lists.mcs.anl.gov/pipermail/petsc-dev/2016-March/018836.html , but I
can hardly understand it. Can anyone help me with these issues?


Thank you in advance!


[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR: Error in external library
[0]PETSC ERROR: CUBLAS error 1
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[0]PETSC ERROR: [2]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[2]PETSC ERROR: Error in external library
[2]PETSC ERROR: CUBLAS error 1
[2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[2]PETSC ERROR: Petsc Development GIT revision: v3.7.6-3965-gf375733  GIT
Date: 2017-05-28 10:32:02 -0500
[2]PETSC ERROR: ./hyperh on a arch-linux2-c-debug named romeo44 by xinzhewu
Mon May 29 18:03:58 2017
[2]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++
--with-fc=gfortran --download-mpich --download-fblaslapack
--with-visibility=0 --with-shared-libraries=0 --with-cuda=1 --with-thrust=1
--with-precision=double --with-clanguage=c
--with-pestc-arch=linux-c-no-debug-complex --with-scalar-type=complex
[2]PETSC ERROR: #1 PetscInitialize() line 906 in
/home/xinzhewu/Petsc-GPUs/petsc/src/sys/objects/pinit.c
[2]PETSC ERROR: #2 SlepcInitialize() line 259 in
/home/xinzhewu/Petsc-GPUs/slepc/src/sys/slepcinit.c


-- 
Xinzhe WU
Ph.D Student of Computer Science
Maison de la Simulation, CNRS USR3441
Building 565, CEA Saclay
91191, Gif-sur-Yvette, France
Tel: +33 (0) 1 69 08 59 93
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170529/3ca15c95/attachment-0001.html>

From lvella at gmail.com  Mon May 29 13:20:33 2017
From: lvella at gmail.com (Lucas Clemente Vella)
Date: Mon, 29 May 2017 15:20:33 -0300
Subject: [petsc-users] How to replace the default global database?
In-Reply-To: <1DEA680E-6BF8-459E-8AD2-3628E3B56BAF@mcs.anl.gov>
References: <CAGCathwr3taHTSxp3gH58oyPQvUgrB2rEz1Vyx8CC2o4dCQBXg@mail.gmail.com>
	<1DEA680E-6BF8-459E-8AD2-3628E3B56BAF@mcs.anl.gov>
Message-ID: <CAGCathxS8LX0P+JuJFFKDwCi+1zPR0_ykwRFYk_6djxHDGnDhw@mail.gmail.com>

Hi. Not really what I need. Every time I run my program, I need to pass the
non-trivial solver setup that works as a command line argument (I am using
Schur complement with BCGS and Hypre as internal KSP and PC). I want to
hardcode the complex solver setup so that I can use it depending on a
runtime switch. Like this:

if(use_schur) {
  // change global PETSc options to the settings I know to work.
}

my_solver_struct *s = create_petsc_solver();

if(use_schur) {
  // restore original PETSc options.
}

2017-05-26 19:55 GMT-03:00 Barry Smith <bsmith at mcs.anl.gov>:

>
>    I do not think you want to do this. The standard way we handle what it
> seems you need is to use PetscObjectSetOptionsPrefix() for the different
> PETSc objects giving them different prefixes and then appending the prefix
> for the options when you provide them to the options database. For example
> if you have a KSP for a flow solver and a KSP for a pressure solver you
> might do
>
>     KSPCreate(PETSC_COMM_WORLD,&flow);
>     KSPSetOptionsPrefix(flow,"u");
>
>     KSPCreate(PETSC_COMM_WORLD,&pressure);
>     KSPSetOptionsPrefix(pressure,"p");
>
>     and set options like
>
>      -u_pc_type jacobi
>
>      -p_pc_type gamg
>
>     Will this do what you need?
>
>    Barry
>
>      Because the options data base can be accessed by any object at any
> time (not just when it is created), it doesn't make sense to change the
> default options database ever because it would be uncertain what objects
> the change affected or did not affect.
>
>
>
>
>
>
> > On May 26, 2017, at 4:20 PM, Lucas Clemente Vella <lvella at gmail.com>
> wrote:
> >
> > Here is what I want to do:
> > - Take the global PetscOptions and store it somewhere;
> > - Create my own PetscOptions;
> > - Populate it with my options;
> > - Set my new PetscOptions as the global default;
> > - Create some PETSc objects;
> > - Restore old PetscOptions as default global;
> > - Destroy the PetscOptions I created.
> >
> > I could not find a function to replace global PetscOptions, or to copy
> one PetscOptions to another. Is it possible to do what I want to do? How?
> >
> > --
> > Lucas Clemente Vella
> > lvella at gmail.com
>
>


-- 
Lucas Clemente Vella
lvella at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170529/a49e866e/attachment.html>

From bsmith at mcs.anl.gov  Mon May 29 13:31:13 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 29 May 2017 13:31:13 -0500
Subject: [petsc-users] How to replace the default global database?
In-Reply-To: <CAGCathxS8LX0P+JuJFFKDwCi+1zPR0_ykwRFYk_6djxHDGnDhw@mail.gmail.com>
References: <CAGCathwr3taHTSxp3gH58oyPQvUgrB2rEz1Vyx8CC2o4dCQBXg@mail.gmail.com>
	<1DEA680E-6BF8-459E-8AD2-3628E3B56BAF@mcs.anl.gov>
	<CAGCathxS8LX0P+JuJFFKDwCi+1zPR0_ykwRFYk_6djxHDGnDhw@mail.gmail.com>
Message-ID: <131E7428-B018-4429-8834-BB08ADDCF54C@mcs.anl.gov>


> On May 29, 2017, at 1:20 PM, Lucas Clemente Vella <lvella at gmail.com> wrote:
> 
> Hi. Not really what I need. Every time I run my program, I need to pass the non-trivial solver setup that works as a command line argument (I am using Schur complement with BCGS and Hypre as internal KSP and PC). I want to hardcode the complex solver setup so that I can use it depending on a runtime switch. Like this:
> 
> if(use_schur) {
>   // change global PETSc options to the settings I know to work.

   This is ok. You can use PetscOptionsSetValue() or PetscOptionsInsert() to put the values in. 
> }
> 
> my_solver_struct *s = create_petsc_solver();
> 
> if(use_schur) {
>   // restore original PETSc options.
> }

    Why do you need to "restore original PETSc options" at this point? What are the options used for that they need to be reset? If they control other solvers, for example, then just give them a different prefix.


> 
> 2017-05-26 19:55 GMT-03:00 Barry Smith <bsmith at mcs.anl.gov>:
> 
>    I do not think you want to do this. The standard way we handle what it seems you need is to use PetscObjectSetOptionsPrefix() for the different PETSc objects giving them different prefixes and then appending the prefix for the options when you provide them to the options database. For example if you have a KSP for a flow solver and a KSP for a pressure solver you might do
> 
>     KSPCreate(PETSC_COMM_WORLD,&flow);
>     KSPSetOptionsPrefix(flow,"u");
> 
>     KSPCreate(PETSC_COMM_WORLD,&pressure);
>     KSPSetOptionsPrefix(pressure,"p");
> 
>     and set options like
> 
>      -u_pc_type jacobi
> 
>      -p_pc_type gamg
> 
>     Will this do what you need?
> 
>    Barry
> 
>      Because the options data base can be accessed by any object at any time (not just when it is created), it doesn't make sense to change the default options database ever because it would be uncertain what objects the change affected or did not affect.
> 
> 
> 
> 
> 
> 
> > On May 26, 2017, at 4:20 PM, Lucas Clemente Vella <lvella at gmail.com> wrote:
> >
> > Here is what I want to do:
> > - Take the global PetscOptions and store it somewhere;
> > - Create my own PetscOptions;
> > - Populate it with my options;
> > - Set my new PetscOptions as the global default;
> > - Create some PETSc objects;
> > - Restore old PetscOptions as default global;
> > - Destroy the PetscOptions I created.
> >
> > I could not find a function to replace global PetscOptions, or to copy one PetscOptions to another. Is it possible to do what I want to do? How?
> >
> > --
> > Lucas Clemente Vella
> > lvella at gmail.com
> 
> 
> 
> 
> -- 
> Lucas Clemente Vella
> lvella at gmail.com


From knepley at gmail.com  Mon May 29 14:06:14 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 29 May 2017 14:06:14 -0500
Subject: [petsc-users] Errors about PETSc MPI+GPU
In-Reply-To: <CAEni+++5u-JZ5_CLhJznXrLZH6oDc5CdcurndX9=TK0Gz8Y_bw@mail.gmail.com>
References: <CAEni+++5u-JZ5_CLhJznXrLZH6oDc5CdcurndX9=TK0Gz8Y_bw@mail.gmail.com>
Message-ID: <CAMYG4Gn9byOWH4FCSB12ceYCoDJEuU08w8Gj2roTu2SNVEcKsA@mail.gmail.com>

On Mon, May 29, 2017 at 11:19 AM, Xinzhe Wu <xinzhe.wu1990 at gmail.com> wrote:

> Dear all,
>
> We have developed the codes with PETSc + SLEPc which works well on CPU
> version. Now we want to try these codes with GPU + MPI, but get some weird
> errors shown as below.
>
> I have found someone talked about this problem here
> http://lists.mcs.anl.gov/pipermail/petsc-dev/2016-March/018836.html , but
> I can hardly understand it. Can anyone help me with these issues?
>

The answer is here:

>>>>* I think the error messages you get is pretty descriptive regarding the root cause. You are probably running out of GPU memory. Since you are running on a GTX 285 you can't use MPS [1] therefore each MPI process has its own context on the GPU. Each context needs to initialize some data on the GPU (used for local variables and so on). The required amount needed for this depends on the size of the GPUs (essentially correlates with the maximum number of concurrently active threads). This can easily be 50-100MB. So with only 1GB of GPU memory you are probably using all GPUs memory for context data and nothing is available for your application. Unfortunately there is no good way to debug this with GeForce. On Tesla nvidia-smi does show you all processes that have a context on a GPU together with their memory consumption.*

It appears that you are running out of GPU memory. This can happen if you
use too many
MPI processes for a single GPU.

  Thanks,

     Matt


> Thank you in advance!
>
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR: Error in external library
> [0]PETSC ERROR: CUBLAS error 1
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [0]PETSC ERROR: [2]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [2]PETSC ERROR: Error in external library
> [2]PETSC ERROR: CUBLAS error 1
> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [2]PETSC ERROR: Petsc Development GIT revision: v3.7.6-3965-gf375733  GIT
> Date: 2017-05-28 10:32:02 -0500
> [2]PETSC ERROR: ./hyperh on a arch-linux2-c-debug named romeo44 by
> xinzhewu Mon May 29 18:03:58 2017
> [2]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --download-mpich --download-fblaslapack
> --with-visibility=0 --with-shared-libraries=0 --with-cuda=1 --with-thrust=1
> --with-precision=double --with-clanguage=c --with-pestc-arch=linux-c-no-debug-complex
> --with-scalar-type=complex
> [2]PETSC ERROR: #1 PetscInitialize() line 906 in /home/xinzhewu/Petsc-GPUs/
> petsc/src/sys/objects/pinit.c
> [2]PETSC ERROR: #2 SlepcInitialize() line 259 in /home/xinzhewu/Petsc-GPUs/
> slepc/src/sys/slepcinit.c
>
>
> --
> Xinzhe WU
> Ph.D Student of Computer Science
> Maison de la Simulation, CNRS USR3441
> Building 565, CEA Saclay
> 91191, Gif-sur-Yvette, France
> Tel: +33 (0) 1 69 08 59 93 <+33%201%2069%2008%2059%2093>
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170529/12920487/attachment.html>

From knepley at gmail.com  Mon May 29 14:30:39 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 29 May 2017 14:30:39 -0500
Subject: [petsc-users] petsc4py and python's logging module
In-Reply-To: <921c7e29-707c-eb57-1f8e-0b12a45aa7e9@dim.uchile.cl>
References: <921c7e29-707c-eb57-1f8e-0b12a45aa7e9@dim.uchile.cl>
Message-ID: <CAMYG4GnuYd0abfNmt9=u3OymB4HNgJU-8je_d+0030fpn_Yr4g@mail.gmail.com>

On Mon, May 29, 2017 at 11:17 AM, David Nolte <dnolte at dim.uchile.cl> wrote:

> Dear all,
>
> is it possible to use python's logging module
> (https://docs.python.org/2/howto/logging.html) to handle PETSc output in
> python, such as the residuals during a KSP/SNES solve?
> I log my solver's activity to a file using the logging module, it would
> be great to include the PETSc output aswell.
>

I think the best way to do this is the following:

  1) Create a PetscViewer implementation, say PyASCII, that logs to the
Python descriptor. This might be as easy as
      just augmenting the ASCII viewer to grab this descriptor on creation

  2) Then you can hook this viewer to the monitor using options

      -ksp_monitor pyascii

  Thanks,

     Matt


> Regards,
> David
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170529/e7ff912d/attachment.html>

From dalcinl at gmail.com  Mon May 29 14:55:52 2017
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Mon, 29 May 2017 22:55:52 +0300
Subject: [petsc-users] petsc4py and python's logging module
In-Reply-To: <921c7e29-707c-eb57-1f8e-0b12a45aa7e9@dim.uchile.cl>
References: <921c7e29-707c-eb57-1f8e-0b12a45aa7e9@dim.uchile.cl>
Message-ID: <CAEcYPwCnhLFinjMLvhxJ0LqON3rXvGnPzj4nR-wXxXkThHDMQw@mail.gmail.com>

On 29 May 2017 at 19:17, David Nolte <dnolte at dim.uchile.cl> wrote:
> Dear all,
>
> is it possible to use python's logging module
> (https://docs.python.org/2/howto/logging.html) to handle PETSc output in
> python, such as the residuals during a KSP/SNES solve?
> I log my solver's activity to a file using the logging module, it would
> be great to include the PETSc output aswell.
>

Not sure if this is what you really want, but you could...

1) Use {ksp|snes}.setConvergenceHistory() before solve, then
{ksp|snes}.getConvergenceHistory() after solve, you will get arrays
with the residual history, then you can do whatever you want with
them.

2) Implement a KSP/SNES monitor in a Python function and call
{ksp|snes}.setMonitor(), then you can use python's logging inside your
monitor function.


-- 
Lisandro Dalcin
============
Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa

Office Phone: +966 12 808-0459

From bsmith at mcs.anl.gov  Mon May 29 15:10:39 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 29 May 2017 15:10:39 -0500
Subject: [petsc-users] petsc4py and python's logging module
In-Reply-To: <921c7e29-707c-eb57-1f8e-0b12a45aa7e9@dim.uchile.cl>
References: <921c7e29-707c-eb57-1f8e-0b12a45aa7e9@dim.uchile.cl>
Message-ID: <01274A21-4416-4FA4-AAD5-FE117E9424C6@mcs.anl.gov>


   If you want to log all PETSc ASCII output you can use 

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscVFPrintf.html  

Just don't use the FILE *fd initial argument, instead pass the output into a Python function that calls the logger as desired.

   Barry



> On May 29, 2017, at 11:17 AM, David Nolte <dnolte at dim.uchile.cl> wrote:
> 
> Dear all,
> 
> is it possible to use python's logging module
> (https://docs.python.org/2/howto/logging.html) to handle PETSc output in
> python, such as the residuals during a KSP/SNES solve?
> I log my solver's activity to a file using the logging module, it would
> be great to include the PETSc output aswell.
> 
> Regards,
> David
> 


From lvella at gmail.com  Mon May 29 15:24:03 2017
From: lvella at gmail.com (Lucas Clemente Vella)
Date: Mon, 29 May 2017 17:24:03 -0300
Subject: [petsc-users] Can't retrieve inner KSP from Schur complement
Message-ID: <CAGCathw3ROADD2z-k_WBmWJFq1hbNBOOiDmXM=o1h=kuTdC2eQ@mail.gmail.com>

I want to set a custom convergence test for the inner KSPs of Schur
complement method, so I am using PCFieldSplitGetSubKSP() to get the inner
KSPs:

    int n_subksp;
    KSP *subksp = NULL;

    PCFieldSplitGetSubKSP(s->pc, &n_subksp, &subksp);
    assert(n_subksp == 2);

But I get a segmentation fault on MatSchurComplementGetKSP(). From file
src/ksp/ksp/utils/schurm.c (line 320):

PetscErrorCode MatSchurComplementGetKSP(Mat S, KSP *ksp)
{
  Mat_SchurComplement *Na;

  PetscFunctionBegin;
  PetscValidHeaderSpecific(S,MAT_CLASSID,1);
  PetscValidPointer(ksp,2);
  Na   = (Mat_SchurComplement*) S->data;
  *ksp = Na->ksp; // <<<<< segfaults on this line, 'Na' is an invalid
pointer...
  PetscFunctionReturn(0);
}

This is the stack trace given by valgrind:

==13559== Invalid read of size 8
==13559==    at 0x56B8780: MatSchurComplementGetKSP (schurm.c:320)
==13559==    by 0x55F5B08: PCFieldSplitGetSubKSP_FieldSplit_Schur(_p_PC*,
int*, _p_KSP***) (fieldsplit.c:1367)
==13559==    by 0x5605187: PCFieldSplitGetSubKSP (fieldsplit.c:1869)
==13559==    by 0x166305: set_singular_convergence_test (solver-petsc.c:293)
### irrelevant calls, from inside my program
==13559==  Address 0x6c0 is not stack'd, malloc'd or (recently) free'd
==13559==

I tried doing this operation both before and after MatAssembly*() calls,
and with both I get the same result. Petsc version is 3.7.5, installed from
Ubuntu repository. Is this a bug? Or I am doing it wrong?

-- 
Lucas Clemente Vella
lvella at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170529/7581fb96/attachment.html>

From bsmith at mcs.anl.gov  Mon May 29 15:32:22 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 29 May 2017 15:32:22 -0500
Subject: [petsc-users] Can't retrieve inner KSP from Schur complement
In-Reply-To: <CAGCathw3ROADD2z-k_WBmWJFq1hbNBOOiDmXM=o1h=kuTdC2eQ@mail.gmail.com>
References: <CAGCathw3ROADD2z-k_WBmWJFq1hbNBOOiDmXM=o1h=kuTdC2eQ@mail.gmail.com>
Message-ID: <7F5B0AFF-558C-4E06-887F-3838456A3BB1@mcs.anl.gov>


   Likely the problem is that the inner objects do not yet exist when you are trying to set the options. It is kind of tricky to handle the construction of these multiple nested objects and when inner objects actually get created.

   Make sure you call KSPSetUp() on the outer KSP before you call this. But this may still not be enough to insure that this inner object has yet been created. Let us know.

   Barry

I will add any error check for Na being null so it prints a useful error message instead of crashing.


    
> On May 29, 2017, at 3:24 PM, Lucas Clemente Vella <lvella at gmail.com> wrote:
> 
> I want to set a custom convergence test for the inner KSPs of Schur complement method, so I am using PCFieldSplitGetSubKSP() to get the inner KSPs:
> 
>     int n_subksp;
>     KSP *subksp = NULL;
> 
>     PCFieldSplitGetSubKSP(s->pc, &n_subksp, &subksp);
>     assert(n_subksp == 2);
> 
> But I get a segmentation fault on MatSchurComplementGetKSP(). From file src/ksp/ksp/utils/schurm.c (line 320):
> 
> PetscErrorCode MatSchurComplementGetKSP(Mat S, KSP *ksp)
> {
>   Mat_SchurComplement *Na;
> 
>   PetscFunctionBegin;
>   PetscValidHeaderSpecific(S,MAT_CLASSID,1);
>   PetscValidPointer(ksp,2);
>   Na   = (Mat_SchurComplement*) S->data;
>   *ksp = Na->ksp; // <<<<< segfaults on this line, 'Na' is an invalid pointer...
>   PetscFunctionReturn(0);
> }
> 
> This is the stack trace given by valgrind:
> 
> ==13559== Invalid read of size 8
> ==13559==    at 0x56B8780: MatSchurComplementGetKSP (schurm.c:320)
> ==13559==    by 0x55F5B08: PCFieldSplitGetSubKSP_FieldSplit_Schur(_p_PC*, int*, _p_KSP***) (fieldsplit.c:1367)
> ==13559==    by 0x5605187: PCFieldSplitGetSubKSP (fieldsplit.c:1869)
> ==13559==    by 0x166305: set_singular_convergence_test (solver-petsc.c:293)
> ### irrelevant calls, from inside my program
> ==13559==  Address 0x6c0 is not stack'd, malloc'd or (recently) free'd
> ==13559== 
> 
> I tried doing this operation both before and after MatAssembly*() calls, and with both I get the same result. Petsc version is 3.7.5, installed from Ubuntu repository. Is this a bug? Or I am doing it wrong?
> 
> -- 
> Lucas Clemente Vella
> lvella at gmail.com


From a.croucher at auckland.ac.nz  Mon May 29 17:13:31 2017
From: a.croucher at auckland.ac.nz (Adrian Croucher)
Date: Tue, 30 May 2017 10:13:31 +1200
Subject: [petsc-users] DMPlex export to hdf5/vtk for triangle/prism mesh
In-Reply-To: <mailman.33.1495904404.8805.petsc-users@mcs.anl.gov>
References: <mailman.33.1495904404.8805.petsc-users@mcs.anl.gov>
Message-ID: <819c5072-372d-5f2d-49ef-ca654e96f749@auckland.ac.nz>

hi,

I was asking about support for exactly these 6-node wedge elements in 
DMPlex back in January.

At the time, there was no support for them. Has there been some progress 
since then?

We are going to need them before we can release our software, which 
we're aiming to do by the end of the year.

Cheers, Adrian

> Message: 4
> Date: Fri, 26 May 2017 22:40:40 -0500
> From: Matthew Knepley <knepley at gmail.com>
> To: "Fabian.Jakub" <Fabian.Jakub at physik.uni-muenchen.de>
> Cc: PETSc <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] DMPlex export to hdf5/vtk for
> 	triangle/prism mesh
> Message-ID:
> 	<CAMYG4GkVOTtFL_OBpUBPGEcBTikMbLOb5ruz=VU7tVTO6JQHTQ at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> On Fri, May 26, 2017 at 12:27 PM, Fabian.Jakub <
> Fabian.Jakub at physik.uni-muenchen.de> wrote:
>
> > Dear Petsc Team,
> >
> > I am playing around with DMPlex, using it to generate the Mesh for the
> > ICON weather model(http://doi.org/10.1002/2015MS000431), which employs a
> > triangle mesh horizontally and columns, vertically.
> >
> > This results in a grid, looking like prisms, where top and bottom faces
> > are triangles and side faces are rectangles.
> >
> > I was delighted to see that I could export the triangle DMPlex (2d Mesh)
> > to hdf5 and use petsc_gen_xdmf.py to then visualize the mesh in
> > visit/paraview.
> > This is especially nice when exporting petscsections/vectors directly to
> > VTK.
> >
>
> Great.
>
>
> > I then tried the same approach for the prism grid in 3D.
> > I attached the code for one single cell, as well as the output in hdf5.
> >
> > However, trying to convert the hdf5 output, it fails with:
> >
> > make prism.xmf
> >
> > $PETSC_DIR/bin/petsc_gen_xdmf.py prism.h5
> > Traceback (most recent call last):
> >   File
> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
> > bin/petsc_gen_xdmf.py",
> > line 241, in <module>
> >     generateXdmf(f)
> >   File
> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
> > bin/petsc_gen_xdmf.py",
> > line 235, in generateXdmf
> >     Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells,
> > numCorners, cellDim, geomPath, numVertices, spaceDim, time, vfields,
> > cfields)
> >   File
> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
> > bin/petsc_gen_xdmf.py",
> > line 193, in write
> >     self.writeSpaceGridHeader(fp, numCells, numCorners, cellDim, spaceDim)
> >   File
> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
> > bin/petsc_gen_xdmf.py",
> > line 75, in writeSpaceGridHeader
> >     ''' % (self.cellMap[cellDim][numCorners], numCells, "XYZ" if
> > spaceDim > 2 else "XY"))
> > KeyError: 6
> >
> >
> > Also, if I try to export a vector directly to vtk, visit and paraview
> > fail to open it.
> >
> > My question is:
> > Is this a general limitation of these output formats, that I can not mix
> > faces with 3 and 4 vertices or is it a limitation of the
> > petsc_gen_xdmf.py or the VTK Viewer.
> >
>
> petsc_gen_xdmf. Take a look here
>
>
> https://bitbucket.org/petsc/petsc/src/1731673c3fe570066779d46b51a4aee7a45775ed/bin/petsc_gen_xdmf.py?at=master&fileviewer=file-view-default#petsc_gen_xdmf.py-9
>
> This is what fails. You need to add something like
>
>    6: "Wedge"
>
> in the dictionary. See http://www.xdmf.org/index.php/XDMF_Model_and_Format
>
>
> > I'd also welcome any thoughts on the prism mesh in general.
> > Is it that uncommon to use and do you foresee other complications with it?
> >
>
> You need an element that works with prisms, but it seems you already have
> one. I know
> there is good work from here: https://arxiv.org/abs/1411.2940
>
>
> > I fear I cannot change the discretization of the host model but maybe it
> > makes sense to use a different grid for my radiative transfer code?
> >
>
> I do not really do RT, but would be happy to try and think about it.
>
>    Thanks,
>
>       Matt
>
>
> > Many thanks,
> >
> >
> > Fabian
> >
>

-- 
Dr Adrian Croucher
Senior Research Fellow
Department of Engineering Science
University of Auckland, New Zealand
email: a.croucher at auckland.ac.nz
tel: +64 (0)9 923 4611



From Fabian.Jakub at physik.uni-muenchen.de  Mon May 29 17:43:11 2017
From: Fabian.Jakub at physik.uni-muenchen.de (Fabian Jakub)
Date: Tue, 30 May 2017 00:43:11 +0200
Subject: [petsc-users] DMPlex export to hdf5/vtk for triangle/prism mesh
In-Reply-To: <819c5072-372d-5f2d-49ef-ca654e96f749@auckland.ac.nz>
References: <mailman.33.1495904404.8805.petsc-users@mcs.anl.gov>
	<819c5072-372d-5f2d-49ef-ca654e96f749@auckland.ac.nz>
Message-ID: <f7818e14-3071-5596-e944-146a0fb7e5dc@physik.uni-muenchen.de>

Hi,

I did just as Matt suggested which works nicely... thanks by the way!

Inserted in the petsc_gen_xdmf.py the "6: 'Wedge' " entry .

Calling the example with:

<example_program> -show_plex hdf5:output.h5 -show_vector 
hdf5:output.h5::append

exports the mesh and a vector to hdf5.

Then calling

$PETSC_DIR/bin/petsc_gen_xdmf.py output.h5

correctly creates the descriptor file and just loads to visit.


Many thanks again to you, Matt :)


Fab


On 30.05.2017 00:13, Adrian Croucher wrote:
> hi,
>
> I was asking about support for exactly these 6-node wedge elements in 
> DMPlex back in January.
>
> At the time, there was no support for them. Has there been some 
> progress since then?
>
> We are going to need them before we can release our software, which 
> we're aiming to do by the end of the year.
>
> Cheers, Adrian
>
>> Message: 4
>> Date: Fri, 26 May 2017 22:40:40 -0500
>> From: Matthew Knepley <knepley at gmail.com>
>> To: "Fabian.Jakub" <Fabian.Jakub at physik.uni-muenchen.de>
>> Cc: PETSc <petsc-users at mcs.anl.gov>
>> Subject: Re: [petsc-users] DMPlex export to hdf5/vtk for
>>     triangle/prism mesh
>> Message-ID:
>>     <CAMYG4GkVOTtFL_OBpUBPGEcBTikMbLOb5ruz=VU7tVTO6JQHTQ at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> On Fri, May 26, 2017 at 12:27 PM, Fabian.Jakub <
>> Fabian.Jakub at physik.uni-muenchen.de> wrote:
>>
>> > Dear Petsc Team,
>> >
>> > I am playing around with DMPlex, using it to generate the Mesh for the
>> > ICON weather model(http://doi.org/10.1002/2015MS000431), which 
>> employs a
>> > triangle mesh horizontally and columns, vertically.
>> >
>> > This results in a grid, looking like prisms, where top and bottom 
>> faces
>> > are triangles and side faces are rectangles.
>> >
>> > I was delighted to see that I could export the triangle DMPlex (2d 
>> Mesh)
>> > to hdf5 and use petsc_gen_xdmf.py to then visualize the mesh in
>> > visit/paraview.
>> > This is especially nice when exporting petscsections/vectors 
>> directly to
>> > VTK.
>> >
>>
>> Great.
>>
>>
>> > I then tried the same approach for the prism grid in 3D.
>> > I attached the code for one single cell, as well as the output in 
>> hdf5.
>> >
>> > However, trying to convert the hdf5 output, it fails with:
>> >
>> > make prism.xmf
>> >
>> > $PETSC_DIR/bin/petsc_gen_xdmf.py prism.h5
>> > Traceback (most recent call last):
>> >   File
>> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
>> > bin/petsc_gen_xdmf.py",
>> > line 241, in <module>
>> >     generateXdmf(f)
>> >   File
>> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
>> > bin/petsc_gen_xdmf.py",
>> > line 235, in generateXdmf
>> >     Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells,
>> > numCorners, cellDim, geomPath, numVertices, spaceDim, time, vfields,
>> > cfields)
>> >   File
>> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
>> > bin/petsc_gen_xdmf.py",
>> > line 193, in write
>> >     self.writeSpaceGridHeader(fp, numCells, numCorners, cellDim, 
>> spaceDim)
>> >   File
>> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
>> > bin/petsc_gen_xdmf.py",
>> > line 75, in writeSpaceGridHeader
>> >     ''' % (self.cellMap[cellDim][numCorners], numCells, "XYZ" if
>> > spaceDim > 2 else "XY"))
>> > KeyError: 6
>> >
>> >
>> > Also, if I try to export a vector directly to vtk, visit and paraview
>> > fail to open it.
>> >
>> > My question is:
>> > Is this a general limitation of these output formats, that I can 
>> not mix
>> > faces with 3 and 4 vertices or is it a limitation of the
>> > petsc_gen_xdmf.py or the VTK Viewer.
>> >
>>
>> petsc_gen_xdmf. Take a look here
>>
>>
>> https://bitbucket.org/petsc/petsc/src/1731673c3fe570066779d46b51a4aee7a45775ed/bin/petsc_gen_xdmf.py?at=master&fileviewer=file-view-default#petsc_gen_xdmf.py-9 
>>
>>
>> This is what fails. You need to add something like
>>
>>    6: "Wedge"
>>
>> in the dictionary. See 
>> http://www.xdmf.org/index.php/XDMF_Model_and_Format
>>
>>
>> > I'd also welcome any thoughts on the prism mesh in general.
>> > Is it that uncommon to use and do you foresee other complications 
>> with it?
>> >
>>
>> You need an element that works with prisms, but it seems you already 
>> have
>> one. I know
>> there is good work from here: https://arxiv.org/abs/1411.2940
>>
>>
>> > I fear I cannot change the discretization of the host model but 
>> maybe it
>> > makes sense to use a different grid for my radiative transfer code?
>> >
>>
>> I do not really do RT, but would be happy to try and think about it.
>>
>>    Thanks,
>>
>>       Matt
>>
>>
>> > Many thanks,
>> >
>> >
>> > Fabian
>> >
>>
>


From knepley at gmail.com  Mon May 29 19:25:59 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 29 May 2017 19:25:59 -0500
Subject: [petsc-users] DMPlex export to hdf5/vtk for triangle/prism mesh
In-Reply-To: <819c5072-372d-5f2d-49ef-ca654e96f749@auckland.ac.nz>
References: <mailman.33.1495904404.8805.petsc-users@mcs.anl.gov>
	<819c5072-372d-5f2d-49ef-ca654e96f749@auckland.ac.nz>
Message-ID: <CAMYG4G=-BuvheVRpa+33QCO=cr+234DTczHZsqoSHN5CpegL1A@mail.gmail.com>

On Mon, May 29, 2017 at 5:13 PM, Adrian Croucher <a.croucher at auckland.ac.nz>
wrote:

> hi,
>
> I was asking about support for exactly these 6-node wedge elements in
> DMPlex back in January.
>
> At the time, there was no support for them. Has there been some progress
> since then?
>
> We are going to need them before we can release our software, which we're
> aiming to do by the end of the year.
>

Sorry about not keeping up to date on that. I had not really thought about
it working until Fabian suggested it.
So, it looks like XDMF output works. I am making a test now.

However, other stuff will not, like refinement, interpolation, cell
geometry, and other discretization stuff.

What do you need working?

  Thanks,

     Matt


> Cheers, Adrian
>
> Message: 4
>> Date: Fri, 26 May 2017 22:40:40 -0500
>> From: Matthew Knepley <knepley at gmail.com>
>> To: "Fabian.Jakub" <Fabian.Jakub at physik.uni-muenchen.de>
>> Cc: PETSc <petsc-users at mcs.anl.gov>
>> Subject: Re: [petsc-users] DMPlex export to hdf5/vtk for
>>         triangle/prism mesh
>> Message-ID:
>>         <CAMYG4GkVOTtFL_OBpUBPGEcBTikMbLOb5ruz=VU7tVTO6JQHTQ at mail.
>> gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>>
>> On Fri, May 26, 2017 at 12:27 PM, Fabian.Jakub <
>> Fabian.Jakub at physik.uni-muenchen.de> wrote:
>>
>> > Dear Petsc Team,
>> >
>> > I am playing around with DMPlex, using it to generate the Mesh for the
>> > ICON weather model(http://doi.org/10.1002/2015MS000431), which employs
>> a
>> > triangle mesh horizontally and columns, vertically.
>> >
>> > This results in a grid, looking like prisms, where top and bottom faces
>> > are triangles and side faces are rectangles.
>> >
>> > I was delighted to see that I could export the triangle DMPlex (2d Mesh)
>> > to hdf5 and use petsc_gen_xdmf.py to then visualize the mesh in
>> > visit/paraview.
>> > This is especially nice when exporting petscsections/vectors directly to
>> > VTK.
>> >
>>
>> Great.
>>
>>
>> > I then tried the same approach for the prism grid in 3D.
>> > I attached the code for one single cell, as well as the output in hdf5.
>> >
>> > However, trying to convert the hdf5 output, it fails with:
>> >
>> > make prism.xmf
>> >
>> > $PETSC_DIR/bin/petsc_gen_xdmf.py prism.h5
>> > Traceback (most recent call last):
>> >   File
>> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
>> > bin/petsc_gen_xdmf.py",
>> > line 241, in <module>
>> >     generateXdmf(f)
>> >   File
>> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
>> > bin/petsc_gen_xdmf.py",
>> > line 235, in generateXdmf
>> >     Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells,
>> > numCorners, cellDim, geomPath, numVertices, spaceDim, time, vfields,
>> > cfields)
>> >   File
>> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
>> > bin/petsc_gen_xdmf.py",
>> > line 193, in write
>> >     self.writeSpaceGridHeader(fp, numCells, numCorners, cellDim,
>> spaceDim)
>> >   File
>> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
>> > bin/petsc_gen_xdmf.py",
>> > line 75, in writeSpaceGridHeader
>> >     ''' % (self.cellMap[cellDim][numCorners], numCells, "XYZ" if
>> > spaceDim > 2 else "XY"))
>> > KeyError: 6
>> >
>> >
>> > Also, if I try to export a vector directly to vtk, visit and paraview
>> > fail to open it.
>> >
>> > My question is:
>> > Is this a general limitation of these output formats, that I can not mix
>> > faces with 3 and 4 vertices or is it a limitation of the
>> > petsc_gen_xdmf.py or the VTK Viewer.
>> >
>>
>> petsc_gen_xdmf. Take a look here
>>
>>
>> https://bitbucket.org/petsc/petsc/src/1731673c3fe570066779d4
>> 6b51a4aee7a45775ed/bin/petsc_gen_xdmf.py?at=master&
>> fileviewer=file-view-default#petsc_gen_xdmf.py-9
>>
>> This is what fails. You need to add something like
>>
>>    6: "Wedge"
>>
>> in the dictionary. See http://www.xdmf.org/index.php/
>> XDMF_Model_and_Format
>>
>>
>> > I'd also welcome any thoughts on the prism mesh in general.
>> > Is it that uncommon to use and do you foresee other complications with
>> it?
>> >
>>
>> You need an element that works with prisms, but it seems you already have
>> one. I know
>> there is good work from here: https://arxiv.org/abs/1411.2940
>>
>>
>> > I fear I cannot change the discretization of the host model but maybe it
>> > makes sense to use a different grid for my radiative transfer code?
>> >
>>
>> I do not really do RT, but would be happy to try and think about it.
>>
>>    Thanks,
>>
>>       Matt
>>
>>
>> > Many thanks,
>> >
>> >
>> > Fabian
>> >
>>
>>
> --
> Dr Adrian Croucher
> Senior Research Fellow
> Department of Engineering Science
> University of Auckland, New Zealand
> email: a.croucher at auckland.ac.nz
> tel: +64 (0)9 923 4611
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170529/f2699824/attachment-0001.html>

From knepley at gmail.com  Mon May 29 19:27:49 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 29 May 2017 19:27:49 -0500
Subject: [petsc-users] DMPlex export to hdf5/vtk for triangle/prism mesh
In-Reply-To: <f7818e14-3071-5596-e944-146a0fb7e5dc@physik.uni-muenchen.de>
References: <mailman.33.1495904404.8805.petsc-users@mcs.anl.gov>
	<819c5072-372d-5f2d-49ef-ca654e96f749@auckland.ac.nz>
	<f7818e14-3071-5596-e944-146a0fb7e5dc@physik.uni-muenchen.de>
Message-ID: <CAMYG4GkUQnMc3Fd+GCU039mt=4JT-gH-qd9gmWQ4FdQkEnRYxw@mail.gmail.com>

On Mon, May 29, 2017 at 5:43 PM, Fabian Jakub <
Fabian.Jakub at physik.uni-muenchen.de> wrote:

> Hi,
>
> I did just as Matt suggested which works nicely... thanks by the way!
>
> Inserted in the petsc_gen_xdmf.py the "6: 'Wedge' " entry .
>
> Calling the example with:
>
> <example_program> -show_plex hdf5:output.h5 -show_vector
> hdf5:output.h5::append
>
> exports the mesh and a vector to hdf5.
>
> Then calling
>
> $PETSC_DIR/bin/petsc_gen_xdmf.py output.h5
>
> correctly creates the descriptor file and just loads to visit.
>
>
> Many thanks again to you, Matt :)
>

Great! I will make a test and push it soon. I'll put you on the ChangeSet.

  Thanks,

     Matt


> Fab
>
>
>
> On 30.05.2017 00:13, Adrian Croucher wrote:
>
>> hi,
>>
>> I was asking about support for exactly these 6-node wedge elements in
>> DMPlex back in January.
>>
>> At the time, there was no support for them. Has there been some progress
>> since then?
>>
>> We are going to need them before we can release our software, which we're
>> aiming to do by the end of the year.
>>
>> Cheers, Adrian
>>
>> Message: 4
>>> Date: Fri, 26 May 2017 22:40:40 -0500
>>> From: Matthew Knepley <knepley at gmail.com>
>>> To: "Fabian.Jakub" <Fabian.Jakub at physik.uni-muenchen.de>
>>> Cc: PETSc <petsc-users at mcs.anl.gov>
>>> Subject: Re: [petsc-users] DMPlex export to hdf5/vtk for
>>>     triangle/prism mesh
>>> Message-ID:
>>>     <CAMYG4GkVOTtFL_OBpUBPGEcBTikMbLOb5ruz=VU7tVTO6JQHTQ at mail.gmail.com>
>>> Content-Type: text/plain; charset="utf-8"
>>>
>>> On Fri, May 26, 2017 at 12:27 PM, Fabian.Jakub <
>>> Fabian.Jakub at physik.uni-muenchen.de> wrote:
>>>
>>> > Dear Petsc Team,
>>> >
>>> > I am playing around with DMPlex, using it to generate the Mesh for the
>>> > ICON weather model(http://doi.org/10.1002/2015MS000431), which
>>> employs a
>>> > triangle mesh horizontally and columns, vertically.
>>> >
>>> > This results in a grid, looking like prisms, where top and bottom faces
>>> > are triangles and side faces are rectangles.
>>> >
>>> > I was delighted to see that I could export the triangle DMPlex (2d
>>> Mesh)
>>> > to hdf5 and use petsc_gen_xdmf.py to then visualize the mesh in
>>> > visit/paraview.
>>> > This is especially nice when exporting petscsections/vectors directly
>>> to
>>> > VTK.
>>> >
>>>
>>> Great.
>>>
>>>
>>> > I then tried the same approach for the prism grid in 3D.
>>> > I attached the code for one single cell, as well as the output in hdf5.
>>> >
>>> > However, trying to convert the hdf5 output, it fails with:
>>> >
>>> > make prism.xmf
>>> >
>>> > $PETSC_DIR/bin/petsc_gen_xdmf.py prism.h5
>>> > Traceback (most recent call last):
>>> >   File
>>> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
>>> > bin/petsc_gen_xdmf.py",
>>> > line 241, in <module>
>>> >     generateXdmf(f)
>>> >   File
>>> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
>>> > bin/petsc_gen_xdmf.py",
>>> > line 235, in generateXdmf
>>> >     Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells,
>>> > numCorners, cellDim, geomPath, numVertices, spaceDim, time, vfields,
>>> > cfields)
>>> >   File
>>> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
>>> > bin/petsc_gen_xdmf.py",
>>> > line 193, in write
>>> >     self.writeSpaceGridHeader(fp, numCells, numCorners, cellDim,
>>> spaceDim)
>>> >   File
>>> > "/software/meteo/xenial/x86_64/petsc/master/debug_gcc/..//
>>> > bin/petsc_gen_xdmf.py",
>>> > line 75, in writeSpaceGridHeader
>>> >     ''' % (self.cellMap[cellDim][numCorners], numCells, "XYZ" if
>>> > spaceDim > 2 else "XY"))
>>> > KeyError: 6
>>> >
>>> >
>>> > Also, if I try to export a vector directly to vtk, visit and paraview
>>> > fail to open it.
>>> >
>>> > My question is:
>>> > Is this a general limitation of these output formats, that I can not
>>> mix
>>> > faces with 3 and 4 vertices or is it a limitation of the
>>> > petsc_gen_xdmf.py or the VTK Viewer.
>>> >
>>>
>>> petsc_gen_xdmf. Take a look here
>>>
>>>
>>> https://bitbucket.org/petsc/petsc/src/1731673c3fe570066779d4
>>> 6b51a4aee7a45775ed/bin/petsc_gen_xdmf.py?at=master&
>>> fileviewer=file-view-default#petsc_gen_xdmf.py-9
>>>
>>> This is what fails. You need to add something like
>>>
>>>    6: "Wedge"
>>>
>>> in the dictionary. See http://www.xdmf.org/index.php/
>>> XDMF_Model_and_Format
>>>
>>>
>>> > I'd also welcome any thoughts on the prism mesh in general.
>>> > Is it that uncommon to use and do you foresee other complications with
>>> it?
>>> >
>>>
>>> You need an element that works with prisms, but it seems you already have
>>> one. I know
>>> there is good work from here: https://arxiv.org/abs/1411.2940
>>>
>>>
>>> > I fear I cannot change the discretization of the host model but maybe
>>> it
>>> > makes sense to use a different grid for my radiative transfer code?
>>> >
>>>
>>> I do not really do RT, but would be happy to try and think about it.
>>>
>>>    Thanks,
>>>
>>>       Matt
>>>
>>>
>>> > Many thanks,
>>> >
>>> >
>>> > Fabian
>>> >
>>>
>>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170529/0f1f9081/attachment.html>

From a.croucher at auckland.ac.nz  Mon May 29 20:58:20 2017
From: a.croucher at auckland.ac.nz (Adrian Croucher)
Date: Tue, 30 May 2017 13:58:20 +1200
Subject: [petsc-users] DMPlex export to hdf5/vtk for triangle/prism mesh
In-Reply-To: <CAMYG4G=-BuvheVRpa+33QCO=cr+234DTczHZsqoSHN5CpegL1A@mail.gmail.com>
References: <mailman.33.1495904404.8805.petsc-users@mcs.anl.gov>
	<819c5072-372d-5f2d-49ef-ca654e96f749@auckland.ac.nz>
	<CAMYG4G=-BuvheVRpa+33QCO=cr+234DTczHZsqoSHN5CpegL1A@mail.gmail.com>
Message-ID: <472c6367-4a84-897e-ab9f-7444d52dbe7d@auckland.ac.nz>



On 30/05/17 12:25, Matthew Knepley wrote:
>
> Sorry about not keeping up to date on that. I had not really thought 
> about it working until Fabian suggested it.
> So, it looks like XDMF output works. I am making a test now.
>
> However, other stuff will not, like refinement, interpolation, cell 
> geometry, and other discretization stuff.
>
> What do you need working?

We'll definitely need interpolation and cell geometry, but that might be 
about it. We won't need refinement.

- Adrian

-- 
Dr Adrian Croucher
Senior Research Fellow
Department of Engineering Science
University of Auckland, New Zealand
email: a.croucher at auckland.ac.nz
tel: +64 (0)9 923 4611

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170530/0137bdb9/attachment.html>

From knepley at gmail.com  Mon May 29 21:45:50 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 29 May 2017 21:45:50 -0500
Subject: [petsc-users] DMPlex export to hdf5/vtk for triangle/prism mesh
In-Reply-To: <472c6367-4a84-897e-ab9f-7444d52dbe7d@auckland.ac.nz>
References: <mailman.33.1495904404.8805.petsc-users@mcs.anl.gov>
	<819c5072-372d-5f2d-49ef-ca654e96f749@auckland.ac.nz>
	<CAMYG4G=-BuvheVRpa+33QCO=cr+234DTczHZsqoSHN5CpegL1A@mail.gmail.com>
	<472c6367-4a84-897e-ab9f-7444d52dbe7d@auckland.ac.nz>
Message-ID: <CAMYG4GnxBnwwREn7oYWmZ2QQD1-qg0RhNtug2dqS61T7Zq=1Og@mail.gmail.com>

On Mon, May 29, 2017 at 8:58 PM, Adrian Croucher <a.croucher at auckland.ac.nz>
wrote:

> On 30/05/17 12:25, Matthew Knepley wrote:
>
>
> Sorry about not keeping up to date on that. I had not really thought about
> it working until Fabian suggested it.
> So, it looks like XDMF output works. I am making a test now.
>
> However, other stuff will not, like refinement, interpolation, cell
> geometry, and other discretization stuff.
>
> What do you need working?
>
>
> We'll definitely need interpolation and cell geometry, but that might be
> about it. We won't need refinement.
>

What kind of basis are you expecting? A tensor product?

  Thanks,

     Matt


>
> - Adrian
>
> --
> Dr Adrian Croucher
> Senior Research Fellow
> Department of Engineering Science
> University of Auckland, New Zealand
> email: a.croucher at auckland.ac.nz
> tel: +64 (0)9 923 4611 <+64%209-923%204611>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170529/051c341d/attachment.html>

From a.croucher at auckland.ac.nz  Mon May 29 21:52:32 2017
From: a.croucher at auckland.ac.nz (Adrian Croucher)
Date: Tue, 30 May 2017 14:52:32 +1200
Subject: [petsc-users] DMPlex export to hdf5/vtk for triangle/prism mesh
In-Reply-To: <CAMYG4GnxBnwwREn7oYWmZ2QQD1-qg0RhNtug2dqS61T7Zq=1Og@mail.gmail.com>
References: <mailman.33.1495904404.8805.petsc-users@mcs.anl.gov>
	<819c5072-372d-5f2d-49ef-ca654e96f749@auckland.ac.nz>
	<CAMYG4G=-BuvheVRpa+33QCO=cr+234DTczHZsqoSHN5CpegL1A@mail.gmail.com>
	<472c6367-4a84-897e-ab9f-7444d52dbe7d@auckland.ac.nz>
	<CAMYG4GnxBnwwREn7oYWmZ2QQD1-qg0RhNtug2dqS61T7Zq=1Og@mail.gmail.com>
Message-ID: <7b5464bd-9517-f162-cf6c-4821589c19ca@auckland.ac.nz>



On 30/05/17 14:45, Matthew Knepley wrote:
>
>
> What kind of basis are you expecting? A tensor product?

At present we don't even need basis functions, because we're just doing 
flow simulation and it's all finite volume.

However further down the track we will also be doing rock mechanics on 
the same mesh, using finite elements. For that, tensor product basis 
would be fine.

- Adrian

-- 
Dr Adrian Croucher
Senior Research Fellow
Department of Engineering Science
University of Auckland, New Zealand
email: a.croucher at auckland.ac.nz
tel: +64 (0)9 923 4611

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170530/7db683e1/attachment.html>

From knepley at gmail.com  Mon May 29 21:55:44 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 29 May 2017 21:55:44 -0500
Subject: [petsc-users] DMPlex export to hdf5/vtk for triangle/prism mesh
In-Reply-To: <7b5464bd-9517-f162-cf6c-4821589c19ca@auckland.ac.nz>
References: <mailman.33.1495904404.8805.petsc-users@mcs.anl.gov>
	<819c5072-372d-5f2d-49ef-ca654e96f749@auckland.ac.nz>
	<CAMYG4G=-BuvheVRpa+33QCO=cr+234DTczHZsqoSHN5CpegL1A@mail.gmail.com>
	<472c6367-4a84-897e-ab9f-7444d52dbe7d@auckland.ac.nz>
	<CAMYG4GnxBnwwREn7oYWmZ2QQD1-qg0RhNtug2dqS61T7Zq=1Og@mail.gmail.com>
	<7b5464bd-9517-f162-cf6c-4821589c19ca@auckland.ac.nz>
Message-ID: <CAMYG4Gn=1DiQ64QB+wDsDJu9nNoU7Kug4sHJWYm=d_wPBjfs_A@mail.gmail.com>

On Mon, May 29, 2017 at 9:52 PM, Adrian Croucher <a.croucher at auckland.ac.nz>
wrote:

> On 30/05/17 14:45, Matthew Knepley wrote:
>
>
>
> What kind of basis are you expecting? A tensor product?
>
>
> At present we don't even need basis functions, because we're just doing
> flow simulation and it's all finite volume.
>

Okay good. Now for cell geometry. What kind of deformation do you allow in
the wedge? and what do you want
to know? For FV, we are providing the centroid and volume. If that is
enough, we could be done quickly.

  Thanks,

    Matt


> However further down the track we will also be doing rock mechanics on the
> same mesh, using finite elements. For that, tensor product basis would be
> fine.
>
> - Adrian
>
> --
> Dr Adrian Croucher
> Senior Research Fellow
> Department of Engineering Science
> University of Auckland, New Zealand
> email: a.croucher at auckland.ac.nz
> tel: +64 (0)9 923 4611 <+64%209-923%204611>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170529/1187277f/attachment-0001.html>

From leejearl at 126.com  Mon May 29 21:59:27 2017
From: leejearl at 126.com (leejearl)
Date: Tue, 30 May 2017 10:59:27 +0800
Subject: [petsc-users] a question about PetscSectionCreate
In-Reply-To: <CAJ98EDqExAd=K+-kJ=a8m2mAtYCxjkdn6p5weseyBo=rcLF2Mw@mail.gmail.com>
References: <de99b92f-825a-3bac-978e-f3cd4bc86d90@126.com>
	<D8DEF74E-BC3F-4659-8544-7F6E0CFDAD93@imperial.ac.uk>
	<CAMYG4G=xNRgkzKnwJDH0H4adUGUyJ154Vroiaxwqo14JGSRf1w@mail.gmail.com>
	<9715fa58-bf80-7aca-d01a-c74cdcde5701@126.com>
	<CAJ98EDqExAd=K+-kJ=a8m2mAtYCxjkdn6p5weseyBo=rcLF2Mw@mail.gmail.com>
Message-ID: <224c1cd2-175f-9912-b692-7264a2dabb7b@126.com>

Thanks for your kind reply. It helps me very much.

leejearl

On 2017?05?29? 15:47, Dave May wrote:
>
> On Mon, 29 May 2017 at 08:39, leejearl <leejearl at 126.com 
> <mailto:leejearl at 126.com>> wrote:
>
>     Hi, all:
>     I have create a IS for every cell in dmplex by the following steps:
>     1. Creating a integer array which size is matched to the number of
>     cells.
>     2. Use the routine "ISCreateGeneral" to create a corresponding IS.
>
>     Is there any routine which can create a IS for every cell in the
>     dmplex directly?,
>
>
> I don't think so as Plex would have to somehow know what geom quantity 
> to use to define the size of IS (e.g. vertex, cell, face, edge)
>
>     and what is the difference between ISCopy() and ISDuplicate()?
>
>
> ISDuplicate allocates memory for a new with the same comm and layout 
> as the original IS AND copies values from the original IS into the new 
> one. (Note that this is slightly different from other duplicate 
> functions like VecDuplicate which only allocate memory and does not 
> copy values from the orig vec.)
>
> ISCopy does not allocate memory for the IS (passed as the second arg), 
> it only performs the copy of values.
>
> Thanks
>   Dave
>
>
>
>     Thanks,
>     leejearl
>
>
>     On 2017?05?28? 19:35, Matthew Knepley wrote:
>>     On Sun, May 28, 2017 at 6:02 AM, Lawrence Mitchell
>>     <lawrence.mitchell at imperial.ac.uk
>>     <mailto:lawrence.mitchell at imperial.ac.uk>> wrote:
>>
>>
>>
>>         > On 28 May 2017, at 09:16, leejearl <leejearl at 126.com
>>         <mailto:leejearl at 126.com>> wrote:
>>         >
>>         > Hi, Dave: I want to store a PetscInt tag for every cell of
>>         the dmplex with the struct. Thanks,
>>
>>         You probably want to use a DMLabel to store these ids. Unless
>>         you have a different I'd for every cell.
>>
>>
>>     Several things to think about:
>>
>>     1) If you want to store a tag for EVERY cell, then just use an
>>     IS. Cell numberings are guaranteed to be
>>         contiguous and start from 0.
>>
>>     2) If you want to tag only SOME cells, then use a DMLabel as
>>     Lawrence suggests. This uses hash tables
>>         for fast construction, and sorted lists for fast search and
>>     retrieval.
>>
>>     3) If you want to store a VARIABLE number of data items per cell,
>>     then use a Section and an array that you allocate.
>>
>>        Matt
>>
>>
>>         Lawrence
>>
>>
>>
>>
>>     -- 
>>     What most experimenters take for granted before they begin their
>>     experiments is infinitely more interesting than any results to
>>     which their experiments lead.
>>     -- Norbert Wiener
>>
>>     http://www.caam.rice.edu/~mk51/ <http://www.caam.rice.edu/%7Emk51/>
>
>     -- 
>     ??
>     ???????????????
>     Phone: 17792092487
>     QQ: 188524324
>

-- 
??
???????????????
Phone: 17792092487
QQ: 188524324

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170530/430d5213/attachment.html>

From a.croucher at auckland.ac.nz  Mon May 29 22:06:28 2017
From: a.croucher at auckland.ac.nz (Adrian Croucher)
Date: Tue, 30 May 2017 15:06:28 +1200
Subject: [petsc-users] DMPlex export to hdf5/vtk for triangle/prism mesh
In-Reply-To: <CAMYG4Gn=1DiQ64QB+wDsDJu9nNoU7Kug4sHJWYm=d_wPBjfs_A@mail.gmail.com>
References: <mailman.33.1495904404.8805.petsc-users@mcs.anl.gov>
	<819c5072-372d-5f2d-49ef-ca654e96f749@auckland.ac.nz>
	<CAMYG4G=-BuvheVRpa+33QCO=cr+234DTczHZsqoSHN5CpegL1A@mail.gmail.com>
	<472c6367-4a84-897e-ab9f-7444d52dbe7d@auckland.ac.nz>
	<CAMYG4GnxBnwwREn7oYWmZ2QQD1-qg0RhNtug2dqS61T7Zq=1Og@mail.gmail.com>
	<7b5464bd-9517-f162-cf6c-4821589c19ca@auckland.ac.nz>
	<CAMYG4Gn=1DiQ64QB+wDsDJu9nNoU7Kug4sHJWYm=d_wPBjfs_A@mail.gmail.com>
Message-ID: <85f0dc24-40c7-738b-c220-d94c8a14a32e@auckland.ac.nz>



On 30/05/17 14:55, Matthew Knepley wrote:
> On Mon, May 29, 2017 at 9:52 PM, Adrian Croucher 
> <a.croucher at auckland.ac.nz <mailto:a.croucher at auckland.ac.nz>> wrote:
>
>     On 30/05/17 14:45, Matthew Knepley wrote:
>
>>
>>
>>     What kind of basis are you expecting? A tensor product?
>
>     At present we don't even need basis functions, because we're just
>     doing flow simulation and it's all finite volume.
>
>
> Okay good. Now for cell geometry. What kind of deformation do you 
> allow in the wedge?

As in Fabian's application, these elements arise from meshes which have 
a simple layered structure in the vertical, but are unstructured in the 
horizontal (can be mixtures of quads and triangles in our case- in fact 
the triangles usually only occur where there is local refinement).

So for us these wedges are just horizontal triangles projected downwards 
in the vertical- not really deformed at all.

> and what do you want
> to know? For FV, we are providing the centroid and volume. If that is 
> enough, we could be done quickly.

Yes, just centroid and volume would be enough.

- Adrian

-- 
Dr Adrian Croucher
Senior Research Fellow
Department of Engineering Science
University of Auckland, New Zealand
email: a.croucher at auckland.ac.nz
tel: +64 (0)9 923 4611

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170530/4dc2108c/attachment.html>

From franck.houssen at inria.fr  Tue May 30 02:14:58 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Tue, 30 May 2017 09:14:58 +0200 (CEST)
Subject: [petsc-users] How to VecView with a formatted precision
 (%10.8f) ?
In-Reply-To: <87shjrwmsx.fsf@jedbrown.org>
References: <1559132119.8316480.1495792332227.JavaMail.zimbra@inria.fr>
	<87shjrwmsx.fsf@jedbrown.org>
Message-ID: <1788087205.403799.1496128498132.JavaMail.zimbra@inria.fr>

Mainly for debugging purposes: controlling format/precision could be convenient ! 

Franck

~> mpirun -n 5 ./vecViewPrecision.exe 
Vec Object: 5 MPI processes
  type: mpi
Process [0]
0.
0.
Process [1]
1.23457e+06
-8.1e-07
Process [2]
2.46914e-06
-1.62e+06
Process [3]
3.7037e+06
-2.43e-06
Process [4]
4.93827e-06
-3.24e+06


----- Mail original -----
> De: "Jed Brown" <jed at jedbrown.org>
> ?: "Franck Houssen" <franck.houssen at inria.fr>, "PETSc users list" <petsc-users at mcs.anl.gov>
> Envoy?: Vendredi 26 Mai 2017 19:27:42
> Objet: Re: [petsc-users] How to VecView with a formatted precision (%10.8f) ?
> 
> No, but this could be added to the ASCII viewer.  Why do you want it?
> 
> Franck Houssen <franck.houssen at inria.fr> writes:
> 
> > How to VecView with a formatted precision (%10.8f) ? Not possible ?
> >
> > Franck
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vecViewPrecision.cpp
Type: text/x-c++src
Size: 772 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170530/f183f896/attachment.cpp>

From franck.houssen at inria.fr  Tue May 30 02:21:47 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Tue, 30 May 2017 09:21:47 +0200 (CEST)
Subject: [petsc-users] Must I destroy the local matrix I have (created and)
 set with MatISSetLocalMat ?
In-Reply-To: <872046060.405248.1496128843952.JavaMail.zimbra@inria.fr>
Message-ID: <1901003584.405720.1496128907192.JavaMail.zimbra@inria.fr>

Must I destroy the local matrix I have (created and) set with MatISSetLocalMat ? 

Franck 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170530/288fe010/attachment.html>

From knepley at gmail.com  Tue May 30 06:10:43 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 30 May 2017 06:10:43 -0500
Subject: [petsc-users] Must I destroy the local matrix I have (created
 and) set with MatISSetLocalMat ?
In-Reply-To: <1901003584.405720.1496128907192.JavaMail.zimbra@inria.fr>
References: <872046060.405248.1496128843952.JavaMail.zimbra@inria.fr>
	<1901003584.405720.1496128907192.JavaMail.zimbra@inria.fr>
Message-ID: <CAMYG4G=DckuraFCq3T7B+rbujOiQDdzYauUKxe8DS1uOyjGsGg@mail.gmail.com>

On Tue, May 30, 2017 at 2:21 AM, Franck Houssen <franck.houssen at inria.fr>
wrote:

> Must I destroy the local matrix I have (created and) set with
> MatISSetLocalMat ?
>

Yes.

   Matt


> Franck
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170530/04c2b945/attachment.html>

From jed at jedbrown.org  Tue May 30 10:36:14 2017
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 30 May 2017 09:36:14 -0600
Subject: [petsc-users] Must I destroy the local matrix I have (created
	and) set with MatISSetLocalMat ?
In-Reply-To: <1901003584.405720.1496128907192.JavaMail.zimbra@inria.fr>
References: <1901003584.405720.1496128907192.JavaMail.zimbra@inria.fr>
Message-ID: <87r2z6qrv5.fsf@jedbrown.org>

Franck Houssen <franck.houssen at inria.fr> writes:

> Must I destroy the local matrix I have (created and) set with MatISSetLocalMat ? 

The implementation references the local matrix so you need to destroy
your copy.  This pattern is always used when setting sub-objects like
this.

static PetscErrorCode MatISSetLocalMat_IS(Mat mat,Mat local)
{
  Mat_IS         *is = (Mat_IS*)mat->data;
  PetscInt       nrows,ncols,orows,ocols;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  if (is->A) {
    ierr = MatGetSize(is->A,&orows,&ocols);CHKERRQ(ierr);
    ierr = MatGetSize(local,&nrows,&ncols);CHKERRQ(ierr);
    if (orows != nrows || ocols != ncols) SETERRQ4(PETSC_COMM_SELF,PETSC_ERR_ARG_SIZ,"Local MATIS matrix should be of size %Dx%D (you passed a %Dx%D matrix)",orows,ocols,nrows,ncols);
  }
  ierr  = PetscObjectReference((PetscObject)local);CHKERRQ(ierr);
  ierr  = MatDestroy(&is->A);CHKERRQ(ierr);
  is->A = local;
  PetscFunctionReturn(0);
}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170530/c9fc888c/attachment-0001.pgp>

From bsmith at mcs.anl.gov  Tue May 30 13:22:16 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 30 May 2017 13:22:16 -0500
Subject: [petsc-users] How to VecView with a formatted precision
 (%10.8f) ?
In-Reply-To: <1788087205.403799.1496128498132.JavaMail.zimbra@inria.fr>
References: <1559132119.8316480.1495792332227.JavaMail.zimbra@inria.fr>
	<87shjrwmsx.fsf@jedbrown.org>
	<1788087205.403799.1496128498132.JavaMail.zimbra@inria.fr>
Message-ID: <1124CE1F-F1DC-4772-8ACB-932244B3518E@mcs.anl.gov>


   When I want "full precision" for debugging purposes I use PetscViewerPushFormat(viewer,PETSC_VIEWER_ASCII_MATLAB);

> On May 30, 2017, at 2:14 AM, Franck Houssen <franck.houssen at inria.fr> wrote:
> 
> Mainly for debugging purposes: controlling format/precision could be convenient ! 
> 
> Franck
> 
> ~> mpirun -n 5 ./vecViewPrecision.exe 
> Vec Object: 5 MPI processes
>  type: mpi
> Process [0]
> 0.
> 0.
> Process [1]
> 1.23457e+06
> -8.1e-07
> Process [2]
> 2.46914e-06
> -1.62e+06
> Process [3]
> 3.7037e+06
> -2.43e-06
> Process [4]
> 4.93827e-06
> -3.24e+06
> 
> 
> ----- Mail original -----
>> De: "Jed Brown" <jed at jedbrown.org>
>> ?: "Franck Houssen" <franck.houssen at inria.fr>, "PETSc users list" <petsc-users at mcs.anl.gov>
>> Envoy?: Vendredi 26 Mai 2017 19:27:42
>> Objet: Re: [petsc-users] How to VecView with a formatted precision (%10.8f) ?
>> 
>> No, but this could be added to the ASCII viewer.  Why do you want it?
>> 
>> Franck Houssen <franck.houssen at inria.fr> writes:
>> 
>>> How to VecView with a formatted precision (%10.8f) ? Not possible ?
>>> 
>>> Franck
>> 
> <vecViewPrecision.cpp>


From j.pogacnik at auckland.ac.nz  Tue May 30 22:19:34 2017
From: j.pogacnik at auckland.ac.nz (Justin Pogacnik)
Date: Wed, 31 May 2017 03:19:34 +0000
Subject: [petsc-users] PetscFECreateDefault in Fortran
Message-ID: <1496200773990.42892@auckland.ac.nz>

Hello,


I'm developing a finite element code in fortran 90. I recently updated my PETSc and am now getting the following error during compile/linking on an existing application:

Undefined symbols for architecture x86_64:

  "_petscfecreatedefault_", referenced from:

      _MAIN__ in fe_test.o

ld: symbol(s) not found for architecture x86_64

collect2: error: ld returned 1 exit status

make: *** [dist/fe_test] Error 1


I'm running Mac OS X Yosemite (10.10.5). I've created a "minimum working example" (attached) that re-creates the problem. It's basically just dm/impls/plex/examples/tutorials/ex3f90, but tries to create a PetscFE object. Everything goes fine and the DM looks like what is expected if PetscFECreateDefault is commented out. Any idea what am I missing?

Many thanks!

Justin



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170531/a46cbf53/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fe_test.F90
Type: application/octet-stream
Size: 1679 bytes
Desc: fe_test.F90
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170531/a46cbf53/attachment.obj>

From lirui319 at hnu.edu.cn  Wed May 31 03:29:55 2017
From: lirui319 at hnu.edu.cn (=?GBK?B?wO7I8A==?=)
Date: Wed, 31 May 2017 16:29:55 +0800 (GMT+08:00)
Subject: [petsc-users] Installation Error
In-Reply-To: <alpine.LFD.2.20.1705240755410.20794@asterix>
References: <15e1cc1.5bb6.15c38e117d7.Coremail.lirui319@hnu.edu.cn>
	<alpine.LFD.2.20.1705240755410.20794@asterix>
Message-ID: <1ee9c6f.7d88.15c5da025b7.Coremail.lirui319@hnu.edu.cn>

this problem was already approached.Thank you for your help!  :)  

?2017-05-24 20:57:09,??<lirui319 at hnu.edu.cn>???
> What do you have for:
> 
> which python
> echo $PYTHONPATH
> 
> 
> The following might work..
> 
> PYTHONPATH='' /usr/bin/python ./configure --with-cc=gcc --with-cxx=0 --with-fc=0 --download-f2cblaslapack --download-mpich
> 
> Satish
> 
> 
> On Wed, 24 May 2017, ?? wrote:
> 
> > 
> > Dear professor or engineer:
> >    I meet a problem about installation to petsc.
> >    When I type the code "./configure --with-cc=gcc --with-cxx=0 --with-fc=0 --download-f2cblaslapack --download-mpich" on my terminal,the answer reveals the following results.
> > 
> > >>>ERROR:root:code for hash md5 was not found.
> > Traceback (most recent call last):
> >     File "/home/zhuizhuluori/lirui/software/vapor-2.5.0-Linux_x86_64/vapor/vapor-2.5.0/lib/python2.7/hashlib.py", line 139, in <module
> >       globals()[__func_name] = __get_hash(__func_name)
> >     File "/home/zhuizhuluori/lirui/software/vapor-2.5.0-Linux_x86_64/vapor/vapor-2.5.0/lib/python2.7/hashlib.py", line 91, in __get_builtin_constructor
> >     raise ValueError('unsupported hash type ' + name)
> > ValueError: unsupported hash type md5
> >     ERROR:root:code for hash sha1 was not found .....
> > 
> >    I have used petsc for a long time,and never see the this problem.my laptop is installed an old version of petsc and I wanna change it to a new version.How can I fix it?Thanks for your heartful suggestion! 
> > 
> > 
> > 
> > 
> > 
> > 



From knepley at gmail.com  Wed May 31 07:53:16 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 31 May 2017 07:53:16 -0500
Subject: [petsc-users] PetscFECreateDefault in Fortran
In-Reply-To: <1496200773990.42892@auckland.ac.nz>
References: <1496200773990.42892@auckland.ac.nz>
Message-ID: <CAMYG4GkVU8i_5fXa8HcKLsx5qJ0mbvuW2SXq_Jm-2Kj65s8ydg@mail.gmail.com>

On Tue, May 30, 2017 at 10:19 PM, Justin Pogacnik <j.pogacnik at auckland.ac.nz
> wrote:

> Hello,
>
> I'm developing a finite element code in fortran 90. I recently updated my
> PETSc and am now getting the following error during compile/linking on an
> existing application:
>
> Undefined symbols for architecture x86_64:
>
>   "_petscfecreatedefault_", referenced from:
>
>       _MAIN__ in fe_test.o
>
> ld: symbol(s) not found for architecture x86_64
>
> collect2: error: ld returned 1 exit status
>
> make: *** [dist/fe_test] Error 1
>
>
> I'm running Mac OS X Yosemite (10.10.5). I've created a "minimum working
> example" (attached) that re-creates the problem. It's basically
> just dm/impls/plex/examples/tutorials/ex3f90, but tries to create a
> PetscFE object. Everything goes fine and the DM looks like what is expected
> if PetscFECreateDefault is commented out. Any idea what am I missing?
>
Yes, I had not made a Fortran binding for this function. I will do it now.

  Thanks,

     Matt


> Many thanks!
>
> Justin
>
>
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170531/c3eab829/attachment.html>

From knepley at gmail.com  Wed May 31 08:34:22 2017
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 31 May 2017 08:34:22 -0500
Subject: [petsc-users] PetscFECreateDefault in Fortran
In-Reply-To: <CAMYG4GkVU8i_5fXa8HcKLsx5qJ0mbvuW2SXq_Jm-2Kj65s8ydg@mail.gmail.com>
References: <1496200773990.42892@auckland.ac.nz>
	<CAMYG4GkVU8i_5fXa8HcKLsx5qJ0mbvuW2SXq_Jm-2Kj65s8ydg@mail.gmail.com>
Message-ID: <CAMYG4GnCyU-v-A=ZK+k952H7VdbwYEweB5emip82AMkYBEfmJA@mail.gmail.com>

On Wed, May 31, 2017 at 7:53 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, May 30, 2017 at 10:19 PM, Justin Pogacnik <
> j.pogacnik at auckland.ac.nz> wrote:
>
>> Hello,
>>
>> I'm developing a finite element code in fortran 90. I recently updated my
>> PETSc and am now getting the following error during compile/linking on an
>> existing application:
>>
>> Undefined symbols for architecture x86_64:
>>
>>   "_petscfecreatedefault_", referenced from:
>>
>>       _MAIN__ in fe_test.o
>>
>> ld: symbol(s) not found for architecture x86_64
>>
>> collect2: error: ld returned 1 exit status
>>
>> make: *** [dist/fe_test] Error 1
>>
>>
>> I'm running Mac OS X Yosemite (10.10.5). I've created a "minimum working
>> example" (attached) that re-creates the problem. It's basically
>> just dm/impls/plex/examples/tutorials/ex3f90, but tries to create a
>> PetscFE object. Everything goes fine and the DM looks like what is expected
>> if PetscFECreateDefault is commented out. Any idea what am I missing?
>>
> Yes, I had not made a Fortran binding for this function. I will do it now.
>

I have merged it to the 'next' branch, and it will be in 'master' soon.

  Thanks,

     Matt


>   Thanks,
>
>      Matt
>
>
>> Many thanks!
>>
>> Justin
>>
>>
>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> http://www.caam.rice.edu/~mk51/
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170531/a05476e0/attachment-0001.html>

From franck.houssen at inria.fr  Wed May 31 10:59:53 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Wed, 31 May 2017 17:59:53 +0200 (CEST)
Subject: [petsc-users] If I use MatISSetLocalMat with a MatCreateSeqAIJ
 local matrix, do I need to use MatISSetPreallocation for the global matrix ?
In-Reply-To: <2045825082.1144082.1496246089403.JavaMail.zimbra@inria.fr>
Message-ID: <1636134414.1146088.1496246393744.JavaMail.zimbra@inria.fr>

If I use MatISSetLocalMat with a preallocated MatCreateSeqAIJ local matrix, do I need to use MatISSetPreallocation for the global matrix ? 

Here is the pseudo-code: 
MatCreateIS(PETSC_COMM_WORLD, ..., &globalMat) 
MatISSetPreallocation(globalMatrix, ...) // Is this necessary ? 
MatCreateSeqAIJ(PETSC_COMM_SELF, ..., &localMatrix) // Prealloc done on the fly 
MatSetValues(localMatrix, ...) 
MatISSetLocalMat(globalMatrix, localMatrix) 

Is it necessary to call MatISSetPreallocation for globalMatrix ? (prealloc should have been done locally for each local matrix, no ?) 

Franck 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170531/70fa91fc/attachment.html>

From franck.houssen at inria.fr  Wed May 31 11:22:00 2017
From: franck.houssen at inria.fr (Franck Houssen)
Date: Wed, 31 May 2017 18:22:00 +0200 (CEST)
Subject: [petsc-users] When using MatIS,
 do I need to call MatAssemblyBegin/End between MatISSetLocalMat
 (local set) and MatISGetMPIXAIJ (get global assembly) ?
In-Reply-To: <1834116955.1151067.1496247616349.JavaMail.zimbra@inria.fr>
Message-ID: <1040288737.1151290.1496247720613.JavaMail.zimbra@inria.fr>

When using MatIS, do I need to call MatAssemblyBegin/End between MatISSetLocalMat (local set) and MatISGetMPIXAIJ (get global assembly) ? 

Franck 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170531/ddbfce8c/attachment.html>

From kannanr at ornl.gov  Wed May 31 14:46:18 2017
From: kannanr at ornl.gov (Kannan, Ramakrishnan)
Date: Wed, 31 May 2017 19:46:18 +0000
Subject: [petsc-users] slepc on 1D row distributed matrix
In-Reply-To: <A6FB7D08-3025-4D3A-8510-336CA1863574@ornl.gov>
References: <A6FB7D08-3025-4D3A-8510-336CA1863574@ornl.gov>
Message-ID: <628DF9C9-8C85-4B0E-AE88-CCD2432008C7@ornl.gov>

Hello,

I have got a sparse 1D row distributed matrix in which every MPI process owns an m/p x n of the global matrix mxn.  I am running NHEP with krylovschur on it. It is throwing me some wrong error. For your reference, I have attached the modified ex5.c in which I SetSizes on the matrix to emulate the 1D row distribution and the log file with the error.

In the unmodified ex5.c, for m=5, N=15, the local_m and the local_n is 3x3. How is the global 15x15 matrix distributed locally as 3x3 matrices? When I print the global matrix, it doesn?t appear to be diagonal as well.

If slepc doesn?t support sparse 1D row distributed matrix, how do I need to redistribute it such that I can run NHEP on this.
--
Regards,
Ramki

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170531/9e7ac8b6/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex5.c
Type: application/octet-stream
Size: 7780 bytes
Desc: ex5.c
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170531/9e7ac8b6/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slepc.o607511
Type: application/octet-stream
Size: 26570 bytes
Desc: slepc.o607511
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170531/9e7ac8b6/attachment-0003.obj>

From jed at jedbrown.org  Wed May 31 15:06:11 2017
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 31 May 2017 17:36:11 -0230
Subject: [petsc-users] PETSc User Meeting 2017, June 14-16 in Boulder,
	Colorado
In-Reply-To: <87shjsxmyh.fsf@jedbrown.org>
References: <87y3wbtk1i.fsf@jedbrown.org> <87shjsxmyh.fsf@jedbrown.org>
Message-ID: <87zidsokp8.fsf@jedbrown.org>

Correction: it is still possible to book lodging today (closes at
midnight Mountain Time).

See you in two short weeks.  Thanks!

Jed Brown <jed at jedbrown.org> writes:

> The program is up on the website:
>
>   https://www.mcs.anl.gov/petsc/meetings/2017/
>
> If you haven't registered yet, we can still accommodate you, but please
> register soon.  If you haven't booked lodging, please do that soon --
> the on-campus lodging option will close on *Tuesday, May 30*.
>
>   https://confreg.colorado.edu/CSM2017
>
> We are looking forward to seeing you in Boulder!
>
> Jed Brown <jed at jedbrown.org> writes:
>
>> We'd like to invite you to join us at the 2017 PETSc User Meeting held
>> at the University of Colorado Boulder on June 14-16, 2017.
>>
>>   http://www.mcs.anl.gov/petsc/meetings/2017/
>>
>> The first day consists of tutorials on various aspects and features of
>> PETSc. The second and third days will be devoted to exchange,
>> discussions, and a refinement of strategies for the future with our
>> users. We encourage you to present work illustrating your own use of
>> PETSc, for example in applications or in libraries built on top of
>> PETSc.
>>
>> Registration for the PETSc User Meeting 2017 is free for students and
>> $75 for non-students. We can host a maximum of 150 participants, so
>> register soon (and by May 15).
>>
>>   http://www.eventzilla.net/web/e/petsc-user-meeting-2017-2138890185
>>
>> We are also offering low-cost lodging on campus.  A lodging registration
>> site will be available soon and announced here and on the website.
>>
>> Thanks to the generosity of Intel, we will be able to offer a limited
>> number of student travel grants. We are also soliciting additional
>> sponsors -- please contact us if you are interested.
>>
>>
>> We are looking forward to seeing you in Boulder!
>>
>> Please contact us at petsc2017 at mcs.anl.gov if you have any questions or
>> comments.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170531/ad235597/attachment.pgp>

From jroman at dsic.upv.es  Wed May 31 15:26:40 2017
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Wed, 31 May 2017 22:26:40 +0200
Subject: [petsc-users] slepc on 1D row distributed matrix
In-Reply-To: <628DF9C9-8C85-4B0E-AE88-CCD2432008C7@ornl.gov>
References: <A6FB7D08-3025-4D3A-8510-336CA1863574@ornl.gov>
	<628DF9C9-8C85-4B0E-AE88-CCD2432008C7@ornl.gov>
Message-ID: <57665F1B-33B8-4448-A6C5-BFA3D14AA99C@dsic.upv.es>


> El 31 may 2017, a las 21:46, Kannan, Ramakrishnan <kannanr at ornl.gov> escribi?:
> 
> Hello,
>  
> I have got a sparse 1D row distributed matrix in which every MPI process owns an m/p x n of the global matrix mxn.  I am running NHEP with krylovschur on it. It is throwing me some wrong error. For your reference, I have attached the modified ex5.c in which I SetSizes on the matrix to emulate the 1D row distribution and the log file with the error.
>  
> In the unmodified ex5.c, for m=5, N=15, the local_m and the local_n is 3x3. How is the global 15x15 matrix distributed locally as 3x3 matrices? When I print the global matrix, it doesn?t appear to be diagonal as well. 
>  
> If slepc doesn?t support sparse 1D row distributed matrix, how do I need to redistribute it such that I can run NHEP on this.
> -- 
> Regards,
> Ramki
>  
> <ex5.c><slepc.o607511>

As explained in the manpage, the local columns size n must match the local size of the x vector, so it must also be N/mpisize
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetSizes.html

But be warned that your code will not work when N is not divisible by mpisize. In that case, global and local dimensions won't match.

Setting local sizes is not necessary in your case, since by default PETSc is already doing a 1D block-row distribution.

Jose


From kannanr at ornl.gov  Wed May 31 16:14:55 2017
From: kannanr at ornl.gov (Kannan, Ramakrishnan)
Date: Wed, 31 May 2017 21:14:55 +0000
Subject: [petsc-users] slepc on 1D row distributed matrix
In-Reply-To: <57665F1B-33B8-4448-A6C5-BFA3D14AA99C@dsic.upv.es>
References: <A6FB7D08-3025-4D3A-8510-336CA1863574@ornl.gov>
	<628DF9C9-8C85-4B0E-AE88-CCD2432008C7@ornl.gov>
	<57665F1B-33B8-4448-A6C5-BFA3D14AA99C@dsic.upv.es>
Message-ID: <3A050906-27B2-4C2D-B101-A16CC1EB78CA@ornl.gov>

Jose,

Thank you for the quick reply. 

In this specific example, there are 5 mpi processes and each process owns an 1D row distributed matrix of  size 3x15. According to the MatSetSizes, I should set local rows, local cols, global rows, global cols which in this case are 3,15,15,15 respectively. Instead why would I set 3,3,15,15.

Also in our program, I use global_row_idx, global_col_idx for MatSetValues. If I set 3,3,15,15 instead of 3,15,15,15, my MatSetValues fails with the error ?nnz cannot be greater than row length:?. Also to test the 3,15,15,15 in MatSetSizes to be right, we called a MatCreateVec and MatMult of petsc which seemed to work alright too. 

Appreciate your kind help.  
-- 
Regards,
Ramki
 

On 5/31/17, 4:26 PM, "Jose E. Roman" <jroman at dsic.upv.es> wrote:

    
    > El 31 may 2017, a las 21:46, Kannan, Ramakrishnan <kannanr at ornl.gov> escribi?:
    > 
    > Hello,
    >  
    > I have got a sparse 1D row distributed matrix in which every MPI process owns an m/p x n of the global matrix mxn.  I am running NHEP with krylovschur on it. It is throwing me some wrong error. For your reference, I have attached the modified ex5.c in which I SetSizes on the matrix to emulate the 1D row distribution and the log file with the error.
    >  
    > In the unmodified ex5.c, for m=5, N=15, the local_m and the local_n is 3x3. How is the global 15x15 matrix distributed locally as 3x3 matrices? When I print the global matrix, it doesn?t appear to be diagonal as well. 
    >  
    > If slepc doesn?t support sparse 1D row distributed matrix, how do I need to redistribute it such that I can run NHEP on this.
    > -- 
    > Regards,
    > Ramki
    >  
    > <ex5.c><slepc.o607511>
    
    As explained in the manpage, the local columns size n must match the local size of the x vector, so it must also be N/mpisize
    http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetSizes.html
    
    But be warned that your code will not work when N is not divisible by mpisize. In that case, global and local dimensions won't match.
    
    Setting local sizes is not necessary in your case, since by default PETSc is already doing a 1D block-row distribution.
    
    Jose
    
    
    


From bsmith at mcs.anl.gov  Wed May 31 19:13:17 2017
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 31 May 2017 19:13:17 -0500
Subject: [petsc-users] slepc on 1D row distributed matrix
In-Reply-To: <3A050906-27B2-4C2D-B101-A16CC1EB78CA@ornl.gov>
References: <A6FB7D08-3025-4D3A-8510-336CA1863574@ornl.gov>
	<628DF9C9-8C85-4B0E-AE88-CCD2432008C7@ornl.gov>
	<57665F1B-33B8-4448-A6C5-BFA3D14AA99C@dsic.upv.es>
	<3A050906-27B2-4C2D-B101-A16CC1EB78CA@ornl.gov>
Message-ID: <8A10B5D7-DDF6-4F69-8A42-290E70CA2596@mcs.anl.gov>


> On May 31, 2017, at 4:14 PM, Kannan, Ramakrishnan <kannanr at ornl.gov> wrote:
> 
> Jose,
> 
> Thank you for the quick reply. 
> 
> In this specific example, there are 5 mpi processes and each process owns an 1D row distributed matrix of  size 3x15. According to the MatSetSizes, I should set local rows, local cols, global rows, global cols which in this case are 3,15,15,15 respectively. Instead why would I set 3,3,15,15.

   You have not read carefully the definition of "local size" for matrices in PETSc. 

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetSizes.html

> 
> Also in our program, I use global_row_idx, global_col_idx for MatSetValues. If I set 3,3,15,15 instead of 3,15,15,15, my MatSetValues fails with the error ?nnz cannot be greater than row length:?.

   This is a different problem that may need to be tracked down.

> Also to test the 3,15,15,15 in MatSetSizes to be right, we called a MatCreateVec and MatMult of petsc which seemed to work alright too. 

   This will not work under normal circumstances so something else must be different as well.

   Barry

> 
> Appreciate your kind help.  
> -- 
> Regards,
> Ramki
> 
> 
> On 5/31/17, 4:26 PM, "Jose E. Roman" <jroman at dsic.upv.es> wrote:
> 
> 
>> El 31 may 2017, a las 21:46, Kannan, Ramakrishnan <kannanr at ornl.gov> escribi?:
>> 
>> Hello,
>> 
>> I have got a sparse 1D row distributed matrix in which every MPI process owns an m/p x n of the global matrix mxn.  I am running NHEP with krylovschur on it. It is throwing me some wrong error. For your reference, I have attached the modified ex5.c in which I SetSizes on the matrix to emulate the 1D row distribution and the log file with the error.
>> 
>> In the unmodified ex5.c, for m=5, N=15, the local_m and the local_n is 3x3. How is the global 15x15 matrix distributed locally as 3x3 matrices? When I print the global matrix, it doesn?t appear to be diagonal as well. 
>> 
>> If slepc doesn?t support sparse 1D row distributed matrix, how do I need to redistribute it such that I can run NHEP on this.
>> -- 
>> Regards,
>> Ramki
>> 
>> <ex5.c><slepc.o607511>
> 
>    As explained in the manpage, the local columns size n must match the local size of the x vector, so it must also be N/mpisize
>    http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetSizes.html
> 
>    But be warned that your code will not work when N is not divisible by mpisize. In that case, global and local dimensions won't match.
> 
>    Setting local sizes is not necessary in your case, since by default PETSc is already doing a 1D block-row distribution.
> 
>    Jose
> 
> 
> 
> 


From j.pogacnik at auckland.ac.nz  Wed May 31 22:00:09 2017
From: j.pogacnik at auckland.ac.nz (Justin Pogacnik)
Date: Thu, 1 Jun 2017 03:00:09 +0000
Subject: [petsc-users] PetscFECreateDefault in Fortran
In-Reply-To: <CAMYG4GnCyU-v-A=ZK+k952H7VdbwYEweB5emip82AMkYBEfmJA@mail.gmail.com>
References: <1496200773990.42892@auckland.ac.nz>
	<CAMYG4GkVU8i_5fXa8HcKLsx5qJ0mbvuW2SXq_Jm-2Kj65s8ydg@mail.gmail.com>,
	<CAMYG4GnCyU-v-A=ZK+k952H7VdbwYEweB5emip82AMkYBEfmJA@mail.gmail.com>
Message-ID: <1496286009918.20206@auckland.ac.nz>

Thanks Matt! That works perfectly now. I have another question regarding accessing the quadrature information.


When I use PetscFEGetQuadrature(), then PetscQuadratureView(), I see what I expect regarding point locations, weights.


However, when I try to use PetscQuadratureGetData() the pointers seem to point to random memory locations.


The exact line from my test problem is: call PetscQuadratureGetData(quad,q_nc,q_dim,q_num,pq_points,pq_weights,ierr);

where the pq_* are the pointers giving strange output. The q_nc, q_dim, and q_num are all giving what I would expect to see.


Happy to send along the file if that helps.


Thanks again,


Justin

________________________________
From: Matthew Knepley <knepley at gmail.com>
Sent: Thursday, June 1, 2017 1:34 AM
To: Justin Pogacnik
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] PetscFECreateDefault in Fortran

On Wed, May 31, 2017 at 7:53 AM, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:
On Tue, May 30, 2017 at 10:19 PM, Justin Pogacnik <j.pogacnik at auckland.ac.nz<mailto:j.pogacnik at auckland.ac.nz>> wrote:

Hello,

I'm developing a finite element code in fortran 90. I recently updated my PETSc and am now getting the following error during compile/linking on an existing application:

Undefined symbols for architecture x86_64:

  "_petscfecreatedefault_", referenced from:

      _MAIN__ in fe_test.o

ld: symbol(s) not found for architecture x86_64

collect2: error: ld returned 1 exit status

make: *** [dist/fe_test] Error 1


I'm running Mac OS X Yosemite (10.10.5). I've created a "minimum working example" (attached) that re-creates the problem. It's basically just dm/impls/plex/examples/tutorials/ex3f90, but tries to create a PetscFE object. Everything goes fine and the DM looks like what is expected if PetscFECreateDefault is commented out. Any idea what am I missing?

Yes, I had not made a Fortran binding for this function. I will do it now.

I have merged it to the 'next' branch, and it will be in 'master' soon.

  Thanks,

     Matt

  Thanks,

     Matt

Many thanks!

Justin






--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/



--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

http://www.caam.rice.edu/~mk51/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170601/171891d2/attachment.html>