From smithc11 at rpi.edu Mon Mar 1 15:42:49 2021 From: smithc11 at rpi.edu (Cameron Smith) Date: Mon, 1 Mar 2021 16:42:49 -0500 Subject: [petsc-users] creation of parallel dmplex from a partitioned mesh In-Reply-To: References: <1953567c-6c7f-30fb-13e6-ad7017263a92@rpi.edu> <62654977-bdbc-9cd7-5a70-e9fb4951310a@rpi.edu> <3fcf90b7-3abd-1345-bd90-d7d7272816d9@rpi.edu> <87mu2jg57a.fsf@jedbrown.org> <5e245665-61c6-3a48-9b3e-97b38f69829e@rpi.edu> Message-ID: <19f67be5-9db6-6b9c-7ad0-1d5fbf453f85@rpi.edu> Thank you. That makes sense. Using DMPlexPointLocalRef from petsc master worked for me; time to debug. https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/DMPlexPointLocalRef.html -Cameron On 2/26/21 8:32 AM, Matthew Knepley wrote: > On Thu, Feb 25, 2021 at 4:57 PM Cameron Smith > wrote: > > Hello, > > Bringing this thread back from the dead... > > We made progress with creation of a distributed dmplex that matches our > source mesh partition and are in need of help writing values into a > vector created from the dmplex object. > > As discussed previously, we have created a DMPlex instance using: > > DMPlexCreateFromCellListPetsc(...) > DMGetPointSF(...) > PetscSFSetGraph(...) > > which gives us a distribution of mesh vertices and elements in the DM > object that matches the element-based partition of our unstructured > mesh. > > We then mark mesh vertices on the geometric model boundary using > DMSetLabelValue(...) and a map from our mesh vertices to dmplex points > (created during dmplex definition of vtx coordinates). > > Following this, we create a section for vertices: > > >? ? ?DMPlexGetDepthStratum(dm, 0, &vStart, &vEnd); > >? ? ?PetscSectionCreate(PetscObjectComm((PetscObject) dm), &s); > >? ? ?PetscSectionSetNumFields(s, 1); > >? ? ?PetscSectionSetFieldComponents(s, 0, 1); > >? ? ?PetscSectionSetChart(s, vStart, vEnd); > >? ? ?for(PetscInt v = vStart; v < vEnd; ++v) { > >? ? ? ? ?PetscSectionSetDof(s, v, 1); > >? ? ? ? ?PetscSectionSetFieldDof(s, v, 0, 1); > >? ? ?} > >? ? ?PetscSectionSetUp(s); > >? ? ?DMSetLocalSection(dm, s); > >? ? ?PetscSectionDestroy(&s); > >? ? ?DMGetGlobalSection(dm,&s); //update the global section > > We then try to write values into a local Vec for the on-process > vertices > (roots and leaves in sf terms) and hit an ordering problem. > Specifically, we make the following sequence of calls: > > DMGetLocalVector(dm,&bloc); > VecGetArrayWrite(bloc, &bwrite); > //for loop to write values to bwrite > VecRestoreArrayWrite(bloc, &bwrite); > DMLocalToGlobal(dm,bloc,INSERT_VALUES,b); > DMRestoreLocalVector(dm,&bloc); > > > There is an easy way to get diagnostics here. For the local vector > > ? DMGetLocalSection(dm, &s); > ? PetscSectionGetOffset(s, v, &off); > > will give you the offset into the array you got from VecGetArrayWrite() > for that vertex. You can get this wrapped up using DMPlexPointLocalWrite() > which is what I tend to use for this type of stuff. > > For the global vector > > ? DMGetGlobalSection(dm, &gs); > ? PetscSectionGetOffset(gs, v, &off); > > will give you the offset into the portion of the global array that is > stored in this process. > If you do not own the values for this vertex, the number is negative, > and it is actually > -(i+1) if the index i is the valid one on the owning process. > > Visualizing Vec 'b' in paraview, and the > original mesh, tells us that the dmplex topology and geometry (the > vertex coordinates) are correct but that the order we write values is > wrong (not total garbage... but clearly shifted). > > > We do not make any guarantees that global orders match local orders. > However, by default > we number up global unknowns in rank order, leaving out the dofs that we > not owned. > > Does this make sense? > > ? Thanks, > > ? ? ?Matt > > Is there anything obviously wrong in our described approach?? I suspect > the section creation is wrong and/or we don't understand the order of > entries in the array returned by VecGetArrayWrite. > > Please let us know if other info is needed.? We are happy to share the > relevant source code. > > Thank-you, > Cameron > > > On 8/25/20 8:34 AM, Cameron Smith wrote: > > On 8/24/20 4:57 PM, Matthew Knepley wrote: > >> On Mon, Aug 24, 2020 at 4:27 PM Jed Brown > >> >> wrote: > >> > >> ??? Cameron Smith > >> writes: > >> > >> ???? > We made some progress with star forest creation but still > have > >> ??? work to do. > >> ???? > > >> ???? > We revisited DMPlexCreateFromCellListParallelPetsc(...) > and got it > >> ???? > working by sequentially partitioning the vertex > coordinates across > >> ???? > processes to satisfy the 'vertexCoords' argument. > Specifically, > >> ??? rank 0 > >> ???? > has the coordinates for vertices with global id 0:N/P-1, > rank 1 > >> has > >> ???? > N/P:2*(N/P)-1, and so on (N is the total number of global > >> ??? vertices and P > >> ???? > is the number of processes). > >> ???? > > >> ???? > The consequences of the sequential partition of vertex > >> ??? coordinates in > >> ???? > subsequent solver operations is not clear.? Does it make > process i > >> ???? > responsible for computations and communications > associated with > >> ??? global > >> ???? > vertices i*(N/P):(i+1)*(N/P)-1 ?? We assumed it does and > wanted > >> ??? to confirm. > >> > >> ??? Yeah, in the sense that the corners would be owned by the > rank you > >> ??? place them on. > >> > >> ??? But many methods, especially high-order, perform assembly via > >> ??? non-overlapping partition of elements, in which case the > >> ??? "computations" happen where the elements are (with any required > >> ??? vertex data for the closure of those elements being sent to > the rank > >> ??? handling the element). > >> > >> ??? Note that a typical pattern would be to create a parallel DMPlex > >> ??? with a naive distribution, then repartition/distribute it. > >> > >> > >> As Jed says, CreateParallel() just makes the most naive > partition of > >> vertices because we have no other information. Once > >> the mesh is made, you call DMPlexDistribute() again to reduce > the edge > >> cut. > >> > >> ?? Thanks, > >> > >> ?? ? ?Matt > >> > > > > > > Thank you. > > > > This is being used for PIC code with low order 2d elements whose > mesh is > > partitioned to minimize communications during particle > operations.? This > > partition will not be ideal for the field solve using petsc so we're > > exploring alternatives that will require minimal data movement > between > > the two partitions.? Towards that, we'll keep pursuing the SF > creation. > > > > -Cameron > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Mon Mar 1 19:11:53 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 1 Mar 2021 20:11:53 -0500 Subject: [petsc-users] creation of parallel dmplex from a partitioned mesh In-Reply-To: <19f67be5-9db6-6b9c-7ad0-1d5fbf453f85@rpi.edu> References: <1953567c-6c7f-30fb-13e6-ad7017263a92@rpi.edu> <62654977-bdbc-9cd7-5a70-e9fb4951310a@rpi.edu> <3fcf90b7-3abd-1345-bd90-d7d7272816d9@rpi.edu> <87mu2jg57a.fsf@jedbrown.org> <5e245665-61c6-3a48-9b3e-97b38f69829e@rpi.edu> <19f67be5-9db6-6b9c-7ad0-1d5fbf453f85@rpi.edu> Message-ID: On Mon, Mar 1, 2021 at 4:41 PM Cameron Smith wrote: > Thank you. That makes sense. > > Using DMPlexPointLocalRef from petsc master worked for me; time to debug. > Let me know if it turns out that Plex is making an assumption that is non-intuitive, or if you need something that would make conversion easier. Thanks, Matt > > https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/DMPLEX/DMPlexPointLocalRef.html > > -Cameron > > On 2/26/21 8:32 AM, Matthew Knepley wrote: > > On Thu, Feb 25, 2021 at 4:57 PM Cameron Smith > > wrote: > > > > Hello, > > > > Bringing this thread back from the dead... > > > > We made progress with creation of a distributed dmplex that matches > our > > source mesh partition and are in need of help writing values into a > > vector created from the dmplex object. > > > > As discussed previously, we have created a DMPlex instance using: > > > > DMPlexCreateFromCellListPetsc(...) > > DMGetPointSF(...) > > PetscSFSetGraph(...) > > > > which gives us a distribution of mesh vertices and elements in the DM > > object that matches the element-based partition of our unstructured > > mesh. > > > > We then mark mesh vertices on the geometric model boundary using > > DMSetLabelValue(...) and a map from our mesh vertices to dmplex > points > > (created during dmplex definition of vtx coordinates). > > > > Following this, we create a section for vertices: > > > > > DMPlexGetDepthStratum(dm, 0, &vStart, &vEnd); > > > PetscSectionCreate(PetscObjectComm((PetscObject) dm), &s); > > > PetscSectionSetNumFields(s, 1); > > > PetscSectionSetFieldComponents(s, 0, 1); > > > PetscSectionSetChart(s, vStart, vEnd); > > > for(PetscInt v = vStart; v < vEnd; ++v) { > > > PetscSectionSetDof(s, v, 1); > > > PetscSectionSetFieldDof(s, v, 0, 1); > > > } > > > PetscSectionSetUp(s); > > > DMSetLocalSection(dm, s); > > > PetscSectionDestroy(&s); > > > DMGetGlobalSection(dm,&s); //update the global section > > > > We then try to write values into a local Vec for the on-process > > vertices > > (roots and leaves in sf terms) and hit an ordering problem. > > Specifically, we make the following sequence of calls: > > > > DMGetLocalVector(dm,&bloc); > > VecGetArrayWrite(bloc, &bwrite); > > //for loop to write values to bwrite > > VecRestoreArrayWrite(bloc, &bwrite); > > DMLocalToGlobal(dm,bloc,INSERT_VALUES,b); > > DMRestoreLocalVector(dm,&bloc); > > > > > > There is an easy way to get diagnostics here. For the local vector > > > > DMGetLocalSection(dm, &s); > > PetscSectionGetOffset(s, v, &off); > > > > will give you the offset into the array you got from VecGetArrayWrite() > > for that vertex. You can get this wrapped up using > DMPlexPointLocalWrite() > > which is what I tend to use for this type of stuff. > > > > For the global vector > > > > DMGetGlobalSection(dm, &gs); > > PetscSectionGetOffset(gs, v, &off); > > > > will give you the offset into the portion of the global array that is > > stored in this process. > > If you do not own the values for this vertex, the number is negative, > > and it is actually > > -(i+1) if the index i is the valid one on the owning process. > > > > Visualizing Vec 'b' in paraview, and the > > original mesh, tells us that the dmplex topology and geometry (the > > vertex coordinates) are correct but that the order we write values is > > wrong (not total garbage... but clearly shifted). > > > > > > We do not make any guarantees that global orders match local orders. > > However, by default > > we number up global unknowns in rank order, leaving out the dofs that we > > not owned. > > > > Does this make sense? > > > > Thanks, > > > > Matt > > > > Is there anything obviously wrong in our described approach? I > suspect > > the section creation is wrong and/or we don't understand the order of > > entries in the array returned by VecGetArrayWrite. > > > > Please let us know if other info is needed. We are happy to share > the > > relevant source code. > > > > Thank-you, > > Cameron > > > > > > On 8/25/20 8:34 AM, Cameron Smith wrote: > > > On 8/24/20 4:57 PM, Matthew Knepley wrote: > > >> On Mon, Aug 24, 2020 at 4:27 PM Jed Brown > > > >> >> wrote: > > >> > > >> Cameron Smith > > >> writes: > > >> > > >> > We made some progress with star forest creation but still > > have > > >> work to do. > > >> > > > >> > We revisited DMPlexCreateFromCellListParallelPetsc(...) > > and got it > > >> > working by sequentially partitioning the vertex > > coordinates across > > >> > processes to satisfy the 'vertexCoords' argument. > > Specifically, > > >> rank 0 > > >> > has the coordinates for vertices with global id 0:N/P-1, > > rank 1 > > >> has > > >> > N/P:2*(N/P)-1, and so on (N is the total number of global > > >> vertices and P > > >> > is the number of processes). > > >> > > > >> > The consequences of the sequential partition of vertex > > >> coordinates in > > >> > subsequent solver operations is not clear. Does it make > > process i > > >> > responsible for computations and communications > > associated with > > >> global > > >> > vertices i*(N/P):(i+1)*(N/P)-1 ? We assumed it does and > > wanted > > >> to confirm. > > >> > > >> Yeah, in the sense that the corners would be owned by the > > rank you > > >> place them on. > > >> > > >> But many methods, especially high-order, perform assembly via > > >> non-overlapping partition of elements, in which case the > > >> "computations" happen where the elements are (with any > required > > >> vertex data for the closure of those elements being sent to > > the rank > > >> handling the element). > > >> > > >> Note that a typical pattern would be to create a parallel > DMPlex > > >> with a naive distribution, then repartition/distribute it. > > >> > > >> > > >> As Jed says, CreateParallel() just makes the most naive > > partition of > > >> vertices because we have no other information. Once > > >> the mesh is made, you call DMPlexDistribute() again to reduce > > the edge > > >> cut. > > >> > > >> Thanks, > > >> > > >> Matt > > >> > > > > > > > > > Thank you. > > > > > > This is being used for PIC code with low order 2d elements whose > > mesh is > > > partitioned to minimize communications during particle > > operations. This > > > partition will not be ideal for the field solve using petsc so > we're > > > exploring alternatives that will require minimal data movement > > between > > > the two partitions. Towards that, we'll keep pursuing the SF > > creation. > > > > > > -Cameron > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From degregori at dkrz.de Tue Mar 2 06:48:57 2021 From: degregori at dkrz.de (Enrico) Date: Tue, 2 Mar 2021 13:48:57 +0100 Subject: [petsc-users] PETSC installation on Cray Message-ID: Hi, I'm having some problems installing PETSC with Cray compiler. I use this configuration: ./configure --with-cc=cc --with-cxx=CC --with-fc=0 --with-debugging=1 --with-shared-libraries=1 COPTFLAGS=-O0 CXXOPTFLAGS=-O0 and when I do make all I get the following error because of cmathcalls.h: CC-1043 craycc: ERROR File = /usr/include/bits/cmathcalls.h, Line = 55 _Complex can only be used with floating-point types. __MATHCALL (cacos, (_Mdouble_complex_ __z)); ^ Am I doing something wrong? Regards, Enrico Degregori From knepley at gmail.com Tue Mar 2 07:13:43 2021 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 2 Mar 2021 08:13:43 -0500 Subject: [petsc-users] PETSC installation on Cray In-Reply-To: References: Message-ID: On Tue, Mar 2, 2021 at 7:49 AM Enrico wrote: > Hi, > > I'm having some problems installing PETSC with Cray compiler. > I use this configuration: > > ./configure --with-cc=cc --with-cxx=CC --with-fc=0 --with-debugging=1 > --with-shared-libraries=1 COPTFLAGS=-O0 CXXOPTFLAGS=-O0 > > and when I do > > make all > > I get the following error because of cmathcalls.h: > > CC-1043 craycc: ERROR File = /usr/include/bits/cmathcalls.h, Line = 55 > _Complex can only be used with floating-point types. > __MATHCALL (cacos, (_Mdouble_complex_ __z)); > ^ > > Am I doing something wrong? > This was expended from somewhere. Can you show the entire err log? Thanks, Matt > Regards, > Enrico Degregori > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibault.bridelbertomeu at gmail.com Tue Mar 2 07:15:31 2021 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Tue, 2 Mar 2021 14:15:31 +0100 Subject: [petsc-users] DMPlex read partitionned Gmsh In-Reply-To: References: Message-ID: Dear Matthew, Sorry it took me so long to answer. Thank you for the details on the HDF5 reader, I think I can work with that indeed. Gotta find a way to write the *.msh in the custom HDF5 format first but as you say I might just go with a snippet based on PETSc, this way I'm sure it will be consistent. Cheers, Thibault Le mar. 23 f?vr. 2021 ? 11:50, Matthew Knepley a ?crit : > On Tue, Feb 23, 2021 at 2:37 AM Thibault Bridel-Bertomeu < > thibault.bridelbertomeu at gmail.com> wrote: > >> Dear all, >> >> I was wondering if there was a plan in motion to implement yet another >> possibility for DMPlexCreateGmshFromFile: read a group of foo_*.msh >> generated from a partition done directly in Gmsh ? >> > > What we have implemented now is a system that reads a mesh in parallel > from disk into a naive partition, then repartitions and redistributes. > We have a paper about this strategy: https://arxiv.org/abs/2004.08729 . > Right now it is only implemented in HDF5. This is mainly because: > > 1) Parallel block reads are easy in HDF5. > > 2) We use it for checkpointing as well as load, and it is flexible enough > for this > > 3) Label information can be stored in a scalable way > > It is easy to convert from GMsh to HDF5 (it's a few lines of PETSc). The > GMsh format is not ideal for parallelism, and in fact the GMsh reader > was also using MED, which is an HDF5 format. We originally wrote an MED > reader, but the documentation and support for the library were > not up to snuff, so we went with a custom HDF5 format. > > Is this helpful? > > Matt > > >> Have a great day, >> >> Thibault B.-B. >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From degregori at dkrz.de Tue Mar 2 07:33:23 2021 From: degregori at dkrz.de (Enrico) Date: Tue, 2 Mar 2021 14:33:23 +0100 Subject: [petsc-users] PETSC installation on Cray In-Reply-To: References: Message-ID: <4fd4fc04-9c46-51f5-2720-c682d0011224@dkrz.de> Hi, attached is the configuration and make log files. Enrico On 02/03/2021 14:13, Matthew Knepley wrote: > On Tue, Mar 2, 2021 at 7:49 AM Enrico > wrote: > > Hi, > > I'm having some problems installing PETSC with Cray compiler. > I use this configuration: > > ./configure --with-cc=cc --with-cxx=CC --with-fc=0 --with-debugging=1 > --with-shared-libraries=1 COPTFLAGS=-O0 CXXOPTFLAGS=-O0 > > and when I do > > make all > > I get the following error because of cmathcalls.h: > > CC-1043 craycc: ERROR File = /usr/include/bits/cmathcalls.h, Line = 55 > ? ?_Complex can only be used with floating-point types. > ? ?__MATHCALL (cacos, (_Mdouble_complex_ __z)); > ? ?^ > > Am I doing something wrong? > > > This was expended from somewhere. Can you show the entire err log? > > ? Thanks, > > ? ? ?Matt > > Regards, > Enrico Degregori > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 1111861 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: text/x-log Size: 49843 bytes Desc: not available URL: From knepley at gmail.com Tue Mar 2 07:39:19 2021 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 2 Mar 2021 08:39:19 -0500 Subject: [petsc-users] PETSC installation on Cray In-Reply-To: <4fd4fc04-9c46-51f5-2720-c682d0011224@dkrz.de> References: <4fd4fc04-9c46-51f5-2720-c682d0011224@dkrz.de> Message-ID: On Tue, Mar 2, 2021 at 8:33 AM Enrico wrote: > Hi, > > attached is the configuration and make log files. > cmathcalls.h is a GNU c header. Is the Cray compiler supposed to be using this header, or has something gone wrong with the installation on this machine. This does not seem to be connected to PETSc, but maybe it is in a way I cannot see. Thanks, Matt > Enrico > > On 02/03/2021 14:13, Matthew Knepley wrote: > > On Tue, Mar 2, 2021 at 7:49 AM Enrico > > wrote: > > > > Hi, > > > > I'm having some problems installing PETSC with Cray compiler. > > I use this configuration: > > > > ./configure --with-cc=cc --with-cxx=CC --with-fc=0 --with-debugging=1 > > --with-shared-libraries=1 COPTFLAGS=-O0 CXXOPTFLAGS=-O0 > > > > and when I do > > > > make all > > > > I get the following error because of cmathcalls.h: > > > > CC-1043 craycc: ERROR File = /usr/include/bits/cmathcalls.h, Line = > 55 > > _Complex can only be used with floating-point types. > > __MATHCALL (cacos, (_Mdouble_complex_ __z)); > > ^ > > > > Am I doing something wrong? > > > > > > This was expended from somewhere. Can you show the entire err log? > > > > Thanks, > > > > Matt > > > > Regards, > > Enrico Degregori > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Mar 2 14:03:06 2021 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 2 Mar 2021 14:03:06 -0600 Subject: [petsc-users] PETSC installation on Cray In-Reply-To: <4fd4fc04-9c46-51f5-2720-c682d0011224@dkrz.de> References: <4fd4fc04-9c46-51f5-2720-c682d0011224@dkrz.de> Message-ID: Please try the following. Make four files as below then compile each with cc -c -o test.o test1.c again for test2.c etc Send all the output. test1.c #include test2.c #define _BSD_SOURCE #include test3.c #define _DEFAULT_SOURCE #include test4.c #define _GNU_SOURCE #include > On Mar 2, 2021, at 7:33 AM, Enrico wrote: > > Hi, > > attached is the configuration and make log files. > > Enrico > > On 02/03/2021 14:13, Matthew Knepley wrote: >> On Tue, Mar 2, 2021 at 7:49 AM Enrico > wrote: >> Hi, >> I'm having some problems installing PETSC with Cray compiler. >> I use this configuration: >> ./configure --with-cc=cc --with-cxx=CC --with-fc=0 --with-debugging=1 >> --with-shared-libraries=1 COPTFLAGS=-O0 CXXOPTFLAGS=-O0 >> and when I do >> make all >> I get the following error because of cmathcalls.h: >> CC-1043 craycc: ERROR File = /usr/include/bits/cmathcalls.h, Line = 55 >> _Complex can only be used with floating-point types. >> __MATHCALL (cacos, (_Mdouble_complex_ __z)); >> ^ >> Am I doing something wrong? >> This was expended from somewhere. Can you show the entire err log? >> Thanks, >> Matt >> Regards, >> Enrico Degregori >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> https://www.cse.buffalo.edu/~knepley/ > From thibault.bridelbertomeu at gmail.com Wed Mar 3 01:02:51 2021 From: thibault.bridelbertomeu at gmail.com (Thibault Bridel-Bertomeu) Date: Wed, 3 Mar 2021 08:02:51 +0100 Subject: [petsc-users] What about user-functions containing OpenMP ? Message-ID: Dear all, I am aware that the strategy chosen by PETSc is to rely exclusively on a MPI paradigm with therefore functions, methods and routines that are not necessarily thread-safe in order not to impede the performance too much. I had however one interrogation : what happens if, say, the user passes functions containing OpenMP pragma to wrappers like TSSetRHSFunction, or even writes a new TSAdapt containing OpenMP pragma ? If the threads are started before the TSSolve, would we benefit from some performance increase from the pragmas in the user functions, or would it lead to instability because the PETSc functions calling the user functions are anyways not built for OpenMP and it would fail ? Thank you for your insight, Thibault -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Mar 3 10:25:02 2021 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 3 Mar 2021 10:25:02 -0600 Subject: [petsc-users] What about user-functions containing OpenMP ? In-Reply-To: References: Message-ID: <485A00BB-DEFC-48F3-ACCF-DBFED62D7B17@petsc.dev> So long as 1) you configure PETSc with --with-openmp 2) your pragma loops inside the functions inside RHSFunction do not touch PETSc objects directly (PETSc arrays like from VecGetArray() or DMDAVecGetArray() are fine you can access them) or make PETSc calls then this is fine. But how much speed up you get depends on how much extra memory bandwidth and cores you have available to doe the work after you have already parallelized with MPI. Note you have to set the number of threads you want to use inside these functions which can be done with the horrible environmental variable OMP_NUM_THREADS or the PETSc option -omp_num_threads or by calling omp_set_num_threads() in your code. Barry > On Mar 3, 2021, at 1:02 AM, Thibault Bridel-Bertomeu wrote: > > Dear all, > > I am aware that the strategy chosen by PETSc is to rely exclusively on a MPI paradigm with therefore functions, methods and routines that are not necessarily thread-safe in order not to impede the performance too much. > I had however one interrogation : what happens if, say, the user passes functions containing OpenMP pragma to wrappers like TSSetRHSFunction, or even writes a new TSAdapt containing OpenMP pragma ? > If the threads are started before the TSSolve, would we benefit from some performance increase from the pragmas in the user functions, or would it lead to instability because the PETSc functions calling the user functions are anyways not built for OpenMP and it would fail ? > > Thank you for your insight, > > Thibault From snailsoar at hotmail.com Wed Mar 3 12:14:08 2021 From: snailsoar at hotmail.com (feng wang) Date: Wed, 3 Mar 2021 18:14:08 +0000 Subject: [petsc-users] compile error of CHKERRQ when linking to PETSC Message-ID: Dear All, I am new to PETSC and have exercised on a few simple programs to get myself familiar with PETSC and the simple codes work properly. So the environment for PETSC is set up with no problem. Now I am trying to do some work in my big code which is written in C++. Somehow, I get the compile error for "CHKERRQ", and the error message is: ~/cfd/petsc/include/petscerror.h:464:196: error: return-statement with a value, in function returning 'void' [-fpermissive] do {PetscErrorCode ierr__ = (ierr); if (PetscUnlikely(ierr__)) return PetscError(PETSC_COMM_SELF,__LINE__,PETSC_FUNCTION_NAME,__FILE__,ierr__,PETSC_ERROR_REPEAT," ");} while (0) ^ domain/cfd/petsc_nk.cpp:23:7: note: in expansion of macro ?CHKERRQ? CHKERRQ(ierr); Basically, it complains about "CHKERRQ", if I remove "CHKERRQ", the code compiles with no problem. Could someone please shine some light on this? Many thanks, Feng -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Mar 3 12:25:02 2021 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 3 Mar 2021 12:25:02 -0600 Subject: [petsc-users] compile error of CHKERRQ when linking to PETSC In-Reply-To: References: Message-ID: <4e41b687-4129-3d63-6246-d77fede6946@mcs.anl.gov> CHKERRQ() returns error code from the routine. However if the routine is set to return void - use CHKERRABORT() CHKERRABORT(PETSC_COMM_WORLD,ierr) Satish On Wed, 3 Mar 2021, feng wang wrote: > Dear All, > > I am new to PETSC and have exercised on a few simple programs to get myself familiar with PETSC and the simple codes work properly. So the environment for PETSC is set up with no problem. Now I am trying to do some work in my big code which is written in C++. Somehow, I get the compile error for "CHKERRQ", and the error message is: > > ~/cfd/petsc/include/petscerror.h:464:196: error: return-statement with a value, in function returning 'void' [-fpermissive] > do {PetscErrorCode ierr__ = (ierr); if (PetscUnlikely(ierr__)) return PetscError(PETSC_COMM_SELF,__LINE__,PETSC_FUNCTION_NAME,__FILE__,ierr__,PETSC_ERROR_REPEAT," ");} while (0) > ^ > domain/cfd/petsc_nk.cpp:23:7: note: in expansion of macro ?CHKERRQ? > CHKERRQ(ierr); > > Basically, it complains about "CHKERRQ", if I remove "CHKERRQ", the code compiles with no problem. Could someone please shine some light on this? > > Many thanks, > Feng > From snailsoar at hotmail.com Wed Mar 3 12:41:25 2021 From: snailsoar at hotmail.com (feng wang) Date: Wed, 3 Mar 2021 18:41:25 +0000 Subject: [petsc-users] compile error of CHKERRQ when linking to PETSC In-Reply-To: <4e41b687-4129-3d63-6246-d77fede6946@mcs.anl.gov> References: , <4e41b687-4129-3d63-6246-d77fede6946@mcs.anl.gov> Message-ID: Dear Satish, Many thanks for your prompt reply. Yes, you are right! I've changed the return type of my function to "PetscErrorCode", it is working now. silly me..... Thanks, Feng ________________________________ From: Satish Balay Sent: 03 March 2021 18:25 To: feng wang Cc: PETSc Subject: Re: [petsc-users] compile error of CHKERRQ when linking to PETSC CHKERRQ() returns error code from the routine. However if the routine is set to return void - use CHKERRABORT() CHKERRABORT(PETSC_COMM_WORLD,ierr) Satish On Wed, 3 Mar 2021, feng wang wrote: > Dear All, > > I am new to PETSC and have exercised on a few simple programs to get myself familiar with PETSC and the simple codes work properly. So the environment for PETSC is set up with no problem. Now I am trying to do some work in my big code which is written in C++. Somehow, I get the compile error for "CHKERRQ", and the error message is: > > ~/cfd/petsc/include/petscerror.h:464:196: error: return-statement with a value, in function returning 'void' [-fpermissive] > do {PetscErrorCode ierr__ = (ierr); if (PetscUnlikely(ierr__)) return PetscError(PETSC_COMM_SELF,__LINE__,PETSC_FUNCTION_NAME,__FILE__,ierr__,PETSC_ERROR_REPEAT," ");} while (0) > ^ > domain/cfd/petsc_nk.cpp:23:7: note: in expansion of macro ?CHKERRQ? > CHKERRQ(ierr); > > Basically, it complains about "CHKERRQ", if I remove "CHKERRQ", the code compiles with no problem. Could someone please shine some light on this? > > Many thanks, > Feng > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thijs.smit at hest.ethz.ch Wed Mar 3 15:23:06 2021 From: thijs.smit at hest.ethz.ch (Smit Thijs) Date: Wed, 3 Mar 2021 21:23:06 +0000 Subject: [petsc-users] Loading external array data into PETSc Message-ID: <1279b545fe9643b39ddefa27e59c2a16@hest.ethz.ch> Hi All, I would like to readin a fairly large external vector into PETSc to be used further. My idea was to write a hdf5 file using Python with the vector data. Then read that hdf5 file into PETSc using PetscViewerHDF5Open and VecLoad. Is this the advised way or is there a better alternative to achieve the same (getting this external array data into PETSc)? Best regards, Thijs Smit PhD Candidate ETH Zurich Institute for Biomechanics -------------- next part -------------- An HTML attachment was scrubbed... URL: From roland.richter at ntnu.no Wed Mar 3 15:27:21 2021 From: roland.richter at ntnu.no (Roland Richter) Date: Wed, 3 Mar 2021 22:27:21 +0100 Subject: [petsc-users] Loading external array data into PETSc In-Reply-To: <1279b545fe9643b39ddefa27e59c2a16@hest.ethz.ch> References: <1279b545fe9643b39ddefa27e59c2a16@hest.ethz.ch> Message-ID: <97ca6f84-40ef-b5db-a1e2-4713c17ee481@ntnu.no> Hi, I think that depends on how you get the external vector (i.e. can you create it internally or externally), and how big it is (1k entries, 1kk entries, 1kkk entries)? Regards, Roland Am 03.03.2021 um 22:23 schrieb Smit Thijs: > > Hi All, > > ? > > I would like to readin a fairly large external vector into PETSc to be > used further. My idea was to write a hdf5 file using Python with the > vector data. Then read that hdf5 file into PETSc using > PetscViewerHDF5Open and VecLoad. Is this the advised way or is there a > better alternative to achieve the same (getting this external array > data into PETSc)? > > ? > > Best regards, > > ? > > Thijs Smit > > ? > > PhD Candidate > > ETH Zurich > > Institute for Biomechanics > > ? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 3 16:01:57 2021 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 Mar 2021 17:01:57 -0500 Subject: [petsc-users] Loading external array data into PETSc In-Reply-To: <1279b545fe9643b39ddefa27e59c2a16@hest.ethz.ch> References: <1279b545fe9643b39ddefa27e59c2a16@hest.ethz.ch> Message-ID: On Wed, Mar 3, 2021 at 4:23 PM Smit Thijs wrote: > Hi All, > > > > I would like to readin a fairly large external vector into PETSc to be > used further. My idea was to write a hdf5 file using Python with the vector > data. Then read that hdf5 file into PETSc using PetscViewerHDF5Open and > VecLoad. Is this the advised way or is there a better alternative to > achieve the same (getting this external array data into PETSc)? > I think there are at least two nice ways to do this: 1) Use HDF5. Here you have to name your vector to match the HDF5 object 2) Use raw binary. This is a specific PETSc format, but it is fast and simple. Thanks, Matt > Best regards, > > > > Thijs Smit > > > > PhD Candidate > > ETH Zurich > > Institute for Biomechanics > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Mar 3 16:22:29 2021 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 3 Mar 2021 16:22:29 -0600 Subject: [petsc-users] [petsc-dev] headsup: switch git default branch from 'master' to 'main' In-Reply-To: <4834e30-b876-d6b5-9b8a-1a2396efef7@mcs.anl.gov> References: <55996c7c-a274-5ebb-bba7-e06ac4c3b83a@mcs.anl.gov> <4834e30-b876-d6b5-9b8a-1a2396efef7@mcs.anl.gov> Message-ID: <7a746674-57c9-635f-f896-106d14907aa1@mcs.anl.gov> An update: Looks like we have to do the following in our existing clones: git remote set-head origin -a Satish ----- balay at sb /home/balay/petsc (main=) $ git gc fatal: bad object refs/remotes/origin/HEAD fatal: failed to run repack balay at sb /home/balay/petsc (main=) $ cat .git/refs/remotes/origin/HEAD ref: refs/remotes/origin/master balay at sb /home/balay/petsc (main=) $ git remote set-head origin -a origin/HEAD set to main balay at sb /home/balay/petsc (main=) $ cat .git/refs/remotes/origin/HEAD ref: refs/remotes/origin/main balay at sb /home/balay/petsc (main=) $ git gc --prune=now Enumerating objects: 942787, done. Counting objects: 100% (942787/942787), done. Delta compression using up to 4 threads Compressing objects: 100% (213109/213109), done. Writing objects: 100% (942787/942787), done. Total 942787 (delta 725291), reused 942001 (delta 724664), pack-reused 0 Checking connectivity: 942787, done. Expanding reachable commits in commit graph: 86669, done. balay at sb /home/balay/petsc (main=) $ On Fri, 26 Feb 2021, Satish Balay via petsc-users wrote: > Update: > > the switch (at gitlab.com/petsc/petsc) is done. > > Please delete your local copy of 'master' branch and start using 'main' branch. > > Satish > > On Tue, 23 Feb 2021, Satish Balay via petsc-dev wrote: > > > All, > > > > This is a heads-up, we are to switch the default branch in petsc git > > repo from 'master' to 'main' > > > > [Will plan to do the switch on friday the 26th] > > > > We've previously switched 'maint' branch to 'release' before 3.14 > > release - and this change (to 'main') is the next step in this direction. > > > > Satish > > > From thijs.smit at hest.ethz.ch Thu Mar 4 00:10:09 2021 From: thijs.smit at hest.ethz.ch (Smit Thijs) Date: Thu, 4 Mar 2021 06:10:09 +0000 Subject: [petsc-users] Loading external array data into PETSc In-Reply-To: References: <1279b545fe9643b39ddefa27e59c2a16@hest.ethz.ch> Message-ID: Hi Matt, Roland and others, @ Roland, I can create the Vec in Petsc, but want to load it with external data. I can arrange that data in what ever way with Python. It is about two vectors, each of length 4.6 x10^6 entries. I have created a hdf5 file with Python and try to load it into PETSc which gives me some errors. I added the small test script I am using. The error I am getting is: error: ?PetscViewerHDF5Open? was not declared in this scope; did you mean ?PetscViewerVTKOpen?? Am I using the wrong header? Best, Thijs From: Matthew Knepley Sent: 03 March 2021 23:02 To: Smit Thijs Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Loading external array data into PETSc On Wed, Mar 3, 2021 at 4:23 PM Smit Thijs > wrote: Hi All, I would like to readin a fairly large external vector into PETSc to be used further. My idea was to write a hdf5 file using Python with the vector data. Then read that hdf5 file into PETSc using PetscViewerHDF5Open and VecLoad. Is this the advised way or is there a better alternative to achieve the same (getting this external array data into PETSc)? I think there are at least two nice ways to do this: 1) Use HDF5. Here you have to name your vector to match the HDF5 object 2) Use raw binary. This is a specific PETSc format, but it is fast and simple. Thanks, Matt Best regards, Thijs Smit PhD Candidate ETH Zurich Institute for Biomechanics -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: hdf5.cpp URL: From knepley at gmail.com Thu Mar 4 07:52:30 2021 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 Mar 2021 08:52:30 -0500 Subject: [petsc-users] Loading external array data into PETSc In-Reply-To: References: <1279b545fe9643b39ddefa27e59c2a16@hest.ethz.ch> Message-ID: On Thu, Mar 4, 2021 at 1:10 AM Smit Thijs wrote: > Hi Matt, Roland and others, > > > > @ Roland, I can create the Vec in Petsc, but want to load it with external > data. I can arrange that data in what ever way with Python. It is about two > vectors, each of length 4.6 x10^6 entries. > > > > I have created a hdf5 file with Python and try to load it into PETSc which > gives me some errors. I added the small test script I am using. The error I > am getting is: > > > > error: ?PetscViewerHDF5Open? was not declared in this scope; did you mean > ?PetscViewerVTKOpen?? > > > > Am I using the wrong header > Did you configure PETSc with HDF5? If so, the configure.log will have an entry for it at the bottom. If not, reconfigure with it: ${PETSC_DIR}/${PETSC_ARCH}/lib/petsc/conf/reconfigure-${PETSC_ARCH}.py --download-hdf5 or you can use --with-hdf5-dir=/path/to/hdf5 if it is already installed. Thanks, Matt > Best, > > > > Thijs > > > > *From:* Matthew Knepley > *Sent:* 03 March 2021 23:02 > *To:* Smit Thijs > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Loading external array data into PETSc > > > > On Wed, Mar 3, 2021 at 4:23 PM Smit Thijs wrote: > > Hi All, > > > > I would like to readin a fairly large external vector into PETSc to be > used further. My idea was to write a hdf5 file using Python with the vector > data. Then read that hdf5 file into PETSc using PetscViewerHDF5Open and > VecLoad. Is this the advised way or is there a better alternative to > achieve the same (getting this external array data into PETSc)? > > > > I think there are at least two nice ways to do this: > > > > 1) Use HDF5. Here you have to name your vector to match the HDF5 object > > > > 2) Use raw binary. This is a specific PETSc format, but it is fast and > simple. > > > > Thanks, > > > > Matt > > > > Best regards, > > > > Thijs Smit > > > > PhD Candidate > > ETH Zurich > > Institute for Biomechanics > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thijs.smit at hest.ethz.ch Thu Mar 4 09:19:42 2021 From: thijs.smit at hest.ethz.ch (Smit Thijs) Date: Thu, 4 Mar 2021 15:19:42 +0000 Subject: [petsc-users] Loading external array data into PETSc In-Reply-To: References: <1279b545fe9643b39ddefa27e59c2a16@hest.ethz.ch> Message-ID: <731b0f4b394747889a9f343d8411055a@hest.ethz.ch> Hi Matt, Thank you, I needed to reconfigure with hdf5. The import works well now. Best, Thijs From: Matthew Knepley Sent: 04 March 2021 14:53 To: Smit Thijs Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Loading external array data into PETSc On Thu, Mar 4, 2021 at 1:10 AM Smit Thijs > wrote: Hi Matt, Roland and others, @ Roland, I can create the Vec in Petsc, but want to load it with external data. I can arrange that data in what ever way with Python. It is about two vectors, each of length 4.6 x10^6 entries. I have created a hdf5 file with Python and try to load it into PETSc which gives me some errors. I added the small test script I am using. The error I am getting is: error: ?PetscViewerHDF5Open? was not declared in this scope; did you mean ?PetscViewerVTKOpen?? Am I using the wrong header Did you configure PETSc with HDF5? If so, the configure.log will have an entry for it at the bottom. If not, reconfigure with it: ${PETSC_DIR}/${PETSC_ARCH}/lib/petsc/conf/reconfigure-${PETSC_ARCH}.py --download-hdf5 or you can use --with-hdf5-dir=/path/to/hdf5 if it is already installed. Thanks, Matt Best, Thijs From: Matthew Knepley > Sent: 03 March 2021 23:02 To: Smit Thijs > Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Loading external array data into PETSc On Wed, Mar 3, 2021 at 4:23 PM Smit Thijs > wrote: Hi All, I would like to readin a fairly large external vector into PETSc to be used further. My idea was to write a hdf5 file using Python with the vector data. Then read that hdf5 file into PETSc using PetscViewerHDF5Open and VecLoad. Is this the advised way or is there a better alternative to achieve the same (getting this external array data into PETSc)? I think there are at least two nice ways to do this: 1) Use HDF5. Here you have to name your vector to match the HDF5 object 2) Use raw binary. This is a specific PETSc format, but it is fast and simple. Thanks, Matt Best regards, Thijs Smit PhD Candidate ETH Zurich Institute for Biomechanics -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Mar 4 11:38:31 2021 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 4 Mar 2021 11:38:31 -0600 Subject: [petsc-users] petsc-3.14.5 now available Message-ID: <8ba4d75b-22c2-9eba-be53-42f16747419e@mcs.anl.gov> Dear PETSc users, The patch release petsc-3.14.5 is now available for download. http://www.mcs.anl.gov/petsc/download/index.html Satish From acolin at isi.edu Fri Mar 5 15:06:22 2021 From: acolin at isi.edu (Alexei Colin) Date: Fri, 5 Mar 2021 16:06:22 -0500 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution Message-ID: To PETSc DMPlex users, Firedrake users, Dr. Knepley and Dr. Karpeev: Is it expected for mesh distribution step to (A) take a share of 50-99% of total time-to-solution of an FEM problem, and (B) take an amount of time that increases with the number of ranks, and (C) take an amount of memory on rank 0 that does not decrease with the number of ranks ? The attached plots suggest (A), (B), and (C) is happening for Cahn-Hilliard problem (from firedrake-bench repo) on a 2D 8Kx8K unit-square mesh. The implementation is here [1]. Versions are Firedrake, PyOp2: 20200204.0; PETSc 3.13.1; ParMETIS 4.0.3. Two questions, one on (A) and the other on (B)+(C): 1. Is (A) result expected? Given (A), any effort to improve the quality of the compiled assembly kernels (or anything else other than mesh distribution) appears futile since it takes 1% of end-to-end execution time, or am I missing something? 1a. Is mesh distribution fundamentally necessary for any FEM framework, or is it only needed by Firedrake? If latter, then how do other frameworks partition the mesh and execute in parallel with MPI but avoid the non-scalable mesh destribution step? 2. Results (B) and (C) suggest that the mesh distribution step does not scale. Is it a fundamental property of the mesh distribution problem that it has a central bottleneck in the master process, or is it a limitation of the current implementation in PETSc-DMPlex? 2a. Our (B) result seems to agree with Figure 4(left) of [2]. Fig 6 of [2] suggests a way to reduce the time spent on sequential bottleneck by "parallel mesh refinment" that creates high-resolution meshes from an initial coarse mesh. Is this approach implemented in DMPLex? If so, any pointers on how to try it out with Firedrake? If not, any other directions for reducing this bottleneck? 2b. Fig 6 in [3] shows plots for Assembly and Solve steps that scale well up to 96 cores -- is mesh distribution included in those times? Is anyone reading this aware of any other publications with evaluations of Firedrake that measure mesh distribution (or explain how to avoid or exclude it)? Thank you for your time and any info or tips. [1] https://github.com/ISI-apex/firedrake-bench/blob/master/cahn_hilliard/firedrake_cahn_hilliard_problem.py [2] Unstructured Overlapping Mesh Distribution in Parallel, Matthew G. Knepley, Michael Lange, Gerard J. Gorman, 2015. https://arxiv.org/pdf/1506.06194.pdf [3] Efficient mesh management in Firedrake using PETSc-DMPlex, Michael Lange, Lawrence Mitchell, Matthew G. Knepley and Gerard J. Gorman, SISC, 38(5), S143-S155, 2016. http://arxiv.org/abs/1506.07749 -------------- next part -------------- A non-text attachment was scrubbed... Name: ch-mesh-dist.pdf Type: application/pdf Size: 16677 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ch-mem.pdf Type: application/pdf Size: 16582 bytes Desc: not available URL: From bsmith at petsc.dev Fri Mar 5 18:05:49 2021 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 5 Mar 2021 18:05:49 -0600 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: Message-ID: Alexei, Sorry to hear about your difficulties with the mesh distribution step. Based on the figures you included, the extreme difficulties occur on the Oak Ridge Summit system? But on the Argonne Theta system though the distribution time goes up it does not dominate the computation? There is a very short discussion of mesh distribution in https://arxiv.org/abs/2102.13018 section 6.3 Figure 11 that was run on the Summit system. Certainly there is no intention that mesh distribution dominates the entire computation, your particular case and the behavior of DMPLEX would need to understood on Summit to determine the problems. DMPLEX and all its related components are rapidly evolving and hence performance can change quickly with new updates. I urge you to use the main branch of PETSc for HPC and timing studies of DMPLEX performance on large systems, do not just use PETSc releases. You can communicate directly with those working on scaling DMPLEX at gitlab.com/petsc/petsc who can help understand the cause of the performance issues on Summit. Barry > On Mar 5, 2021, at 3:06 PM, Alexei Colin wrote: > > To PETSc DMPlex users, Firedrake users, Dr. Knepley and Dr. Karpeev: > > Is it expected for mesh distribution step to > (A) take a share of 50-99% of total time-to-solution of an FEM problem, and > (B) take an amount of time that increases with the number of ranks, and > (C) take an amount of memory on rank 0 that does not decrease with the > number of ranks > ? > > The attached plots suggest (A), (B), and (C) is happening for > Cahn-Hilliard problem (from firedrake-bench repo) on a 2D 8Kx8K > unit-square mesh. The implementation is here [1]. Versions are > Firedrake, PyOp2: 20200204.0; PETSc 3.13.1; ParMETIS 4.0.3. > > Two questions, one on (A) and the other on (B)+(C): > > 1. Is (A) result expected? Given (A), any effort to improve the quality > of the compiled assembly kernels (or anything else other than mesh > distribution) appears futile since it takes 1% of end-to-end execution > time, or am I missing something? > > 1a. Is mesh distribution fundamentally necessary for any FEM framework, > or is it only needed by Firedrake? If latter, then how do other > frameworks partition the mesh and execute in parallel with MPI but avoid > the non-scalable mesh destribution step? > > 2. Results (B) and (C) suggest that the mesh distribution step does > not scale. Is it a fundamental property of the mesh distribution problem > that it has a central bottleneck in the master process, or is it > a limitation of the current implementation in PETSc-DMPlex? > > 2a. Our (B) result seems to agree with Figure 4(left) of [2]. Fig 6 of [2] > suggests a way to reduce the time spent on sequential bottleneck by > "parallel mesh refinment" that creates high-resolution meshes from an > initial coarse mesh. Is this approach implemented in DMPLex? If so, any > pointers on how to try it out with Firedrake? If not, any other > directions for reducing this bottleneck? > > 2b. Fig 6 in [3] shows plots for Assembly and Solve steps that scale well up > to 96 cores -- is mesh distribution included in those times? Is anyone > reading this aware of any other publications with evaluations of > Firedrake that measure mesh distribution (or explain how to avoid or > exclude it)? > > Thank you for your time and any info or tips. > > > [1] https://github.com/ISI-apex/firedrake-bench/blob/master/cahn_hilliard/firedrake_cahn_hilliard_problem.py > > [2] Unstructured Overlapping Mesh Distribution in Parallel, Matthew G. > Knepley, Michael Lange, Gerard J. Gorman, 2015. > https://arxiv.org/pdf/1506.06194.pdf > > [3] Efficient mesh management in Firedrake using PETSc-DMPlex, Michael > Lange, Lawrence Mitchell, Matthew G. Knepley and Gerard J. Gorman, SISC, > 38(5), S143-S155, 2016. http://arxiv.org/abs/1506.07749 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at buffalo.edu Fri Mar 5 21:04:39 2021 From: knepley at buffalo.edu (Matthew Knepley) Date: Fri, 5 Mar 2021 22:04:39 -0500 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: On Fri, Mar 5, 2021 at 4:06 PM Alexei Colin wrote: > To PETSc DMPlex users, Firedrake users, Dr. Knepley and Dr. Karpeev: > > Is it expected for mesh distribution step to > (A) take a share of 50-99% of total time-to-solution of an FEM problem, and > No > (B) take an amount of time that increases with the number of ranks, and > See below. > (C) take an amount of memory on rank 0 that does not decrease with the > number of ranks > The problem here is that a serial mesh is being partitioned and sent to all processes. This is fundamentally non-scalable, but it is easy and works well for modest clusters < 100 nodes or so. Above this, it will take increasing amounts of time. There are a few techniques for mitigating this. a) For simple domains, you can distribute a coarse grid, then regularly refine that in parallel with DMRefine() or -dm_refine . These steps can be repeated easily, and redistribution in parallel is fast, as shown for example in [1]. b) For complex meshes, you can read them in parallel, and then repeat a). This is done in [1]. It is a little more involved, but not much. c) You can do a multilevel partitioning, as they do in [2]. I cannot find the paper in which they describe this right now. It is feasible, but definitely the most expert approach. Does this make sense? Thanks, Matt [1] Fully Parallel Mesh I/O using PETSc DMPlex with an Application to Waveform Modeling, Hapla et.al. https://arxiv.org/abs/2004.08729 [2] On the robustness and performance of entropy stable discontinuous collocation methods for the compressible Navier-Stokes equations, ROjas . et.al. https://arxiv.org/abs/1911.10966 > ? > > The attached plots suggest (A), (B), and (C) is happening for > Cahn-Hilliard problem (from firedrake-bench repo) on a 2D 8Kx8K > unit-square mesh. The implementation is here [1]. Versions are > Firedrake, PyOp2: 20200204.0; PETSc 3.13.1; ParMETIS 4.0.3. > > Two questions, one on (A) and the other on (B)+(C): > > 1. Is (A) result expected? Given (A), any effort to improve the quality > of the compiled assembly kernels (or anything else other than mesh > distribution) appears futile since it takes 1% of end-to-end execution > time, or am I missing something? > > 1a. Is mesh distribution fundamentally necessary for any FEM framework, > or is it only needed by Firedrake? If latter, then how do other > frameworks partition the mesh and execute in parallel with MPI but avoid > the non-scalable mesh destribution step? > > 2. Results (B) and (C) suggest that the mesh distribution step does > not scale. Is it a fundamental property of the mesh distribution problem > that it has a central bottleneck in the master process, or is it > a limitation of the current implementation in PETSc-DMPlex? > > 2a. Our (B) result seems to agree with Figure 4(left) of [2]. Fig 6 of [2] > suggests a way to reduce the time spent on sequential bottleneck by > "parallel mesh refinment" that creates high-resolution meshes from an > initial coarse mesh. Is this approach implemented in DMPLex? If so, any > pointers on how to try it out with Firedrake? If not, any other > directions for reducing this bottleneck? > > 2b. Fig 6 in [3] shows plots for Assembly and Solve steps that scale well > up > to 96 cores -- is mesh distribution included in those times? Is anyone > reading this aware of any other publications with evaluations of > Firedrake that measure mesh distribution (or explain how to avoid or > exclude it)? > > Thank you for your time and any info or tips. > > > [1] > https://github.com/ISI-apex/firedrake-bench/blob/master/cahn_hilliard/firedrake_cahn_hilliard_problem.py > > [2] Unstructured Overlapping Mesh Distribution in Parallel, Matthew G. > Knepley, Michael Lange, Gerard J. Gorman, 2015. > https://arxiv.org/pdf/1506.06194.pdf > > [3] Efficient mesh management in Firedrake using PETSc-DMPlex, Michael > Lange, Lawrence Mitchell, Matthew G. Knepley and Gerard J. Gorman, SISC, > 38(5), S143-S155, 2016. http://arxiv.org/abs/1506.07749 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Sat Mar 6 20:48:22 2021 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Sat, 6 Mar 2021 20:48:22 -0600 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: On Sat, Mar 6, 2021 at 12:27 PM Matthew Knepley wrote: > On Fri, Mar 5, 2021 at 4:06 PM Alexei Colin wrote: > >> To PETSc DMPlex users, Firedrake users, Dr. Knepley and Dr. Karpeev: >> >> Is it expected for mesh distribution step to >> (A) take a share of 50-99% of total time-to-solution of an FEM problem, >> and >> > > No > > >> (B) take an amount of time that increases with the number of ranks, and >> > > See below. > > >> (C) take an amount of memory on rank 0 that does not decrease with the >> number of ranks >> > > The problem here is that a serial mesh is being partitioned and sent to > all processes. This is fundamentally > non-scalable, but it is easy and works well for modest clusters < 100 > nodes or so. Above this, it will take > increasing amounts of time. There are a few techniques for mitigating this. > Is this one-to-all communication only done once? If yes, one MPI_Scatterv() is enough and should not cost much. a) For simple domains, you can distribute a coarse grid, then regularly > refine that in parallel with DMRefine() or -dm_refine . > These steps can be repeated easily, and redistribution in parallel is > fast, as shown for example in [1]. > > b) For complex meshes, you can read them in parallel, and then repeat a). > This is done in [1]. It is a little more involved, > but not much. > > c) You can do a multilevel partitioning, as they do in [2]. I cannot find > the paper in which they describe this right now. It is feasible, > but definitely the most expert approach. > > Does this make sense? > > Thanks, > > Matt > > [1] Fully Parallel Mesh I/O using PETSc DMPlex with an Application to > Waveform Modeling, Hapla et.al. > https://arxiv.org/abs/2004.08729 > [2] On the robustness and performance of entropy stable discontinuous > collocation methods for the compressible Navier-Stokes equations, ROjas . > et.al. > https://arxiv.org/abs/1911.10966 > > >> ? >> >> The attached plots suggest (A), (B), and (C) is happening for >> Cahn-Hilliard problem (from firedrake-bench repo) on a 2D 8Kx8K >> unit-square mesh. The implementation is here [1]. Versions are >> Firedrake, PyOp2: 20200204.0; PETSc 3.13.1; ParMETIS 4.0.3. >> >> Two questions, one on (A) and the other on (B)+(C): >> >> 1. Is (A) result expected? Given (A), any effort to improve the quality >> of the compiled assembly kernels (or anything else other than mesh >> distribution) appears futile since it takes 1% of end-to-end execution >> time, or am I missing something? >> >> 1a. Is mesh distribution fundamentally necessary for any FEM framework, >> or is it only needed by Firedrake? If latter, then how do other >> frameworks partition the mesh and execute in parallel with MPI but avoid >> the non-scalable mesh destribution step? >> >> 2. Results (B) and (C) suggest that the mesh distribution step does >> not scale. Is it a fundamental property of the mesh distribution problem >> that it has a central bottleneck in the master process, or is it >> a limitation of the current implementation in PETSc-DMPlex? >> >> 2a. Our (B) result seems to agree with Figure 4(left) of [2]. Fig 6 of [2] >> suggests a way to reduce the time spent on sequential bottleneck by >> "parallel mesh refinment" that creates high-resolution meshes from an >> initial coarse mesh. Is this approach implemented in DMPLex? If so, any >> pointers on how to try it out with Firedrake? If not, any other >> directions for reducing this bottleneck? >> >> 2b. Fig 6 in [3] shows plots for Assembly and Solve steps that scale well >> up >> to 96 cores -- is mesh distribution included in those times? Is anyone >> reading this aware of any other publications with evaluations of >> Firedrake that measure mesh distribution (or explain how to avoid or >> exclude it)? >> >> Thank you for your time and any info or tips. >> >> >> [1] >> https://github.com/ISI-apex/firedrake-bench/blob/master/cahn_hilliard/firedrake_cahn_hilliard_problem.py >> >> [2] Unstructured Overlapping Mesh Distribution in Parallel, Matthew G. >> Knepley, Michael Lange, Gerard J. Gorman, 2015. >> https://arxiv.org/pdf/1506.06194.pdf >> >> [3] Efficient mesh management in Firedrake using PETSc-DMPlex, Michael >> Lange, Lawrence Mitchell, Matthew G. Knepley and Gerard J. Gorman, SISC, >> 38(5), S143-S155, 2016. http://arxiv.org/abs/1506.07749 >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sat Mar 6 21:46:34 2021 From: mfadams at lbl.gov (Mark Adams) Date: Sat, 6 Mar 2021 22:46:34 -0500 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: I observed poor scaling with mat/tests/ex13 on Fugaku recently. I was running this test as is (eg, no threads and 4 MPI processes per node/chip, which seems recomended). I did not dig into this. A test with about 10% of the machine took about 45 minutes to run. Mark On Sat, Mar 6, 2021 at 9:49 PM Junchao Zhang wrote: > > > > On Sat, Mar 6, 2021 at 12:27 PM Matthew Knepley > wrote: > >> On Fri, Mar 5, 2021 at 4:06 PM Alexei Colin wrote: >> >>> To PETSc DMPlex users, Firedrake users, Dr. Knepley and Dr. Karpeev: >>> >>> Is it expected for mesh distribution step to >>> (A) take a share of 50-99% of total time-to-solution of an FEM problem, >>> and >>> >> >> No >> >> >>> (B) take an amount of time that increases with the number of ranks, and >>> >> >> See below. >> >> >>> (C) take an amount of memory on rank 0 that does not decrease with the >>> number of ranks >>> >> >> The problem here is that a serial mesh is being partitioned and sent to >> all processes. This is fundamentally >> non-scalable, but it is easy and works well for modest clusters < 100 >> nodes or so. Above this, it will take >> increasing amounts of time. There are a few techniques for mitigating >> this. >> > Is this one-to-all communication only done once? If yes, one > MPI_Scatterv() is enough and should not cost much. > > a) For simple domains, you can distribute a coarse grid, then regularly >> refine that in parallel with DMRefine() or -dm_refine . >> These steps can be repeated easily, and redistribution in parallel is >> fast, as shown for example in [1]. >> >> b) For complex meshes, you can read them in parallel, and then repeat a). >> This is done in [1]. It is a little more involved, >> but not much. >> >> c) You can do a multilevel partitioning, as they do in [2]. I cannot find >> the paper in which they describe this right now. It is feasible, >> but definitely the most expert approach. >> >> Does this make sense? >> >> Thanks, >> >> Matt >> >> [1] Fully Parallel Mesh I/O using PETSc DMPlex with an Application to >> Waveform Modeling, Hapla et.al. >> https://arxiv.org/abs/2004.08729 >> [2] On the robustness and performance of entropy stable discontinuous >> collocation methods for the compressible Navier-Stokes equations, ROjas . >> et.al. >> https://arxiv.org/abs/1911.10966 >> >> >>> ? >>> >>> The attached plots suggest (A), (B), and (C) is happening for >>> Cahn-Hilliard problem (from firedrake-bench repo) on a 2D 8Kx8K >>> unit-square mesh. The implementation is here [1]. Versions are >>> Firedrake, PyOp2: 20200204.0; PETSc 3.13.1; ParMETIS 4.0.3. >>> >>> Two questions, one on (A) and the other on (B)+(C): >>> >>> 1. Is (A) result expected? Given (A), any effort to improve the quality >>> of the compiled assembly kernels (or anything else other than mesh >>> distribution) appears futile since it takes 1% of end-to-end execution >>> time, or am I missing something? >>> >>> 1a. Is mesh distribution fundamentally necessary for any FEM framework, >>> or is it only needed by Firedrake? If latter, then how do other >>> frameworks partition the mesh and execute in parallel with MPI but avoid >>> the non-scalable mesh destribution step? >>> >>> 2. Results (B) and (C) suggest that the mesh distribution step does >>> not scale. Is it a fundamental property of the mesh distribution problem >>> that it has a central bottleneck in the master process, or is it >>> a limitation of the current implementation in PETSc-DMPlex? >>> >>> 2a. Our (B) result seems to agree with Figure 4(left) of [2]. Fig 6 of >>> [2] >>> suggests a way to reduce the time spent on sequential bottleneck by >>> "parallel mesh refinment" that creates high-resolution meshes from an >>> initial coarse mesh. Is this approach implemented in DMPLex? If so, any >>> pointers on how to try it out with Firedrake? If not, any other >>> directions for reducing this bottleneck? >>> >>> 2b. Fig 6 in [3] shows plots for Assembly and Solve steps that scale >>> well up >>> to 96 cores -- is mesh distribution included in those times? Is anyone >>> reading this aware of any other publications with evaluations of >>> Firedrake that measure mesh distribution (or explain how to avoid or >>> exclude it)? >>> >>> Thank you for your time and any info or tips. >>> >>> >>> [1] >>> https://github.com/ISI-apex/firedrake-bench/blob/master/cahn_hilliard/firedrake_cahn_hilliard_problem.py >>> >>> [2] Unstructured Overlapping Mesh Distribution in Parallel, Matthew G. >>> Knepley, Michael Lange, Gerard J. Gorman, 2015. >>> https://arxiv.org/pdf/1506.06194.pdf >>> >>> [3] Efficient mesh management in Firedrake using PETSc-DMPlex, Michael >>> Lange, Lawrence Mitchell, Matthew G. Knepley and Gerard J. Gorman, SISC, >>> 38(5), S143-S155, 2016. http://arxiv.org/abs/1506.07749 >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at buffalo.edu Sat Mar 6 22:29:20 2021 From: knepley at buffalo.edu (Matthew Knepley) Date: Sat, 6 Mar 2021 23:29:20 -0500 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: On Sat, Mar 6, 2021 at 9:48 PM Junchao Zhang wrote: > On Sat, Mar 6, 2021 at 12:27 PM Matthew Knepley > wrote: > >> On Fri, Mar 5, 2021 at 4:06 PM Alexei Colin wrote: >> >>> To PETSc DMPlex users, Firedrake users, Dr. Knepley and Dr. Karpeev: >>> >>> Is it expected for mesh distribution step to >>> (A) take a share of 50-99% of total time-to-solution of an FEM problem, >>> and >>> >> >> No >> >> >>> (B) take an amount of time that increases with the number of ranks, and >>> >> >> See below. >> >> >>> (C) take an amount of memory on rank 0 that does not decrease with the >>> number of ranks >>> >> >> The problem here is that a serial mesh is being partitioned and sent to >> all processes. This is fundamentally >> non-scalable, but it is easy and works well for modest clusters < 100 >> nodes or so. Above this, it will take >> increasing amounts of time. There are a few techniques for mitigating >> this. >> > Is this one-to-all communication only done once? If yes, one > MPI_Scatterv() is enough and should not cost much. > No, there are several rounds of communication. This is definitely the bottleneck. We have measured it many times. THanks, Matt > a) For simple domains, you can distribute a coarse grid, then regularly >> refine that in parallel with DMRefine() or -dm_refine . >> These steps can be repeated easily, and redistribution in parallel is >> fast, as shown for example in [1]. >> >> b) For complex meshes, you can read them in parallel, and then repeat a). >> This is done in [1]. It is a little more involved, >> but not much. >> >> c) You can do a multilevel partitioning, as they do in [2]. I cannot find >> the paper in which they describe this right now. It is feasible, >> but definitely the most expert approach. >> >> Does this make sense? >> >> Thanks, >> >> Matt >> >> [1] Fully Parallel Mesh I/O using PETSc DMPlex with an Application to >> Waveform Modeling, Hapla et.al. >> https://arxiv.org/abs/2004.08729 >> [2] On the robustness and performance of entropy stable discontinuous >> collocation methods for the compressible Navier-Stokes equations, ROjas . >> et.al. >> https://arxiv.org/abs/1911.10966 >> >> >>> ? >>> >>> The attached plots suggest (A), (B), and (C) is happening for >>> Cahn-Hilliard problem (from firedrake-bench repo) on a 2D 8Kx8K >>> unit-square mesh. The implementation is here [1]. Versions are >>> Firedrake, PyOp2: 20200204.0; PETSc 3.13.1; ParMETIS 4.0.3. >>> >>> Two questions, one on (A) and the other on (B)+(C): >>> >>> 1. Is (A) result expected? Given (A), any effort to improve the quality >>> of the compiled assembly kernels (or anything else other than mesh >>> distribution) appears futile since it takes 1% of end-to-end execution >>> time, or am I missing something? >>> >>> 1a. Is mesh distribution fundamentally necessary for any FEM framework, >>> or is it only needed by Firedrake? If latter, then how do other >>> frameworks partition the mesh and execute in parallel with MPI but avoid >>> the non-scalable mesh destribution step? >>> >>> 2. Results (B) and (C) suggest that the mesh distribution step does >>> not scale. Is it a fundamental property of the mesh distribution problem >>> that it has a central bottleneck in the master process, or is it >>> a limitation of the current implementation in PETSc-DMPlex? >>> >>> 2a. Our (B) result seems to agree with Figure 4(left) of [2]. Fig 6 of >>> [2] >>> suggests a way to reduce the time spent on sequential bottleneck by >>> "parallel mesh refinment" that creates high-resolution meshes from an >>> initial coarse mesh. Is this approach implemented in DMPLex? If so, any >>> pointers on how to try it out with Firedrake? If not, any other >>> directions for reducing this bottleneck? >>> >>> 2b. Fig 6 in [3] shows plots for Assembly and Solve steps that scale >>> well up >>> to 96 cores -- is mesh distribution included in those times? Is anyone >>> reading this aware of any other publications with evaluations of >>> Firedrake that measure mesh distribution (or explain how to avoid or >>> exclude it)? >>> >>> Thank you for your time and any info or tips. >>> >>> >>> [1] >>> https://github.com/ISI-apex/firedrake-bench/blob/master/cahn_hilliard/firedrake_cahn_hilliard_problem.py >>> >>> [2] Unstructured Overlapping Mesh Distribution in Parallel, Matthew G. >>> Knepley, Michael Lange, Gerard J. Gorman, 2015. >>> https://arxiv.org/pdf/1506.06194.pdf >>> >>> [3] Efficient mesh management in Firedrake using PETSc-DMPlex, Michael >>> Lange, Lawrence Mitchell, Matthew G. Knepley and Gerard J. Gorman, SISC, >>> 38(5), S143-S155, 2016. http://arxiv.org/abs/1506.07749 >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun Mar 7 05:12:45 2021 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 7 Mar 2021 05:12:45 -0600 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: <42DA83E8-8BC8-4A18-9C7F-9F9F3F1AA220@petsc.dev> mat/tests/ex13.c creates a sequential AIJ matrix, converts it to the same format, reorders it and then prints it and the reordering in ASCII. Each of these steps is sequential and takes place on each rank. The prints are ASCII stdout on the ranks. ierr = MatCreateSeqAIJ(PETSC_COMM_SELF,m*n,m*n,5,NULL,&C);CHKERRQ(ierr); /* create the matrix for the five point stencil, YET AGAIN*/ for (i=0; i0) {J = Ii - n; ierr = MatSetValues(C,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);} if (i0) {J = Ii - 1; ierr = MatSetValues(C,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);} if (j On Mar 6, 2021, at 9:46 PM, Mark Adams wrote: > > I observed poor scaling with mat/tests/ex13 on Fugaku recently. > I was running this test as is (eg, no threads and 4 MPI processes per node/chip, which seems recomended). I did not dig into this. > A test with about 10% of the machine took about 45 minutes to run. > Mark > > On Sat, Mar 6, 2021 at 9:49 PM Junchao Zhang > wrote: > > > > On Sat, Mar 6, 2021 at 12:27 PM Matthew Knepley > wrote: > On Fri, Mar 5, 2021 at 4:06 PM Alexei Colin > wrote: > To PETSc DMPlex users, Firedrake users, Dr. Knepley and Dr. Karpeev: > > Is it expected for mesh distribution step to > (A) take a share of 50-99% of total time-to-solution of an FEM problem, and > > No > > (B) take an amount of time that increases with the number of ranks, and > > See below. > > (C) take an amount of memory on rank 0 that does not decrease with the > number of ranks > > The problem here is that a serial mesh is being partitioned and sent to all processes. This is fundamentally > non-scalable, but it is easy and works well for modest clusters < 100 nodes or so. Above this, it will take > increasing amounts of time. There are a few techniques for mitigating this. > Is this one-to-all communication only done once? If yes, one MPI_Scatterv() is enough and should not cost much. > > a) For simple domains, you can distribute a coarse grid, then regularly refine that in parallel with DMRefine() or -dm_refine . > These steps can be repeated easily, and redistribution in parallel is fast, as shown for example in [1]. > > b) For complex meshes, you can read them in parallel, and then repeat a). This is done in [1]. It is a little more involved, > but not much. > > c) You can do a multilevel partitioning, as they do in [2]. I cannot find the paper in which they describe this right now. It is feasible, > but definitely the most expert approach. > > Does this make sense? > > Thanks, > > Matt > > [1] Fully Parallel Mesh I/O using PETSc DMPlex with an Application to Waveform Modeling, Hapla et.al . > https://arxiv.org/abs/2004.08729 > [2] On the robustness and performance of entropy stable discontinuous collocation methods for the compressible Navier-Stokes equations, ROjas .et.al . > https://arxiv.org/abs/1911.10966 > > ? > > The attached plots suggest (A), (B), and (C) is happening for > Cahn-Hilliard problem (from firedrake-bench repo) on a 2D 8Kx8K > unit-square mesh. The implementation is here [1]. Versions are > Firedrake, PyOp2: 20200204.0; PETSc 3.13.1; ParMETIS 4.0.3. > > Two questions, one on (A) and the other on (B)+(C): > > 1. Is (A) result expected? Given (A), any effort to improve the quality > of the compiled assembly kernels (or anything else other than mesh > distribution) appears futile since it takes 1% of end-to-end execution > time, or am I missing something? > > 1a. Is mesh distribution fundamentally necessary for any FEM framework, > or is it only needed by Firedrake? If latter, then how do other > frameworks partition the mesh and execute in parallel with MPI but avoid > the non-scalable mesh destribution step? > > 2. Results (B) and (C) suggest that the mesh distribution step does > not scale. Is it a fundamental property of the mesh distribution problem > that it has a central bottleneck in the master process, or is it > a limitation of the current implementation in PETSc-DMPlex? > > 2a. Our (B) result seems to agree with Figure 4(left) of [2]. Fig 6 of [2] > suggests a way to reduce the time spent on sequential bottleneck by > "parallel mesh refinment" that creates high-resolution meshes from an > initial coarse mesh. Is this approach implemented in DMPLex? If so, any > pointers on how to try it out with Firedrake? If not, any other > directions for reducing this bottleneck? > > 2b. Fig 6 in [3] shows plots for Assembly and Solve steps that scale well up > to 96 cores -- is mesh distribution included in those times? Is anyone > reading this aware of any other publications with evaluations of > Firedrake that measure mesh distribution (or explain how to avoid or > exclude it)? > > Thank you for your time and any info or tips. > > > [1] https://github.com/ISI-apex/firedrake-bench/blob/master/cahn_hilliard/firedrake_cahn_hilliard_problem.py > > [2] Unstructured Overlapping Mesh Distribution in Parallel, Matthew G. > Knepley, Michael Lange, Gerard J. Gorman, 2015. > https://arxiv.org/pdf/1506.06194.pdf > > [3] Efficient mesh management in Firedrake using PETSc-DMPlex, Michael > Lange, Lawrence Mitchell, Matthew G. Knepley and Gerard J. Gorman, SISC, > 38(5), S143-S155, 2016. http://arxiv.org/abs/1506.07749 -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Sun Mar 7 06:27:31 2021 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Sun, 7 Mar 2021 15:27:31 +0300 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: [2] On the robustness and performance of entropy stable discontinuous > collocation methods for the compressible Navier-Stokes equations, ROjas . > et.al. > https://arxiv.org/abs/1911.10966 > This is not the proper reference, here is the correct one https://www.sciencedirect.com/science/article/pii/S0021999120306185?dgcid=rss_sd_all However, there the algorithm is only outlined, and performances related to the mesh distribution are not really reported. We observed a large gain for large core counts and one to all distributions (from minutes to seconds) by splitting the several communication rounds needed by DMPlex into stages: from rank 0 to 1 rank per node, and then decomposing independently within the node. Attached the total time for one-to-all DMPlexDistrbute for a 128^3 mesh > > >> ? >> >> The attached plots suggest (A), (B), and (C) is happening for >> Cahn-Hilliard problem (from firedrake-bench repo) on a 2D 8Kx8K >> unit-square mesh. The implementation is here [1]. Versions are >> Firedrake, PyOp2: 20200204.0; PETSc 3.13.1; ParMETIS 4.0.3. >> >> Two questions, one on (A) and the other on (B)+(C): >> >> 1. Is (A) result expected? Given (A), any effort to improve the quality >> of the compiled assembly kernels (or anything else other than mesh >> distribution) appears futile since it takes 1% of end-to-end execution >> time, or am I missing something? >> >> 1a. Is mesh distribution fundamentally necessary for any FEM framework, >> or is it only needed by Firedrake? If latter, then how do other >> frameworks partition the mesh and execute in parallel with MPI but avoid >> the non-scalable mesh destribution step? >> >> 2. Results (B) and (C) suggest that the mesh distribution step does >> not scale. Is it a fundamental property of the mesh distribution problem >> that it has a central bottleneck in the master process, or is it >> a limitation of the current implementation in PETSc-DMPlex? >> >> 2a. Our (B) result seems to agree with Figure 4(left) of [2]. Fig 6 of [2] >> suggests a way to reduce the time spent on sequential bottleneck by >> "parallel mesh refinment" that creates high-resolution meshes from an >> initial coarse mesh. Is this approach implemented in DMPLex? If so, any >> pointers on how to try it out with Firedrake? If not, any other >> directions for reducing this bottleneck? >> >> 2b. Fig 6 in [3] shows plots for Assembly and Solve steps that scale well >> up >> to 96 cores -- is mesh distribution included in those times? Is anyone >> reading this aware of any other publications with evaluations of >> Firedrake that measure mesh distribution (or explain how to avoid or >> exclude it)? >> >> Thank you for your time and any info or tips. >> >> >> [1] >> https://github.com/ISI-apex/firedrake-bench/blob/master/cahn_hilliard/firedrake_cahn_hilliard_problem.py >> >> [2] Unstructured Overlapping Mesh Distribution in Parallel, Matthew G. >> Knepley, Michael Lange, Gerard J. Gorman, 2015. >> https://arxiv.org/pdf/1506.06194.pdf >> >> [3] Efficient mesh management in Firedrake using PETSc-DMPlex, Michael >> Lange, Lawrence Mitchell, Matthew G. Knepley and Gerard J. Gorman, SISC, >> 38(5), S143-S155, 2016. http://arxiv.org/abs/1506.07749 >> > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: totl_seq.png Type: image/png Size: 25242 bytes Desc: not available URL: From mfadams at lbl.gov Sun Mar 7 07:06:49 2021 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 7 Mar 2021 08:06:49 -0500 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: <42DA83E8-8BC8-4A18-9C7F-9F9F3F1AA220@petsc.dev> References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> <42DA83E8-8BC8-4A18-9C7F-9F9F3F1AA220@petsc.dev> Message-ID: Whoop, snes/tests/ex13.c. This is what I used for the Summit runs that I presented a while ago. On Sun, Mar 7, 2021 at 6:12 AM Barry Smith wrote: > > mat/tests/ex13.c creates a sequential AIJ matrix, converts it to the > same format, reorders it and then prints it and the reordering in ASCII. > Each of these steps is sequential and takes place on each rank. The prints > are ASCII stdout on the ranks. > > ierr = MatCreateSeqAIJ(PETSC_COMM_SELF,m*n,m*n,5,NULL,&C);CHKERRQ(ierr); > /* create the matrix for the five point stencil, YET AGAIN*/ > for (i=0; i for (j=0; j v = -1.0; Ii = j + n*i; > if (i>0) {J = Ii - n; ierr = > MatSetValues(C,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);} > if (i MatSetValues(C,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);} > if (j>0) {J = Ii - 1; ierr = > MatSetValues(C,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);} > if (j MatSetValues(C,1,&Ii,1,&J,&v,INSERT_VALUES);CHKERRQ(ierr);} > v = 4.0; ierr = > MatSetValues(C,1,&Ii,1,&Ii,&v,INSERT_VALUES);CHKERRQ(ierr); > } > } > ierr = MatAssemblyBegin(C,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > ierr = MatAssemblyEnd(C,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); > > ierr = MatConvert(C,MATSAME,MAT_INITIAL_MATRIX,&A);CHKERRQ(ierr); > > ierr = MatGetOrdering(A,MATORDERINGND,&perm,&iperm);CHKERRQ(ierr); > ierr = ISView(perm,PETSC_VIEWER_STDOUT_SELF);CHKERRQ(ierr); > ierr = ISView(iperm,PETSC_VIEWER_STDOUT_SELF);CHKERRQ(ierr); > ierr = MatView(A,PETSC_VIEWER_STDOUT_SELF);CHKERRQ(ierr); > > I think each rank would simply be running the same code and dumping > everything to its own stdout. > > At some point within the system/MPI executor there is code that merges and > print outs the stdout of each rank. If the test does truly take 45 minutes > than Fugaku has a classic bug of not being able to efficiently merge stdout > from each of the ranks. Nothing really to do with PETSc, just neglect of > Fugaku developers to respect all aspects of developing a HPC system. Heck, > they only had a billion dollars, can't expect them to do what other > scalable systems do :-). > > One should be able to reproduce this with a simple MPI program that prints > a moderate amount of data to stdout on each rank. > > Barry > > > > > On Mar 6, 2021, at 9:46 PM, Mark Adams wrote: > > I observed poor scaling with mat/tests/ex13 on Fugaku recently. > I was running this test as is (eg, no threads and 4 MPI processes per > node/chip, which seems recomended). I did not dig into this. > A test with about 10% of the machine took about 45 minutes to run. > Mark > > On Sat, Mar 6, 2021 at 9:49 PM Junchao Zhang > wrote: > >> >> >> >> On Sat, Mar 6, 2021 at 12:27 PM Matthew Knepley >> wrote: >> >>> On Fri, Mar 5, 2021 at 4:06 PM Alexei Colin wrote: >>> >>>> To PETSc DMPlex users, Firedrake users, Dr. Knepley and Dr. Karpeev: >>>> >>>> Is it expected for mesh distribution step to >>>> (A) take a share of 50-99% of total time-to-solution of an FEM problem, >>>> and >>>> >>> >>> No >>> >>> >>>> (B) take an amount of time that increases with the number of ranks, and >>>> >>> >>> See below. >>> >>> >>>> (C) take an amount of memory on rank 0 that does not decrease with the >>>> number of ranks >>>> >>> >>> The problem here is that a serial mesh is being partitioned and sent to >>> all processes. This is fundamentally >>> non-scalable, but it is easy and works well for modest clusters < 100 >>> nodes or so. Above this, it will take >>> increasing amounts of time. There are a few techniques for mitigating >>> this. >>> >> Is this one-to-all communication only done once? If yes, one >> MPI_Scatterv() is enough and should not cost much. >> >> a) For simple domains, you can distribute a coarse grid, then regularly >>> refine that in parallel with DMRefine() or -dm_refine . >>> These steps can be repeated easily, and redistribution in parallel >>> is fast, as shown for example in [1]. >>> >>> b) For complex meshes, you can read them in parallel, and then repeat >>> a). This is done in [1]. It is a little more involved, >>> but not much. >>> >>> c) You can do a multilevel partitioning, as they do in [2]. I cannot >>> find the paper in which they describe this right now. It is feasible, >>> but definitely the most expert approach. >>> >>> Does this make sense? >>> >>> Thanks, >>> >>> Matt >>> >>> [1] Fully Parallel Mesh I/O using PETSc DMPlex with an Application to >>> Waveform Modeling, Hapla et.al. >>> https://arxiv.org/abs/2004.08729 >>> [2] On the robustness and performance of entropy stable discontinuous >>> collocation methods for the compressible Navier-Stokes equations, ROjas . >>> et.al. >>> https://arxiv.org/abs/1911.10966 >>> >>> >>>> ? >>>> >>>> The attached plots suggest (A), (B), and (C) is happening for >>>> Cahn-Hilliard problem (from firedrake-bench repo) on a 2D 8Kx8K >>>> unit-square mesh. The implementation is here [1]. Versions are >>>> Firedrake, PyOp2: 20200204.0; PETSc 3.13.1; ParMETIS 4.0.3. >>>> >>>> Two questions, one on (A) and the other on (B)+(C): >>>> >>>> 1. Is (A) result expected? Given (A), any effort to improve the quality >>>> of the compiled assembly kernels (or anything else other than mesh >>>> distribution) appears futile since it takes 1% of end-to-end execution >>>> time, or am I missing something? >>>> >>>> 1a. Is mesh distribution fundamentally necessary for any FEM framework, >>>> or is it only needed by Firedrake? If latter, then how do other >>>> frameworks partition the mesh and execute in parallel with MPI but avoid >>>> the non-scalable mesh destribution step? >>>> >>>> 2. Results (B) and (C) suggest that the mesh distribution step does >>>> not scale. Is it a fundamental property of the mesh distribution problem >>>> that it has a central bottleneck in the master process, or is it >>>> a limitation of the current implementation in PETSc-DMPlex? >>>> >>>> 2a. Our (B) result seems to agree with Figure 4(left) of [2]. Fig 6 of >>>> [2] >>>> suggests a way to reduce the time spent on sequential bottleneck by >>>> "parallel mesh refinment" that creates high-resolution meshes from an >>>> initial coarse mesh. Is this approach implemented in DMPLex? If so, any >>>> pointers on how to try it out with Firedrake? If not, any other >>>> directions for reducing this bottleneck? >>>> >>>> 2b. Fig 6 in [3] shows plots for Assembly and Solve steps that scale >>>> well up >>>> to 96 cores -- is mesh distribution included in those times? Is anyone >>>> reading this aware of any other publications with evaluations of >>>> Firedrake that measure mesh distribution (or explain how to avoid or >>>> exclude it)? >>>> >>>> Thank you for your time and any info or tips. >>>> >>>> >>>> [1] >>>> https://github.com/ISI-apex/firedrake-bench/blob/master/cahn_hilliard/firedrake_cahn_hilliard_problem.py >>>> >>>> [2] Unstructured Overlapping Mesh Distribution in Parallel, Matthew G. >>>> Knepley, Michael Lange, Gerard J. Gorman, 2015. >>>> https://arxiv.org/pdf/1506.06194.pdf >>>> >>>> [3] Efficient mesh management in Firedrake using PETSc-DMPlex, Michael >>>> Lange, Lawrence Mitchell, Matthew G. Knepley and Gerard J. Gorman, SISC, >>>> 38(5), S143-S155, 2016. http://arxiv.org/abs/1506.07749 >>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sun Mar 7 07:19:57 2021 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 7 Mar 2021 08:19:57 -0500 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: Is phase 1 the old method and 2 the new? Is this 128^3 mesh per process? On Sun, Mar 7, 2021 at 7:27 AM Stefano Zampini wrote: > > > [2] On the robustness and performance of entropy stable discontinuous >> collocation methods for the compressible Navier-Stokes equations, ROjas . >> et.al. >> https://arxiv.org/abs/1911.10966 >> > > This is not the proper reference, here is the correct one > https://www.sciencedirect.com/science/article/pii/S0021999120306185?dgcid=rss_sd_all > However, there the algorithm is only outlined, and performances related to > the mesh distribution are not really reported. > We observed a large gain for large core counts and one to all > distributions (from minutes to seconds) by splitting the several > communication rounds needed by DMPlex into stages: from rank 0 to 1 rank > per node, and then decomposing independently within the node. > Attached the total time for one-to-all DMPlexDistrbute for a 128^3 mesh > > >> >> >>> ? >>> >>> The attached plots suggest (A), (B), and (C) is happening for >>> Cahn-Hilliard problem (from firedrake-bench repo) on a 2D 8Kx8K >>> unit-square mesh. The implementation is here [1]. Versions are >>> Firedrake, PyOp2: 20200204.0; PETSc 3.13.1; ParMETIS 4.0.3. >>> >>> Two questions, one on (A) and the other on (B)+(C): >>> >>> 1. Is (A) result expected? Given (A), any effort to improve the quality >>> of the compiled assembly kernels (or anything else other than mesh >>> distribution) appears futile since it takes 1% of end-to-end execution >>> time, or am I missing something? >>> >>> 1a. Is mesh distribution fundamentally necessary for any FEM framework, >>> or is it only needed by Firedrake? If latter, then how do other >>> frameworks partition the mesh and execute in parallel with MPI but avoid >>> the non-scalable mesh destribution step? >>> >>> 2. Results (B) and (C) suggest that the mesh distribution step does >>> not scale. Is it a fundamental property of the mesh distribution problem >>> that it has a central bottleneck in the master process, or is it >>> a limitation of the current implementation in PETSc-DMPlex? >>> >>> 2a. Our (B) result seems to agree with Figure 4(left) of [2]. Fig 6 of >>> [2] >>> suggests a way to reduce the time spent on sequential bottleneck by >>> "parallel mesh refinment" that creates high-resolution meshes from an >>> initial coarse mesh. Is this approach implemented in DMPLex? If so, any >>> pointers on how to try it out with Firedrake? If not, any other >>> directions for reducing this bottleneck? >>> >>> 2b. Fig 6 in [3] shows plots for Assembly and Solve steps that scale >>> well up >>> to 96 cores -- is mesh distribution included in those times? Is anyone >>> reading this aware of any other publications with evaluations of >>> Firedrake that measure mesh distribution (or explain how to avoid or >>> exclude it)? >>> >>> Thank you for your time and any info or tips. >>> >>> >>> [1] >>> https://github.com/ISI-apex/firedrake-bench/blob/master/cahn_hilliard/firedrake_cahn_hilliard_problem.py >>> >>> [2] Unstructured Overlapping Mesh Distribution in Parallel, Matthew G. >>> Knepley, Michael Lange, Gerard J. Gorman, 2015. >>> https://arxiv.org/pdf/1506.06194.pdf >>> >>> [3] Efficient mesh management in Firedrake using PETSc-DMPlex, Michael >>> Lange, Lawrence Mitchell, Matthew G. Knepley and Gerard J. Gorman, SISC, >>> 38(5), S143-S155, 2016. http://arxiv.org/abs/1506.07749 >>> >> > > -- > Stefano > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Sun Mar 7 07:23:05 2021 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Sun, 7 Mar 2021 16:23:05 +0300 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: 128^3 is the entire mesh. The blue line (1 phase) is with dmplexdistribute, the red line, with a two-stage approach. Il Dom 7 Mar 2021, 16:20 Mark Adams ha scritto: > Is phase 1 the old method and 2 the new? > Is this 128^3 mesh per process? > > On Sun, Mar 7, 2021 at 7:27 AM Stefano Zampini > wrote: > >> >> >> [2] On the robustness and performance of entropy stable discontinuous >>> collocation methods for the compressible Navier-Stokes equations, ROjas . >>> et.al. >>> https://arxiv.org/abs/1911.10966 >>> >> >> This is not the proper reference, here is the correct one >> https://www.sciencedirect.com/science/article/pii/S0021999120306185?dgcid=rss_sd_all >> However, there the algorithm is only outlined, and performances related >> to the mesh distribution are not really reported. >> We observed a large gain for large core counts and one to all >> distributions (from minutes to seconds) by splitting the several >> communication rounds needed by DMPlex into stages: from rank 0 to 1 rank >> per node, and then decomposing independently within the node. >> Attached the total time for one-to-all DMPlexDistrbute for a 128^3 mesh >> >> >>> >>> >>>> ? >>>> >>>> The attached plots suggest (A), (B), and (C) is happening for >>>> Cahn-Hilliard problem (from firedrake-bench repo) on a 2D 8Kx8K >>>> unit-square mesh. The implementation is here [1]. Versions are >>>> Firedrake, PyOp2: 20200204.0; PETSc 3.13.1; ParMETIS 4.0.3. >>>> >>>> Two questions, one on (A) and the other on (B)+(C): >>>> >>>> 1. Is (A) result expected? Given (A), any effort to improve the quality >>>> of the compiled assembly kernels (or anything else other than mesh >>>> distribution) appears futile since it takes 1% of end-to-end execution >>>> time, or am I missing something? >>>> >>>> 1a. Is mesh distribution fundamentally necessary for any FEM framework, >>>> or is it only needed by Firedrake? If latter, then how do other >>>> frameworks partition the mesh and execute in parallel with MPI but avoid >>>> the non-scalable mesh destribution step? >>>> >>>> 2. Results (B) and (C) suggest that the mesh distribution step does >>>> not scale. Is it a fundamental property of the mesh distribution problem >>>> that it has a central bottleneck in the master process, or is it >>>> a limitation of the current implementation in PETSc-DMPlex? >>>> >>>> 2a. Our (B) result seems to agree with Figure 4(left) of [2]. Fig 6 of >>>> [2] >>>> suggests a way to reduce the time spent on sequential bottleneck by >>>> "parallel mesh refinment" that creates high-resolution meshes from an >>>> initial coarse mesh. Is this approach implemented in DMPLex? If so, any >>>> pointers on how to try it out with Firedrake? If not, any other >>>> directions for reducing this bottleneck? >>>> >>>> 2b. Fig 6 in [3] shows plots for Assembly and Solve steps that scale >>>> well up >>>> to 96 cores -- is mesh distribution included in those times? Is anyone >>>> reading this aware of any other publications with evaluations of >>>> Firedrake that measure mesh distribution (or explain how to avoid or >>>> exclude it)? >>>> >>>> Thank you for your time and any info or tips. >>>> >>>> >>>> [1] >>>> https://github.com/ISI-apex/firedrake-bench/blob/master/cahn_hilliard/firedrake_cahn_hilliard_problem.py >>>> >>>> [2] Unstructured Overlapping Mesh Distribution in Parallel, Matthew G. >>>> Knepley, Michael Lange, Gerard J. Gorman, 2015. >>>> https://arxiv.org/pdf/1506.06194.pdf >>>> >>>> [3] Efficient mesh management in Firedrake using PETSc-DMPlex, Michael >>>> Lange, Lawrence Mitchell, Matthew G. Knepley and Gerard J. Gorman, SISC, >>>> 38(5), S143-S155, 2016. http://arxiv.org/abs/1506.07749 >>>> >>> >> >> -- >> Stefano >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at buffalo.edu Sun Mar 7 07:23:27 2021 From: knepley at buffalo.edu (Matthew Knepley) Date: Sun, 7 Mar 2021 08:23:27 -0500 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: <17e285bf9ea94d959acf3a870a8fe6c2@MBX-LS5.itorg.ad.buffalo.edu> References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> <17e285bf9ea94d959acf3a870a8fe6c2@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: On Sun, Mar 7, 2021 at 8:20 AM Mark Adams wrote: > Is phase 1 the old method and 2 the new? > Is this 128^3 mesh per process? > Both phases are the same method. My interpretation is that there is severe message congestion. Sending fewer bigger messages in a phase 1, and then rebalancing locally in a phase 2, both using the same communication method, is much faster than just directly communicating the data in a single phase. Matt > On Sun, Mar 7, 2021 at 7:27 AM Stefano Zampini > wrote: > >> >> >> [2] On the robustness and performance of entropy stable discontinuous >>> collocation methods for the compressible Navier-Stokes equations, ROjas . >>> et.al. >>> https://arxiv.org/abs/1911.10966 >>> >> >> This is not the proper reference, here is the correct one >> https://www.sciencedirect.com/science/article/pii/S0021999120306185?dgcid=rss_sd_all >> However, there the algorithm is only outlined, and performances related >> to the mesh distribution are not really reported. >> We observed a large gain for large core counts and one to all >> distributions (from minutes to seconds) by splitting the several >> communication rounds needed by DMPlex into stages: from rank 0 to 1 rank >> per node, and then decomposing independently within the node. >> Attached the total time for one-to-all DMPlexDistrbute for a 128^3 mesh >> >> >>> >>> >>>> ? >>>> >>>> The attached plots suggest (A), (B), and (C) is happening for >>>> Cahn-Hilliard problem (from firedrake-bench repo) on a 2D 8Kx8K >>>> unit-square mesh. The implementation is here [1]. Versions are >>>> Firedrake, PyOp2: 20200204.0; PETSc 3.13.1; ParMETIS 4.0.3. >>>> >>>> Two questions, one on (A) and the other on (B)+(C): >>>> >>>> 1. Is (A) result expected? Given (A), any effort to improve the quality >>>> of the compiled assembly kernels (or anything else other than mesh >>>> distribution) appears futile since it takes 1% of end-to-end execution >>>> time, or am I missing something? >>>> >>>> 1a. Is mesh distribution fundamentally necessary for any FEM framework, >>>> or is it only needed by Firedrake? If latter, then how do other >>>> frameworks partition the mesh and execute in parallel with MPI but avoid >>>> the non-scalable mesh destribution step? >>>> >>>> 2. Results (B) and (C) suggest that the mesh distribution step does >>>> not scale. Is it a fundamental property of the mesh distribution problem >>>> that it has a central bottleneck in the master process, or is it >>>> a limitation of the current implementation in PETSc-DMPlex? >>>> >>>> 2a. Our (B) result seems to agree with Figure 4(left) of [2]. Fig 6 of >>>> [2] >>>> suggests a way to reduce the time spent on sequential bottleneck by >>>> "parallel mesh refinment" that creates high-resolution meshes from an >>>> initial coarse mesh. Is this approach implemented in DMPLex? If so, any >>>> pointers on how to try it out with Firedrake? If not, any other >>>> directions for reducing this bottleneck? >>>> >>>> 2b. Fig 6 in [3] shows plots for Assembly and Solve steps that scale >>>> well up >>>> to 96 cores -- is mesh distribution included in those times? Is anyone >>>> reading this aware of any other publications with evaluations of >>>> Firedrake that measure mesh distribution (or explain how to avoid or >>>> exclude it)? >>>> >>>> Thank you for your time and any info or tips. >>>> >>>> >>>> [1] >>>> https://github.com/ISI-apex/firedrake-bench/blob/master/cahn_hilliard/firedrake_cahn_hilliard_problem.py >>>> >>>> [2] Unstructured Overlapping Mesh Distribution in Parallel, Matthew G. >>>> Knepley, Michael Lange, Gerard J. Gorman, 2015. >>>> https://arxiv.org/pdf/1506.06194.pdf >>>> >>>> [3] Efficient mesh management in Firedrake using PETSc-DMPlex, Michael >>>> Lange, Lawrence Mitchell, Matthew G. Knepley and Gerard J. Gorman, SISC, >>>> 38(5), S143-S155, 2016. http://arxiv.org/abs/1506.07749 >>>> >>> >> >> -- >> Stefano >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sun Mar 7 07:27:58 2021 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 7 Mar 2021 08:27:58 -0500 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: FWIW, Here is the output from ex13 on 32K processes (8K Fugaku nodes/sockets, 4 MPI/node, which seems recommended) with 128^3 vertex mesh (64^3 Q2 3D Laplacian). Almost an hour. Attached is solver scaling. 0 SNES Function norm 3.658334849208e+00 Linear solve converged due to CONVERGED_RTOL iterations 22 1 SNES Function norm 1.609000373074e-12 Nonlinear solve converged due to CONVERGED_ITS iterations 1 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ../ex13 on a named i07-4008c with 32768 processors, by a04199 Fri Feb 12 23:27:13 2021 Using Petsc Development GIT revision: v3.14.4-579-g4cb72fa GIT Date: 2021-02-05 15:19:40 +0000 Max Max/Min Avg Total Time (sec): 3.373e+03 1.000 3.373e+03 Objects: 1.055e+05 14.797 7.144e+03 Flop: 5.376e+10 1.176 4.885e+10 1.601e+15 Flop/sec: 1.594e+07 1.176 1.448e+07 4.745e+11 MPI Messages: 6.048e+05 30.010 8.833e+04 2.894e+09 MPI Message Lengths: 1.127e+09 4.132 6.660e+03 1.928e+13 MPI Reductions: 1.824e+03 1.000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 3.2903e+03 97.5% 2.4753e+14 15.5% 3.538e+08 12.2% 1.779e+04 32.7% 9.870e+02 54.1% 1: PCSetUp: 4.3062e+01 1.3% 1.8160e+13 1.1% 1.902e+07 0.7% 3.714e+04 3.7% 1.590e+02 8.7% 2: KSP Solve only: 3.9685e+01 1.2% 1.3349e+15 83.4% 2.522e+09 87.1% 4.868e+03 63.7% 6.700e+02 36.7% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage PetscBarrier 5 1.0 1.9907e+00 2.2 0.00e+00 0.0 3.8e+06 7.7e+01 2.0e+01 0 0 0 0 1 0 0 1 0 2 0 BuildTwoSided 62 1.0 7.3272e+0214.1 0.00e+00 0.0 6.7e+06 8.0e+00 0.0e+00 5 0 0 0 0 5 0 2 0 0 0 BuildTwoSidedF 59 1.0 3.1132e+01 7.4 0.00e+00 0.0 4.8e+06 2.5e+05 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 SNESSolve 1 1.0 1.7468e+02 1.0 7.83e+09 1.3 3.4e+08 1.3e+04 8.8e+02 5 13 12 23 48 5 85 96 70 89 1205779 SNESSetUp 1 1.0 2.4195e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 SNESFunctionEval 3 1.0 1.1359e+01 1.2 1.17e+09 1.0 1.6e+06 1.4e+04 2.0e+00 0 2 0 0 0 0 15 0 0 0 3344744 SNESJacobianEval 2 1.0 1.6829e+02 1.0 1.52e+09 1.0 1.1e+06 8.3e+05 0.0e+00 5 3 0 5 0 5 20 0 14 0 293588 DMCreateMat 1 1.0 2.4107e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 2.7e+02 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 1.9e+02 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 DMPlexPartSelf 1 1.0 1.1498e+002367.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexPartLblInv 1 1.0 3.6698e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexPartLblSF 1 1.0 2.8522e-01 1.7 0.00e+00 0.0 4.9e+04 1.5e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 4.3e+02 0.0e+00 14 0 0 0 0 15 0 0 0 0 0 DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 5.4e+02 0.0e+00 28 0 0 0 0 29 0 0 0 0 0 DMPlexInterp 84 1.0 4.3219e-0158.6 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 0 0 0 0 0 0 0 0 0 1 0 DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 2.3e+02 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 3.1e+02 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 1.9e+02 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 9.3e+01 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 DMPlexStratify 118 1.0 6.2852e+023280.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+01 1 0 0 0 1 1 0 0 0 2 0 DMPlexSymmetrize 118 1.0 6.7634e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexPrealloc 1 1.0 2.3741e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 DMPlexResidualFE 3 1.0 1.0634e+01 1.2 1.16e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 15 0 0 0 3569848 DMPlexJacobianFE 2 1.0 1.6809e+02 1.0 1.51e+09 1.0 6.5e+05 1.4e+06 0.0e+00 5 3 0 5 0 5 20 0 14 0 293801 SFSetGraph 87 1.0 2.7673e-03 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 2.7e+04 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 1.8e+04 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 SFReduceBegin 12 1.0 2.4825e-01172.8 0.00e+00 0.0 2.4e+06 2.0e+05 0.0e+00 0 0 0 2 0 0 0 1 8 0 0 SFReduceEnd 12 1.0 3.8286e+014865.8 3.74e+04 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 31 SFFetchOpBegin 2 1.0 2.4497e-0390.2 0.00e+00 0.0 4.3e+05 3.5e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 SFFetchOpEnd 2 1.0 6.1349e-0210.9 0.00e+00 0.0 4.3e+05 3.5e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 SFCreateEmbed 3 1.0 3.6800e+013261.5 0.00e+00 0.0 4.7e+05 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 1.1e+04 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 1.7e+05 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 SFRemoteOff 2 1.0 3.2868e-0143.1 0.00e+00 0.0 8.7e+05 8.2e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFPack 1023 1.0 2.5215e-0176.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFUnpack 1025 1.0 5.1600e-0216.8 5.62e+0521.3 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 54693 MatMult 1549525.4 3.4810e+00 1.3 4.35e+09 1.1 2.2e+08 6.1e+03 0.0e+00 0 8 8 7 0 0 54 62 21 0 38319208 MatMultAdd 132 1.0 6.9168e-01 3.0 7.97e+07 1.2 2.8e+07 4.6e+02 0.0e+00 0 0 1 0 0 0 1 8 0 0 3478717 MatMultTranspose 132 1.0 5.9967e-01 1.6 8.00e+07 1.2 3.0e+07 4.5e+02 0.0e+00 0 0 1 0 0 0 1 9 0 0 4015214 MatSolve 22 0.0 6.8431e-04 0.0 7.41e+05 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1082 MatLUFactorSym 1 1.0 5.9569e-0433.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 1.0 1.6236e-03773.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 897 MatConvert 6 1.0 1.4290e-01 1.2 0.00e+00 0.0 3.0e+06 3.7e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 MatScale 18 1.0 3.7962e-01 1.3 4.11e+07 1.2 2.0e+06 5.5e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 3253392 MatResidual 132 1.0 6.8256e-01 1.4 8.27e+08 1.2 4.4e+07 5.5e+03 0.0e+00 0 2 2 1 0 0 10 13 4 0 36282014 MatAssemblyBegin 244 1.0 3.1181e+01 6.6 0.00e+00 0.0 4.8e+06 2.5e+05 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 MatAssemblyEnd 244 1.0 6.3232e+00 1.9 3.17e+06 6.9 0.0e+00 0.0e+00 1.4e+02 0 0 0 0 8 0 0 0 0 15 7655 MatGetRowIJ 1 0.0 2.5780e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMat 10 1.0 1.5162e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 1.3e+02 0 0 0 0 7 0 0 0 1 13 0 MatGetOrdering 1 0.0 1.0899e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 6 1.0 3.5837e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 MatZeroEntries 8 1.0 5.3730e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 6 1.0 2.6245e-01 1.1 2.66e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33035 MatTranspose 12 1.0 3.0731e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMatMultSym 18 1.0 2.1398e+00 1.4 0.00e+00 0.0 6.1e+06 5.5e+03 4.8e+01 0 0 0 0 3 0 0 2 1 5 0 MatMatMultNum 6 1.0 1.1243e+00 1.0 3.76e+07 1.2 2.0e+06 5.5e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 1001203 MatPtAPSymbolic 6 1.0 1.7280e+01 1.0 0.00e+00 0.0 1.2e+07 3.2e+04 4.2e+01 1 0 0 2 2 1 0 3 6 4 0 MatPtAPNumeric 6 1.0 1.8047e+01 1.0 1.49e+09 5.1 2.8e+06 1.1e+05 2.4e+01 1 1 0 2 1 1 5 1 5 2 663675 MatTrnMatMultSym 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 MatGetLocalMat 19 1.0 1.3904e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 18 1.0 1.9926e-01 5.0 0.00e+00 0.0 1.4e+07 2.3e+04 0.0e+00 0 0 0 2 0 0 0 4 5 0 0 MatGetSymTrans 2 1.0 1.8996e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecTDot 176 1.0 7.0632e-01 4.5 3.48e+07 1.0 0.0e+00 0.0e+00 1.8e+02 0 0 0 0 10 0 0 0 0 18 1608728 VecNorm 60 1.0 1.4074e+0012.2 1.58e+07 1.0 0.0e+00 0.0e+00 6.0e+01 0 0 0 0 3 0 0 0 0 6 366467 VecCopy 422 1.0 5.1259e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 653 1.0 2.3974e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 165 1.0 6.5622e-03 1.3 3.42e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170485467 VecAYPX 861 1.0 7.8529e-02 1.2 6.21e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 25785252 VecAXPBYCZ 264 1.0 4.1343e-02 1.5 5.85e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 46135592 VecAssemblyBegin 21 1.0 2.3463e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 21 1.0 1.4457e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 600 1.0 5.7510e-02 1.2 2.66e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 15075754 VecScatterBegin 902 1.0 5.1188e-01 1.2 0.00e+00 0.0 2.9e+08 5.3e+03 0.0e+00 0 0 10 8 0 0 0 82 25 0 0 VecScatterEnd 902 1.0 1.2143e+00 3.2 5.50e+0537.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1347 VecSetRandom 6 1.0 2.6354e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DualSpaceSetUp 7 1.0 5.3467e-0112.0 4.26e+03 1.0 0.0e+00 0.0e+00 1.3e+01 0 0 0 0 1 0 0 0 0 1 261 FESetUp 7 1.0 1.7541e-01128.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 15 1.0 2.7470e-01 1.1 2.04e+08 1.2 1.0e+07 5.5e+03 1.3e+02 0 0 0 0 7 0 2 3 1 13 22477233 KSPSolve 1 1.0 4.3257e+00 1.0 4.33e+09 1.1 2.5e+08 4.8e+03 6.6e+01 0 8 9 6 4 0 54 72 20 7 30855976 PCGAMGGraph_AGG 6 1.0 5.0969e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 4.8e+01 0 0 0 0 3 0 0 1 0 5 220852 PCGAMGCoarse_AGG 6 1.0 3.1121e+01 1.0 0.00e+00 0.0 2.5e+07 6.9e+04 5.5e+01 1 0 1 9 3 1 0 7 27 6 0 PCGAMGProl_AGG 6 1.0 5.8196e-01 1.0 0.00e+00 0.0 6.6e+06 9.3e+03 7.2e+01 0 0 0 0 4 0 0 2 1 7 0 PCGAMGPOpt_AGG 6 1.0 3.2414e+00 1.0 2.42e+08 1.2 2.1e+07 5.3e+03 1.6e+02 0 0 1 1 9 0 3 6 2 17 2256493 GAMG: createProl 6 1.0 4.0042e+01 1.0 2.80e+08 1.2 5.8e+07 3.3e+04 3.4e+02 1 1 2 10 19 1 3 16 31 34 210778 Graph 12 1.0 5.0926e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 4.8e+01 0 0 0 0 3 0 0 1 0 5 221038 MIS/Agg 6 1.0 3.5850e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 SA: col data 6 1.0 3.0509e-01 1.0 0.00e+00 0.0 5.4e+06 9.2e+03 2.4e+01 0 0 0 0 1 0 0 2 1 2 0 SA: frmProl0 6 1.0 2.3467e-01 1.1 0.00e+00 0.0 1.3e+06 9.5e+03 2.4e+01 0 0 0 0 1 0 0 0 0 2 0 SA: smooth 6 1.0 2.7855e+00 1.0 4.14e+07 1.2 8.1e+06 5.5e+03 6.3e+01 0 0 0 0 3 0 1 2 1 6 446491 GAMG: partLevel 6 1.0 3.7266e+01 1.0 1.49e+09 5.1 1.5e+07 4.9e+04 3.2e+02 1 1 1 4 17 1 5 4 12 32 321395 repartition 5 1.0 2.0343e+00 1.1 0.00e+00 0.0 4.0e+05 1.4e+05 2.5e+02 0 0 0 0 14 0 0 0 1 25 0 Invert-Sort 5 1.0 1.5021e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+01 0 0 0 0 2 0 0 0 0 3 0 Move A 5 1.0 1.1548e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 7.0e+01 0 0 0 0 4 0 0 0 1 7 0 Move P 5 1.0 4.2799e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 7.5e+01 0 0 0 0 4 0 0 0 0 8 0 PCGAMG Squ l00 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 PCGAMG Gal l00 1 1.0 8.7411e+00 1.0 2.93e+08 1.1 5.4e+06 4.5e+04 1.2e+01 0 1 0 1 1 0 4 2 4 1 1092355 PCGAMG Opt l00 1 1.0 1.9734e+00 1.0 3.36e+07 1.1 3.2e+06 1.2e+04 9.0e+00 0 0 0 0 0 0 0 1 1 1 555327 PCGAMG Gal l01 1 1.0 1.0153e+00 1.0 3.50e+07 1.4 5.9e+06 3.9e+04 1.2e+01 0 0 0 1 1 0 0 2 4 1 1079887 PCGAMG Opt l01 1 1.0 7.4812e-02 1.0 5.35e+05 1.2 3.2e+06 1.1e+03 9.0e+00 0 0 0 0 0 0 0 1 0 1 232542 PCGAMG Gal l02 1 1.0 1.8063e+00 1.0 7.43e+07 0.0 3.0e+06 5.9e+04 1.2e+01 0 0 0 1 1 0 0 1 3 1 593392 PCGAMG Opt l02 1 1.0 1.1580e-01 1.1 6.93e+05 0.0 1.6e+06 1.3e+03 9.0e+00 0 0 0 0 0 0 0 0 0 1 93213 PCGAMG Gal l03 1 1.0 6.1075e+00 1.0 2.72e+08 0.0 2.6e+05 9.2e+04 1.1e+01 0 0 0 0 1 0 0 0 0 1 36155 PCGAMG Opt l03 1 1.0 8.0836e-02 1.0 1.55e+06 0.0 1.4e+05 1.4e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 18229 PCGAMG Gal l04 1 1.0 1.6203e+01 1.0 9.44e+08 0.0 1.4e+04 3.0e+05 1.1e+01 0 0 0 0 1 0 0 0 0 1 2366 PCGAMG Opt l04 1 1.0 1.2663e-01 1.0 2.01e+06 0.0 6.9e+03 2.2e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 817 PCGAMG Gal l05 1 1.0 1.4800e+00 1.0 3.16e+08 0.0 9.0e+01 1.6e+05 1.1e+01 0 0 0 0 1 0 0 0 0 1 796 PCGAMG Opt l05 1 1.0 8.1763e-02 1.1 2.50e+06 0.0 4.8e+01 4.6e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 114 PCSetUp 2 1.0 7.7969e+01 1.0 1.97e+09 2.8 8.3e+07 3.3e+04 8.1e+02 2 2 3 14 44 2 11 23 43 82 341051 PCSetUpOnBlocks 22 1.0 2.4609e-0317.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 592 PCApply 22 1.0 3.6455e+00 1.1 3.57e+09 1.2 2.4e+08 4.3e+03 0.0e+00 0 7 8 5 0 0 43 67 16 0 29434967 --- Event Stage 1: PCSetUp BuildTwoSided 4 1.0 1.5980e-01 2.7 0.00e+00 0.0 2.1e+05 8.0e+00 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 BuildTwoSidedF 6 1.0 1.3169e+01 5.5 0.00e+00 0.0 1.9e+06 1.9e+05 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 SFSetGraph 5 1.0 4.9640e-0519.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetUp 4 1.0 1.6038e-01 2.3 0.00e+00 0.0 6.4e+05 9.1e+02 0.0e+00 0 0 0 0 0 0 0 3 0 0 0 SFPack 30 1.0 3.3376e-04 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFUnpack 30 1.0 1.2101e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMult 30 1.0 1.5544e-01 1.5 1.87e+08 1.2 1.0e+07 5.5e+03 0.0e+00 0 0 0 0 0 0 31 53 8 0 35930640 MatAssemblyBegin 43 1.0 1.3201e+01 4.7 0.00e+00 0.0 1.9e+06 1.9e+05 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 MatAssemblyEnd 43 1.0 1.1159e+01 1.0 2.77e+07705.7 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 1 26 0 0 0 13 1036 MatZeroEntries 6 1.0 4.7315e-0410.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatTranspose 12 1.0 2.5142e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMatMultSym 10 1.0 5.8783e-0117.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatPtAPSymbolic 5 1.0 1.4489e+01 1.0 0.00e+00 0.0 6.2e+06 3.6e+04 3.5e+01 0 0 0 1 2 34 0 32 31 22 0 MatPtAPNumeric 6 1.0 2.8457e+01 1.0 1.50e+09 5.1 2.7e+06 1.6e+05 2.0e+01 1 1 0 2 1 66 66 14 61 13 421190 MatGetLocalMat 6 1.0 9.8574e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 6 1.0 3.7669e-01 2.3 0.00e+00 0.0 5.1e+06 3.8e+04 0.0e+00 0 0 0 1 0 0 0 27 28 0 0 VecTDot 66 1.0 6.5271e-02 4.1 5.85e+06 1.0 0.0e+00 0.0e+00 6.6e+01 0 0 0 0 4 0 1 0 0 42 2922260 VecNorm 36 1.0 1.1226e-02 3.2 3.19e+06 1.0 0.0e+00 0.0e+00 3.6e+01 0 0 0 0 2 0 1 0 0 23 9268067 VecCopy 12 1.0 1.2805e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 11 1.0 6.6620e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 60 1.0 1.0763e-03 1.5 5.32e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 161104914 VecAYPX 24 1.0 2.0581e-03 1.3 2.13e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33701038 VecPointwiseMult 36 1.0 3.5709e-03 1.3 1.60e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 14567861 VecScatterBegin 30 1.0 2.9079e-03 7.8 0.00e+00 0.0 1.0e+07 5.5e+03 0.0e+00 0 0 0 0 0 0 0 53 8 0 0 VecScatterEnd 30 1.0 3.7015e-0263.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 7 1.0 2.3165e-01 1.0 2.04e+08 1.2 1.0e+07 5.5e+03 1.0e+02 0 0 0 0 6 1 34 53 8 64 26654598 PCGAMG Gal l00 1 1.0 4.7415e+00 1.0 2.94e+08 1.1 1.8e+06 7.8e+04 0.0e+00 0 1 0 1 0 11 53 9 20 0 2015623 PCGAMG Gal l01 1 1.0 1.2103e+00 1.0 3.50e+07 1.4 4.8e+06 6.2e+04 1.2e+01 0 0 0 2 1 3 6 25 41 8 905938 PCGAMG Gal l02 1 1.0 3.4334e+00 1.0 7.41e+07 0.0 2.2e+06 8.7e+04 1.2e+01 0 0 0 1 1 8 6 11 27 8 312184 PCGAMG Gal l03 1 1.0 9.6062e+00 1.0 2.71e+08 0.0 1.9e+05 1.3e+05 1.1e+01 0 0 0 0 1 22 1 1 4 7 22987 PCGAMG Gal l04 1 1.0 2.2482e+01 1.0 9.43e+08 0.0 8.7e+03 4.8e+05 1.1e+01 1 0 0 0 1 52 0 0 1 7 1705 PCGAMG Gal l05 1 1.0 1.5961e+00 1.1 3.16e+08 0.0 6.8e+01 2.2e+05 1.1e+01 0 0 0 0 1 4 0 0 0 7 738 PCSetUp 1 1.0 4.3191e+01 1.0 1.70e+09 3.6 1.9e+07 3.7e+04 1.6e+02 1 1 1 4 9 100100100100100 420463 --- Event Stage 2: KSP Solve only SFPack 8140 1.0 7.4247e-02 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFUnpack 8140 1.0 1.2905e-02 5.2 5.50e+0637.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1267207 MatMult 5500 1.0 2.9994e+01 1.2 3.98e+10 1.1 2.0e+09 6.1e+03 0.0e+00 1 76 68 62 0 70 92 78 98 0 40747181 MatMultAdd 1320 1.0 6.2192e+00 2.7 7.97e+08 1.2 2.8e+08 4.6e+02 0.0e+00 0 2 10 1 0 14 2 11 1 0 3868976 MatMultTranspose 1320 1.0 4.0304e+00 1.7 8.00e+08 1.2 2.8e+08 4.6e+02 0.0e+00 0 2 10 1 0 7 2 11 1 0 5974153 MatSolve 220 0.0 6.7366e-03 0.0 7.41e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1100 MatLUFactorSym 1 1.0 5.8691e-0435.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 1.0 1.5955e-03756.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 913 MatResidual 1320 1.0 6.4920e+00 1.3 8.27e+09 1.2 4.4e+08 5.5e+03 0.0e+00 0 15 15 13 0 14 19 18 20 0 38146350 MatGetRowIJ 1 0.0 2.7820e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 0.0 9.6940e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecTDot 440 1.0 4.6162e+00 6.9 2.31e+08 1.0 0.0e+00 0.0e+00 4.4e+02 0 0 0 0 24 5 1 0 0 66 1635124 VecNorm 230 1.0 3.9605e-02 1.6 1.21e+08 1.0 0.0e+00 0.0e+00 2.3e+02 0 0 0 0 13 0 0 0 0 34 99622387 VecCopy 3980 1.0 5.4166e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 4640 1.0 1.4216e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 440 1.0 4.2829e-02 1.3 2.31e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 176236363 VecAYPX 8130 1.0 7.3998e-01 1.2 5.78e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 2 1 0 0 0 25489392 VecAXPBYCZ 2640 1.0 3.9974e-01 1.5 5.85e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 47716315 VecPointwiseMult 5280 1.0 5.9845e-01 1.5 2.34e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 12748927 VecScatterBegin 8140 1.0 4.9231e-01 5.9 0.00e+00 0.0 2.5e+09 4.9e+03 0.0e+00 0 0 87 64 0 1 0100100 0 0 VecScatterEnd 8140 1.0 1.0172e+01 3.6 5.50e+0637.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 13 0 0 0 0 1608 KSPSetUp 1 1.0 9.5996e-07 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 10 1.0 3.9685e+01 1.0 4.33e+10 1.1 2.5e+09 4.9e+03 6.7e+02 1 83 87 64 37 100100100100100 33637495 PCSetUp 1 1.0 2.4149e-0318.1 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 603 PCSetUpOnBlocks 220 1.0 2.6945e-03 8.9 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 540 PCApply 220 1.0 3.2921e+01 1.1 3.57e+10 1.2 2.3e+09 4.3e+03 0.0e+00 1 67 81 53 0 81 80 93 82 0 32595360 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 112 112 69888 0. SNES 1 1 1532 0. DMSNES 1 1 720 0. Distributed Mesh 449 449 30060888 0. DM Label 790 790 549840 0. Quadrature 579 579 379824 0. Index Set 100215 100210 361926232 0. IS L to G Mapping 8 13 4356552 0. Section 771 771 598296 0. Star Forest Graph 897 897 1053640 0. Discrete System 521 521 533512 0. GraphPartitioner 118 118 91568 0. Matrix 432 462 2441805304 0. Matrix Coarsen 6 6 4032 0. Vector 354 354 65492968 0. Linear Space 7 7 5208 0. Dual Space 111 111 113664 0. FE Space 7 7 5992 0. Field over DM 6 6 4560 0. Krylov Solver 21 21 37560 0. DMKSP interface 1 1 704 0. Preconditioner 21 21 21632 0. Viewer 2 1 896 0. PetscRandom 12 12 8520 0. --- Event Stage 1: PCSetUp Index Set 10 15 85367336 0. IS L to G Mapping 5 0 0 0. Star Forest Graph 5 5 6600 0. Matrix 50 20 73134024 0. Vector 28 28 6235096 0. --- Event Stage 2: KSP Solve only Index Set 5 5 8296 0. Matrix 1 1 273856 0. ======================================================================================================================== Average time to get PetscTime(): 6.40051e-08 Average time for MPI_Barrier(): 8.506e-06 Average time for zero size MPI_Send(): 6.6027e-06 #PETSc Option Table entries: -benchmark_it 10 -dm_distribute -dm_plex_box_dim 3 -dm_plex_box_faces 32,32,32 -dm_plex_box_lower 0,0,0 -dm_plex_box_simplex 0 -dm_plex_box_upper 1,1,1 -dm_refine 5 -ksp_converged_reason -ksp_max_it 150 -ksp_norm_type unpreconditioned -ksp_rtol 1.e-12 -ksp_type cg -log_view -matptap_via scalable -mg_levels_esteig_ksp_max_it 5 -mg_levels_esteig_ksp_type cg -mg_levels_ksp_max_it 2 -mg_levels_ksp_type chebyshev -mg_levels_pc_type jacobi -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 2000 -pc_gamg_coarse_grid_layout_type spread -pc_gamg_esteig_ksp_max_it 5 -pc_gamg_esteig_ksp_type cg -pc_gamg_process_eq_limit 500 -pc_gamg_repartition false -pc_gamg_reuse_interpolation true -pc_gamg_square_graph 1 -pc_gamg_threshold 0.01 -pc_gamg_threshold_scale .5 -pc_gamg_type agg -pc_type gamg -petscpartitioner_simple_node_grid 8,8,8 -petscpartitioner_simple_process_grid 4,4,4 -petscpartitioner_type simple -potential_petscspace_degree 2 -snes_converged_reason -snes_max_it 1 -snes_monitor -snes_rtol 1.e-8 -snes_type ksponly #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with 64 bit PetscInt Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 Configure options: CC=mpifccpx CXX=mpiFCCpx CFLAGS="-L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" CXXFLAGS="-L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" COPTFLAGS=-Kfast CXXOPTFLAGS=-Kfast --with-fc=0 --package-prefix-hash=/home/ra010009/a04199/petsc-hash-pkgs --with-batch=1 --with-shared-libraries=yes --with-debugging=no --with-64-bit-indices=1 PETSC_ARCH=arch-fugaku-fujitsu ----------------------------------------- Libraries compiled on 2021-02-12 02:27:41 on fn01sv08 Machine characteristics: Linux-3.10.0-957.27.2.el7.x86_64-x86_64-with-redhat-7.6-Maipo Using PETSc directory: /home/ra010009/a04199/petsc Using PETSc arch: ----------------------------------------- Using C compiler: mpifccpx -L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack -fPIC -Kfast ----------------------------------------- Using include paths: -I/home/ra010009/a04199/petsc/include -I/home/ra010009/a04199/petsc/arch-fugaku-fujitsu/include ----------------------------------------- Using C linker: mpifccpx Using libraries: -Wl,-rpath,/home/ra010009/a04199/petsc/lib -L/home/ra010009/a04199/petsc/lib -lpetsc -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 -L/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64 -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64 -Wl,-rpath,/opt/FJSVxtclanga/.common/MELI022/lib64 -L/opt/FJSVxtclanga/.common/MELI022/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 -L/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 -L/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 -L/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj -lX11 -lfjprofmpi -lfjlapack -ldl -lmpi_cxx -lmpi -lfjstring_internal -lfj90i -lfj90fmt_sve -lfj90f -lfjsrcinfo -lfjcrt -lfjprofcore -lfjprofomp -lfjc++ -lfjc++abi -lfjdemgl -lmpg -lm -lrt -lpthread -lelf -lz -lgcc_s -ldl ----------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: weak_scaling_fugaku.png Type: image/png Size: 54249 bytes Desc: not available URL: From mfadams at lbl.gov Sun Mar 7 07:35:30 2021 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 7 Mar 2021 08:35:30 -0500 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: And this data puts one cell per process, distributes, and then refines 5 (or 2,3,4 in plot) times. On Sun, Mar 7, 2021 at 8:27 AM Mark Adams wrote: > FWIW, Here is the output from ex13 on 32K processes (8K Fugaku > nodes/sockets, 4 MPI/node, which seems recommended) with 128^3 vertex mesh > (64^3 Q2 3D Laplacian). > Almost an hour. > Attached is solver scaling. > > > 0 SNES Function norm 3.658334849208e+00 > Linear solve converged due to CONVERGED_RTOL iterations 22 > 1 SNES Function norm 1.609000373074e-12 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > > ************************************************************************************************************************ > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r > -fCourier9' to print this document *** > > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: > ---------------------------------------------- > > ../ex13 on a named i07-4008c with 32768 processors, by a04199 Fri Feb 12 > 23:27:13 2021 > Using Petsc Development GIT revision: v3.14.4-579-g4cb72fa GIT Date: > 2021-02-05 15:19:40 +0000 > > Max Max/Min Avg Total > Time (sec): 3.373e+03 1.000 3.373e+03 > Objects: 1.055e+05 14.797 7.144e+03 > Flop: 5.376e+10 1.176 4.885e+10 1.601e+15 > Flop/sec: 1.594e+07 1.176 1.448e+07 4.745e+11 > MPI Messages: 6.048e+05 30.010 8.833e+04 2.894e+09 > MPI Message Lengths: 1.127e+09 4.132 6.660e+03 1.928e+13 > MPI Reductions: 1.824e+03 1.000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flop > and VecAXPY() for complex vectors of length N > --> 8N flop > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count > %Total Avg %Total Count %Total > 0: Main Stage: 3.2903e+03 97.5% 2.4753e+14 15.5% 3.538e+08 > 12.2% 1.779e+04 32.7% 9.870e+02 54.1% > 1: PCSetUp: 4.3062e+01 1.3% 1.8160e+13 1.1% 1.902e+07 > 0.7% 3.714e+04 3.7% 1.590e+02 8.7% > 2: KSP Solve only: 3.9685e+01 1.2% 1.3349e+15 83.4% 2.522e+09 > 87.1% 4.868e+03 63.7% 6.700e+02 36.7% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > AvgLen: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over > all processors) > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > PetscBarrier 5 1.0 1.9907e+00 2.2 0.00e+00 0.0 3.8e+06 7.7e+01 > 2.0e+01 0 0 0 0 1 0 0 1 0 2 0 > BuildTwoSided 62 1.0 7.3272e+0214.1 0.00e+00 0.0 6.7e+06 8.0e+00 > 0.0e+00 5 0 0 0 0 5 0 2 0 0 0 > BuildTwoSidedF 59 1.0 3.1132e+01 7.4 0.00e+00 0.0 4.8e+06 2.5e+05 > 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 > SNESSolve 1 1.0 1.7468e+02 1.0 7.83e+09 1.3 3.4e+08 1.3e+04 > 8.8e+02 5 13 12 23 48 5 85 96 70 89 1205779 > SNESSetUp 1 1.0 2.4195e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 > 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 > SNESFunctionEval 3 1.0 1.1359e+01 1.2 1.17e+09 1.0 1.6e+06 1.4e+04 > 2.0e+00 0 2 0 0 0 0 15 0 0 0 3344744 > SNESJacobianEval 2 1.0 1.6829e+02 1.0 1.52e+09 1.0 1.1e+06 8.3e+05 > 0.0e+00 5 3 0 5 0 5 20 0 14 0 293588 > DMCreateMat 1 1.0 2.4107e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 > 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 > Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 2.7e+02 > 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 > Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 1.9e+02 > 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 > DMPlexPartSelf 1 1.0 1.1498e+002367.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DMPlexPartLblInv 1 1.0 3.6698e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DMPlexPartLblSF 1 1.0 2.8522e-01 1.7 0.00e+00 0.0 4.9e+04 1.5e+02 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 4.3e+02 > 0.0e+00 14 0 0 0 0 15 0 0 0 0 0 > DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 5.4e+02 > 0.0e+00 28 0 0 0 0 29 0 0 0 0 0 > DMPlexInterp 84 1.0 4.3219e-0158.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 5.0e+00 0 0 0 0 0 0 0 0 0 1 0 > DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 2.3e+02 > 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 > DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 3.1e+02 > 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 > DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 1.9e+02 > 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 > DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 9.3e+01 > 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 > DMPlexStratify 118 1.0 6.2852e+023280.9 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.6e+01 1 0 0 0 1 1 0 0 0 2 0 > DMPlexSymmetrize 118 1.0 6.7634e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DMPlexPrealloc 1 1.0 2.3741e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 > 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 > DMPlexResidualFE 3 1.0 1.0634e+01 1.2 1.16e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 2 0 0 0 0 15 0 0 0 3569848 > DMPlexJacobianFE 2 1.0 1.6809e+02 1.0 1.51e+09 1.0 6.5e+05 1.4e+06 > 0.0e+00 5 3 0 5 0 5 20 0 14 0 293801 > SFSetGraph 87 1.0 2.7673e-03 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 2.7e+04 > 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 > SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 1.8e+04 > 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 > SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 > SFReduceBegin 12 1.0 2.4825e-01172.8 0.00e+00 0.0 2.4e+06 2.0e+05 > 0.0e+00 0 0 0 2 0 0 0 1 8 0 0 > SFReduceEnd 12 1.0 3.8286e+014865.8 3.74e+04 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 31 > SFFetchOpBegin 2 1.0 2.4497e-0390.2 0.00e+00 0.0 4.3e+05 3.5e+05 > 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 > SFFetchOpEnd 2 1.0 6.1349e-0210.9 0.00e+00 0.0 4.3e+05 3.5e+05 > 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 > SFCreateEmbed 3 1.0 3.6800e+013261.5 0.00e+00 0.0 4.7e+05 1.7e+03 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 1.1e+04 > 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 > SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 1.7e+05 > 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 > SFRemoteOff 2 1.0 3.2868e-0143.1 0.00e+00 0.0 8.7e+05 8.2e+03 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFPack 1023 1.0 2.5215e-0176.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFUnpack 1025 1.0 5.1600e-0216.8 5.62e+0521.3 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 54693 > MatMult 1549525.4 3.4810e+00 1.3 4.35e+09 1.1 2.2e+08 6.1e+03 > 0.0e+00 0 8 8 7 0 0 54 62 21 0 38319208 > MatMultAdd 132 1.0 6.9168e-01 3.0 7.97e+07 1.2 2.8e+07 4.6e+02 > 0.0e+00 0 0 1 0 0 0 1 8 0 0 3478717 > MatMultTranspose 132 1.0 5.9967e-01 1.6 8.00e+07 1.2 3.0e+07 4.5e+02 > 0.0e+00 0 0 1 0 0 0 1 9 0 0 4015214 > MatSolve 22 0.0 6.8431e-04 0.0 7.41e+05 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 1082 > MatLUFactorSym 1 1.0 5.9569e-0433.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 1 1.0 1.6236e-03773.2 1.46e+06 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 897 > MatConvert 6 1.0 1.4290e-01 1.2 0.00e+00 0.0 3.0e+06 3.7e+03 > 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 > MatScale 18 1.0 3.7962e-01 1.3 4.11e+07 1.2 2.0e+06 5.5e+03 > 0.0e+00 0 0 0 0 0 0 0 1 0 0 3253392 > MatResidual 132 1.0 6.8256e-01 1.4 8.27e+08 1.2 4.4e+07 5.5e+03 > 0.0e+00 0 2 2 1 0 0 10 13 4 0 36282014 > MatAssemblyBegin 244 1.0 3.1181e+01 6.6 0.00e+00 0.0 4.8e+06 2.5e+05 > 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 > MatAssemblyEnd 244 1.0 6.3232e+00 1.9 3.17e+06 6.9 0.0e+00 0.0e+00 > 1.4e+02 0 0 0 0 8 0 0 0 0 15 7655 > MatGetRowIJ 1 0.0 2.5780e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatCreateSubMat 10 1.0 1.5162e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 > 1.3e+02 0 0 0 0 7 0 0 0 1 13 0 > MatGetOrdering 1 0.0 1.0899e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatCoarsen 6 1.0 3.5837e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 > 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 > MatZeroEntries 8 1.0 5.3730e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatAXPY 6 1.0 2.6245e-01 1.1 2.66e+05 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 33035 > MatTranspose 12 1.0 3.0731e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatMatMultSym 18 1.0 2.1398e+00 1.4 0.00e+00 0.0 6.1e+06 5.5e+03 > 4.8e+01 0 0 0 0 3 0 0 2 1 5 0 > MatMatMultNum 6 1.0 1.1243e+00 1.0 3.76e+07 1.2 2.0e+06 5.5e+03 > 0.0e+00 0 0 0 0 0 0 0 1 0 0 1001203 > MatPtAPSymbolic 6 1.0 1.7280e+01 1.0 0.00e+00 0.0 1.2e+07 3.2e+04 > 4.2e+01 1 0 0 2 2 1 0 3 6 4 0 > MatPtAPNumeric 6 1.0 1.8047e+01 1.0 1.49e+09 5.1 2.8e+06 1.1e+05 > 2.4e+01 1 1 0 2 1 1 5 1 5 2 663675 > MatTrnMatMultSym 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 > 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 > MatGetLocalMat 19 1.0 1.3904e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetBrAoCol 18 1.0 1.9926e-01 5.0 0.00e+00 0.0 1.4e+07 2.3e+04 > 0.0e+00 0 0 0 2 0 0 0 4 5 0 0 > MatGetSymTrans 2 1.0 1.8996e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecTDot 176 1.0 7.0632e-01 4.5 3.48e+07 1.0 0.0e+00 0.0e+00 > 1.8e+02 0 0 0 0 10 0 0 0 0 18 1608728 > VecNorm 60 1.0 1.4074e+0012.2 1.58e+07 1.0 0.0e+00 0.0e+00 > 6.0e+01 0 0 0 0 3 0 0 0 0 6 366467 > VecCopy 422 1.0 5.1259e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 653 1.0 2.3974e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 165 1.0 6.5622e-03 1.3 3.42e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 170485467 > VecAYPX 861 1.0 7.8529e-02 1.2 6.21e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 1 0 0 0 25785252 > VecAXPBYCZ 264 1.0 4.1343e-02 1.5 5.85e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 1 0 0 0 46135592 > VecAssemblyBegin 21 1.0 2.3463e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAssemblyEnd 21 1.0 1.4457e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecPointwiseMult 600 1.0 5.7510e-02 1.2 2.66e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 15075754 > VecScatterBegin 902 1.0 5.1188e-01 1.2 0.00e+00 0.0 2.9e+08 5.3e+03 > 0.0e+00 0 0 10 8 0 0 0 82 25 0 0 > VecScatterEnd 902 1.0 1.2143e+00 3.2 5.50e+0537.9 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 1347 > VecSetRandom 6 1.0 2.6354e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DualSpaceSetUp 7 1.0 5.3467e-0112.0 4.26e+03 1.0 0.0e+00 0.0e+00 > 1.3e+01 0 0 0 0 1 0 0 0 0 1 261 > FESetUp 7 1.0 1.7541e-01128.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSetUp 15 1.0 2.7470e-01 1.1 2.04e+08 1.2 1.0e+07 5.5e+03 > 1.3e+02 0 0 0 0 7 0 2 3 1 13 22477233 > KSPSolve 1 1.0 4.3257e+00 1.0 4.33e+09 1.1 2.5e+08 4.8e+03 > 6.6e+01 0 8 9 6 4 0 54 72 20 7 30855976 > PCGAMGGraph_AGG 6 1.0 5.0969e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 > 4.8e+01 0 0 0 0 3 0 0 1 0 5 220852 > PCGAMGCoarse_AGG 6 1.0 3.1121e+01 1.0 0.00e+00 0.0 2.5e+07 6.9e+04 > 5.5e+01 1 0 1 9 3 1 0 7 27 6 0 > PCGAMGProl_AGG 6 1.0 5.8196e-01 1.0 0.00e+00 0.0 6.6e+06 9.3e+03 > 7.2e+01 0 0 0 0 4 0 0 2 1 7 0 > PCGAMGPOpt_AGG 6 1.0 3.2414e+00 1.0 2.42e+08 1.2 2.1e+07 5.3e+03 > 1.6e+02 0 0 1 1 9 0 3 6 2 17 2256493 > GAMG: createProl 6 1.0 4.0042e+01 1.0 2.80e+08 1.2 5.8e+07 3.3e+04 > 3.4e+02 1 1 2 10 19 1 3 16 31 34 210778 > Graph 12 1.0 5.0926e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 > 4.8e+01 0 0 0 0 3 0 0 1 0 5 221038 > MIS/Agg 6 1.0 3.5850e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 > 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 > SA: col data 6 1.0 3.0509e-01 1.0 0.00e+00 0.0 5.4e+06 9.2e+03 > 2.4e+01 0 0 0 0 1 0 0 2 1 2 0 > SA: frmProl0 6 1.0 2.3467e-01 1.1 0.00e+00 0.0 1.3e+06 9.5e+03 > 2.4e+01 0 0 0 0 1 0 0 0 0 2 0 > SA: smooth 6 1.0 2.7855e+00 1.0 4.14e+07 1.2 8.1e+06 5.5e+03 > 6.3e+01 0 0 0 0 3 0 1 2 1 6 446491 > GAMG: partLevel 6 1.0 3.7266e+01 1.0 1.49e+09 5.1 1.5e+07 4.9e+04 > 3.2e+02 1 1 1 4 17 1 5 4 12 32 321395 > repartition 5 1.0 2.0343e+00 1.1 0.00e+00 0.0 4.0e+05 1.4e+05 > 2.5e+02 0 0 0 0 14 0 0 0 1 25 0 > Invert-Sort 5 1.0 1.5021e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 3.0e+01 0 0 0 0 2 0 0 0 0 3 0 > Move A 5 1.0 1.1548e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 > 7.0e+01 0 0 0 0 4 0 0 0 1 7 0 > Move P 5 1.0 4.2799e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 7.5e+01 0 0 0 0 4 0 0 0 0 8 0 > PCGAMG Squ l00 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 > 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 > PCGAMG Gal l00 1 1.0 8.7411e+00 1.0 2.93e+08 1.1 5.4e+06 4.5e+04 > 1.2e+01 0 1 0 1 1 0 4 2 4 1 1092355 > PCGAMG Opt l00 1 1.0 1.9734e+00 1.0 3.36e+07 1.1 3.2e+06 1.2e+04 > 9.0e+00 0 0 0 0 0 0 0 1 1 1 555327 > PCGAMG Gal l01 1 1.0 1.0153e+00 1.0 3.50e+07 1.4 5.9e+06 3.9e+04 > 1.2e+01 0 0 0 1 1 0 0 2 4 1 1079887 > PCGAMG Opt l01 1 1.0 7.4812e-02 1.0 5.35e+05 1.2 3.2e+06 1.1e+03 > 9.0e+00 0 0 0 0 0 0 0 1 0 1 232542 > PCGAMG Gal l02 1 1.0 1.8063e+00 1.0 7.43e+07 0.0 3.0e+06 5.9e+04 > 1.2e+01 0 0 0 1 1 0 0 1 3 1 593392 > PCGAMG Opt l02 1 1.0 1.1580e-01 1.1 6.93e+05 0.0 1.6e+06 1.3e+03 > 9.0e+00 0 0 0 0 0 0 0 0 0 1 93213 > PCGAMG Gal l03 1 1.0 6.1075e+00 1.0 2.72e+08 0.0 2.6e+05 9.2e+04 > 1.1e+01 0 0 0 0 1 0 0 0 0 1 36155 > PCGAMG Opt l03 1 1.0 8.0836e-02 1.0 1.55e+06 0.0 1.4e+05 1.4e+03 > 8.0e+00 0 0 0 0 0 0 0 0 0 1 18229 > PCGAMG Gal l04 1 1.0 1.6203e+01 1.0 9.44e+08 0.0 1.4e+04 3.0e+05 > 1.1e+01 0 0 0 0 1 0 0 0 0 1 2366 > PCGAMG Opt l04 1 1.0 1.2663e-01 1.0 2.01e+06 0.0 6.9e+03 2.2e+03 > 8.0e+00 0 0 0 0 0 0 0 0 0 1 817 > PCGAMG Gal l05 1 1.0 1.4800e+00 1.0 3.16e+08 0.0 9.0e+01 1.6e+05 > 1.1e+01 0 0 0 0 1 0 0 0 0 1 796 > PCGAMG Opt l05 1 1.0 8.1763e-02 1.1 2.50e+06 0.0 4.8e+01 4.6e+03 > 8.0e+00 0 0 0 0 0 0 0 0 0 1 114 > PCSetUp 2 1.0 7.7969e+01 1.0 1.97e+09 2.8 8.3e+07 3.3e+04 > 8.1e+02 2 2 3 14 44 2 11 23 43 82 341051 > PCSetUpOnBlocks 22 1.0 2.4609e-0317.2 1.46e+06 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 592 > PCApply 22 1.0 3.6455e+00 1.1 3.57e+09 1.2 2.4e+08 4.3e+03 > 0.0e+00 0 7 8 5 0 0 43 67 16 0 29434967 > > --- Event Stage 1: PCSetUp > > BuildTwoSided 4 1.0 1.5980e-01 2.7 0.00e+00 0.0 2.1e+05 8.0e+00 > 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 > BuildTwoSidedF 6 1.0 1.3169e+01 5.5 0.00e+00 0.0 1.9e+06 1.9e+05 > 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 > SFSetGraph 5 1.0 4.9640e-0519.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFSetUp 4 1.0 1.6038e-01 2.3 0.00e+00 0.0 6.4e+05 9.1e+02 > 0.0e+00 0 0 0 0 0 0 0 3 0 0 0 > SFPack 30 1.0 3.3376e-04 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFUnpack 30 1.0 1.2101e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatMult 30 1.0 1.5544e-01 1.5 1.87e+08 1.2 1.0e+07 5.5e+03 > 0.0e+00 0 0 0 0 0 0 31 53 8 0 35930640 > MatAssemblyBegin 43 1.0 1.3201e+01 4.7 0.00e+00 0.0 1.9e+06 1.9e+05 > 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 > MatAssemblyEnd 43 1.0 1.1159e+01 1.0 2.77e+07705.7 0.0e+00 0.0e+00 > 2.0e+01 0 0 0 0 1 26 0 0 0 13 1036 > MatZeroEntries 6 1.0 4.7315e-0410.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatTranspose 12 1.0 2.5142e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatMatMultSym 10 1.0 5.8783e-0117.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatPtAPSymbolic 5 1.0 1.4489e+01 1.0 0.00e+00 0.0 6.2e+06 3.6e+04 > 3.5e+01 0 0 0 1 2 34 0 32 31 22 0 > MatPtAPNumeric 6 1.0 2.8457e+01 1.0 1.50e+09 5.1 2.7e+06 1.6e+05 > 2.0e+01 1 1 0 2 1 66 66 14 61 13 421190 > MatGetLocalMat 6 1.0 9.8574e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetBrAoCol 6 1.0 3.7669e-01 2.3 0.00e+00 0.0 5.1e+06 3.8e+04 > 0.0e+00 0 0 0 1 0 0 0 27 28 0 0 > VecTDot 66 1.0 6.5271e-02 4.1 5.85e+06 1.0 0.0e+00 0.0e+00 > 6.6e+01 0 0 0 0 4 0 1 0 0 42 2922260 > VecNorm 36 1.0 1.1226e-02 3.2 3.19e+06 1.0 0.0e+00 0.0e+00 > 3.6e+01 0 0 0 0 2 0 1 0 0 23 9268067 > VecCopy 12 1.0 1.2805e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 11 1.0 6.6620e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 60 1.0 1.0763e-03 1.5 5.32e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 1 0 0 0 161104914 > VecAYPX 24 1.0 2.0581e-03 1.3 2.13e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 33701038 > VecPointwiseMult 36 1.0 3.5709e-03 1.3 1.60e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 14567861 > VecScatterBegin 30 1.0 2.9079e-03 7.8 0.00e+00 0.0 1.0e+07 5.5e+03 > 0.0e+00 0 0 0 0 0 0 0 53 8 0 0 > VecScatterEnd 30 1.0 3.7015e-0263.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSetUp 7 1.0 2.3165e-01 1.0 2.04e+08 1.2 1.0e+07 5.5e+03 > 1.0e+02 0 0 0 0 6 1 34 53 8 64 26654598 > PCGAMG Gal l00 1 1.0 4.7415e+00 1.0 2.94e+08 1.1 1.8e+06 7.8e+04 > 0.0e+00 0 1 0 1 0 11 53 9 20 0 2015623 > PCGAMG Gal l01 1 1.0 1.2103e+00 1.0 3.50e+07 1.4 4.8e+06 6.2e+04 > 1.2e+01 0 0 0 2 1 3 6 25 41 8 905938 > PCGAMG Gal l02 1 1.0 3.4334e+00 1.0 7.41e+07 0.0 2.2e+06 8.7e+04 > 1.2e+01 0 0 0 1 1 8 6 11 27 8 312184 > PCGAMG Gal l03 1 1.0 9.6062e+00 1.0 2.71e+08 0.0 1.9e+05 1.3e+05 > 1.1e+01 0 0 0 0 1 22 1 1 4 7 22987 > PCGAMG Gal l04 1 1.0 2.2482e+01 1.0 9.43e+08 0.0 8.7e+03 4.8e+05 > 1.1e+01 1 0 0 0 1 52 0 0 1 7 1705 > PCGAMG Gal l05 1 1.0 1.5961e+00 1.1 3.16e+08 0.0 6.8e+01 2.2e+05 > 1.1e+01 0 0 0 0 1 4 0 0 0 7 738 > PCSetUp 1 1.0 4.3191e+01 1.0 1.70e+09 3.6 1.9e+07 3.7e+04 > 1.6e+02 1 1 1 4 9 100100100100100 420463 > > --- Event Stage 2: KSP Solve only > > SFPack 8140 1.0 7.4247e-02 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFUnpack 8140 1.0 1.2905e-02 5.2 5.50e+0637.9 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 1267207 > MatMult 5500 1.0 2.9994e+01 1.2 3.98e+10 1.1 2.0e+09 6.1e+03 > 0.0e+00 1 76 68 62 0 70 92 78 98 0 40747181 > MatMultAdd 1320 1.0 6.2192e+00 2.7 7.97e+08 1.2 2.8e+08 4.6e+02 > 0.0e+00 0 2 10 1 0 14 2 11 1 0 3868976 > MatMultTranspose 1320 1.0 4.0304e+00 1.7 8.00e+08 1.2 2.8e+08 4.6e+02 > 0.0e+00 0 2 10 1 0 7 2 11 1 0 5974153 > MatSolve 220 0.0 6.7366e-03 0.0 7.41e+06 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 1100 > MatLUFactorSym 1 1.0 5.8691e-0435.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 1 1.0 1.5955e-03756.2 1.46e+06 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 913 > MatResidual 1320 1.0 6.4920e+00 1.3 8.27e+09 1.2 4.4e+08 5.5e+03 > 0.0e+00 0 15 15 13 0 14 19 18 20 0 38146350 > MatGetRowIJ 1 0.0 2.7820e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetOrdering 1 0.0 9.6940e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecTDot 440 1.0 4.6162e+00 6.9 2.31e+08 1.0 0.0e+00 0.0e+00 > 4.4e+02 0 0 0 0 24 5 1 0 0 66 1635124 > VecNorm 230 1.0 3.9605e-02 1.6 1.21e+08 1.0 0.0e+00 0.0e+00 > 2.3e+02 0 0 0 0 13 0 0 0 0 34 99622387 > VecCopy 3980 1.0 5.4166e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 4640 1.0 1.4216e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 440 1.0 4.2829e-02 1.3 2.31e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 1 0 0 0 176236363 > VecAYPX 8130 1.0 7.3998e-01 1.2 5.78e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 2 1 0 0 0 25489392 > VecAXPBYCZ 2640 1.0 3.9974e-01 1.5 5.85e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 1 1 0 0 0 47716315 > VecPointwiseMult 5280 1.0 5.9845e-01 1.5 2.34e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 1 0 0 0 12748927 > VecScatterBegin 8140 1.0 4.9231e-01 5.9 0.00e+00 0.0 2.5e+09 4.9e+03 > 0.0e+00 0 0 87 64 0 1 0100100 0 0 > VecScatterEnd 8140 1.0 1.0172e+01 3.6 5.50e+0637.9 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 13 0 0 0 0 1608 > KSPSetUp 1 1.0 9.5996e-07 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 10 1.0 3.9685e+01 1.0 4.33e+10 1.1 2.5e+09 4.9e+03 > 6.7e+02 1 83 87 64 37 100100100100100 33637495 > PCSetUp 1 1.0 2.4149e-0318.1 1.46e+06 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 603 > PCSetUpOnBlocks 220 1.0 2.6945e-03 8.9 1.46e+06 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 540 > PCApply 220 1.0 3.2921e+01 1.1 3.57e+10 1.2 2.3e+09 4.3e+03 > 0.0e+00 1 67 81 53 0 81 80 93 82 0 32595360 > > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Container 112 112 69888 0. > SNES 1 1 1532 0. > DMSNES 1 1 720 0. > Distributed Mesh 449 449 30060888 0. > DM Label 790 790 549840 0. > Quadrature 579 579 379824 0. > Index Set 100215 100210 361926232 0. > IS L to G Mapping 8 13 4356552 0. > Section 771 771 598296 0. > Star Forest Graph 897 897 1053640 0. > Discrete System 521 521 533512 0. > GraphPartitioner 118 118 91568 0. > Matrix 432 462 2441805304 0. > Matrix Coarsen 6 6 4032 0. > Vector 354 354 65492968 0. > Linear Space 7 7 5208 0. > Dual Space 111 111 113664 0. > FE Space 7 7 5992 0. > Field over DM 6 6 4560 0. > Krylov Solver 21 21 37560 0. > DMKSP interface 1 1 704 0. > Preconditioner 21 21 21632 0. > Viewer 2 1 896 0. > PetscRandom 12 12 8520 0. > > --- Event Stage 1: PCSetUp > > Index Set 10 15 85367336 0. > IS L to G Mapping 5 0 0 0. > Star Forest Graph 5 5 6600 0. > Matrix 50 20 73134024 0. > Vector 28 28 6235096 0. > > --- Event Stage 2: KSP Solve only > > Index Set 5 5 8296 0. > Matrix 1 1 273856 0. > > ======================================================================================================================== > Average time to get PetscTime(): 6.40051e-08 > Average time for MPI_Barrier(): 8.506e-06 > Average time for zero size MPI_Send(): 6.6027e-06 > #PETSc Option Table entries: > -benchmark_it 10 > -dm_distribute > -dm_plex_box_dim 3 > -dm_plex_box_faces 32,32,32 > -dm_plex_box_lower 0,0,0 > -dm_plex_box_simplex 0 > -dm_plex_box_upper 1,1,1 > -dm_refine 5 > -ksp_converged_reason > -ksp_max_it 150 > -ksp_norm_type unpreconditioned > -ksp_rtol 1.e-12 > -ksp_type cg > -log_view > -matptap_via scalable > -mg_levels_esteig_ksp_max_it 5 > -mg_levels_esteig_ksp_type cg > -mg_levels_ksp_max_it 2 > -mg_levels_ksp_type chebyshev > -mg_levels_pc_type jacobi > -pc_gamg_agg_nsmooths 1 > -pc_gamg_coarse_eq_limit 2000 > -pc_gamg_coarse_grid_layout_type spread > -pc_gamg_esteig_ksp_max_it 5 > -pc_gamg_esteig_ksp_type cg > -pc_gamg_process_eq_limit 500 > -pc_gamg_repartition false > -pc_gamg_reuse_interpolation true > -pc_gamg_square_graph 1 > -pc_gamg_threshold 0.01 > -pc_gamg_threshold_scale .5 > -pc_gamg_type agg > -pc_type gamg > -petscpartitioner_simple_node_grid 8,8,8 > -petscpartitioner_simple_process_grid 4,4,4 > -petscpartitioner_type simple > -potential_petscspace_degree 2 > -snes_converged_reason > -snes_max_it 1 > -snes_monitor > -snes_rtol 1.e-8 > -snes_type ksponly > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with 64 bit PetscInt > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 8 > Configure options: CC=mpifccpx CXX=mpiFCCpx CFLAGS="-L > /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" CXXFLAGS="-L > /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" COPTFLAGS=-Kfast > CXXOPTFLAGS=-Kfast --with-fc=0 > --package-prefix-hash=/home/ra010009/a04199/petsc-hash-pkgs --with-batch=1 > --with-shared-libraries=yes --with-debugging=no --with-64-bit-indices=1 > PETSC_ARCH=arch-fugaku-fujitsu > ----------------------------------------- > Libraries compiled on 2021-02-12 02:27:41 on fn01sv08 > Machine characteristics: > Linux-3.10.0-957.27.2.el7.x86_64-x86_64-with-redhat-7.6-Maipo > Using PETSc directory: /home/ra010009/a04199/petsc > Using PETSc arch: > ----------------------------------------- > > Using C compiler: mpifccpx -L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 > -lfjlapack -fPIC -Kfast > ----------------------------------------- > > Using include paths: -I/home/ra010009/a04199/petsc/include > -I/home/ra010009/a04199/petsc/arch-fugaku-fujitsu/include > ----------------------------------------- > > Using C linker: mpifccpx > Using libraries: -Wl,-rpath,/home/ra010009/a04199/petsc/lib > -L/home/ra010009/a04199/petsc/lib -lpetsc > -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 > -L/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 > -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64 > -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64 > -Wl,-rpath,/opt/FJSVxtclanga/.common/MELI022/lib64 > -L/opt/FJSVxtclanga/.common/MELI022/lib64 > -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 > -L/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 > -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 > -L/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 > -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 > -L/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 > -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj > -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj -lX11 -lfjprofmpi -lfjlapack > -ldl -lmpi_cxx -lmpi -lfjstring_internal -lfj90i -lfj90fmt_sve -lfj90f > -lfjsrcinfo -lfjcrt -lfjprofcore -lfjprofomp -lfjc++ -lfjc++abi -lfjdemgl > -lmpg -lm -lrt -lpthread -lelf -lz -lgcc_s -ldl > ----------------------------------------- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Sun Mar 7 07:51:57 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Sun, 7 Mar 2021 14:51:57 +0100 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: References: Message-ID: Matt, Thanks for your answer. However, DMPlexComputeCellGeometryFVM does not compute what I need (normals of height 1 entities). I can't find any function doing that, is there one ? So far I've been doing it by hand, and after a lot of experimenting the past weeks, it seems that if I call P0P1P2P3 a tetrahedron and note x the cross product, P3P2xP3P1 is the outward normal to face P1P2P3 P0P2xP0P3 " P0P2P3 P3P1xP3P0 " P0P1P3 P0P1xP0P2 " P0P1P2 Have I been lucky but can't expect it to be true ? (Alternatively, there is a link between the normals and the element Jacobian, but I don't know the formula and can find them) Thanks, -- Nicolas On 08/02/2021 15:19, Matthew Knepley wrote: > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral > > wrote: > > Hi all, > > Can I make any assumption on the orientation of triangular facets in a > tetrahedral plex ? I need the inward facet normals. Do I need to use > DMPlexGetOrientedFace or can I rely on either the tet vertices > ordering, > or the faces ordering ? Could DMPlexGetRawFaces_Internal be enough ? > > > You can do it by hand, but you have to account for the face orientation > relative to the cell. That is what > DMPlexGetOrientedFace() does. I think it would be easier to use the > function below. > > Alternatively, is there a function that computes the normals - without > bringing out the big guns ? > > > This will compute the normals > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html > Should not be too heavy weight. > > ? THanks, > > ? ? Matt > > Thanks > > -- > Nicolas > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Sun Mar 7 09:54:08 2021 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 7 Mar 2021 10:54:08 -0500 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: References: Message-ID: On Sun, Mar 7, 2021 at 8:52 AM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > Matt, > > Thanks for your answer. > > However, DMPlexComputeCellGeometryFVM does not compute what I need > (normals of height 1 entities). I can't find any function doing that, is > there one ? > The normal[] in DMPlexComputeCellGeometryFVM() is exactly what you want. What does not look right to you? Thanks, Matt > So far I've been doing it by hand, and after a lot of experimenting the > past weeks, it seems that if I call P0P1P2P3 a tetrahedron and note x > the cross product, > P3P2xP3P1 is the outward normal to face P1P2P3 > P0P2xP0P3 " P0P2P3 > P3P1xP3P0 " P0P1P3 > P0P1xP0P2 " P0P1P2 > Have I been lucky but can't expect it to be true ? > > (Alternatively, there is a link between the normals and the element > Jacobian, but I don't know the formula and can find them) > > > Thanks, > > -- > Nicolas > > On 08/02/2021 15:19, Matthew Knepley wrote: > > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral > > > > wrote: > > > > Hi all, > > > > Can I make any assumption on the orientation of triangular facets in > a > > tetrahedral plex ? I need the inward facet normals. Do I need to use > > DMPlexGetOrientedFace or can I rely on either the tet vertices > > ordering, > > or the faces ordering ? Could DMPlexGetRawFaces_Internal be enough ? > > > > > > You can do it by hand, but you have to account for the face orientation > > relative to the cell. That is what > > DMPlexGetOrientedFace() does. I think it would be easier to use the > > function below. > > > > Alternatively, is there a function that computes the normals - > without > > bringing out the big guns ? > > > > > > This will compute the normals > > > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html > > Should not be too heavy weight. > > > > THanks, > > > > Matt > > > > Thanks > > > > -- > > Nicolas > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Sun Mar 7 12:35:29 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Sun, 7 Mar 2021 19:35:29 +0100 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: References: Message-ID: On 3/7/21 4:54 PM, Matthew Knepley wrote: > On Sun, Mar 7, 2021 at 8:52 AM Nicolas Barral > > wrote: > > Matt, > > Thanks for your answer. > > However, DMPlexComputeCellGeometryFVM does not compute what I need > (normals of height 1 entities). I can't find any function doing > that, is > there one ? > > > The normal[] in?DMPlexComputeCellGeometryFVM() is exactly what you want. > What does not look right to you? > I got confused by the parameters name: it wasn't clear cell could be a point of lower depth in the DAG :) I'll try that now. Thanks -- Nicolas > ? Thanks, > > ? ? Matt > > So far I've been doing it by hand, and after a lot of experimenting the > past weeks, it seems that if I call P0P1P2P3 a tetrahedron and note x > the cross product, > P3P2xP3P1 is the outward normal to face P1P2P3 > P0P2xP0P3? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P2P3 > P3P1xP3P0? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P3 > P0P1xP0P2? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P2 > Have I been lucky but can't expect it to be true ? > > (Alternatively, there is a link between the normals and the element > Jacobian, but I don't know the formula and can? find them) > > > Thanks, > > -- > Nicolas > > On 08/02/2021 15:19, Matthew Knepley wrote: > > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral > > > > >> wrote: > > > >? ? ?Hi all, > > > >? ? ?Can I make any assumption on the orientation of triangular > facets in a > >? ? ?tetrahedral plex ? I need the inward facet normals. Do I need > to use > >? ? ?DMPlexGetOrientedFace or can I rely on either the tet vertices > >? ? ?ordering, > >? ? ?or the faces ordering ? Could DMPlexGetRawFaces_Internal be > enough ? > > > > > > You can do it by hand, but you have to account for the face > orientation > > relative to the cell. That is what > > DMPlexGetOrientedFace() does. I think it would be easier to use the > > function below. > > > >? ? ?Alternatively, is there a function that computes the normals > - without > >? ? ?bringing out the big guns ? > > > > > > This will compute the normals > > > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html > > > Should not be too heavy weight. > > > >? ? THanks, > > > >? ? ? Matt > > > >? ? ?Thanks > > > >? ? ?-- > >? ? ?Nicolas > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From bsmith at petsc.dev Sun Mar 7 13:01:40 2021 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 7 Mar 2021 13:01:40 -0600 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: Mark, Thanks for the numbers. Extremely problematic. DMPlexDistribute takes 88 percent of the total run time, SFBcastOpEnd takes 80 percent. Probably Matt is right, PetscSF is flooding the network which it cannot handle. IMHO fixing PetscSF would be a far better route than writing all kinds of fancy DMPLEX hierarchical distributors. PetscSF needs to detect that it is sending too many messages together and do the messaging in appropriate waves; at the moment PetscSF is as dumb as stone it just shoves everything out as fast as it can. Junchao needs access to this machine. If everything in PETSc will depend on PetscSF then it simply has to scale on systems where you cannot just flood the network with MPI. Barry Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 2.7e+02 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 1.9e+02 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 4.3e+00.0e+00 14 0 0 0 0 15 0 0 0 0 0 DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 5.4e+00.0e+00 28 0 0 0 0 29 0 0 0 0 0 DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 2.3e+02 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 3.1e+02 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 1.9e+02 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 9.3e+01 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 2.7e+04 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 1.8e+04 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 1.1e+04 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 1.7e+05 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 > On Mar 7, 2021, at 7:35 AM, Mark Adams wrote: > > And this data puts one cell per process, distributes, and then refines 5 (or 2,3,4 in plot) times. > > On Sun, Mar 7, 2021 at 8:27 AM Mark Adams > wrote: > FWIW, Here is the output from ex13 on 32K processes (8K Fugaku nodes/sockets, 4 MPI/node, which seems recommended) with 128^3 vertex mesh (64^3 Q2 3D Laplacian). > Almost an hour. > Attached is solver scaling. > > > 0 SNES Function norm 3.658334849208e+00 > Linear solve converged due to CONVERGED_RTOL iterations 22 > 1 SNES Function norm 1.609000373074e-12 > Nonlinear solve converged due to CONVERGED_ITS iterations 1 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > Linear solve converged due to CONVERGED_RTOL iterations 22 > ************************************************************************************************************************ > *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** > ************************************************************************************************************************ > > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > ../ex13 on a named i07-4008c with 32768 processors, by a04199 Fri Feb 12 23:27:13 2021 > Using Petsc Development GIT revision: v3.14.4-579-g4cb72fa GIT Date: 2021-02-05 15:19:40 +0000 > > Max Max/Min Avg Total > Time (sec): 3.373e+03 1.000 3.373e+03 > Objects: 1.055e+05 14.797 7.144e+03 > Flop: 5.376e+10 1.176 4.885e+10 1.601e+15 > Flop/sec: 1.594e+07 1.176 1.448e+07 4.745e+11 > MPI Messages: 6.048e+05 30.010 8.833e+04 2.894e+09 > MPI Message Lengths: 1.127e+09 4.132 6.660e+03 1.928e+13 > MPI Reductions: 1.824e+03 1.000 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N --> 2N flop > and VecAXPY() for complex vectors of length N --> 8N flop > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count %Total Avg %Total Count %Total > 0: Main Stage: 3.2903e+03 97.5% 2.4753e+14 15.5% 3.538e+08 12.2% 1.779e+04 32.7% 9.870e+02 54.1% > 1: PCSetUp: 4.3062e+01 1.3% 1.8160e+13 1.1% 1.902e+07 0.7% 3.714e+04 3.7% 1.590e+02 8.7% > 2: KSP Solve only: 3.9685e+01 1.2% 1.3349e+15 83.4% 2.522e+09 87.1% 4.868e+03 63.7% 6.700e+02 36.7% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > AvgLen: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this phase > %M - percent messages in this phase %L - percent message lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop --- Global --- --- Stage ---- Total > Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > PetscBarrier 5 1.0 1.9907e+00 2.2 0.00e+00 0.0 3.8e+06 7.7e+01 2.0e+01 0 0 0 0 1 0 0 1 0 2 0 > BuildTwoSided 62 1.0 7.3272e+0214.1 0.00e+00 0.0 6.7e+06 8.0e+00 0.0e+00 5 0 0 0 0 5 0 2 0 0 0 > BuildTwoSidedF 59 1.0 3.1132e+01 7.4 0.00e+00 0.0 4.8e+06 2.5e+05 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 > SNESSolve 1 1.0 1.7468e+02 1.0 7.83e+09 1.3 3.4e+08 1.3e+04 8.8e+02 5 13 12 23 48 5 85 96 70 89 1205779 > SNESSetUp 1 1.0 2.4195e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 > SNESFunctionEval 3 1.0 1.1359e+01 1.2 1.17e+09 1.0 1.6e+06 1.4e+04 2.0e+00 0 2 0 0 0 0 15 0 0 0 3344744 > SNESJacobianEval 2 1.0 1.6829e+02 1.0 1.52e+09 1.0 1.1e+06 8.3e+05 0.0e+00 5 3 0 5 0 5 20 0 14 0 293588 > DMCreateMat 1 1.0 2.4107e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 > Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 2.7e+02 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 > Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 1.9e+02 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 > DMPlexPartSelf 1 1.0 1.1498e+002367.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DMPlexPartLblInv 1 1.0 3.6698e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DMPlexPartLblSF 1 1.0 2.8522e-01 1.7 0.00e+00 0.0 4.9e+04 1.5e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 4.3e+02 0.0e+00 14 0 0 0 0 15 0 0 0 0 0 > DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 5.4e+02 0.0e+00 28 0 0 0 0 29 0 0 0 0 0 > DMPlexInterp 84 1.0 4.3219e-0158.6 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 0 0 0 0 0 0 0 0 0 1 0 > DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 2.3e+02 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 > DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 3.1e+02 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 > DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 1.9e+02 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 > DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 9.3e+01 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 > DMPlexStratify 118 1.0 6.2852e+023280.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+01 1 0 0 0 1 1 0 0 0 2 0 > DMPlexSymmetrize 118 1.0 6.7634e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DMPlexPrealloc 1 1.0 2.3741e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 > DMPlexResidualFE 3 1.0 1.0634e+01 1.2 1.16e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 15 0 0 0 3569848 > DMPlexJacobianFE 2 1.0 1.6809e+02 1.0 1.51e+09 1.0 6.5e+05 1.4e+06 0.0e+00 5 3 0 5 0 5 20 0 14 0 293801 > SFSetGraph 87 1.0 2.7673e-03 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 2.7e+04 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 > SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 1.8e+04 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 > SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 > SFReduceBegin 12 1.0 2.4825e-01172.8 0.00e+00 0.0 2.4e+06 2.0e+05 0.0e+00 0 0 0 2 0 0 0 1 8 0 0 > SFReduceEnd 12 1.0 3.8286e+014865.8 3.74e+04 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 31 > SFFetchOpBegin 2 1.0 2.4497e-0390.2 0.00e+00 0.0 4.3e+05 3.5e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 > SFFetchOpEnd 2 1.0 6.1349e-0210.9 0.00e+00 0.0 4.3e+05 3.5e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 > SFCreateEmbed 3 1.0 3.6800e+013261.5 0.00e+00 0.0 4.7e+05 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 1.1e+04 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 > SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 1.7e+05 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 > SFRemoteOff 2 1.0 3.2868e-0143.1 0.00e+00 0.0 8.7e+05 8.2e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFPack 1023 1.0 2.5215e-0176.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFUnpack 1025 1.0 5.1600e-0216.8 5.62e+0521.3 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 54693 > MatMult 1549525.4 3.4810e+00 1.3 4.35e+09 1.1 2.2e+08 6.1e+03 0.0e+00 0 8 8 7 0 0 54 62 21 0 38319208 > MatMultAdd 132 1.0 6.9168e-01 3.0 7.97e+07 1.2 2.8e+07 4.6e+02 0.0e+00 0 0 1 0 0 0 1 8 0 0 3478717 > MatMultTranspose 132 1.0 5.9967e-01 1.6 8.00e+07 1.2 3.0e+07 4.5e+02 0.0e+00 0 0 1 0 0 0 1 9 0 0 4015214 > MatSolve 22 0.0 6.8431e-04 0.0 7.41e+05 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1082 > MatLUFactorSym 1 1.0 5.9569e-0433.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 1 1.0 1.6236e-03773.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 897 > MatConvert 6 1.0 1.4290e-01 1.2 0.00e+00 0.0 3.0e+06 3.7e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 > MatScale 18 1.0 3.7962e-01 1.3 4.11e+07 1.2 2.0e+06 5.5e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 3253392 > MatResidual 132 1.0 6.8256e-01 1.4 8.27e+08 1.2 4.4e+07 5.5e+03 0.0e+00 0 2 2 1 0 0 10 13 4 0 36282014 > MatAssemblyBegin 244 1.0 3.1181e+01 6.6 0.00e+00 0.0 4.8e+06 2.5e+05 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 > MatAssemblyEnd 244 1.0 6.3232e+00 1.9 3.17e+06 6.9 0.0e+00 0.0e+00 1.4e+02 0 0 0 0 8 0 0 0 0 15 7655 > MatGetRowIJ 1 0.0 2.5780e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatCreateSubMat 10 1.0 1.5162e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 1.3e+02 0 0 0 0 7 0 0 0 1 13 0 > MatGetOrdering 1 0.0 1.0899e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatCoarsen 6 1.0 3.5837e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 > MatZeroEntries 8 1.0 5.3730e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatAXPY 6 1.0 2.6245e-01 1.1 2.66e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33035 > MatTranspose 12 1.0 3.0731e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatMatMultSym 18 1.0 2.1398e+00 1.4 0.00e+00 0.0 6.1e+06 5.5e+03 4.8e+01 0 0 0 0 3 0 0 2 1 5 0 > MatMatMultNum 6 1.0 1.1243e+00 1.0 3.76e+07 1.2 2.0e+06 5.5e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 1001203 > MatPtAPSymbolic 6 1.0 1.7280e+01 1.0 0.00e+00 0.0 1.2e+07 3.2e+04 4.2e+01 1 0 0 2 2 1 0 3 6 4 0 > MatPtAPNumeric 6 1.0 1.8047e+01 1.0 1.49e+09 5.1 2.8e+06 1.1e+05 2.4e+01 1 1 0 2 1 1 5 1 5 2 663675 > MatTrnMatMultSym 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 > MatGetLocalMat 19 1.0 1.3904e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetBrAoCol 18 1.0 1.9926e-01 5.0 0.00e+00 0.0 1.4e+07 2.3e+04 0.0e+00 0 0 0 2 0 0 0 4 5 0 0 > MatGetSymTrans 2 1.0 1.8996e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecTDot 176 1.0 7.0632e-01 4.5 3.48e+07 1.0 0.0e+00 0.0e+00 1.8e+02 0 0 0 0 10 0 0 0 0 18 1608728 > VecNorm 60 1.0 1.4074e+0012.2 1.58e+07 1.0 0.0e+00 0.0e+00 6.0e+01 0 0 0 0 3 0 0 0 0 6 366467 > VecCopy 422 1.0 5.1259e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 653 1.0 2.3974e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 165 1.0 6.5622e-03 1.3 3.42e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170485467 > VecAYPX 861 1.0 7.8529e-02 1.2 6.21e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 25785252 > VecAXPBYCZ 264 1.0 4.1343e-02 1.5 5.85e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 46135592 > VecAssemblyBegin 21 1.0 2.3463e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAssemblyEnd 21 1.0 1.4457e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecPointwiseMult 600 1.0 5.7510e-02 1.2 2.66e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 15075754 > VecScatterBegin 902 1.0 5.1188e-01 1.2 0.00e+00 0.0 2.9e+08 5.3e+03 0.0e+00 0 0 10 8 0 0 0 82 25 0 0 > VecScatterEnd 902 1.0 1.2143e+00 3.2 5.50e+0537.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1347 > VecSetRandom 6 1.0 2.6354e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DualSpaceSetUp 7 1.0 5.3467e-0112.0 4.26e+03 1.0 0.0e+00 0.0e+00 1.3e+01 0 0 0 0 1 0 0 0 0 1 261 > FESetUp 7 1.0 1.7541e-01128.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSetUp 15 1.0 2.7470e-01 1.1 2.04e+08 1.2 1.0e+07 5.5e+03 1.3e+02 0 0 0 0 7 0 2 3 1 13 22477233 > KSPSolve 1 1.0 4.3257e+00 1.0 4.33e+09 1.1 2.5e+08 4.8e+03 6.6e+01 0 8 9 6 4 0 54 72 20 7 30855976 > PCGAMGGraph_AGG 6 1.0 5.0969e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 4.8e+01 0 0 0 0 3 0 0 1 0 5 220852 > PCGAMGCoarse_AGG 6 1.0 3.1121e+01 1.0 0.00e+00 0.0 2.5e+07 6.9e+04 5.5e+01 1 0 1 9 3 1 0 7 27 6 0 > PCGAMGProl_AGG 6 1.0 5.8196e-01 1.0 0.00e+00 0.0 6.6e+06 9.3e+03 7.2e+01 0 0 0 0 4 0 0 2 1 7 0 > PCGAMGPOpt_AGG 6 1.0 3.2414e+00 1.0 2.42e+08 1.2 2.1e+07 5.3e+03 1.6e+02 0 0 1 1 9 0 3 6 2 17 2256493 > GAMG: createProl 6 1.0 4.0042e+01 1.0 2.80e+08 1.2 5.8e+07 3.3e+04 3.4e+02 1 1 2 10 19 1 3 16 31 34 210778 > Graph 12 1.0 5.0926e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 4.8e+01 0 0 0 0 3 0 0 1 0 5 221038 > MIS/Agg 6 1.0 3.5850e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 > SA: col data 6 1.0 3.0509e-01 1.0 0.00e+00 0.0 5.4e+06 9.2e+03 2.4e+01 0 0 0 0 1 0 0 2 1 2 0 > SA: frmProl0 6 1.0 2.3467e-01 1.1 0.00e+00 0.0 1.3e+06 9.5e+03 2.4e+01 0 0 0 0 1 0 0 0 0 2 0 > SA: smooth 6 1.0 2.7855e+00 1.0 4.14e+07 1.2 8.1e+06 5.5e+03 6.3e+01 0 0 0 0 3 0 1 2 1 6 446491 > GAMG: partLevel 6 1.0 3.7266e+01 1.0 1.49e+09 5.1 1.5e+07 4.9e+04 3.2e+02 1 1 1 4 17 1 5 4 12 32 321395 > repartition 5 1.0 2.0343e+00 1.1 0.00e+00 0.0 4.0e+05 1.4e+05 2.5e+02 0 0 0 0 14 0 0 0 1 25 0 > Invert-Sort 5 1.0 1.5021e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+01 0 0 0 0 2 0 0 0 0 3 0 > Move A 5 1.0 1.1548e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 7.0e+01 0 0 0 0 4 0 0 0 1 7 0 > Move P 5 1.0 4.2799e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 7.5e+01 0 0 0 0 4 0 0 0 0 8 0 > PCGAMG Squ l00 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 > PCGAMG Gal l00 1 1.0 8.7411e+00 1.0 2.93e+08 1.1 5.4e+06 4.5e+04 1.2e+01 0 1 0 1 1 0 4 2 4 1 1092355 > PCGAMG Opt l00 1 1.0 1.9734e+00 1.0 3.36e+07 1.1 3.2e+06 1.2e+04 9.0e+00 0 0 0 0 0 0 0 1 1 1 555327 > PCGAMG Gal l01 1 1.0 1.0153e+00 1.0 3.50e+07 1.4 5.9e+06 3.9e+04 1.2e+01 0 0 0 1 1 0 0 2 4 1 1079887 > PCGAMG Opt l01 1 1.0 7.4812e-02 1.0 5.35e+05 1.2 3.2e+06 1.1e+03 9.0e+00 0 0 0 0 0 0 0 1 0 1 232542 > PCGAMG Gal l02 1 1.0 1.8063e+00 1.0 7.43e+07 0.0 3.0e+06 5.9e+04 1.2e+01 0 0 0 1 1 0 0 1 3 1 593392 > PCGAMG Opt l02 1 1.0 1.1580e-01 1.1 6.93e+05 0.0 1.6e+06 1.3e+03 9.0e+00 0 0 0 0 0 0 0 0 0 1 93213 > PCGAMG Gal l03 1 1.0 6.1075e+00 1.0 2.72e+08 0.0 2.6e+05 9.2e+04 1.1e+01 0 0 0 0 1 0 0 0 0 1 36155 > PCGAMG Opt l03 1 1.0 8.0836e-02 1.0 1.55e+06 0.0 1.4e+05 1.4e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 18229 > PCGAMG Gal l04 1 1.0 1.6203e+01 1.0 9.44e+08 0.0 1.4e+04 3.0e+05 1.1e+01 0 0 0 0 1 0 0 0 0 1 2366 > PCGAMG Opt l04 1 1.0 1.2663e-01 1.0 2.01e+06 0.0 6.9e+03 2.2e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 817 > PCGAMG Gal l05 1 1.0 1.4800e+00 1.0 3.16e+08 0.0 9.0e+01 1.6e+05 1.1e+01 0 0 0 0 1 0 0 0 0 1 796 > PCGAMG Opt l05 1 1.0 8.1763e-02 1.1 2.50e+06 0.0 4.8e+01 4.6e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 114 > PCSetUp 2 1.0 7.7969e+01 1.0 1.97e+09 2.8 8.3e+07 3.3e+04 8.1e+02 2 2 3 14 44 2 11 23 43 82 341051 > PCSetUpOnBlocks 22 1.0 2.4609e-0317.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 592 > PCApply 22 1.0 3.6455e+00 1.1 3.57e+09 1.2 2.4e+08 4.3e+03 0.0e+00 0 7 8 5 0 0 43 67 16 0 29434967 > > --- Event Stage 1: PCSetUp > > BuildTwoSided 4 1.0 1.5980e-01 2.7 0.00e+00 0.0 2.1e+05 8.0e+00 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 > BuildTwoSidedF 6 1.0 1.3169e+01 5.5 0.00e+00 0.0 1.9e+06 1.9e+05 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 > SFSetGraph 5 1.0 4.9640e-0519.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFSetUp 4 1.0 1.6038e-01 2.3 0.00e+00 0.0 6.4e+05 9.1e+02 0.0e+00 0 0 0 0 0 0 0 3 0 0 0 > SFPack 30 1.0 3.3376e-04 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFUnpack 30 1.0 1.2101e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatMult 30 1.0 1.5544e-01 1.5 1.87e+08 1.2 1.0e+07 5.5e+03 0.0e+00 0 0 0 0 0 0 31 53 8 0 35930640 > MatAssemblyBegin 43 1.0 1.3201e+01 4.7 0.00e+00 0.0 1.9e+06 1.9e+05 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 > MatAssemblyEnd 43 1.0 1.1159e+01 1.0 2.77e+07705.7 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 1 26 0 0 0 13 1036 > MatZeroEntries 6 1.0 4.7315e-0410.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatTranspose 12 1.0 2.5142e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatMatMultSym 10 1.0 5.8783e-0117.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatPtAPSymbolic 5 1.0 1.4489e+01 1.0 0.00e+00 0.0 6.2e+06 3.6e+04 3.5e+01 0 0 0 1 2 34 0 32 31 22 0 > MatPtAPNumeric 6 1.0 2.8457e+01 1.0 1.50e+09 5.1 2.7e+06 1.6e+05 2.0e+01 1 1 0 2 1 66 66 14 61 13 421190 > MatGetLocalMat 6 1.0 9.8574e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetBrAoCol 6 1.0 3.7669e-01 2.3 0.00e+00 0.0 5.1e+06 3.8e+04 0.0e+00 0 0 0 1 0 0 0 27 28 0 0 > VecTDot 66 1.0 6.5271e-02 4.1 5.85e+06 1.0 0.0e+00 0.0e+00 6.6e+01 0 0 0 0 4 0 1 0 0 42 2922260 > VecNorm 36 1.0 1.1226e-02 3.2 3.19e+06 1.0 0.0e+00 0.0e+00 3.6e+01 0 0 0 0 2 0 1 0 0 23 9268067 > VecCopy 12 1.0 1.2805e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 11 1.0 6.6620e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 60 1.0 1.0763e-03 1.5 5.32e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 161104914 > VecAYPX 24 1.0 2.0581e-03 1.3 2.13e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33701038 > VecPointwiseMult 36 1.0 3.5709e-03 1.3 1.60e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 14567861 > VecScatterBegin 30 1.0 2.9079e-03 7.8 0.00e+00 0.0 1.0e+07 5.5e+03 0.0e+00 0 0 0 0 0 0 0 53 8 0 0 > VecScatterEnd 30 1.0 3.7015e-0263.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSetUp 7 1.0 2.3165e-01 1.0 2.04e+08 1.2 1.0e+07 5.5e+03 1.0e+02 0 0 0 0 6 1 34 53 8 64 26654598 > PCGAMG Gal l00 1 1.0 4.7415e+00 1.0 2.94e+08 1.1 1.8e+06 7.8e+04 0.0e+00 0 1 0 1 0 11 53 9 20 0 2015623 > PCGAMG Gal l01 1 1.0 1.2103e+00 1.0 3.50e+07 1.4 4.8e+06 6.2e+04 1.2e+01 0 0 0 2 1 3 6 25 41 8 905938 > PCGAMG Gal l02 1 1.0 3.4334e+00 1.0 7.41e+07 0.0 2.2e+06 8.7e+04 1.2e+01 0 0 0 1 1 8 6 11 27 8 312184 > PCGAMG Gal l03 1 1.0 9.6062e+00 1.0 2.71e+08 0.0 1.9e+05 1.3e+05 1.1e+01 0 0 0 0 1 22 1 1 4 7 22987 > PCGAMG Gal l04 1 1.0 2.2482e+01 1.0 9.43e+08 0.0 8.7e+03 4.8e+05 1.1e+01 1 0 0 0 1 52 0 0 1 7 1705 > PCGAMG Gal l05 1 1.0 1.5961e+00 1.1 3.16e+08 0.0 6.8e+01 2.2e+05 1.1e+01 0 0 0 0 1 4 0 0 0 7 738 > PCSetUp 1 1.0 4.3191e+01 1.0 1.70e+09 3.6 1.9e+07 3.7e+04 1.6e+02 1 1 1 4 9 100100100100100 420463 > > --- Event Stage 2: KSP Solve only > > SFPack 8140 1.0 7.4247e-02 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFUnpack 8140 1.0 1.2905e-02 5.2 5.50e+0637.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1267207 > MatMult 5500 1.0 2.9994e+01 1.2 3.98e+10 1.1 2.0e+09 6.1e+03 0.0e+00 1 76 68 62 0 70 92 78 98 0 40747181 > MatMultAdd 1320 1.0 6.2192e+00 2.7 7.97e+08 1.2 2.8e+08 4.6e+02 0.0e+00 0 2 10 1 0 14 2 11 1 0 3868976 > MatMultTranspose 1320 1.0 4.0304e+00 1.7 8.00e+08 1.2 2.8e+08 4.6e+02 0.0e+00 0 2 10 1 0 7 2 11 1 0 5974153 > MatSolve 220 0.0 6.7366e-03 0.0 7.41e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1100 > MatLUFactorSym 1 1.0 5.8691e-0435.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 1 1.0 1.5955e-03756.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 913 > MatResidual 1320 1.0 6.4920e+00 1.3 8.27e+09 1.2 4.4e+08 5.5e+03 0.0e+00 0 15 15 13 0 14 19 18 20 0 38146350 > MatGetRowIJ 1 0.0 2.7820e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetOrdering 1 0.0 9.6940e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecTDot 440 1.0 4.6162e+00 6.9 2.31e+08 1.0 0.0e+00 0.0e+00 4.4e+02 0 0 0 0 24 5 1 0 0 66 1635124 > VecNorm 230 1.0 3.9605e-02 1.6 1.21e+08 1.0 0.0e+00 0.0e+00 2.3e+02 0 0 0 0 13 0 0 0 0 34 99622387 > VecCopy 3980 1.0 5.4166e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 4640 1.0 1.4216e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 440 1.0 4.2829e-02 1.3 2.31e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 176236363 > VecAYPX 8130 1.0 7.3998e-01 1.2 5.78e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 2 1 0 0 0 25489392 > VecAXPBYCZ 2640 1.0 3.9974e-01 1.5 5.85e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 47716315 > VecPointwiseMult 5280 1.0 5.9845e-01 1.5 2.34e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 12748927 > VecScatterBegin 8140 1.0 4.9231e-01 5.9 0.00e+00 0.0 2.5e+09 4.9e+03 0.0e+00 0 0 87 64 0 1 0100100 0 0 > VecScatterEnd 8140 1.0 1.0172e+01 3.6 5.50e+0637.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 13 0 0 0 0 1608 > KSPSetUp 1 1.0 9.5996e-07 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 10 1.0 3.9685e+01 1.0 4.33e+10 1.1 2.5e+09 4.9e+03 6.7e+02 1 83 87 64 37 100100100100100 33637495 > PCSetUp 1 1.0 2.4149e-0318.1 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 603 > PCSetUpOnBlocks 220 1.0 2.6945e-03 8.9 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 540 > PCApply 220 1.0 3.2921e+01 1.1 3.57e+10 1.2 2.3e+09 4.3e+03 0.0e+00 1 67 81 53 0 81 80 93 82 0 32595360 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Container 112 112 69888 0. > SNES 1 1 1532 0. > DMSNES 1 1 720 0. > Distributed Mesh 449 449 30060888 0. > DM Label 790 790 549840 0. > Quadrature 579 579 379824 0. > Index Set 100215 100210 361926232 0. > IS L to G Mapping 8 13 4356552 0. > Section 771 771 598296 0. > Star Forest Graph 897 897 1053640 0. > Discrete System 521 521 533512 0. > GraphPartitioner 118 118 91568 0. > Matrix 432 462 2441805304 0. > Matrix Coarsen 6 6 4032 0. > Vector 354 354 65492968 0. > Linear Space 7 7 5208 0. > Dual Space 111 111 113664 0. > FE Space 7 7 5992 0. > Field over DM 6 6 4560 0. > Krylov Solver 21 21 37560 0. > DMKSP interface 1 1 704 0. > Preconditioner 21 21 21632 0. > Viewer 2 1 896 0. > PetscRandom 12 12 8520 0. > > --- Event Stage 1: PCSetUp > > Index Set 10 15 85367336 0. > IS L to G Mapping 5 0 0 0. > Star Forest Graph 5 5 6600 0. > Matrix 50 20 73134024 0. > Vector 28 28 6235096 0. > > --- Event Stage 2: KSP Solve only > > Index Set 5 5 8296 0. > Matrix 1 1 273856 0. > ======================================================================================================================== > Average time to get PetscTime(): 6.40051e-08 > Average time for MPI_Barrier(): 8.506e-06 > Average time for zero size MPI_Send(): 6.6027e-06 > #PETSc Option Table entries: > -benchmark_it 10 > -dm_distribute > -dm_plex_box_dim 3 > -dm_plex_box_faces 32,32,32 > -dm_plex_box_lower 0,0,0 > -dm_plex_box_simplex 0 > -dm_plex_box_upper 1,1,1 > -dm_refine 5 > -ksp_converged_reason > -ksp_max_it 150 > -ksp_norm_type unpreconditioned > -ksp_rtol 1.e-12 > -ksp_type cg > -log_view > -matptap_via scalable > -mg_levels_esteig_ksp_max_it 5 > -mg_levels_esteig_ksp_type cg > -mg_levels_ksp_max_it 2 > -mg_levels_ksp_type chebyshev > -mg_levels_pc_type jacobi > -pc_gamg_agg_nsmooths 1 > -pc_gamg_coarse_eq_limit 2000 > -pc_gamg_coarse_grid_layout_type spread > -pc_gamg_esteig_ksp_max_it 5 > -pc_gamg_esteig_ksp_type cg > -pc_gamg_process_eq_limit 500 > -pc_gamg_repartition false > -pc_gamg_reuse_interpolation true > -pc_gamg_square_graph 1 > -pc_gamg_threshold 0.01 > -pc_gamg_threshold_scale .5 > -pc_gamg_type agg > -pc_type gamg > -petscpartitioner_simple_node_grid 8,8,8 > -petscpartitioner_simple_process_grid 4,4,4 > -petscpartitioner_type simple > -potential_petscspace_degree 2 > -snes_converged_reason > -snes_max_it 1 > -snes_monitor > -snes_rtol 1.e-8 > -snes_type ksponly > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with 64 bit PetscInt > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 > Configure options: CC=mpifccpx CXX=mpiFCCpx CFLAGS="-L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" CXXFLAGS="-L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" COPTFLAGS=-Kfast CXXOPTFLAGS=-Kfast --with-fc=0 --package-prefix-hash=/home/ra010009/a04199/petsc-hash-pkgs --with-batch=1 --with-shared-libraries=yes --with-debugging=no --with-64-bit-indices=1 PETSC_ARCH=arch-fugaku-fujitsu > ----------------------------------------- > Libraries compiled on 2021-02-12 02:27:41 on fn01sv08 > Machine characteristics: Linux-3.10.0-957.27.2.el7.x86_64-x86_64-with-redhat-7.6-Maipo > Using PETSc directory: /home/ra010009/a04199/petsc > Using PETSc arch: > ----------------------------------------- > > Using C compiler: mpifccpx -L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack -fPIC -Kfast > ----------------------------------------- > > Using include paths: -I/home/ra010009/a04199/petsc/include -I/home/ra010009/a04199/petsc/arch-fugaku-fujitsu/include > ----------------------------------------- > > Using C linker: mpifccpx > Using libraries: -Wl,-rpath,/home/ra010009/a04199/petsc/lib -L/home/ra010009/a04199/petsc/lib -lpetsc -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 -L/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64 -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64 -Wl,-rpath,/opt/FJSVxtclanga/.common/MELI022/lib64 -L/opt/FJSVxtclanga/.common/MELI022/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 -L/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 -L/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 -L/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj -lX11 -lfjprofmpi -lfjlapack -ldl -lmpi_cxx -lmpi -lfjstring_internal -lfj90i -lfj90fmt_sve -lfj90f -lfjsrcinfo -lfjcrt -lfjprofcore -lfjprofomp -lfjc++ -lfjc++abi -lfjdemgl -lmpg -lm -lrt -lpthread -lelf -lz -lgcc_s -ldl > ----------------------------------------- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Sun Mar 7 14:27:24 2021 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Sun, 7 Mar 2021 23:27:24 +0300 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: <4129BF7E-1092-4593-A7C7-275531EB3557@gmail.com> Mark Being an MPI issue, you should run with -log_sync From your log the problem seems with SFSetup that is called many times (62), with timings associated mostly to the SF revealing ranks phase. DMPlex abuses of the embedded SF, that can be optimized further I presume. It should run (someone has to write the code) a cheaper operation, since the communication graph of the embedded SF is a subgraph of the original . > On Mar 7, 2021, at 10:01 PM, Barry Smith wrote: > > > Mark, > > Thanks for the numbers. > > Extremely problematic. DMPlexDistribute takes 88 percent of the total run time, SFBcastOpEnd takes 80 percent. > > Probably Matt is right, PetscSF is flooding the network which it cannot handle. IMHO fixing PetscSF would be a far better route than writing all kinds of fancy DMPLEX hierarchical distributors. PetscSF needs to detect that it is sending too many messages together and do the messaging in appropriate waves; at the moment PetscSF is as dumb as stone it just shoves everything out as fast as it can. Junchao needs access to this machine. If everything in PETSc will depend on PetscSF then it simply has to scale on systems where you cannot just flood the network with MPI. > > Barry > > > Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 2.7e+02 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 > Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 1.9e+02 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 > DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 4.3e+00.0e+00 14 0 0 0 0 15 0 0 0 0 0 > DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 5.4e+00.0e+00 28 0 0 0 0 29 0 0 0 0 0 > DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 2.3e+02 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 > DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 3.1e+02 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 > DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 1.9e+02 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 > DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 9.3e+01 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 > SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 2.7e+04 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 > SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 1.8e+04 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 > SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 > SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 1.1e+04 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 > SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 1.7e+05 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 > >> On Mar 7, 2021, at 7:35 AM, Mark Adams > wrote: >> >> And this data puts one cell per process, distributes, and then refines 5 (or 2,3,4 in plot) times. >> >> On Sun, Mar 7, 2021 at 8:27 AM Mark Adams > wrote: >> FWIW, Here is the output from ex13 on 32K processes (8K Fugaku nodes/sockets, 4 MPI/node, which seems recommended) with 128^3 vertex mesh (64^3 Q2 3D Laplacian). >> Almost an hour. >> Attached is solver scaling. >> >> >> 0 SNES Function norm 3.658334849208e+00 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> 1 SNES Function norm 1.609000373074e-12 >> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> ************************************************************************************************************************ >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** >> ************************************************************************************************************************ >> >> ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- >> >> ../ex13 on a named i07-4008c with 32768 processors, by a04199 Fri Feb 12 23:27:13 2021 >> Using Petsc Development GIT revision: v3.14.4-579-g4cb72fa GIT Date: 2021-02-05 15:19:40 +0000 >> >> Max Max/Min Avg Total >> Time (sec): 3.373e+03 1.000 3.373e+03 >> Objects: 1.055e+05 14.797 7.144e+03 >> Flop: 5.376e+10 1.176 4.885e+10 1.601e+15 >> Flop/sec: 1.594e+07 1.176 1.448e+07 4.745e+11 >> MPI Messages: 6.048e+05 30.010 8.833e+04 2.894e+09 >> MPI Message Lengths: 1.127e+09 4.132 6.660e+03 1.928e+13 >> MPI Reductions: 1.824e+03 1.000 >> >> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length N --> 2N flop >> and VecAXPY() for complex vectors of length N --> 8N flop >> >> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >> 0: Main Stage: 3.2903e+03 97.5% 2.4753e+14 15.5% 3.538e+08 12.2% 1.779e+04 32.7% 9.870e+02 54.1% >> 1: PCSetUp: 4.3062e+01 1.3% 1.8160e+13 1.1% 1.902e+07 0.7% 3.714e+04 3.7% 1.590e+02 8.7% >> 2: KSP Solve only: 3.9685e+01 1.2% 1.3349e+15 83.4% 2.522e+09 87.1% 4.868e+03 63.7% 6.700e+02 36.7% >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flop: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all processors >> Mess: number of messages sent >> AvgLen: average message length (bytes) >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flop in this phase >> %M - percent messages in this phase %L - percent message lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ------------------------------------------------------------------------------------------------------------------------ >> >> --- Event Stage 0: Main Stage >> >> PetscBarrier 5 1.0 1.9907e+00 2.2 0.00e+00 0.0 3.8e+06 7.7e+01 2.0e+01 0 0 0 0 1 0 0 1 0 2 0 >> BuildTwoSided 62 1.0 7.3272e+0214.1 0.00e+00 0.0 6.7e+06 8.0e+00 0.0e+00 5 0 0 0 0 5 0 2 0 0 0 >> BuildTwoSidedF 59 1.0 3.1132e+01 7.4 0.00e+00 0.0 4.8e+06 2.5e+05 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 >> SNESSolve 1 1.0 1.7468e+02 1.0 7.83e+09 1.3 3.4e+08 1.3e+04 8.8e+02 5 13 12 23 48 5 85 96 70 89 1205779 >> SNESSetUp 1 1.0 2.4195e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 >> SNESFunctionEval 3 1.0 1.1359e+01 1.2 1.17e+09 1.0 1.6e+06 1.4e+04 2.0e+00 0 2 0 0 0 0 15 0 0 0 3344744 >> SNESJacobianEval 2 1.0 1.6829e+02 1.0 1.52e+09 1.0 1.1e+06 8.3e+05 0.0e+00 5 3 0 5 0 5 20 0 14 0 293588 >> DMCreateMat 1 1.0 2.4107e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 >> Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 2.7e+02 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 >> Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 1.9e+02 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 >> DMPlexPartSelf 1 1.0 1.1498e+002367.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DMPlexPartLblInv 1 1.0 3.6698e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DMPlexPartLblSF 1 1.0 2.8522e-01 1.7 0.00e+00 0.0 4.9e+04 1.5e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 4.3e+02 0.0e+00 14 0 0 0 0 15 0 0 0 0 0 >> DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 5.4e+02 0.0e+00 28 0 0 0 0 29 0 0 0 0 0 >> DMPlexInterp 84 1.0 4.3219e-0158.6 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 0 0 0 0 0 0 0 0 0 1 0 >> DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 2.3e+02 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 >> DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 3.1e+02 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 >> DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 1.9e+02 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 >> DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 9.3e+01 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 >> DMPlexStratify 118 1.0 6.2852e+023280.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+01 1 0 0 0 1 1 0 0 0 2 0 >> DMPlexSymmetrize 118 1.0 6.7634e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DMPlexPrealloc 1 1.0 2.3741e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 >> DMPlexResidualFE 3 1.0 1.0634e+01 1.2 1.16e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 15 0 0 0 3569848 >> DMPlexJacobianFE 2 1.0 1.6809e+02 1.0 1.51e+09 1.0 6.5e+05 1.4e+06 0.0e+00 5 3 0 5 0 5 20 0 14 0 293801 >> SFSetGraph 87 1.0 2.7673e-03 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 2.7e+04 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 >> SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 1.8e+04 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 >> SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 >> SFReduceBegin 12 1.0 2.4825e-01172.8 0.00e+00 0.0 2.4e+06 2.0e+05 0.0e+00 0 0 0 2 0 0 0 1 8 0 0 >> SFReduceEnd 12 1.0 3.8286e+014865.8 3.74e+04 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 31 >> SFFetchOpBegin 2 1.0 2.4497e-0390.2 0.00e+00 0.0 4.3e+05 3.5e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 >> SFFetchOpEnd 2 1.0 6.1349e-0210.9 0.00e+00 0.0 4.3e+05 3.5e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 >> SFCreateEmbed 3 1.0 3.6800e+013261.5 0.00e+00 0.0 4.7e+05 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 1.1e+04 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 >> SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 1.7e+05 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 >> SFRemoteOff 2 1.0 3.2868e-0143.1 0.00e+00 0.0 8.7e+05 8.2e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFPack 1023 1.0 2.5215e-0176.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFUnpack 1025 1.0 5.1600e-0216.8 5.62e+0521.3 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 54693 >> MatMult 1549525.4 3.4810e+00 1.3 4.35e+09 1.1 2.2e+08 6.1e+03 0.0e+00 0 8 8 7 0 0 54 62 21 0 38319208 >> MatMultAdd 132 1.0 6.9168e-01 3.0 7.97e+07 1.2 2.8e+07 4.6e+02 0.0e+00 0 0 1 0 0 0 1 8 0 0 3478717 >> MatMultTranspose 132 1.0 5.9967e-01 1.6 8.00e+07 1.2 3.0e+07 4.5e+02 0.0e+00 0 0 1 0 0 0 1 9 0 0 4015214 >> MatSolve 22 0.0 6.8431e-04 0.0 7.41e+05 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1082 >> MatLUFactorSym 1 1.0 5.9569e-0433.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatLUFactorNum 1 1.0 1.6236e-03773.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 897 >> MatConvert 6 1.0 1.4290e-01 1.2 0.00e+00 0.0 3.0e+06 3.7e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 >> MatScale 18 1.0 3.7962e-01 1.3 4.11e+07 1.2 2.0e+06 5.5e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 3253392 >> MatResidual 132 1.0 6.8256e-01 1.4 8.27e+08 1.2 4.4e+07 5.5e+03 0.0e+00 0 2 2 1 0 0 10 13 4 0 36282014 >> MatAssemblyBegin 244 1.0 3.1181e+01 6.6 0.00e+00 0.0 4.8e+06 2.5e+05 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 >> MatAssemblyEnd 244 1.0 6.3232e+00 1.9 3.17e+06 6.9 0.0e+00 0.0e+00 1.4e+02 0 0 0 0 8 0 0 0 0 15 7655 >> MatGetRowIJ 1 0.0 2.5780e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatCreateSubMat 10 1.0 1.5162e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 1.3e+02 0 0 0 0 7 0 0 0 1 13 0 >> MatGetOrdering 1 0.0 1.0899e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatCoarsen 6 1.0 3.5837e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 >> MatZeroEntries 8 1.0 5.3730e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatAXPY 6 1.0 2.6245e-01 1.1 2.66e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33035 >> MatTranspose 12 1.0 3.0731e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatMatMultSym 18 1.0 2.1398e+00 1.4 0.00e+00 0.0 6.1e+06 5.5e+03 4.8e+01 0 0 0 0 3 0 0 2 1 5 0 >> MatMatMultNum 6 1.0 1.1243e+00 1.0 3.76e+07 1.2 2.0e+06 5.5e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 1001203 >> MatPtAPSymbolic 6 1.0 1.7280e+01 1.0 0.00e+00 0.0 1.2e+07 3.2e+04 4.2e+01 1 0 0 2 2 1 0 3 6 4 0 >> MatPtAPNumeric 6 1.0 1.8047e+01 1.0 1.49e+09 5.1 2.8e+06 1.1e+05 2.4e+01 1 1 0 2 1 1 5 1 5 2 663675 >> MatTrnMatMultSym 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 >> MatGetLocalMat 19 1.0 1.3904e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetBrAoCol 18 1.0 1.9926e-01 5.0 0.00e+00 0.0 1.4e+07 2.3e+04 0.0e+00 0 0 0 2 0 0 0 4 5 0 0 >> MatGetSymTrans 2 1.0 1.8996e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecTDot 176 1.0 7.0632e-01 4.5 3.48e+07 1.0 0.0e+00 0.0e+00 1.8e+02 0 0 0 0 10 0 0 0 0 18 1608728 >> VecNorm 60 1.0 1.4074e+0012.2 1.58e+07 1.0 0.0e+00 0.0e+00 6.0e+01 0 0 0 0 3 0 0 0 0 6 366467 >> VecCopy 422 1.0 5.1259e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 653 1.0 2.3974e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 165 1.0 6.5622e-03 1.3 3.42e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170485467 >> VecAYPX 861 1.0 7.8529e-02 1.2 6.21e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 25785252 >> VecAXPBYCZ 264 1.0 4.1343e-02 1.5 5.85e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 46135592 >> VecAssemblyBegin 21 1.0 2.3463e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAssemblyEnd 21 1.0 1.4457e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecPointwiseMult 600 1.0 5.7510e-02 1.2 2.66e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 15075754 >> VecScatterBegin 902 1.0 5.1188e-01 1.2 0.00e+00 0.0 2.9e+08 5.3e+03 0.0e+00 0 0 10 8 0 0 0 82 25 0 0 >> VecScatterEnd 902 1.0 1.2143e+00 3.2 5.50e+0537.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1347 >> VecSetRandom 6 1.0 2.6354e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DualSpaceSetUp 7 1.0 5.3467e-0112.0 4.26e+03 1.0 0.0e+00 0.0e+00 1.3e+01 0 0 0 0 1 0 0 0 0 1 261 >> FESetUp 7 1.0 1.7541e-01128.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetUp 15 1.0 2.7470e-01 1.1 2.04e+08 1.2 1.0e+07 5.5e+03 1.3e+02 0 0 0 0 7 0 2 3 1 13 22477233 >> KSPSolve 1 1.0 4.3257e+00 1.0 4.33e+09 1.1 2.5e+08 4.8e+03 6.6e+01 0 8 9 6 4 0 54 72 20 7 30855976 >> PCGAMGGraph_AGG 6 1.0 5.0969e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 4.8e+01 0 0 0 0 3 0 0 1 0 5 220852 >> PCGAMGCoarse_AGG 6 1.0 3.1121e+01 1.0 0.00e+00 0.0 2.5e+07 6.9e+04 5.5e+01 1 0 1 9 3 1 0 7 27 6 0 >> PCGAMGProl_AGG 6 1.0 5.8196e-01 1.0 0.00e+00 0.0 6.6e+06 9.3e+03 7.2e+01 0 0 0 0 4 0 0 2 1 7 0 >> PCGAMGPOpt_AGG 6 1.0 3.2414e+00 1.0 2.42e+08 1.2 2.1e+07 5.3e+03 1.6e+02 0 0 1 1 9 0 3 6 2 17 2256493 >> GAMG: createProl 6 1.0 4.0042e+01 1.0 2.80e+08 1.2 5.8e+07 3.3e+04 3.4e+02 1 1 2 10 19 1 3 16 31 34 210778 >> Graph 12 1.0 5.0926e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 4.8e+01 0 0 0 0 3 0 0 1 0 5 221038 >> MIS/Agg 6 1.0 3.5850e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 >> SA: col data 6 1.0 3.0509e-01 1.0 0.00e+00 0.0 5.4e+06 9.2e+03 2.4e+01 0 0 0 0 1 0 0 2 1 2 0 >> SA: frmProl0 6 1.0 2.3467e-01 1.1 0.00e+00 0.0 1.3e+06 9.5e+03 2.4e+01 0 0 0 0 1 0 0 0 0 2 0 >> SA: smooth 6 1.0 2.7855e+00 1.0 4.14e+07 1.2 8.1e+06 5.5e+03 6.3e+01 0 0 0 0 3 0 1 2 1 6 446491 >> GAMG: partLevel 6 1.0 3.7266e+01 1.0 1.49e+09 5.1 1.5e+07 4.9e+04 3.2e+02 1 1 1 4 17 1 5 4 12 32 321395 >> repartition 5 1.0 2.0343e+00 1.1 0.00e+00 0.0 4.0e+05 1.4e+05 2.5e+02 0 0 0 0 14 0 0 0 1 25 0 >> Invert-Sort 5 1.0 1.5021e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+01 0 0 0 0 2 0 0 0 0 3 0 >> Move A 5 1.0 1.1548e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 7.0e+01 0 0 0 0 4 0 0 0 1 7 0 >> Move P 5 1.0 4.2799e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 7.5e+01 0 0 0 0 4 0 0 0 0 8 0 >> PCGAMG Squ l00 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 >> PCGAMG Gal l00 1 1.0 8.7411e+00 1.0 2.93e+08 1.1 5.4e+06 4.5e+04 1.2e+01 0 1 0 1 1 0 4 2 4 1 1092355 >> PCGAMG Opt l00 1 1.0 1.9734e+00 1.0 3.36e+07 1.1 3.2e+06 1.2e+04 9.0e+00 0 0 0 0 0 0 0 1 1 1 555327 >> PCGAMG Gal l01 1 1.0 1.0153e+00 1.0 3.50e+07 1.4 5.9e+06 3.9e+04 1.2e+01 0 0 0 1 1 0 0 2 4 1 1079887 >> PCGAMG Opt l01 1 1.0 7.4812e-02 1.0 5.35e+05 1.2 3.2e+06 1.1e+03 9.0e+00 0 0 0 0 0 0 0 1 0 1 232542 >> PCGAMG Gal l02 1 1.0 1.8063e+00 1.0 7.43e+07 0.0 3.0e+06 5.9e+04 1.2e+01 0 0 0 1 1 0 0 1 3 1 593392 >> PCGAMG Opt l02 1 1.0 1.1580e-01 1.1 6.93e+05 0.0 1.6e+06 1.3e+03 9.0e+00 0 0 0 0 0 0 0 0 0 1 93213 >> PCGAMG Gal l03 1 1.0 6.1075e+00 1.0 2.72e+08 0.0 2.6e+05 9.2e+04 1.1e+01 0 0 0 0 1 0 0 0 0 1 36155 >> PCGAMG Opt l03 1 1.0 8.0836e-02 1.0 1.55e+06 0.0 1.4e+05 1.4e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 18229 >> PCGAMG Gal l04 1 1.0 1.6203e+01 1.0 9.44e+08 0.0 1.4e+04 3.0e+05 1.1e+01 0 0 0 0 1 0 0 0 0 1 2366 >> PCGAMG Opt l04 1 1.0 1.2663e-01 1.0 2.01e+06 0.0 6.9e+03 2.2e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 817 >> PCGAMG Gal l05 1 1.0 1.4800e+00 1.0 3.16e+08 0.0 9.0e+01 1.6e+05 1.1e+01 0 0 0 0 1 0 0 0 0 1 796 >> PCGAMG Opt l05 1 1.0 8.1763e-02 1.1 2.50e+06 0.0 4.8e+01 4.6e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 114 >> PCSetUp 2 1.0 7.7969e+01 1.0 1.97e+09 2.8 8.3e+07 3.3e+04 8.1e+02 2 2 3 14 44 2 11 23 43 82 341051 >> PCSetUpOnBlocks 22 1.0 2.4609e-0317.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 592 >> PCApply 22 1.0 3.6455e+00 1.1 3.57e+09 1.2 2.4e+08 4.3e+03 0.0e+00 0 7 8 5 0 0 43 67 16 0 29434967 >> >> --- Event Stage 1: PCSetUp >> >> BuildTwoSided 4 1.0 1.5980e-01 2.7 0.00e+00 0.0 2.1e+05 8.0e+00 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 >> BuildTwoSidedF 6 1.0 1.3169e+01 5.5 0.00e+00 0.0 1.9e+06 1.9e+05 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 >> SFSetGraph 5 1.0 4.9640e-0519.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFSetUp 4 1.0 1.6038e-01 2.3 0.00e+00 0.0 6.4e+05 9.1e+02 0.0e+00 0 0 0 0 0 0 0 3 0 0 0 >> SFPack 30 1.0 3.3376e-04 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFUnpack 30 1.0 1.2101e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatMult 30 1.0 1.5544e-01 1.5 1.87e+08 1.2 1.0e+07 5.5e+03 0.0e+00 0 0 0 0 0 0 31 53 8 0 35930640 >> MatAssemblyBegin 43 1.0 1.3201e+01 4.7 0.00e+00 0.0 1.9e+06 1.9e+05 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 >> MatAssemblyEnd 43 1.0 1.1159e+01 1.0 2.77e+07705.7 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 1 26 0 0 0 13 1036 >> MatZeroEntries 6 1.0 4.7315e-0410.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatTranspose 12 1.0 2.5142e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatMatMultSym 10 1.0 5.8783e-0117.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatPtAPSymbolic 5 1.0 1.4489e+01 1.0 0.00e+00 0.0 6.2e+06 3.6e+04 3.5e+01 0 0 0 1 2 34 0 32 31 22 0 >> MatPtAPNumeric 6 1.0 2.8457e+01 1.0 1.50e+09 5.1 2.7e+06 1.6e+05 2.0e+01 1 1 0 2 1 66 66 14 61 13 421190 >> MatGetLocalMat 6 1.0 9.8574e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetBrAoCol 6 1.0 3.7669e-01 2.3 0.00e+00 0.0 5.1e+06 3.8e+04 0.0e+00 0 0 0 1 0 0 0 27 28 0 0 >> VecTDot 66 1.0 6.5271e-02 4.1 5.85e+06 1.0 0.0e+00 0.0e+00 6.6e+01 0 0 0 0 4 0 1 0 0 42 2922260 >> VecNorm 36 1.0 1.1226e-02 3.2 3.19e+06 1.0 0.0e+00 0.0e+00 3.6e+01 0 0 0 0 2 0 1 0 0 23 9268067 >> VecCopy 12 1.0 1.2805e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 11 1.0 6.6620e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 60 1.0 1.0763e-03 1.5 5.32e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 161104914 >> VecAYPX 24 1.0 2.0581e-03 1.3 2.13e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33701038 >> VecPointwiseMult 36 1.0 3.5709e-03 1.3 1.60e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 14567861 >> VecScatterBegin 30 1.0 2.9079e-03 7.8 0.00e+00 0.0 1.0e+07 5.5e+03 0.0e+00 0 0 0 0 0 0 0 53 8 0 0 >> VecScatterEnd 30 1.0 3.7015e-0263.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetUp 7 1.0 2.3165e-01 1.0 2.04e+08 1.2 1.0e+07 5.5e+03 1.0e+02 0 0 0 0 6 1 34 53 8 64 26654598 >> PCGAMG Gal l00 1 1.0 4.7415e+00 1.0 2.94e+08 1.1 1.8e+06 7.8e+04 0.0e+00 0 1 0 1 0 11 53 9 20 0 2015623 >> PCGAMG Gal l01 1 1.0 1.2103e+00 1.0 3.50e+07 1.4 4.8e+06 6.2e+04 1.2e+01 0 0 0 2 1 3 6 25 41 8 905938 >> PCGAMG Gal l02 1 1.0 3.4334e+00 1.0 7.41e+07 0.0 2.2e+06 8.7e+04 1.2e+01 0 0 0 1 1 8 6 11 27 8 312184 >> PCGAMG Gal l03 1 1.0 9.6062e+00 1.0 2.71e+08 0.0 1.9e+05 1.3e+05 1.1e+01 0 0 0 0 1 22 1 1 4 7 22987 >> PCGAMG Gal l04 1 1.0 2.2482e+01 1.0 9.43e+08 0.0 8.7e+03 4.8e+05 1.1e+01 1 0 0 0 1 52 0 0 1 7 1705 >> PCGAMG Gal l05 1 1.0 1.5961e+00 1.1 3.16e+08 0.0 6.8e+01 2.2e+05 1.1e+01 0 0 0 0 1 4 0 0 0 7 738 >> PCSetUp 1 1.0 4.3191e+01 1.0 1.70e+09 3.6 1.9e+07 3.7e+04 1.6e+02 1 1 1 4 9 100100100100100 420463 >> >> --- Event Stage 2: KSP Solve only >> >> SFPack 8140 1.0 7.4247e-02 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFUnpack 8140 1.0 1.2905e-02 5.2 5.50e+0637.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1267207 >> MatMult 5500 1.0 2.9994e+01 1.2 3.98e+10 1.1 2.0e+09 6.1e+03 0.0e+00 1 76 68 62 0 70 92 78 98 0 40747181 >> MatMultAdd 1320 1.0 6.2192e+00 2.7 7.97e+08 1.2 2.8e+08 4.6e+02 0.0e+00 0 2 10 1 0 14 2 11 1 0 3868976 >> MatMultTranspose 1320 1.0 4.0304e+00 1.7 8.00e+08 1.2 2.8e+08 4.6e+02 0.0e+00 0 2 10 1 0 7 2 11 1 0 5974153 >> MatSolve 220 0.0 6.7366e-03 0.0 7.41e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1100 >> MatLUFactorSym 1 1.0 5.8691e-0435.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatLUFactorNum 1 1.0 1.5955e-03756.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 913 >> MatResidual 1320 1.0 6.4920e+00 1.3 8.27e+09 1.2 4.4e+08 5.5e+03 0.0e+00 0 15 15 13 0 14 19 18 20 0 38146350 >> MatGetRowIJ 1 0.0 2.7820e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetOrdering 1 0.0 9.6940e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecTDot 440 1.0 4.6162e+00 6.9 2.31e+08 1.0 0.0e+00 0.0e+00 4.4e+02 0 0 0 0 24 5 1 0 0 66 1635124 >> VecNorm 230 1.0 3.9605e-02 1.6 1.21e+08 1.0 0.0e+00 0.0e+00 2.3e+02 0 0 0 0 13 0 0 0 0 34 99622387 >> VecCopy 3980 1.0 5.4166e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 4640 1.0 1.4216e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 440 1.0 4.2829e-02 1.3 2.31e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 176236363 >> VecAYPX 8130 1.0 7.3998e-01 1.2 5.78e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 2 1 0 0 0 25489392 >> VecAXPBYCZ 2640 1.0 3.9974e-01 1.5 5.85e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 47716315 >> VecPointwiseMult 5280 1.0 5.9845e-01 1.5 2.34e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 12748927 >> VecScatterBegin 8140 1.0 4.9231e-01 5.9 0.00e+00 0.0 2.5e+09 4.9e+03 0.0e+00 0 0 87 64 0 1 0100100 0 0 >> VecScatterEnd 8140 1.0 1.0172e+01 3.6 5.50e+0637.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 13 0 0 0 0 1608 >> KSPSetUp 1 1.0 9.5996e-07 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 10 1.0 3.9685e+01 1.0 4.33e+10 1.1 2.5e+09 4.9e+03 6.7e+02 1 83 87 64 37 100100100100100 33637495 >> PCSetUp 1 1.0 2.4149e-0318.1 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 603 >> PCSetUpOnBlocks 220 1.0 2.6945e-03 8.9 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 540 >> PCApply 220 1.0 3.2921e+01 1.1 3.57e+10 1.2 2.3e+09 4.3e+03 0.0e+00 1 67 81 53 0 81 80 93 82 0 32595360 >> ------------------------------------------------------------------------------------------------------------------------ >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' Mem. >> Reports information only for process 0. >> >> --- Event Stage 0: Main Stage >> >> Container 112 112 69888 0. >> SNES 1 1 1532 0. >> DMSNES 1 1 720 0. >> Distributed Mesh 449 449 30060888 0. >> DM Label 790 790 549840 0. >> Quadrature 579 579 379824 0. >> Index Set 100215 100210 361926232 0. >> IS L to G Mapping 8 13 4356552 0. >> Section 771 771 598296 0. >> Star Forest Graph 897 897 1053640 0. >> Discrete System 521 521 533512 0. >> GraphPartitioner 118 118 91568 0. >> Matrix 432 462 2441805304 0. >> Matrix Coarsen 6 6 4032 0. >> Vector 354 354 65492968 0. >> Linear Space 7 7 5208 0. >> Dual Space 111 111 113664 0. >> FE Space 7 7 5992 0. >> Field over DM 6 6 4560 0. >> Krylov Solver 21 21 37560 0. >> DMKSP interface 1 1 704 0. >> Preconditioner 21 21 21632 0. >> Viewer 2 1 896 0. >> PetscRandom 12 12 8520 0. >> >> --- Event Stage 1: PCSetUp >> >> Index Set 10 15 85367336 0. >> IS L to G Mapping 5 0 0 0. >> Star Forest Graph 5 5 6600 0. >> Matrix 50 20 73134024 0. >> Vector 28 28 6235096 0. >> >> --- Event Stage 2: KSP Solve only >> >> Index Set 5 5 8296 0. >> Matrix 1 1 273856 0. >> ======================================================================================================================== >> Average time to get PetscTime(): 6.40051e-08 >> Average time for MPI_Barrier(): 8.506e-06 >> Average time for zero size MPI_Send(): 6.6027e-06 >> #PETSc Option Table entries: >> -benchmark_it 10 >> -dm_distribute >> -dm_plex_box_dim 3 >> -dm_plex_box_faces 32,32,32 >> -dm_plex_box_lower 0,0,0 >> -dm_plex_box_simplex 0 >> -dm_plex_box_upper 1,1,1 >> -dm_refine 5 >> -ksp_converged_reason >> -ksp_max_it 150 >> -ksp_norm_type unpreconditioned >> -ksp_rtol 1.e-12 >> -ksp_type cg >> -log_view >> -matptap_via scalable >> -mg_levels_esteig_ksp_max_it 5 >> -mg_levels_esteig_ksp_type cg >> -mg_levels_ksp_max_it 2 >> -mg_levels_ksp_type chebyshev >> -mg_levels_pc_type jacobi >> -pc_gamg_agg_nsmooths 1 >> -pc_gamg_coarse_eq_limit 2000 >> -pc_gamg_coarse_grid_layout_type spread >> -pc_gamg_esteig_ksp_max_it 5 >> -pc_gamg_esteig_ksp_type cg >> -pc_gamg_process_eq_limit 500 >> -pc_gamg_repartition false >> -pc_gamg_reuse_interpolation true >> -pc_gamg_square_graph 1 >> -pc_gamg_threshold 0.01 >> -pc_gamg_threshold_scale .5 >> -pc_gamg_type agg >> -pc_type gamg >> -petscpartitioner_simple_node_grid 8,8,8 >> -petscpartitioner_simple_process_grid 4,4,4 >> -petscpartitioner_type simple >> -potential_petscspace_degree 2 >> -snes_converged_reason >> -snes_max_it 1 >> -snes_monitor >> -snes_rtol 1.e-8 >> -snes_type ksponly >> #End of PETSc Option Table entries >> Compiled without FORTRAN kernels >> Compiled with 64 bit PetscInt >> Compiled with full precision matrices (default) >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 >> Configure options: CC=mpifccpx CXX=mpiFCCpx CFLAGS="-L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" CXXFLAGS="-L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" COPTFLAGS=-Kfast CXXOPTFLAGS=-Kfast --with-fc=0 --package-prefix-hash=/home/ra010009/a04199/petsc-hash-pkgs --with-batch=1 --with-shared-libraries=yes --with-debugging=no --with-64-bit-indices=1 PETSC_ARCH=arch-fugaku-fujitsu >> ----------------------------------------- >> Libraries compiled on 2021-02-12 02:27:41 on fn01sv08 >> Machine characteristics: Linux-3.10.0-957.27.2.el7.x86_64-x86_64-with-redhat-7.6-Maipo >> Using PETSc directory: /home/ra010009/a04199/petsc >> Using PETSc arch: >> ----------------------------------------- >> >> Using C compiler: mpifccpx -L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack -fPIC -Kfast >> ----------------------------------------- >> >> Using include paths: -I/home/ra010009/a04199/petsc/include -I/home/ra010009/a04199/petsc/arch-fugaku-fujitsu/include >> ----------------------------------------- >> >> Using C linker: mpifccpx >> Using libraries: -Wl,-rpath,/home/ra010009/a04199/petsc/lib -L/home/ra010009/a04199/petsc/lib -lpetsc -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 -L/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64 -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64 -Wl,-rpath,/opt/FJSVxtclanga/.common/MELI022/lib64 -L/opt/FJSVxtclanga/.common/MELI022/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 -L/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 -L/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 -L/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj -lX11 -lfjprofmpi -lfjlapack -ldl -lmpi_cxx -lmpi -lfjstring_internal -lfj90i -lfj90fmt_sve -lfj90f -lfjsrcinfo -lfjcrt -lfjprofcore -lfjprofomp -lfjc++ -lfjc++abi -lfjdemgl -lmpg -lm -lrt -lpthread -lelf -lz -lgcc_s -ldl >> ----------------------------------------- >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Sun Mar 7 15:13:38 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Sun, 7 Mar 2021 22:13:38 +0100 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: References: Message-ID: <8f788afe-01fd-98db-957c-450cd43a18a8@math.u-bordeaux.fr> On 07/03/2021 16:54, Matthew Knepley wrote: > On Sun, Mar 7, 2021 at 8:52 AM Nicolas Barral > > wrote: > > Matt, > > Thanks for your answer. > > However, DMPlexComputeCellGeometryFVM does not compute what I need > (normals of height 1 entities). I can't find any function doing > that, is > there one ? > > > The normal[] in?DMPlexComputeCellGeometryFVM() is exactly what you want. > What does not look right to you? So it turns out it's not what I want because I need non-normalized normals. It doesn't seem like I can easily retrieve the norm, can I? If not, I'll fallback to computing them by hand for now. Is the following assumption safe or do I have to use DMPlexGetOrientedFace? > if I call P0P1P2P3 a tet and note x the cross product, > P3P2xP3P1 is the outward normal to face P1P2P3 > P0P2xP0P3 " P0P2P3 > P3P1xP3P0 " P0P1P3 > P0P1xP0P2 " P0P1P2 Thanks -- Nicolas > > ? Thanks, > > ? ? Matt > > So far I've been doing it by hand, and after a lot of experimenting the > past weeks, it seems that if I call P0P1P2P3 a tetrahedron and note x > the cross product, > P3P2xP3P1 is the outward normal to face P1P2P3 > P0P2xP0P3? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P2P3 > P3P1xP3P0? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P3 > P0P1xP0P2? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P2 > Have I been lucky but can't expect it to be true ? > > (Alternatively, there is a link between the normals and the element > Jacobian, but I don't know the formula and can? find them) > > > Thanks, > > -- > Nicolas > > On 08/02/2021 15:19, Matthew Knepley wrote: > > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral > > > > >> wrote: > > > >? ? ?Hi all, > > > >? ? ?Can I make any assumption on the orientation of triangular > facets in a > >? ? ?tetrahedral plex ? I need the inward facet normals. Do I need > to use > >? ? ?DMPlexGetOrientedFace or can I rely on either the tet vertices > >? ? ?ordering, > >? ? ?or the faces ordering ? Could DMPlexGetRawFaces_Internal be > enough ? > > > > > > You can do it by hand, but you have to account for the face > orientation > > relative to the cell. That is what > > DMPlexGetOrientedFace() does. I think it would be easier to use the > > function below. > > > >? ? ?Alternatively, is there a function that computes the normals > - without > >? ? ?bringing out the big guns ? > > > > > > This will compute the normals > > > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html > > Should not be too heavy weight. > > > >? ? THanks, > > > >? ? ? Matt > > > >? ? ?Thanks > > > >? ? ?-- > >? ? ?Nicolas > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Sun Mar 7 15:30:41 2021 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 7 Mar 2021 16:30:41 -0500 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: <8f788afe-01fd-98db-957c-450cd43a18a8@math.u-bordeaux.fr> References: <8f788afe-01fd-98db-957c-450cd43a18a8@math.u-bordeaux.fr> Message-ID: On Sun, Mar 7, 2021 at 4:13 PM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > On 07/03/2021 16:54, Matthew Knepley wrote: > > On Sun, Mar 7, 2021 at 8:52 AM Nicolas Barral > > > > wrote: > > > > Matt, > > > > Thanks for your answer. > > > > However, DMPlexComputeCellGeometryFVM does not compute what I need > > (normals of height 1 entities). I can't find any function doing > > that, is > > there one ? > > > > > > The normal[] in DMPlexComputeCellGeometryFVM() is exactly what you want. > > What does not look right to you? > > > So it turns out it's not what I want because I need non-normalized > normals. It doesn't seem like I can easily retrieve the norm, can I? > You just want area-weighted normals I think, which means that you just multiply by the area, which comes back in the same function. Thanks, Matt > If not, I'll fallback to computing them by hand for now. Is the > following assumption safe or do I have to use DMPlexGetOrientedFace? > > if I call P0P1P2P3 a tet and note x the cross product, > > P3P2xP3P1 is the outward normal to face P1P2P3 > > P0P2xP0P3 " P0P2P3 > > P3P1xP3P0 " P0P1P3 > > P0P1xP0P2 " P0P1P2 > > Thanks > > -- > Nicolas > > > > Thanks, > > > > Matt > > > > So far I've been doing it by hand, and after a lot of experimenting > the > > past weeks, it seems that if I call P0P1P2P3 a tetrahedron and note x > > the cross product, > > P3P2xP3P1 is the outward normal to face P1P2P3 > > P0P2xP0P3 " P0P2P3 > > P3P1xP3P0 " P0P1P3 > > P0P1xP0P2 " P0P1P2 > > Have I been lucky but can't expect it to be true ? > > > > (Alternatively, there is a link between the normals and the element > > Jacobian, but I don't know the formula and can find them) > > > > > > Thanks, > > > > -- > > Nicolas > > > > On 08/02/2021 15:19, Matthew Knepley wrote: > > > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral > > > > > > > > >> wrote: > > > > > > Hi all, > > > > > > Can I make any assumption on the orientation of triangular > > facets in a > > > tetrahedral plex ? I need the inward facet normals. Do I need > > to use > > > DMPlexGetOrientedFace or can I rely on either the tet vertices > > > ordering, > > > or the faces ordering ? Could DMPlexGetRawFaces_Internal be > > enough ? > > > > > > > > > You can do it by hand, but you have to account for the face > > orientation > > > relative to the cell. That is what > > > DMPlexGetOrientedFace() does. I think it would be easier to use > the > > > function below. > > > > > > Alternatively, is there a function that computes the normals > > - without > > > bringing out the big guns ? > > > > > > > > > This will compute the normals > > > > > > > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html > > > Should not be too heavy weight. > > > > > > THanks, > > > > > > Matt > > > > > > Thanks > > > > > > -- > > > Nicolas > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to > which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at buffalo.edu Sun Mar 7 15:44:39 2021 From: knepley at buffalo.edu (Matthew Knepley) Date: Sun, 7 Mar 2021 16:44:39 -0500 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: <54542fc5f5164ed8a08e796881a41073@MBX-LS5.itorg.ad.buffalo.edu> References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> <54542fc5f5164ed8a08e796881a41073@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: On Sun, Mar 7, 2021 at 3:27 PM Stefano Zampini wrote: > Mark > > Being an MPI issue, you should run with -log_sync > From your log the problem seems with SFSetup that is called many times > (62), with timings associated mostly to the SF revealing ranks phase. > DMPlex abuses of the embedded SF, that can be optimized further I > presume. It should run (someone has to write the code) a cheaper operation, > since the communication graph of the embedded SF is a subgraph of the > original . > I want understand why calling CreateEmbeddedRootSF() would be an abuse. Right now, it does one SFBcast() and purely local stuff to create a smaller SF. All the Plex calls are contiguous, so we could make it send less data by sending only the updated root bounds, but I don't think that is the problem here. Is it that creating a new SF is expensive and we should rewrite SF so that CreateEmbeddedRootSF() does not call SFSetup()? Matt > On Mar 7, 2021, at 10:01 PM, Barry Smith wrote: > > > Mark, > > Thanks for the numbers. > > Extremely problematic. DMPlexDistribute takes 88 percent of the total > run time, SFBcastOpEnd takes 80 percent. > > Probably Matt is right, PetscSF is flooding the network which it cannot > handle. IMHO fixing PetscSF would be a far better route than writing all > kinds of fancy DMPLEX hierarchical distributors. PetscSF needs to detect > that it is sending too many messages together and do the messaging in > appropriate waves; at the moment PetscSF is as dumb as stone it just shoves > everything out as fast as it can. Junchao needs access to this machine. If > everything in PETSc will depend on PetscSF then it simply has to scale on > systems where you cannot just flood the network with MPI. > > Barry > > > Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 2.7e+02 > 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 > Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 1.9e+02 > 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 > DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 > 4.3e+00.0e+00 14 0 0 0 0 15 0 0 0 0 0 > DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 > 5.4e+00.0e+00 28 0 0 0 0 29 0 0 0 0 0 > DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 2.3e+02 > 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 > DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 3.1e+02 > 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 > DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 1.9e+02 > 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 > DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 9.3e+01 > 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 > SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 2.7e+04 > 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 > SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 1.8e+04 > 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 > SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 > SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 1.1e+04 > 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 > SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 1.7e+05 > 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 > > On Mar 7, 2021, at 7:35 AM, Mark Adams wrote: > > And this data puts one cell per process, distributes, and then refines 5 > (or 2,3,4 in plot) times. > > On Sun, Mar 7, 2021 at 8:27 AM Mark Adams wrote: > >> FWIW, Here is the output from ex13 on 32K processes (8K Fugaku >> nodes/sockets, 4 MPI/node, which seems recommended) with 128^3 vertex mesh >> (64^3 Q2 3D Laplacian). >> Almost an hour. >> Attached is solver scaling. >> >> >> 0 SNES Function norm 3.658334849208e+00 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> 1 SNES Function norm 1.609000373074e-12 >> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> >> ************************************************************************************************************************ >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r >> -fCourier9' to print this document *** >> >> ************************************************************************************************************************ >> >> ---------------------------------------------- PETSc Performance Summary: >> ---------------------------------------------- >> >> ../ex13 on a named i07-4008c with 32768 processors, by a04199 Fri Feb 12 >> 23:27:13 2021 >> Using Petsc Development GIT revision: v3.14.4-579-g4cb72fa GIT Date: >> 2021-02-05 15:19:40 +0000 >> >> Max Max/Min Avg Total >> Time (sec): 3.373e+03 1.000 3.373e+03 >> Objects: 1.055e+05 14.797 7.144e+03 >> Flop: 5.376e+10 1.176 4.885e+10 1.601e+15 >> Flop/sec: 1.594e+07 1.176 1.448e+07 4.745e+11 >> MPI Messages: 6.048e+05 30.010 8.833e+04 2.894e+09 >> MPI Message Lengths: 1.127e+09 4.132 6.660e+03 1.928e+13 >> MPI Reductions: 1.824e+03 1.000 >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length N >> --> 2N flop >> and VecAXPY() for complex vectors of length N >> --> 8N flop >> >> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages >> --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total Count >> %Total Avg %Total Count %Total >> 0: Main Stage: 3.2903e+03 97.5% 2.4753e+14 15.5% 3.538e+08 >> 12.2% 1.779e+04 32.7% 9.870e+02 54.1% >> 1: PCSetUp: 4.3062e+01 1.3% 1.8160e+13 1.1% 1.902e+07 >> 0.7% 3.714e+04 3.7% 1.590e+02 8.7% >> 2: KSP Solve only: 3.9685e+01 1.2% 1.3349e+15 83.4% 2.522e+09 >> 87.1% 4.868e+03 63.7% 6.700e+02 36.7% >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flop: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all processors >> Mess: number of messages sent >> AvgLen: average message length (bytes) >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() >> and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flop in this >> phase >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time >> over all processors) >> >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flop >> --- Global --- --- Stage ---- Total >> Max Ratio Max Ratio Max Ratio Mess AvgLen >> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> --- Event Stage 0: Main Stage >> >> PetscBarrier 5 1.0 1.9907e+00 2.2 0.00e+00 0.0 3.8e+06 7.7e+01 >> 2.0e+01 0 0 0 0 1 0 0 1 0 2 0 >> BuildTwoSided 62 1.0 7.3272e+0214.1 0.00e+00 0.0 6.7e+06 8.0e+00 >> 0.0e+00 5 0 0 0 0 5 0 2 0 0 0 >> BuildTwoSidedF 59 1.0 3.1132e+01 7.4 0.00e+00 0.0 4.8e+06 2.5e+05 >> 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 >> SNESSolve 1 1.0 1.7468e+02 1.0 7.83e+09 1.3 3.4e+08 1.3e+04 >> 8.8e+02 5 13 12 23 48 5 85 96 70 89 1205779 >> SNESSetUp 1 1.0 2.4195e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 >> 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 >> SNESFunctionEval 3 1.0 1.1359e+01 1.2 1.17e+09 1.0 1.6e+06 1.4e+04 >> 2.0e+00 0 2 0 0 0 0 15 0 0 0 3344744 >> SNESJacobianEval 2 1.0 1.6829e+02 1.0 1.52e+09 1.0 1.1e+06 8.3e+05 >> 0.0e+00 5 3 0 5 0 5 20 0 14 0 293588 >> DMCreateMat 1 1.0 2.4107e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 >> 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 >> Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 2.7e+02 >> 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 >> Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 1.9e+02 >> 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 >> DMPlexPartSelf 1 1.0 1.1498e+002367.3 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DMPlexPartLblInv 1 1.0 3.6698e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 >> 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DMPlexPartLblSF 1 1.0 2.8522e-01 1.7 0.00e+00 0.0 4.9e+04 1.5e+02 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 >> 4.3e+02 0.0e+00 14 0 0 0 0 15 0 0 0 0 0 >> DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 >> 5.4e+02 0.0e+00 28 0 0 0 0 29 0 0 0 0 0 >> DMPlexInterp 84 1.0 4.3219e-0158.6 0.00e+00 0.0 0.0e+00 0.0e+00 >> 5.0e+00 0 0 0 0 0 0 0 0 0 1 0 >> DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 2.3e+02 >> 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 >> DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 3.1e+02 >> 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 >> DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 1.9e+02 >> 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 >> DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 9.3e+01 >> 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 >> DMPlexStratify 118 1.0 6.2852e+023280.9 0.00e+00 0.0 0.0e+00 >> 0.0e+00 1.6e+01 1 0 0 0 1 1 0 0 0 2 0 >> DMPlexSymmetrize 118 1.0 6.7634e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DMPlexPrealloc 1 1.0 2.3741e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 >> 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 >> DMPlexResidualFE 3 1.0 1.0634e+01 1.2 1.16e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 2 0 0 0 0 15 0 0 0 3569848 >> DMPlexJacobianFE 2 1.0 1.6809e+02 1.0 1.51e+09 1.0 6.5e+05 1.4e+06 >> 0.0e+00 5 3 0 5 0 5 20 0 14 0 293801 >> SFSetGraph 87 1.0 2.7673e-03 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 2.7e+04 >> 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 >> SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 1.8e+04 >> 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 >> SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 >> SFReduceBegin 12 1.0 2.4825e-01172.8 0.00e+00 0.0 2.4e+06 2.0e+05 >> 0.0e+00 0 0 0 2 0 0 0 1 8 0 0 >> SFReduceEnd 12 1.0 3.8286e+014865.8 3.74e+04 0.0 0.0e+00 >> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 31 >> SFFetchOpBegin 2 1.0 2.4497e-0390.2 0.00e+00 0.0 4.3e+05 3.5e+05 >> 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 >> SFFetchOpEnd 2 1.0 6.1349e-0210.9 0.00e+00 0.0 4.3e+05 3.5e+05 >> 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 >> SFCreateEmbed 3 1.0 3.6800e+013261.5 0.00e+00 0.0 4.7e+05 >> 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 1.1e+04 >> 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 >> SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 1.7e+05 >> 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 >> SFRemoteOff 2 1.0 3.2868e-0143.1 0.00e+00 0.0 8.7e+05 8.2e+03 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFPack 1023 1.0 2.5215e-0176.6 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFUnpack 1025 1.0 5.1600e-0216.8 5.62e+0521.3 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 54693 >> MatMult 1549525.4 3.4810e+00 1.3 4.35e+09 1.1 2.2e+08 6.1e+03 >> 0.0e+00 0 8 8 7 0 0 54 62 21 0 38319208 >> MatMultAdd 132 1.0 6.9168e-01 3.0 7.97e+07 1.2 2.8e+07 4.6e+02 >> 0.0e+00 0 0 1 0 0 0 1 8 0 0 3478717 >> MatMultTranspose 132 1.0 5.9967e-01 1.6 8.00e+07 1.2 3.0e+07 4.5e+02 >> 0.0e+00 0 0 1 0 0 0 1 9 0 0 4015214 >> MatSolve 22 0.0 6.8431e-04 0.0 7.41e+05 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1082 >> MatLUFactorSym 1 1.0 5.9569e-0433.3 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatLUFactorNum 1 1.0 1.6236e-03773.2 1.46e+06 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 897 >> MatConvert 6 1.0 1.4290e-01 1.2 0.00e+00 0.0 3.0e+06 3.7e+03 >> 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 >> MatScale 18 1.0 3.7962e-01 1.3 4.11e+07 1.2 2.0e+06 5.5e+03 >> 0.0e+00 0 0 0 0 0 0 0 1 0 0 3253392 >> MatResidual 132 1.0 6.8256e-01 1.4 8.27e+08 1.2 4.4e+07 5.5e+03 >> 0.0e+00 0 2 2 1 0 0 10 13 4 0 36282014 >> MatAssemblyBegin 244 1.0 3.1181e+01 6.6 0.00e+00 0.0 4.8e+06 2.5e+05 >> 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 >> MatAssemblyEnd 244 1.0 6.3232e+00 1.9 3.17e+06 6.9 0.0e+00 0.0e+00 >> 1.4e+02 0 0 0 0 8 0 0 0 0 15 7655 >> MatGetRowIJ 1 0.0 2.5780e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatCreateSubMat 10 1.0 1.5162e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 >> 1.3e+02 0 0 0 0 7 0 0 0 1 13 0 >> MatGetOrdering 1 0.0 1.0899e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatCoarsen 6 1.0 3.5837e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 >> 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 >> MatZeroEntries 8 1.0 5.3730e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatAXPY 6 1.0 2.6245e-01 1.1 2.66e+05 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 33035 >> MatTranspose 12 1.0 3.0731e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatMatMultSym 18 1.0 2.1398e+00 1.4 0.00e+00 0.0 6.1e+06 5.5e+03 >> 4.8e+01 0 0 0 0 3 0 0 2 1 5 0 >> MatMatMultNum 6 1.0 1.1243e+00 1.0 3.76e+07 1.2 2.0e+06 5.5e+03 >> 0.0e+00 0 0 0 0 0 0 0 1 0 0 1001203 >> MatPtAPSymbolic 6 1.0 1.7280e+01 1.0 0.00e+00 0.0 1.2e+07 3.2e+04 >> 4.2e+01 1 0 0 2 2 1 0 3 6 4 0 >> MatPtAPNumeric 6 1.0 1.8047e+01 1.0 1.49e+09 5.1 2.8e+06 1.1e+05 >> 2.4e+01 1 1 0 2 1 1 5 1 5 2 663675 >> MatTrnMatMultSym 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 >> 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 >> MatGetLocalMat 19 1.0 1.3904e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetBrAoCol 18 1.0 1.9926e-01 5.0 0.00e+00 0.0 1.4e+07 2.3e+04 >> 0.0e+00 0 0 0 2 0 0 0 4 5 0 0 >> MatGetSymTrans 2 1.0 1.8996e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecTDot 176 1.0 7.0632e-01 4.5 3.48e+07 1.0 0.0e+00 0.0e+00 >> 1.8e+02 0 0 0 0 10 0 0 0 0 18 1608728 >> VecNorm 60 1.0 1.4074e+0012.2 1.58e+07 1.0 0.0e+00 0.0e+00 >> 6.0e+01 0 0 0 0 3 0 0 0 0 6 366467 >> VecCopy 422 1.0 5.1259e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 653 1.0 2.3974e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 165 1.0 6.5622e-03 1.3 3.42e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 170485467 >> VecAYPX 861 1.0 7.8529e-02 1.2 6.21e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 1 0 0 0 25785252 >> VecAXPBYCZ 264 1.0 4.1343e-02 1.5 5.85e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 1 0 0 0 46135592 >> VecAssemblyBegin 21 1.0 2.3463e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAssemblyEnd 21 1.0 1.4457e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecPointwiseMult 600 1.0 5.7510e-02 1.2 2.66e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 15075754 >> VecScatterBegin 902 1.0 5.1188e-01 1.2 0.00e+00 0.0 2.9e+08 5.3e+03 >> 0.0e+00 0 0 10 8 0 0 0 82 25 0 0 >> VecScatterEnd 902 1.0 1.2143e+00 3.2 5.50e+0537.9 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1347 >> VecSetRandom 6 1.0 2.6354e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DualSpaceSetUp 7 1.0 5.3467e-0112.0 4.26e+03 1.0 0.0e+00 0.0e+00 >> 1.3e+01 0 0 0 0 1 0 0 0 0 1 261 >> FESetUp 7 1.0 1.7541e-01128.5 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetUp 15 1.0 2.7470e-01 1.1 2.04e+08 1.2 1.0e+07 5.5e+03 >> 1.3e+02 0 0 0 0 7 0 2 3 1 13 22477233 >> KSPSolve 1 1.0 4.3257e+00 1.0 4.33e+09 1.1 2.5e+08 4.8e+03 >> 6.6e+01 0 8 9 6 4 0 54 72 20 7 30855976 >> PCGAMGGraph_AGG 6 1.0 5.0969e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 >> 4.8e+01 0 0 0 0 3 0 0 1 0 5 220852 >> PCGAMGCoarse_AGG 6 1.0 3.1121e+01 1.0 0.00e+00 0.0 2.5e+07 6.9e+04 >> 5.5e+01 1 0 1 9 3 1 0 7 27 6 0 >> PCGAMGProl_AGG 6 1.0 5.8196e-01 1.0 0.00e+00 0.0 6.6e+06 9.3e+03 >> 7.2e+01 0 0 0 0 4 0 0 2 1 7 0 >> PCGAMGPOpt_AGG 6 1.0 3.2414e+00 1.0 2.42e+08 1.2 2.1e+07 5.3e+03 >> 1.6e+02 0 0 1 1 9 0 3 6 2 17 2256493 >> GAMG: createProl 6 1.0 4.0042e+01 1.0 2.80e+08 1.2 5.8e+07 3.3e+04 >> 3.4e+02 1 1 2 10 19 1 3 16 31 34 210778 >> Graph 12 1.0 5.0926e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 >> 4.8e+01 0 0 0 0 3 0 0 1 0 5 221038 >> MIS/Agg 6 1.0 3.5850e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 >> 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 >> SA: col data 6 1.0 3.0509e-01 1.0 0.00e+00 0.0 5.4e+06 9.2e+03 >> 2.4e+01 0 0 0 0 1 0 0 2 1 2 0 >> SA: frmProl0 6 1.0 2.3467e-01 1.1 0.00e+00 0.0 1.3e+06 9.5e+03 >> 2.4e+01 0 0 0 0 1 0 0 0 0 2 0 >> SA: smooth 6 1.0 2.7855e+00 1.0 4.14e+07 1.2 8.1e+06 5.5e+03 >> 6.3e+01 0 0 0 0 3 0 1 2 1 6 446491 >> GAMG: partLevel 6 1.0 3.7266e+01 1.0 1.49e+09 5.1 1.5e+07 4.9e+04 >> 3.2e+02 1 1 1 4 17 1 5 4 12 32 321395 >> repartition 5 1.0 2.0343e+00 1.1 0.00e+00 0.0 4.0e+05 1.4e+05 >> 2.5e+02 0 0 0 0 14 0 0 0 1 25 0 >> Invert-Sort 5 1.0 1.5021e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >> 3.0e+01 0 0 0 0 2 0 0 0 0 3 0 >> Move A 5 1.0 1.1548e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 >> 7.0e+01 0 0 0 0 4 0 0 0 1 7 0 >> Move P 5 1.0 4.2799e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >> 7.5e+01 0 0 0 0 4 0 0 0 0 8 0 >> PCGAMG Squ l00 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 >> 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 >> PCGAMG Gal l00 1 1.0 8.7411e+00 1.0 2.93e+08 1.1 5.4e+06 4.5e+04 >> 1.2e+01 0 1 0 1 1 0 4 2 4 1 1092355 >> PCGAMG Opt l00 1 1.0 1.9734e+00 1.0 3.36e+07 1.1 3.2e+06 1.2e+04 >> 9.0e+00 0 0 0 0 0 0 0 1 1 1 555327 >> PCGAMG Gal l01 1 1.0 1.0153e+00 1.0 3.50e+07 1.4 5.9e+06 3.9e+04 >> 1.2e+01 0 0 0 1 1 0 0 2 4 1 1079887 >> PCGAMG Opt l01 1 1.0 7.4812e-02 1.0 5.35e+05 1.2 3.2e+06 1.1e+03 >> 9.0e+00 0 0 0 0 0 0 0 1 0 1 232542 >> PCGAMG Gal l02 1 1.0 1.8063e+00 1.0 7.43e+07 0.0 3.0e+06 5.9e+04 >> 1.2e+01 0 0 0 1 1 0 0 1 3 1 593392 >> PCGAMG Opt l02 1 1.0 1.1580e-01 1.1 6.93e+05 0.0 1.6e+06 1.3e+03 >> 9.0e+00 0 0 0 0 0 0 0 0 0 1 93213 >> PCGAMG Gal l03 1 1.0 6.1075e+00 1.0 2.72e+08 0.0 2.6e+05 9.2e+04 >> 1.1e+01 0 0 0 0 1 0 0 0 0 1 36155 >> PCGAMG Opt l03 1 1.0 8.0836e-02 1.0 1.55e+06 0.0 1.4e+05 1.4e+03 >> 8.0e+00 0 0 0 0 0 0 0 0 0 1 18229 >> PCGAMG Gal l04 1 1.0 1.6203e+01 1.0 9.44e+08 0.0 1.4e+04 3.0e+05 >> 1.1e+01 0 0 0 0 1 0 0 0 0 1 2366 >> PCGAMG Opt l04 1 1.0 1.2663e-01 1.0 2.01e+06 0.0 6.9e+03 2.2e+03 >> 8.0e+00 0 0 0 0 0 0 0 0 0 1 817 >> PCGAMG Gal l05 1 1.0 1.4800e+00 1.0 3.16e+08 0.0 9.0e+01 1.6e+05 >> 1.1e+01 0 0 0 0 1 0 0 0 0 1 796 >> PCGAMG Opt l05 1 1.0 8.1763e-02 1.1 2.50e+06 0.0 4.8e+01 4.6e+03 >> 8.0e+00 0 0 0 0 0 0 0 0 0 1 114 >> PCSetUp 2 1.0 7.7969e+01 1.0 1.97e+09 2.8 8.3e+07 3.3e+04 >> 8.1e+02 2 2 3 14 44 2 11 23 43 82 341051 >> PCSetUpOnBlocks 22 1.0 2.4609e-0317.2 1.46e+06 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 592 >> PCApply 22 1.0 3.6455e+00 1.1 3.57e+09 1.2 2.4e+08 4.3e+03 >> 0.0e+00 0 7 8 5 0 0 43 67 16 0 29434967 >> >> --- Event Stage 1: PCSetUp >> >> BuildTwoSided 4 1.0 1.5980e-01 2.7 0.00e+00 0.0 2.1e+05 8.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 >> BuildTwoSidedF 6 1.0 1.3169e+01 5.5 0.00e+00 0.0 1.9e+06 1.9e+05 >> 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 >> SFSetGraph 5 1.0 4.9640e-0519.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFSetUp 4 1.0 1.6038e-01 2.3 0.00e+00 0.0 6.4e+05 9.1e+02 >> 0.0e+00 0 0 0 0 0 0 0 3 0 0 0 >> SFPack 30 1.0 3.3376e-04 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFUnpack 30 1.0 1.2101e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatMult 30 1.0 1.5544e-01 1.5 1.87e+08 1.2 1.0e+07 5.5e+03 >> 0.0e+00 0 0 0 0 0 0 31 53 8 0 35930640 >> MatAssemblyBegin 43 1.0 1.3201e+01 4.7 0.00e+00 0.0 1.9e+06 1.9e+05 >> 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 >> MatAssemblyEnd 43 1.0 1.1159e+01 1.0 2.77e+07705.7 0.0e+00 0.0e+00 >> 2.0e+01 0 0 0 0 1 26 0 0 0 13 1036 >> MatZeroEntries 6 1.0 4.7315e-0410.7 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatTranspose 12 1.0 2.5142e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatMatMultSym 10 1.0 5.8783e-0117.4 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatPtAPSymbolic 5 1.0 1.4489e+01 1.0 0.00e+00 0.0 6.2e+06 3.6e+04 >> 3.5e+01 0 0 0 1 2 34 0 32 31 22 0 >> MatPtAPNumeric 6 1.0 2.8457e+01 1.0 1.50e+09 5.1 2.7e+06 1.6e+05 >> 2.0e+01 1 1 0 2 1 66 66 14 61 13 421190 >> MatGetLocalMat 6 1.0 9.8574e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetBrAoCol 6 1.0 3.7669e-01 2.3 0.00e+00 0.0 5.1e+06 3.8e+04 >> 0.0e+00 0 0 0 1 0 0 0 27 28 0 0 >> VecTDot 66 1.0 6.5271e-02 4.1 5.85e+06 1.0 0.0e+00 0.0e+00 >> 6.6e+01 0 0 0 0 4 0 1 0 0 42 2922260 >> VecNorm 36 1.0 1.1226e-02 3.2 3.19e+06 1.0 0.0e+00 0.0e+00 >> 3.6e+01 0 0 0 0 2 0 1 0 0 23 9268067 >> VecCopy 12 1.0 1.2805e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 11 1.0 6.6620e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 60 1.0 1.0763e-03 1.5 5.32e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 1 0 0 0 161104914 >> VecAYPX 24 1.0 2.0581e-03 1.3 2.13e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 33701038 >> VecPointwiseMult 36 1.0 3.5709e-03 1.3 1.60e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 14567861 >> VecScatterBegin 30 1.0 2.9079e-03 7.8 0.00e+00 0.0 1.0e+07 5.5e+03 >> 0.0e+00 0 0 0 0 0 0 0 53 8 0 0 >> VecScatterEnd 30 1.0 3.7015e-0263.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetUp 7 1.0 2.3165e-01 1.0 2.04e+08 1.2 1.0e+07 5.5e+03 >> 1.0e+02 0 0 0 0 6 1 34 53 8 64 26654598 >> PCGAMG Gal l00 1 1.0 4.7415e+00 1.0 2.94e+08 1.1 1.8e+06 7.8e+04 >> 0.0e+00 0 1 0 1 0 11 53 9 20 0 2015623 >> PCGAMG Gal l01 1 1.0 1.2103e+00 1.0 3.50e+07 1.4 4.8e+06 6.2e+04 >> 1.2e+01 0 0 0 2 1 3 6 25 41 8 905938 >> PCGAMG Gal l02 1 1.0 3.4334e+00 1.0 7.41e+07 0.0 2.2e+06 8.7e+04 >> 1.2e+01 0 0 0 1 1 8 6 11 27 8 312184 >> PCGAMG Gal l03 1 1.0 9.6062e+00 1.0 2.71e+08 0.0 1.9e+05 1.3e+05 >> 1.1e+01 0 0 0 0 1 22 1 1 4 7 22987 >> PCGAMG Gal l04 1 1.0 2.2482e+01 1.0 9.43e+08 0.0 8.7e+03 4.8e+05 >> 1.1e+01 1 0 0 0 1 52 0 0 1 7 1705 >> PCGAMG Gal l05 1 1.0 1.5961e+00 1.1 3.16e+08 0.0 6.8e+01 2.2e+05 >> 1.1e+01 0 0 0 0 1 4 0 0 0 7 738 >> PCSetUp 1 1.0 4.3191e+01 1.0 1.70e+09 3.6 1.9e+07 3.7e+04 >> 1.6e+02 1 1 1 4 9 100100100100100 420463 >> >> --- Event Stage 2: KSP Solve only >> >> SFPack 8140 1.0 7.4247e-02 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFUnpack 8140 1.0 1.2905e-02 5.2 5.50e+0637.9 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1267207 >> MatMult 5500 1.0 2.9994e+01 1.2 3.98e+10 1.1 2.0e+09 6.1e+03 >> 0.0e+00 1 76 68 62 0 70 92 78 98 0 40747181 >> MatMultAdd 1320 1.0 6.2192e+00 2.7 7.97e+08 1.2 2.8e+08 4.6e+02 >> 0.0e+00 0 2 10 1 0 14 2 11 1 0 3868976 >> MatMultTranspose 1320 1.0 4.0304e+00 1.7 8.00e+08 1.2 2.8e+08 4.6e+02 >> 0.0e+00 0 2 10 1 0 7 2 11 1 0 5974153 >> MatSolve 220 0.0 6.7366e-03 0.0 7.41e+06 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1100 >> MatLUFactorSym 1 1.0 5.8691e-0435.5 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatLUFactorNum 1 1.0 1.5955e-03756.2 1.46e+06 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 913 >> MatResidual 1320 1.0 6.4920e+00 1.3 8.27e+09 1.2 4.4e+08 5.5e+03 >> 0.0e+00 0 15 15 13 0 14 19 18 20 0 38146350 >> MatGetRowIJ 1 0.0 2.7820e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetOrdering 1 0.0 9.6940e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecTDot 440 1.0 4.6162e+00 6.9 2.31e+08 1.0 0.0e+00 0.0e+00 >> 4.4e+02 0 0 0 0 24 5 1 0 0 66 1635124 >> VecNorm 230 1.0 3.9605e-02 1.6 1.21e+08 1.0 0.0e+00 0.0e+00 >> 2.3e+02 0 0 0 0 13 0 0 0 0 34 99622387 >> VecCopy 3980 1.0 5.4166e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 4640 1.0 1.4216e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 440 1.0 4.2829e-02 1.3 2.31e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 1 0 0 0 176236363 >> VecAYPX 8130 1.0 7.3998e-01 1.2 5.78e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 2 1 0 0 0 25489392 >> VecAXPBYCZ 2640 1.0 3.9974e-01 1.5 5.85e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 1 1 0 0 0 47716315 >> VecPointwiseMult 5280 1.0 5.9845e-01 1.5 2.34e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 1 1 0 0 0 12748927 >> VecScatterBegin 8140 1.0 4.9231e-01 5.9 0.00e+00 0.0 2.5e+09 4.9e+03 >> 0.0e+00 0 0 87 64 0 1 0100100 0 0 >> VecScatterEnd 8140 1.0 1.0172e+01 3.6 5.50e+0637.9 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 13 0 0 0 0 1608 >> KSPSetUp 1 1.0 9.5996e-07 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 10 1.0 3.9685e+01 1.0 4.33e+10 1.1 2.5e+09 4.9e+03 >> 6.7e+02 1 83 87 64 37 100100100100100 33637495 >> PCSetUp 1 1.0 2.4149e-0318.1 1.46e+06 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 603 >> PCSetUpOnBlocks 220 1.0 2.6945e-03 8.9 1.46e+06 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 540 >> PCApply 220 1.0 3.2921e+01 1.1 3.57e+10 1.2 2.3e+09 4.3e+03 >> 0.0e+00 1 67 81 53 0 81 80 93 82 0 32595360 >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' >> Mem. >> Reports information only for process 0. >> >> --- Event Stage 0: Main Stage >> >> Container 112 112 69888 0. >> SNES 1 1 1532 0. >> DMSNES 1 1 720 0. >> Distributed Mesh 449 449 30060888 0. >> DM Label 790 790 549840 0. >> Quadrature 579 579 379824 0. >> Index Set 100215 100210 361926232 0. >> IS L to G Mapping 8 13 4356552 0. >> Section 771 771 598296 0. >> Star Forest Graph 897 897 1053640 0. >> Discrete System 521 521 533512 0. >> GraphPartitioner 118 118 91568 0. >> Matrix 432 462 2441805304 0. >> Matrix Coarsen 6 6 4032 0. >> Vector 354 354 65492968 0. >> Linear Space 7 7 5208 0. >> Dual Space 111 111 113664 0. >> FE Space 7 7 5992 0. >> Field over DM 6 6 4560 0. >> Krylov Solver 21 21 37560 0. >> DMKSP interface 1 1 704 0. >> Preconditioner 21 21 21632 0. >> Viewer 2 1 896 0. >> PetscRandom 12 12 8520 0. >> >> --- Event Stage 1: PCSetUp >> >> Index Set 10 15 85367336 0. >> IS L to G Mapping 5 0 0 0. >> Star Forest Graph 5 5 6600 0. >> Matrix 50 20 73134024 0. >> Vector 28 28 6235096 0. >> >> --- Event Stage 2: KSP Solve only >> >> Index Set 5 5 8296 0. >> Matrix 1 1 273856 0. >> >> ======================================================================================================================== >> Average time to get PetscTime(): 6.40051e-08 >> Average time for MPI_Barrier(): 8.506e-06 >> Average time for zero size MPI_Send(): 6.6027e-06 >> #PETSc Option Table entries: >> -benchmark_it 10 >> -dm_distribute >> -dm_plex_box_dim 3 >> -dm_plex_box_faces 32,32,32 >> -dm_plex_box_lower 0,0,0 >> -dm_plex_box_simplex 0 >> -dm_plex_box_upper 1,1,1 >> -dm_refine 5 >> -ksp_converged_reason >> -ksp_max_it 150 >> -ksp_norm_type unpreconditioned >> -ksp_rtol 1.e-12 >> -ksp_type cg >> -log_view >> -matptap_via scalable >> -mg_levels_esteig_ksp_max_it 5 >> -mg_levels_esteig_ksp_type cg >> -mg_levels_ksp_max_it 2 >> -mg_levels_ksp_type chebyshev >> -mg_levels_pc_type jacobi >> -pc_gamg_agg_nsmooths 1 >> -pc_gamg_coarse_eq_limit 2000 >> -pc_gamg_coarse_grid_layout_type spread >> -pc_gamg_esteig_ksp_max_it 5 >> -pc_gamg_esteig_ksp_type cg >> -pc_gamg_process_eq_limit 500 >> -pc_gamg_repartition false >> -pc_gamg_reuse_interpolation true >> -pc_gamg_square_graph 1 >> -pc_gamg_threshold 0.01 >> -pc_gamg_threshold_scale .5 >> -pc_gamg_type agg >> -pc_type gamg >> -petscpartitioner_simple_node_grid 8,8,8 >> -petscpartitioner_simple_process_grid 4,4,4 >> -petscpartitioner_type simple >> -potential_petscspace_degree 2 >> -snes_converged_reason >> -snes_max_it 1 >> -snes_monitor >> -snes_rtol 1.e-8 >> -snes_type ksponly >> #End of PETSc Option Table entries >> Compiled without FORTRAN kernels >> Compiled with 64 bit PetscInt >> Compiled with full precision matrices (default) >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >> sizeof(PetscScalar) 8 sizeof(PetscInt) 8 >> Configure options: CC=mpifccpx CXX=mpiFCCpx CFLAGS="-L >> /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" CXXFLAGS="-L >> /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" COPTFLAGS=-Kfast >> CXXOPTFLAGS=-Kfast --with-fc=0 >> --package-prefix-hash=/home/ra010009/a04199/petsc-hash-pkgs --with-batch=1 >> --with-shared-libraries=yes --with-debugging=no --with-64-bit-indices=1 >> PETSC_ARCH=arch-fugaku-fujitsu >> ----------------------------------------- >> Libraries compiled on 2021-02-12 02:27:41 on fn01sv08 >> Machine characteristics: >> Linux-3.10.0-957.27.2.el7.x86_64-x86_64-with-redhat-7.6-Maipo >> Using PETSc directory: /home/ra010009/a04199/petsc >> Using PETSc arch: >> ----------------------------------------- >> >> Using C compiler: mpifccpx -L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 >> -lfjlapack -fPIC -Kfast >> ----------------------------------------- >> >> Using include paths: -I/home/ra010009/a04199/petsc/include >> -I/home/ra010009/a04199/petsc/arch-fugaku-fujitsu/include >> ----------------------------------------- >> >> Using C linker: mpifccpx >> Using libraries: -Wl,-rpath,/home/ra010009/a04199/petsc/lib >> -L/home/ra010009/a04199/petsc/lib -lpetsc >> -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 >> -L/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 >> -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64 >> -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64 >> -Wl,-rpath,/opt/FJSVxtclanga/.common/MELI022/lib64 >> -L/opt/FJSVxtclanga/.common/MELI022/lib64 >> -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 >> -L/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 >> -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 >> -L/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 >> -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 >> -L/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 >> -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj >> -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj -lX11 -lfjprofmpi -lfjlapack >> -ldl -lmpi_cxx -lmpi -lfjstring_internal -lfj90i -lfj90fmt_sve -lfj90f >> -lfjsrcinfo -lfjcrt -lfjprofcore -lfjprofomp -lfjc++ -lfjc++abi -lfjdemgl >> -lmpg -lm -lrt -lpthread -lelf -lz -lgcc_s -ldl >> ----------------------------------------- >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Sun Mar 7 15:51:31 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Sun, 7 Mar 2021 22:51:31 +0100 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: References: <8f788afe-01fd-98db-957c-450cd43a18a8@math.u-bordeaux.fr> Message-ID: <1c4a69aa-ff26-848c-770d-89756a87be66@math.u-bordeaux.fr> On 07/03/2021 22:30, Matthew Knepley wrote: > On Sun, Mar 7, 2021 at 4:13 PM Nicolas Barral > > wrote: > > On 07/03/2021 16:54, Matthew Knepley wrote: > > On Sun, Mar 7, 2021 at 8:52 AM Nicolas Barral > > > > >> wrote: > > > >? ? ?Matt, > > > >? ? ?Thanks for your answer. > > > >? ? ?However, DMPlexComputeCellGeometryFVM does not compute what I > need > >? ? ?(normals of height 1 entities). I can't find any function doing > >? ? ?that, is > >? ? ?there one ? > > > > > > The normal[] in?DMPlexComputeCellGeometryFVM() is exactly what > you want. > > What does not look right to you? > > > So it turns out it's not what I want because I need non-normalized > normals. It doesn't seem like I can easily retrieve the norm, can I? > > > You just want area-weighted normals I think, which?means that you just > multiply by the area, > which comes back in the same function. > Ah by the area times 2, of course, my bad. Do you order height-1 elements in a certain way ? I need to access the facet (resp. edge) opposite to a vertex in a tet (resp. triangle). Thanks -- Nicolas > ? Thanks, > > ? ? Matt > > If not, I'll fallback to computing them by hand for now. Is the > following assumption safe or do I have to use DMPlexGetOrientedFace? > ?>? if I call P0P1P2P3 a tet and note x the cross product, > ?>? P3P2xP3P1 is the outward normal to face P1P2P3 > ?>? P0P2xP0P3? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P2P3 > ?>? P3P1xP3P0? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P3 > ?>? P0P1xP0P2? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P2 > > Thanks > > -- > Nicolas > > > >? ? Thanks, > > > >? ? ? Matt > > > >? ? ?So far I've been doing it by hand, and after a lot of > experimenting the > >? ? ?past weeks, it seems that if I call P0P1P2P3 a tetrahedron > and note x > >? ? ?the cross product, > >? ? ?P3P2xP3P1 is the outward normal to face P1P2P3 > >? ? ?P0P2xP0P3? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P2P3 > >? ? ?P3P1xP3P0? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P3 > >? ? ?P0P1xP0P2? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P2 > >? ? ?Have I been lucky but can't expect it to be true ? > > > >? ? ?(Alternatively, there is a link between the normals and the > element > >? ? ?Jacobian, but I don't know the formula and can? find them) > > > > > >? ? ?Thanks, > > > >? ? ?-- > >? ? ?Nicolas > > > >? ? ?On 08/02/2021 15:19, Matthew Knepley wrote: > >? ? ? > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? >>> wrote: > >? ? ? > > >? ? ? >? ? ?Hi all, > >? ? ? > > >? ? ? >? ? ?Can I make any assumption on the orientation of triangular > >? ? ?facets in a > >? ? ? >? ? ?tetrahedral plex ? I need the inward facet normals. Do > I need > >? ? ?to use > >? ? ? >? ? ?DMPlexGetOrientedFace or can I rely on either the tet > vertices > >? ? ? >? ? ?ordering, > >? ? ? >? ? ?or the faces ordering ? Could > DMPlexGetRawFaces_Internal be > >? ? ?enough ? > >? ? ? > > >? ? ? > > >? ? ? > You can do it by hand, but you have to account for the face > >? ? ?orientation > >? ? ? > relative to the cell. That is what > >? ? ? > DMPlexGetOrientedFace() does. I think it would be easier > to use the > >? ? ? > function below. > >? ? ? > > >? ? ? >? ? ?Alternatively, is there a function that computes the > normals > >? ? ?- without > >? ? ? >? ? ?bringing out the big guns ? > >? ? ? > > >? ? ? > > >? ? ? > This will compute the normals > >? ? ? > > >? ? ? > > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html > >? ? ? > Should not be too heavy weight. > >? ? ? > > >? ? ? >? ? THanks, > >? ? ? > > >? ? ? >? ? ? Matt > >? ? ? > > >? ? ? >? ? ?Thanks > >? ? ? > > >? ? ? >? ? ?-- > >? ? ? >? ? ?Nicolas > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? > -- > >? ? ? > What most experimenters take for granted before they begin > their > >? ? ? > experiments is infinitely more interesting than any > results to which > >? ? ? > their experiments lead. > >? ? ? > -- Norbert Wiener > >? ? ? > > >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Sun Mar 7 15:56:23 2021 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 7 Mar 2021 16:56:23 -0500 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: <1c4a69aa-ff26-848c-770d-89756a87be66@math.u-bordeaux.fr> References: <8f788afe-01fd-98db-957c-450cd43a18a8@math.u-bordeaux.fr> <1c4a69aa-ff26-848c-770d-89756a87be66@math.u-bordeaux.fr> Message-ID: On Sun, Mar 7, 2021 at 4:51 PM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > > On 07/03/2021 22:30, Matthew Knepley wrote: > > On Sun, Mar 7, 2021 at 4:13 PM Nicolas Barral > > > > wrote: > > > > On 07/03/2021 16:54, Matthew Knepley wrote: > > > On Sun, Mar 7, 2021 at 8:52 AM Nicolas Barral > > > > > > > > >> wrote: > > > > > > Matt, > > > > > > Thanks for your answer. > > > > > > However, DMPlexComputeCellGeometryFVM does not compute what I > > need > > > (normals of height 1 entities). I can't find any function > doing > > > that, is > > > there one ? > > > > > > > > > The normal[] in DMPlexComputeCellGeometryFVM() is exactly what > > you want. > > > What does not look right to you? > > > > > > So it turns out it's not what I want because I need non-normalized > > normals. It doesn't seem like I can easily retrieve the norm, can I? > > > > > > You just want area-weighted normals I think, which means that you just > > multiply by the area, > > which comes back in the same function. > > > > Ah by the area times 2, of course, my bad. > Do you order height-1 elements in a certain way ? I need to access the > facet (resp. edge) opposite to a vertex in a tet (resp. triangle). > Yes. Now that I have pretty much settled on it, I will put it in the manual. It is currently here: https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexinterpolate.c#L56 All normals are outward facing, but hopefully the ordering in the sourse file makes sense. Thanks, Matt > Thanks > > -- > Nicolas > > > > Thanks, > > > > Matt > > > > If not, I'll fallback to computing them by hand for now. Is the > > following assumption safe or do I have to use DMPlexGetOrientedFace? > > > if I call P0P1P2P3 a tet and note x the cross product, > > > P3P2xP3P1 is the outward normal to face P1P2P3 > > > P0P2xP0P3 " P0P2P3 > > > P3P1xP3P0 " P0P1P3 > > > P0P1xP0P2 " P0P1P2 > > > > Thanks > > > > -- > > Nicolas > > > > > > Thanks, > > > > > > Matt > > > > > > So far I've been doing it by hand, and after a lot of > > experimenting the > > > past weeks, it seems that if I call P0P1P2P3 a tetrahedron > > and note x > > > the cross product, > > > P3P2xP3P1 is the outward normal to face P1P2P3 > > > P0P2xP0P3 " P0P2P3 > > > P3P1xP3P0 " P0P1P3 > > > P0P1xP0P2 " P0P1P2 > > > Have I been lucky but can't expect it to be true ? > > > > > > (Alternatively, there is a link between the normals and the > > element > > > Jacobian, but I don't know the formula and can find them) > > > > > > > > > Thanks, > > > > > > -- > > > Nicolas > > > > > > On 08/02/2021 15:19, Matthew Knepley wrote: > > > > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral > > > > > > > > > > > > > > > > > > > >>> wrote: > > > > > > > > Hi all, > > > > > > > > Can I make any assumption on the orientation of > triangular > > > facets in a > > > > tetrahedral plex ? I need the inward facet normals. Do > > I need > > > to use > > > > DMPlexGetOrientedFace or can I rely on either the tet > > vertices > > > > ordering, > > > > or the faces ordering ? Could > > DMPlexGetRawFaces_Internal be > > > enough ? > > > > > > > > > > > > You can do it by hand, but you have to account for the face > > > orientation > > > > relative to the cell. That is what > > > > DMPlexGetOrientedFace() does. I think it would be easier > > to use the > > > > function below. > > > > > > > > Alternatively, is there a function that computes the > > normals > > > - without > > > > bringing out the big guns ? > > > > > > > > > > > > This will compute the normals > > > > > > > > > > > > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html > > > > Should not be too heavy weight. > > > > > > > > THanks, > > > > > > > > Matt > > > > > > > > Thanks > > > > > > > > -- > > > > Nicolas > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before they begin > > their > > > > experiments is infinitely more interesting than any > > results to which > > > > their experiments lead. > > > > -- Norbert Wiener > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to > which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun Mar 7 18:40:00 2021 From: jed at jedbrown.org (Jed Brown) Date: Sun, 07 Mar 2021 17:40:00 -0700 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: <877dmijx5r.fsf@jedbrown.org> There is some use of Iscatterv in SF implementations (though it looks like perhaps not PetscSFBcast where the root nodes are consolidated on a root rank). We should perhaps have a function that analyzes the graph to set the type rather than requiring the caller to PetscSFSetType. Barry Smith writes: > Mark, > > Thanks for the numbers. > > Extremely problematic. DMPlexDistribute takes 88 percent of the total run time, SFBcastOpEnd takes 80 percent. > > Probably Matt is right, PetscSF is flooding the network which it cannot handle. IMHO fixing PetscSF would be a far better route than writing all kinds of fancy DMPLEX hierarchical distributors. PetscSF needs to detect that it is sending too many messages together and do the messaging in appropriate waves; at the moment PetscSF is as dumb as stone it just shoves everything out as fast as it can. Junchao needs access to this machine. If everything in PETSc will depend on PetscSF then it simply has to scale on systems where you cannot just flood the network with MPI. > > Barry > > > Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 2.7e+02 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 > Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 1.9e+02 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 > DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 4.3e+00.0e+00 14 0 0 0 0 15 0 0 0 0 0 > DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 5.4e+00.0e+00 28 0 0 0 0 29 0 0 0 0 0 > DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 2.3e+02 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 > DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 3.1e+02 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 > DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 1.9e+02 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 > DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 9.3e+01 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 > SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 2.7e+04 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 > SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 1.8e+04 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 > SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 > SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 1.1e+04 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 > SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 1.7e+05 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 > >> On Mar 7, 2021, at 7:35 AM, Mark Adams wrote: >> >> And this data puts one cell per process, distributes, and then refines 5 (or 2,3,4 in plot) times. >> >> On Sun, Mar 7, 2021 at 8:27 AM Mark Adams > wrote: >> FWIW, Here is the output from ex13 on 32K processes (8K Fugaku nodes/sockets, 4 MPI/node, which seems recommended) with 128^3 vertex mesh (64^3 Q2 3D Laplacian). >> Almost an hour. >> Attached is solver scaling. >> >> >> 0 SNES Function norm 3.658334849208e+00 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> 1 SNES Function norm 1.609000373074e-12 >> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> Linear solve converged due to CONVERGED_RTOL iterations 22 >> ************************************************************************************************************************ >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** >> ************************************************************************************************************************ >> >> ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- >> >> ../ex13 on a named i07-4008c with 32768 processors, by a04199 Fri Feb 12 23:27:13 2021 >> Using Petsc Development GIT revision: v3.14.4-579-g4cb72fa GIT Date: 2021-02-05 15:19:40 +0000 >> >> Max Max/Min Avg Total >> Time (sec): 3.373e+03 1.000 3.373e+03 >> Objects: 1.055e+05 14.797 7.144e+03 >> Flop: 5.376e+10 1.176 4.885e+10 1.601e+15 >> Flop/sec: 1.594e+07 1.176 1.448e+07 4.745e+11 >> MPI Messages: 6.048e+05 30.010 8.833e+04 2.894e+09 >> MPI Message Lengths: 1.127e+09 4.132 6.660e+03 1.928e+13 >> MPI Reductions: 1.824e+03 1.000 >> >> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length N --> 2N flop >> and VecAXPY() for complex vectors of length N --> 8N flop >> >> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >> 0: Main Stage: 3.2903e+03 97.5% 2.4753e+14 15.5% 3.538e+08 12.2% 1.779e+04 32.7% 9.870e+02 54.1% >> 1: PCSetUp: 4.3062e+01 1.3% 1.8160e+13 1.1% 1.902e+07 0.7% 3.714e+04 3.7% 1.590e+02 8.7% >> 2: KSP Solve only: 3.9685e+01 1.2% 1.3349e+15 83.4% 2.522e+09 87.1% 4.868e+03 63.7% 6.700e+02 36.7% >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flop: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all processors >> Mess: number of messages sent >> AvgLen: average message length (bytes) >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flop in this phase >> %M - percent messages in this phase %L - percent message lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ------------------------------------------------------------------------------------------------------------------------ >> >> --- Event Stage 0: Main Stage >> >> PetscBarrier 5 1.0 1.9907e+00 2.2 0.00e+00 0.0 3.8e+06 7.7e+01 2.0e+01 0 0 0 0 1 0 0 1 0 2 0 >> BuildTwoSided 62 1.0 7.3272e+0214.1 0.00e+00 0.0 6.7e+06 8.0e+00 0.0e+00 5 0 0 0 0 5 0 2 0 0 0 >> BuildTwoSidedF 59 1.0 3.1132e+01 7.4 0.00e+00 0.0 4.8e+06 2.5e+05 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 >> SNESSolve 1 1.0 1.7468e+02 1.0 7.83e+09 1.3 3.4e+08 1.3e+04 8.8e+02 5 13 12 23 48 5 85 96 70 89 1205779 >> SNESSetUp 1 1.0 2.4195e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 >> SNESFunctionEval 3 1.0 1.1359e+01 1.2 1.17e+09 1.0 1.6e+06 1.4e+04 2.0e+00 0 2 0 0 0 0 15 0 0 0 3344744 >> SNESJacobianEval 2 1.0 1.6829e+02 1.0 1.52e+09 1.0 1.1e+06 8.3e+05 0.0e+00 5 3 0 5 0 5 20 0 14 0 293588 >> DMCreateMat 1 1.0 2.4107e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 >> Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 2.7e+02 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 >> Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 1.9e+02 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 >> DMPlexPartSelf 1 1.0 1.1498e+002367.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DMPlexPartLblInv 1 1.0 3.6698e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DMPlexPartLblSF 1 1.0 2.8522e-01 1.7 0.00e+00 0.0 4.9e+04 1.5e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 4.3e+02 0.0e+00 14 0 0 0 0 15 0 0 0 0 0 >> DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 5.4e+02 0.0e+00 28 0 0 0 0 29 0 0 0 0 0 >> DMPlexInterp 84 1.0 4.3219e-0158.6 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 0 0 0 0 0 0 0 0 0 1 0 >> DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 2.3e+02 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 >> DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 3.1e+02 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 >> DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 1.9e+02 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 >> DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 9.3e+01 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 >> DMPlexStratify 118 1.0 6.2852e+023280.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+01 1 0 0 0 1 1 0 0 0 2 0 >> DMPlexSymmetrize 118 1.0 6.7634e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DMPlexPrealloc 1 1.0 2.3741e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 >> DMPlexResidualFE 3 1.0 1.0634e+01 1.2 1.16e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 15 0 0 0 3569848 >> DMPlexJacobianFE 2 1.0 1.6809e+02 1.0 1.51e+09 1.0 6.5e+05 1.4e+06 0.0e+00 5 3 0 5 0 5 20 0 14 0 293801 >> SFSetGraph 87 1.0 2.7673e-03 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 2.7e+04 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 >> SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 1.8e+04 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 >> SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 >> SFReduceBegin 12 1.0 2.4825e-01172.8 0.00e+00 0.0 2.4e+06 2.0e+05 0.0e+00 0 0 0 2 0 0 0 1 8 0 0 >> SFReduceEnd 12 1.0 3.8286e+014865.8 3.74e+04 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 31 >> SFFetchOpBegin 2 1.0 2.4497e-0390.2 0.00e+00 0.0 4.3e+05 3.5e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 >> SFFetchOpEnd 2 1.0 6.1349e-0210.9 0.00e+00 0.0 4.3e+05 3.5e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 >> SFCreateEmbed 3 1.0 3.6800e+013261.5 0.00e+00 0.0 4.7e+05 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 1.1e+04 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 >> SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 1.7e+05 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 >> SFRemoteOff 2 1.0 3.2868e-0143.1 0.00e+00 0.0 8.7e+05 8.2e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFPack 1023 1.0 2.5215e-0176.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFUnpack 1025 1.0 5.1600e-0216.8 5.62e+0521.3 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 54693 >> MatMult 1549525.4 3.4810e+00 1.3 4.35e+09 1.1 2.2e+08 6.1e+03 0.0e+00 0 8 8 7 0 0 54 62 21 0 38319208 >> MatMultAdd 132 1.0 6.9168e-01 3.0 7.97e+07 1.2 2.8e+07 4.6e+02 0.0e+00 0 0 1 0 0 0 1 8 0 0 3478717 >> MatMultTranspose 132 1.0 5.9967e-01 1.6 8.00e+07 1.2 3.0e+07 4.5e+02 0.0e+00 0 0 1 0 0 0 1 9 0 0 4015214 >> MatSolve 22 0.0 6.8431e-04 0.0 7.41e+05 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1082 >> MatLUFactorSym 1 1.0 5.9569e-0433.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatLUFactorNum 1 1.0 1.6236e-03773.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 897 >> MatConvert 6 1.0 1.4290e-01 1.2 0.00e+00 0.0 3.0e+06 3.7e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 >> MatScale 18 1.0 3.7962e-01 1.3 4.11e+07 1.2 2.0e+06 5.5e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 3253392 >> MatResidual 132 1.0 6.8256e-01 1.4 8.27e+08 1.2 4.4e+07 5.5e+03 0.0e+00 0 2 2 1 0 0 10 13 4 0 36282014 >> MatAssemblyBegin 244 1.0 3.1181e+01 6.6 0.00e+00 0.0 4.8e+06 2.5e+05 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 >> MatAssemblyEnd 244 1.0 6.3232e+00 1.9 3.17e+06 6.9 0.0e+00 0.0e+00 1.4e+02 0 0 0 0 8 0 0 0 0 15 7655 >> MatGetRowIJ 1 0.0 2.5780e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatCreateSubMat 10 1.0 1.5162e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 1.3e+02 0 0 0 0 7 0 0 0 1 13 0 >> MatGetOrdering 1 0.0 1.0899e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatCoarsen 6 1.0 3.5837e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 >> MatZeroEntries 8 1.0 5.3730e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatAXPY 6 1.0 2.6245e-01 1.1 2.66e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33035 >> MatTranspose 12 1.0 3.0731e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatMatMultSym 18 1.0 2.1398e+00 1.4 0.00e+00 0.0 6.1e+06 5.5e+03 4.8e+01 0 0 0 0 3 0 0 2 1 5 0 >> MatMatMultNum 6 1.0 1.1243e+00 1.0 3.76e+07 1.2 2.0e+06 5.5e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 1001203 >> MatPtAPSymbolic 6 1.0 1.7280e+01 1.0 0.00e+00 0.0 1.2e+07 3.2e+04 4.2e+01 1 0 0 2 2 1 0 3 6 4 0 >> MatPtAPNumeric 6 1.0 1.8047e+01 1.0 1.49e+09 5.1 2.8e+06 1.1e+05 2.4e+01 1 1 0 2 1 1 5 1 5 2 663675 >> MatTrnMatMultSym 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 >> MatGetLocalMat 19 1.0 1.3904e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetBrAoCol 18 1.0 1.9926e-01 5.0 0.00e+00 0.0 1.4e+07 2.3e+04 0.0e+00 0 0 0 2 0 0 0 4 5 0 0 >> MatGetSymTrans 2 1.0 1.8996e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecTDot 176 1.0 7.0632e-01 4.5 3.48e+07 1.0 0.0e+00 0.0e+00 1.8e+02 0 0 0 0 10 0 0 0 0 18 1608728 >> VecNorm 60 1.0 1.4074e+0012.2 1.58e+07 1.0 0.0e+00 0.0e+00 6.0e+01 0 0 0 0 3 0 0 0 0 6 366467 >> VecCopy 422 1.0 5.1259e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 653 1.0 2.3974e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 165 1.0 6.5622e-03 1.3 3.42e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170485467 >> VecAYPX 861 1.0 7.8529e-02 1.2 6.21e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 25785252 >> VecAXPBYCZ 264 1.0 4.1343e-02 1.5 5.85e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 46135592 >> VecAssemblyBegin 21 1.0 2.3463e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAssemblyEnd 21 1.0 1.4457e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecPointwiseMult 600 1.0 5.7510e-02 1.2 2.66e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 15075754 >> VecScatterBegin 902 1.0 5.1188e-01 1.2 0.00e+00 0.0 2.9e+08 5.3e+03 0.0e+00 0 0 10 8 0 0 0 82 25 0 0 >> VecScatterEnd 902 1.0 1.2143e+00 3.2 5.50e+0537.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1347 >> VecSetRandom 6 1.0 2.6354e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> DualSpaceSetUp 7 1.0 5.3467e-0112.0 4.26e+03 1.0 0.0e+00 0.0e+00 1.3e+01 0 0 0 0 1 0 0 0 0 1 261 >> FESetUp 7 1.0 1.7541e-01128.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetUp 15 1.0 2.7470e-01 1.1 2.04e+08 1.2 1.0e+07 5.5e+03 1.3e+02 0 0 0 0 7 0 2 3 1 13 22477233 >> KSPSolve 1 1.0 4.3257e+00 1.0 4.33e+09 1.1 2.5e+08 4.8e+03 6.6e+01 0 8 9 6 4 0 54 72 20 7 30855976 >> PCGAMGGraph_AGG 6 1.0 5.0969e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 4.8e+01 0 0 0 0 3 0 0 1 0 5 220852 >> PCGAMGCoarse_AGG 6 1.0 3.1121e+01 1.0 0.00e+00 0.0 2.5e+07 6.9e+04 5.5e+01 1 0 1 9 3 1 0 7 27 6 0 >> PCGAMGProl_AGG 6 1.0 5.8196e-01 1.0 0.00e+00 0.0 6.6e+06 9.3e+03 7.2e+01 0 0 0 0 4 0 0 2 1 7 0 >> PCGAMGPOpt_AGG 6 1.0 3.2414e+00 1.0 2.42e+08 1.2 2.1e+07 5.3e+03 1.6e+02 0 0 1 1 9 0 3 6 2 17 2256493 >> GAMG: createProl 6 1.0 4.0042e+01 1.0 2.80e+08 1.2 5.8e+07 3.3e+04 3.4e+02 1 1 2 10 19 1 3 16 31 34 210778 >> Graph 12 1.0 5.0926e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 4.8e+01 0 0 0 0 3 0 0 1 0 5 221038 >> MIS/Agg 6 1.0 3.5850e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 >> SA: col data 6 1.0 3.0509e-01 1.0 0.00e+00 0.0 5.4e+06 9.2e+03 2.4e+01 0 0 0 0 1 0 0 2 1 2 0 >> SA: frmProl0 6 1.0 2.3467e-01 1.1 0.00e+00 0.0 1.3e+06 9.5e+03 2.4e+01 0 0 0 0 1 0 0 0 0 2 0 >> SA: smooth 6 1.0 2.7855e+00 1.0 4.14e+07 1.2 8.1e+06 5.5e+03 6.3e+01 0 0 0 0 3 0 1 2 1 6 446491 >> GAMG: partLevel 6 1.0 3.7266e+01 1.0 1.49e+09 5.1 1.5e+07 4.9e+04 3.2e+02 1 1 1 4 17 1 5 4 12 32 321395 >> repartition 5 1.0 2.0343e+00 1.1 0.00e+00 0.0 4.0e+05 1.4e+05 2.5e+02 0 0 0 0 14 0 0 0 1 25 0 >> Invert-Sort 5 1.0 1.5021e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+01 0 0 0 0 2 0 0 0 0 3 0 >> Move A 5 1.0 1.1548e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 7.0e+01 0 0 0 0 4 0 0 0 1 7 0 >> Move P 5 1.0 4.2799e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 7.5e+01 0 0 0 0 4 0 0 0 0 8 0 >> PCGAMG Squ l00 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 >> PCGAMG Gal l00 1 1.0 8.7411e+00 1.0 2.93e+08 1.1 5.4e+06 4.5e+04 1.2e+01 0 1 0 1 1 0 4 2 4 1 1092355 >> PCGAMG Opt l00 1 1.0 1.9734e+00 1.0 3.36e+07 1.1 3.2e+06 1.2e+04 9.0e+00 0 0 0 0 0 0 0 1 1 1 555327 >> PCGAMG Gal l01 1 1.0 1.0153e+00 1.0 3.50e+07 1.4 5.9e+06 3.9e+04 1.2e+01 0 0 0 1 1 0 0 2 4 1 1079887 >> PCGAMG Opt l01 1 1.0 7.4812e-02 1.0 5.35e+05 1.2 3.2e+06 1.1e+03 9.0e+00 0 0 0 0 0 0 0 1 0 1 232542 >> PCGAMG Gal l02 1 1.0 1.8063e+00 1.0 7.43e+07 0.0 3.0e+06 5.9e+04 1.2e+01 0 0 0 1 1 0 0 1 3 1 593392 >> PCGAMG Opt l02 1 1.0 1.1580e-01 1.1 6.93e+05 0.0 1.6e+06 1.3e+03 9.0e+00 0 0 0 0 0 0 0 0 0 1 93213 >> PCGAMG Gal l03 1 1.0 6.1075e+00 1.0 2.72e+08 0.0 2.6e+05 9.2e+04 1.1e+01 0 0 0 0 1 0 0 0 0 1 36155 >> PCGAMG Opt l03 1 1.0 8.0836e-02 1.0 1.55e+06 0.0 1.4e+05 1.4e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 18229 >> PCGAMG Gal l04 1 1.0 1.6203e+01 1.0 9.44e+08 0.0 1.4e+04 3.0e+05 1.1e+01 0 0 0 0 1 0 0 0 0 1 2366 >> PCGAMG Opt l04 1 1.0 1.2663e-01 1.0 2.01e+06 0.0 6.9e+03 2.2e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 817 >> PCGAMG Gal l05 1 1.0 1.4800e+00 1.0 3.16e+08 0.0 9.0e+01 1.6e+05 1.1e+01 0 0 0 0 1 0 0 0 0 1 796 >> PCGAMG Opt l05 1 1.0 8.1763e-02 1.1 2.50e+06 0.0 4.8e+01 4.6e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 114 >> PCSetUp 2 1.0 7.7969e+01 1.0 1.97e+09 2.8 8.3e+07 3.3e+04 8.1e+02 2 2 3 14 44 2 11 23 43 82 341051 >> PCSetUpOnBlocks 22 1.0 2.4609e-0317.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 592 >> PCApply 22 1.0 3.6455e+00 1.1 3.57e+09 1.2 2.4e+08 4.3e+03 0.0e+00 0 7 8 5 0 0 43 67 16 0 29434967 >> >> --- Event Stage 1: PCSetUp >> >> BuildTwoSided 4 1.0 1.5980e-01 2.7 0.00e+00 0.0 2.1e+05 8.0e+00 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 >> BuildTwoSidedF 6 1.0 1.3169e+01 5.5 0.00e+00 0.0 1.9e+06 1.9e+05 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 >> SFSetGraph 5 1.0 4.9640e-0519.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFSetUp 4 1.0 1.6038e-01 2.3 0.00e+00 0.0 6.4e+05 9.1e+02 0.0e+00 0 0 0 0 0 0 0 3 0 0 0 >> SFPack 30 1.0 3.3376e-04 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFUnpack 30 1.0 1.2101e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatMult 30 1.0 1.5544e-01 1.5 1.87e+08 1.2 1.0e+07 5.5e+03 0.0e+00 0 0 0 0 0 0 31 53 8 0 35930640 >> MatAssemblyBegin 43 1.0 1.3201e+01 4.7 0.00e+00 0.0 1.9e+06 1.9e+05 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 >> MatAssemblyEnd 43 1.0 1.1159e+01 1.0 2.77e+07705.7 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 1 26 0 0 0 13 1036 >> MatZeroEntries 6 1.0 4.7315e-0410.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatTranspose 12 1.0 2.5142e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatMatMultSym 10 1.0 5.8783e-0117.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatPtAPSymbolic 5 1.0 1.4489e+01 1.0 0.00e+00 0.0 6.2e+06 3.6e+04 3.5e+01 0 0 0 1 2 34 0 32 31 22 0 >> MatPtAPNumeric 6 1.0 2.8457e+01 1.0 1.50e+09 5.1 2.7e+06 1.6e+05 2.0e+01 1 1 0 2 1 66 66 14 61 13 421190 >> MatGetLocalMat 6 1.0 9.8574e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetBrAoCol 6 1.0 3.7669e-01 2.3 0.00e+00 0.0 5.1e+06 3.8e+04 0.0e+00 0 0 0 1 0 0 0 27 28 0 0 >> VecTDot 66 1.0 6.5271e-02 4.1 5.85e+06 1.0 0.0e+00 0.0e+00 6.6e+01 0 0 0 0 4 0 1 0 0 42 2922260 >> VecNorm 36 1.0 1.1226e-02 3.2 3.19e+06 1.0 0.0e+00 0.0e+00 3.6e+01 0 0 0 0 2 0 1 0 0 23 9268067 >> VecCopy 12 1.0 1.2805e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 11 1.0 6.6620e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 60 1.0 1.0763e-03 1.5 5.32e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 161104914 >> VecAYPX 24 1.0 2.0581e-03 1.3 2.13e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33701038 >> VecPointwiseMult 36 1.0 3.5709e-03 1.3 1.60e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 14567861 >> VecScatterBegin 30 1.0 2.9079e-03 7.8 0.00e+00 0.0 1.0e+07 5.5e+03 0.0e+00 0 0 0 0 0 0 0 53 8 0 0 >> VecScatterEnd 30 1.0 3.7015e-0263.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSetUp 7 1.0 2.3165e-01 1.0 2.04e+08 1.2 1.0e+07 5.5e+03 1.0e+02 0 0 0 0 6 1 34 53 8 64 26654598 >> PCGAMG Gal l00 1 1.0 4.7415e+00 1.0 2.94e+08 1.1 1.8e+06 7.8e+04 0.0e+00 0 1 0 1 0 11 53 9 20 0 2015623 >> PCGAMG Gal l01 1 1.0 1.2103e+00 1.0 3.50e+07 1.4 4.8e+06 6.2e+04 1.2e+01 0 0 0 2 1 3 6 25 41 8 905938 >> PCGAMG Gal l02 1 1.0 3.4334e+00 1.0 7.41e+07 0.0 2.2e+06 8.7e+04 1.2e+01 0 0 0 1 1 8 6 11 27 8 312184 >> PCGAMG Gal l03 1 1.0 9.6062e+00 1.0 2.71e+08 0.0 1.9e+05 1.3e+05 1.1e+01 0 0 0 0 1 22 1 1 4 7 22987 >> PCGAMG Gal l04 1 1.0 2.2482e+01 1.0 9.43e+08 0.0 8.7e+03 4.8e+05 1.1e+01 1 0 0 0 1 52 0 0 1 7 1705 >> PCGAMG Gal l05 1 1.0 1.5961e+00 1.1 3.16e+08 0.0 6.8e+01 2.2e+05 1.1e+01 0 0 0 0 1 4 0 0 0 7 738 >> PCSetUp 1 1.0 4.3191e+01 1.0 1.70e+09 3.6 1.9e+07 3.7e+04 1.6e+02 1 1 1 4 9 100100100100100 420463 >> >> --- Event Stage 2: KSP Solve only >> >> SFPack 8140 1.0 7.4247e-02 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFUnpack 8140 1.0 1.2905e-02 5.2 5.50e+0637.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1267207 >> MatMult 5500 1.0 2.9994e+01 1.2 3.98e+10 1.1 2.0e+09 6.1e+03 0.0e+00 1 76 68 62 0 70 92 78 98 0 40747181 >> MatMultAdd 1320 1.0 6.2192e+00 2.7 7.97e+08 1.2 2.8e+08 4.6e+02 0.0e+00 0 2 10 1 0 14 2 11 1 0 3868976 >> MatMultTranspose 1320 1.0 4.0304e+00 1.7 8.00e+08 1.2 2.8e+08 4.6e+02 0.0e+00 0 2 10 1 0 7 2 11 1 0 5974153 >> MatSolve 220 0.0 6.7366e-03 0.0 7.41e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1100 >> MatLUFactorSym 1 1.0 5.8691e-0435.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatLUFactorNum 1 1.0 1.5955e-03756.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 913 >> MatResidual 1320 1.0 6.4920e+00 1.3 8.27e+09 1.2 4.4e+08 5.5e+03 0.0e+00 0 15 15 13 0 14 19 18 20 0 38146350 >> MatGetRowIJ 1 0.0 2.7820e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetOrdering 1 0.0 9.6940e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecTDot 440 1.0 4.6162e+00 6.9 2.31e+08 1.0 0.0e+00 0.0e+00 4.4e+02 0 0 0 0 24 5 1 0 0 66 1635124 >> VecNorm 230 1.0 3.9605e-02 1.6 1.21e+08 1.0 0.0e+00 0.0e+00 2.3e+02 0 0 0 0 13 0 0 0 0 34 99622387 >> VecCopy 3980 1.0 5.4166e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 4640 1.0 1.4216e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 440 1.0 4.2829e-02 1.3 2.31e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 176236363 >> VecAYPX 8130 1.0 7.3998e-01 1.2 5.78e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 2 1 0 0 0 25489392 >> VecAXPBYCZ 2640 1.0 3.9974e-01 1.5 5.85e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 47716315 >> VecPointwiseMult 5280 1.0 5.9845e-01 1.5 2.34e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 12748927 >> VecScatterBegin 8140 1.0 4.9231e-01 5.9 0.00e+00 0.0 2.5e+09 4.9e+03 0.0e+00 0 0 87 64 0 1 0100100 0 0 >> VecScatterEnd 8140 1.0 1.0172e+01 3.6 5.50e+0637.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 13 0 0 0 0 1608 >> KSPSetUp 1 1.0 9.5996e-07 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 10 1.0 3.9685e+01 1.0 4.33e+10 1.1 2.5e+09 4.9e+03 6.7e+02 1 83 87 64 37 100100100100100 33637495 >> PCSetUp 1 1.0 2.4149e-0318.1 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 603 >> PCSetUpOnBlocks 220 1.0 2.6945e-03 8.9 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 540 >> PCApply 220 1.0 3.2921e+01 1.1 3.57e+10 1.2 2.3e+09 4.3e+03 0.0e+00 1 67 81 53 0 81 80 93 82 0 32595360 >> ------------------------------------------------------------------------------------------------------------------------ >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' Mem. >> Reports information only for process 0. >> >> --- Event Stage 0: Main Stage >> >> Container 112 112 69888 0. >> SNES 1 1 1532 0. >> DMSNES 1 1 720 0. >> Distributed Mesh 449 449 30060888 0. >> DM Label 790 790 549840 0. >> Quadrature 579 579 379824 0. >> Index Set 100215 100210 361926232 0. >> IS L to G Mapping 8 13 4356552 0. >> Section 771 771 598296 0. >> Star Forest Graph 897 897 1053640 0. >> Discrete System 521 521 533512 0. >> GraphPartitioner 118 118 91568 0. >> Matrix 432 462 2441805304 0. >> Matrix Coarsen 6 6 4032 0. >> Vector 354 354 65492968 0. >> Linear Space 7 7 5208 0. >> Dual Space 111 111 113664 0. >> FE Space 7 7 5992 0. >> Field over DM 6 6 4560 0. >> Krylov Solver 21 21 37560 0. >> DMKSP interface 1 1 704 0. >> Preconditioner 21 21 21632 0. >> Viewer 2 1 896 0. >> PetscRandom 12 12 8520 0. >> >> --- Event Stage 1: PCSetUp >> >> Index Set 10 15 85367336 0. >> IS L to G Mapping 5 0 0 0. >> Star Forest Graph 5 5 6600 0. >> Matrix 50 20 73134024 0. >> Vector 28 28 6235096 0. >> >> --- Event Stage 2: KSP Solve only >> >> Index Set 5 5 8296 0. >> Matrix 1 1 273856 0. >> ======================================================================================================================== >> Average time to get PetscTime(): 6.40051e-08 >> Average time for MPI_Barrier(): 8.506e-06 >> Average time for zero size MPI_Send(): 6.6027e-06 >> #PETSc Option Table entries: >> -benchmark_it 10 >> -dm_distribute >> -dm_plex_box_dim 3 >> -dm_plex_box_faces 32,32,32 >> -dm_plex_box_lower 0,0,0 >> -dm_plex_box_simplex 0 >> -dm_plex_box_upper 1,1,1 >> -dm_refine 5 >> -ksp_converged_reason >> -ksp_max_it 150 >> -ksp_norm_type unpreconditioned >> -ksp_rtol 1.e-12 >> -ksp_type cg >> -log_view >> -matptap_via scalable >> -mg_levels_esteig_ksp_max_it 5 >> -mg_levels_esteig_ksp_type cg >> -mg_levels_ksp_max_it 2 >> -mg_levels_ksp_type chebyshev >> -mg_levels_pc_type jacobi >> -pc_gamg_agg_nsmooths 1 >> -pc_gamg_coarse_eq_limit 2000 >> -pc_gamg_coarse_grid_layout_type spread >> -pc_gamg_esteig_ksp_max_it 5 >> -pc_gamg_esteig_ksp_type cg >> -pc_gamg_process_eq_limit 500 >> -pc_gamg_repartition false >> -pc_gamg_reuse_interpolation true >> -pc_gamg_square_graph 1 >> -pc_gamg_threshold 0.01 >> -pc_gamg_threshold_scale .5 >> -pc_gamg_type agg >> -pc_type gamg >> -petscpartitioner_simple_node_grid 8,8,8 >> -petscpartitioner_simple_process_grid 4,4,4 >> -petscpartitioner_type simple >> -potential_petscspace_degree 2 >> -snes_converged_reason >> -snes_max_it 1 >> -snes_monitor >> -snes_rtol 1.e-8 >> -snes_type ksponly >> #End of PETSc Option Table entries >> Compiled without FORTRAN kernels >> Compiled with 64 bit PetscInt >> Compiled with full precision matrices (default) >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 >> Configure options: CC=mpifccpx CXX=mpiFCCpx CFLAGS="-L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" CXXFLAGS="-L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" COPTFLAGS=-Kfast CXXOPTFLAGS=-Kfast --with-fc=0 --package-prefix-hash=/home/ra010009/a04199/petsc-hash-pkgs --with-batch=1 --with-shared-libraries=yes --with-debugging=no --with-64-bit-indices=1 PETSC_ARCH=arch-fugaku-fujitsu >> ----------------------------------------- >> Libraries compiled on 2021-02-12 02:27:41 on fn01sv08 >> Machine characteristics: Linux-3.10.0-957.27.2.el7.x86_64-x86_64-with-redhat-7.6-Maipo >> Using PETSc directory: /home/ra010009/a04199/petsc >> Using PETSc arch: >> ----------------------------------------- >> >> Using C compiler: mpifccpx -L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack -fPIC -Kfast >> ----------------------------------------- >> >> Using include paths: -I/home/ra010009/a04199/petsc/include -I/home/ra010009/a04199/petsc/arch-fugaku-fujitsu/include >> ----------------------------------------- >> >> Using C linker: mpifccpx >> Using libraries: -Wl,-rpath,/home/ra010009/a04199/petsc/lib -L/home/ra010009/a04199/petsc/lib -lpetsc -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 -L/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64 -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64 -Wl,-rpath,/opt/FJSVxtclanga/.common/MELI022/lib64 -L/opt/FJSVxtclanga/.common/MELI022/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 -L/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 -L/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 -L/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj -lX11 -lfjprofmpi -lfjlapack -ldl -lmpi_cxx -lmpi -lfjstring_internal -lfj90i -lfj90fmt_sve -lfj90f -lfjsrcinfo -lfjcrt -lfjprofcore -lfjprofomp -lfjc++ -lfjc++abi -lfjdemgl -lmpg -lm -lrt -lpthread -lelf -lz -lgcc_s -ldl >> ----------------------------------------- >> From junchao.zhang at gmail.com Sun Mar 7 20:30:57 2021 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Sun, 7 Mar 2021 20:30:57 -0600 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: <877dmijx5r.fsf@jedbrown.org> References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> <877dmijx5r.fsf@jedbrown.org> Message-ID: Yes, we should investigate whether DMPlex used PetscSF in a wrong way or PetscSF needs to detect this pattern. I need to learn more about DMPlex. --Junchao Zhang On Sun, Mar 7, 2021 at 6:40 PM Jed Brown wrote: > There is some use of Iscatterv in SF implementations (though it looks like > perhaps not PetscSFBcast where the root nodes are consolidated on a root > rank). > > We should perhaps have a function that analyzes the graph to set the type > rather than requiring the caller to PetscSFSetType. > > Barry Smith writes: > > > Mark, > > > > Thanks for the numbers. > > > > Extremely problematic. DMPlexDistribute takes 88 percent of the total > run time, SFBcastOpEnd takes 80 percent. > > > > Probably Matt is right, PetscSF is flooding the network which it > cannot handle. IMHO fixing PetscSF would be a far better route than writing > all kinds of fancy DMPLEX hierarchical distributors. PetscSF needs to > detect that it is sending too many messages together and do the messaging > in appropriate waves; at the moment PetscSF is as dumb as stone it just > shoves everything out as fast as it can. Junchao needs access to this > machine. If everything in PETSc will depend on PetscSF then it simply has > to scale on systems where you cannot just flood the network with MPI. > > > > Barry > > > > > > Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 2.7e+02 > 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 > > Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 1.9e+02 > 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 > > DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 > 4.3e+00.0e+00 14 0 0 0 0 15 0 0 0 0 0 > > DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 > 5.4e+00.0e+00 28 0 0 0 0 29 0 0 0 0 0 > > DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 2.3e+02 > 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 > > DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 3.1e+02 > 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 > > DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 1.9e+02 > 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 > > DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 9.3e+01 > 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 > > SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 2.7e+04 > 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 > > SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 > 1.8e+04 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 > > SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 > > SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 1.1e+04 > 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 > > SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 1.7e+05 > 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 > > > >> On Mar 7, 2021, at 7:35 AM, Mark Adams wrote: > >> > >> And this data puts one cell per process, distributes, and then refines > 5 (or 2,3,4 in plot) times. > >> > >> On Sun, Mar 7, 2021 at 8:27 AM Mark Adams mfadams at lbl.gov>> wrote: > >> FWIW, Here is the output from ex13 on 32K processes (8K Fugaku > nodes/sockets, 4 MPI/node, which seems recommended) with 128^3 vertex mesh > (64^3 Q2 3D Laplacian). > >> Almost an hour. > >> Attached is solver scaling. > >> > >> > >> 0 SNES Function norm 3.658334849208e+00 > >> Linear solve converged due to CONVERGED_RTOL iterations 22 > >> 1 SNES Function norm 1.609000373074e-12 > >> Nonlinear solve converged due to CONVERGED_ITS iterations 1 > >> Linear solve converged due to CONVERGED_RTOL iterations 22 > >> Linear solve converged due to CONVERGED_RTOL iterations 22 > >> Linear solve converged due to CONVERGED_RTOL iterations 22 > >> Linear solve converged due to CONVERGED_RTOL iterations 22 > >> Linear solve converged due to CONVERGED_RTOL iterations 22 > >> Linear solve converged due to CONVERGED_RTOL iterations 22 > >> Linear solve converged due to CONVERGED_RTOL iterations 22 > >> Linear solve converged due to CONVERGED_RTOL iterations 22 > >> Linear solve converged due to CONVERGED_RTOL iterations 22 > >> Linear solve converged due to CONVERGED_RTOL iterations 22 > >> > ************************************************************************************************************************ > >> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r > -fCourier9' to print this document *** > >> > ************************************************************************************************************************ > >> > >> ---------------------------------------------- PETSc Performance > Summary: ---------------------------------------------- > >> > >> ../ex13 on a named i07-4008c with 32768 processors, by a04199 Fri Feb > 12 23:27:13 2021 > >> Using Petsc Development GIT revision: v3.14.4-579-g4cb72fa GIT Date: > 2021-02-05 15:19:40 +0000 > >> > >> Max Max/Min Avg Total > >> Time (sec): 3.373e+03 1.000 3.373e+03 > >> Objects: 1.055e+05 14.797 7.144e+03 > >> Flop: 5.376e+10 1.176 4.885e+10 1.601e+15 > >> Flop/sec: 1.594e+07 1.176 1.448e+07 4.745e+11 > >> MPI Messages: 6.048e+05 30.010 8.833e+04 2.894e+09 > >> MPI Message Lengths: 1.127e+09 4.132 6.660e+03 1.928e+13 > >> MPI Reductions: 1.824e+03 1.000 > >> > >> Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > >> e.g., VecAXPY() for real vectors of length > N --> 2N flop > >> and VecAXPY() for complex vectors of length > N --> 8N flop > >> > >> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages > --- -- Message Lengths -- -- Reductions -- > >> Avg %Total Avg %Total Count > %Total Avg %Total Count %Total > >> 0: Main Stage: 3.2903e+03 97.5% 2.4753e+14 15.5% 3.538e+08 > 12.2% 1.779e+04 32.7% 9.870e+02 54.1% > >> 1: PCSetUp: 4.3062e+01 1.3% 1.8160e+13 1.1% 1.902e+07 > 0.7% 3.714e+04 3.7% 1.590e+02 8.7% > >> 2: KSP Solve only: 3.9685e+01 1.2% 1.3349e+15 83.4% 2.522e+09 > 87.1% 4.868e+03 63.7% 6.700e+02 36.7% > >> > >> > ------------------------------------------------------------------------------------------------------------------------ > >> See the 'Profiling' chapter of the users' manual for details on > interpreting output. > >> Phase summary info: > >> Count: number of times phase was executed > >> Time and Flop: Max - maximum over all processors > >> Ratio - ratio of maximum to minimum over all > processors > >> Mess: number of messages sent > >> AvgLen: average message length (bytes) > >> Reduct: number of global reductions > >> Global: entire computation > >> Stage: stages of a computation. Set stages with PetscLogStagePush() > and PetscLogStagePop(). > >> %T - percent time in this phase %F - percent flop in this > phase > >> %M - percent messages in this phase %L - percent message > lengths in this phase > >> %R - percent reductions in this phase > >> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time > over all processors) > >> > ------------------------------------------------------------------------------------------------------------------------ > >> Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total > >> Max Ratio Max Ratio Max Ratio Mess > AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > >> > ------------------------------------------------------------------------------------------------------------------------ > >> > >> --- Event Stage 0: Main Stage > >> > >> PetscBarrier 5 1.0 1.9907e+00 2.2 0.00e+00 0.0 3.8e+06 > 7.7e+01 2.0e+01 0 0 0 0 1 0 0 1 0 2 0 > >> BuildTwoSided 62 1.0 7.3272e+0214.1 0.00e+00 0.0 6.7e+06 > 8.0e+00 0.0e+00 5 0 0 0 0 5 0 2 0 0 0 > >> BuildTwoSidedF 59 1.0 3.1132e+01 7.4 0.00e+00 0.0 4.8e+06 > 2.5e+05 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 > >> SNESSolve 1 1.0 1.7468e+02 1.0 7.83e+09 1.3 3.4e+08 > 1.3e+04 8.8e+02 5 13 12 23 48 5 85 96 70 89 1205779 > >> SNESSetUp 1 1.0 2.4195e+01 1.0 0.00e+00 0.0 3.7e+06 > 3.7e+05 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 > >> SNESFunctionEval 3 1.0 1.1359e+01 1.2 1.17e+09 1.0 1.6e+06 > 1.4e+04 2.0e+00 0 2 0 0 0 0 15 0 0 0 3344744 > >> SNESJacobianEval 2 1.0 1.6829e+02 1.0 1.52e+09 1.0 1.1e+06 > 8.3e+05 0.0e+00 5 3 0 5 0 5 20 0 14 0 293588 > >> DMCreateMat 1 1.0 2.4107e+01 1.0 0.00e+00 0.0 3.7e+06 > 3.7e+05 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 > >> Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 > 2.7e+02 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 > >> Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 > 1.9e+02 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 > >> DMPlexPartSelf 1 1.0 1.1498e+002367.3 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> DMPlexPartLblInv 1 1.0 3.6698e+00 1.5 0.00e+00 0.0 0.0e+00 > 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> DMPlexPartLblSF 1 1.0 2.8522e-01 1.7 0.00e+00 0.0 4.9e+04 > 1.5e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 > 4.3e+02 0.0e+00 14 0 0 0 0 15 0 0 0 0 0 > >> DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 > 5.4e+02 0.0e+00 28 0 0 0 0 29 0 0 0 0 0 > >> DMPlexInterp 84 1.0 4.3219e-0158.6 0.00e+00 0.0 0.0e+00 > 0.0e+00 5.0e+00 0 0 0 0 0 0 0 0 0 1 0 > >> DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 > 2.3e+02 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 > >> DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 > 3.1e+02 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 > >> DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 > 1.9e+02 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 > >> DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 > 9.3e+01 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 > >> DMPlexStratify 118 1.0 6.2852e+023280.9 0.00e+00 0.0 0.0e+00 > 0.0e+00 1.6e+01 1 0 0 0 1 1 0 0 0 2 0 > >> DMPlexSymmetrize 118 1.0 6.7634e-02 2.3 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> DMPlexPrealloc 1 1.0 2.3741e+01 1.0 0.00e+00 0.0 3.7e+06 > 3.7e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 > >> DMPlexResidualFE 3 1.0 1.0634e+01 1.2 1.16e+09 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 2 0 0 0 0 15 0 0 0 3569848 > >> DMPlexJacobianFE 2 1.0 1.6809e+02 1.0 1.51e+09 1.0 6.5e+05 > 1.4e+06 0.0e+00 5 3 0 5 0 5 20 0 14 0 293801 > >> SFSetGraph 87 1.0 2.7673e-03 3.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 > 2.7e+04 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 > >> SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 > 1.8e+04 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 > >> SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 > >> SFReduceBegin 12 1.0 2.4825e-01172.8 0.00e+00 0.0 2.4e+06 > 2.0e+05 0.0e+00 0 0 0 2 0 0 0 1 8 0 0 > >> SFReduceEnd 12 1.0 3.8286e+014865.8 3.74e+04 0.0 0.0e+00 > 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 31 > >> SFFetchOpBegin 2 1.0 2.4497e-0390.2 0.00e+00 0.0 4.3e+05 > 3.5e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 > >> SFFetchOpEnd 2 1.0 6.1349e-0210.9 0.00e+00 0.0 4.3e+05 > 3.5e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 > >> SFCreateEmbed 3 1.0 3.6800e+013261.5 0.00e+00 0.0 4.7e+05 > 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 > 1.1e+04 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 > >> SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 > 1.7e+05 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 > >> SFRemoteOff 2 1.0 3.2868e-0143.1 0.00e+00 0.0 8.7e+05 > 8.2e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> SFPack 1023 1.0 2.5215e-0176.6 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> SFUnpack 1025 1.0 5.1600e-0216.8 5.62e+0521.3 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 54693 > >> MatMult 1549525.4 3.4810e+00 1.3 4.35e+09 1.1 2.2e+08 > 6.1e+03 0.0e+00 0 8 8 7 0 0 54 62 21 0 38319208 > >> MatMultAdd 132 1.0 6.9168e-01 3.0 7.97e+07 1.2 2.8e+07 > 4.6e+02 0.0e+00 0 0 1 0 0 0 1 8 0 0 3478717 > >> MatMultTranspose 132 1.0 5.9967e-01 1.6 8.00e+07 1.2 3.0e+07 > 4.5e+02 0.0e+00 0 0 1 0 0 0 1 9 0 0 4015214 > >> MatSolve 22 0.0 6.8431e-04 0.0 7.41e+05 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1082 > >> MatLUFactorSym 1 1.0 5.9569e-0433.3 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> MatLUFactorNum 1 1.0 1.6236e-03773.2 1.46e+06 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 897 > >> MatConvert 6 1.0 1.4290e-01 1.2 0.00e+00 0.0 3.0e+06 > 3.7e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 > >> MatScale 18 1.0 3.7962e-01 1.3 4.11e+07 1.2 2.0e+06 > 5.5e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 3253392 > >> MatResidual 132 1.0 6.8256e-01 1.4 8.27e+08 1.2 4.4e+07 > 5.5e+03 0.0e+00 0 2 2 1 0 0 10 13 4 0 36282014 > >> MatAssemblyBegin 244 1.0 3.1181e+01 6.6 0.00e+00 0.0 4.8e+06 > 2.5e+05 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 > >> MatAssemblyEnd 244 1.0 6.3232e+00 1.9 3.17e+06 6.9 0.0e+00 > 0.0e+00 1.4e+02 0 0 0 0 8 0 0 0 0 15 7655 > >> MatGetRowIJ 1 0.0 2.5780e-05 0.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> MatCreateSubMat 10 1.0 1.5162e+00 1.0 0.00e+00 0.0 1.6e+05 > 3.4e+05 1.3e+02 0 0 0 0 7 0 0 0 1 13 0 > >> MatGetOrdering 1 0.0 1.0899e-04 0.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> MatCoarsen 6 1.0 3.5837e-01 1.3 0.00e+00 0.0 1.6e+07 > 1.2e+04 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 > >> MatZeroEntries 8 1.0 5.3730e-03 1.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> MatAXPY 6 1.0 2.6245e-01 1.1 2.66e+05 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33035 > >> MatTranspose 12 1.0 3.0731e-02 1.3 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> MatMatMultSym 18 1.0 2.1398e+00 1.4 0.00e+00 0.0 6.1e+06 > 5.5e+03 4.8e+01 0 0 0 0 3 0 0 2 1 5 0 > >> MatMatMultNum 6 1.0 1.1243e+00 1.0 3.76e+07 1.2 2.0e+06 > 5.5e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 1001203 > >> MatPtAPSymbolic 6 1.0 1.7280e+01 1.0 0.00e+00 0.0 1.2e+07 > 3.2e+04 4.2e+01 1 0 0 2 2 1 0 3 6 4 0 > >> MatPtAPNumeric 6 1.0 1.8047e+01 1.0 1.49e+09 5.1 2.8e+06 > 1.1e+05 2.4e+01 1 1 0 2 1 1 5 1 5 2 663675 > >> MatTrnMatMultSym 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 > 5.8e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 > >> MatGetLocalMat 19 1.0 1.3904e-01 1.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> MatGetBrAoCol 18 1.0 1.9926e-01 5.0 0.00e+00 0.0 1.4e+07 > 2.3e+04 0.0e+00 0 0 0 2 0 0 0 4 5 0 0 > >> MatGetSymTrans 2 1.0 1.8996e-01 1.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> VecTDot 176 1.0 7.0632e-01 4.5 3.48e+07 1.0 0.0e+00 > 0.0e+00 1.8e+02 0 0 0 0 10 0 0 0 0 18 1608728 > >> VecNorm 60 1.0 1.4074e+0012.2 1.58e+07 1.0 0.0e+00 > 0.0e+00 6.0e+01 0 0 0 0 3 0 0 0 0 6 366467 > >> VecCopy 422 1.0 5.1259e-02 3.8 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> VecSet 653 1.0 2.3974e-03 1.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> VecAXPY 165 1.0 6.5622e-03 1.3 3.42e+07 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170485467 > >> VecAYPX 861 1.0 7.8529e-02 1.2 6.21e+07 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 25785252 > >> VecAXPBYCZ 264 1.0 4.1343e-02 1.5 5.85e+07 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 46135592 > >> VecAssemblyBegin 21 1.0 2.3463e-01 1.5 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> VecAssemblyEnd 21 1.0 1.4457e-04 1.6 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> VecPointwiseMult 600 1.0 5.7510e-02 1.2 2.66e+07 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 15075754 > >> VecScatterBegin 902 1.0 5.1188e-01 1.2 0.00e+00 0.0 2.9e+08 > 5.3e+03 0.0e+00 0 0 10 8 0 0 0 82 25 0 0 > >> VecScatterEnd 902 1.0 1.2143e+00 3.2 5.50e+0537.9 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1347 > >> VecSetRandom 6 1.0 2.6354e-02 1.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> DualSpaceSetUp 7 1.0 5.3467e-0112.0 4.26e+03 1.0 0.0e+00 > 0.0e+00 1.3e+01 0 0 0 0 1 0 0 0 0 1 261 > >> FESetUp 7 1.0 1.7541e-01128.5 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> KSPSetUp 15 1.0 2.7470e-01 1.1 2.04e+08 1.2 1.0e+07 > 5.5e+03 1.3e+02 0 0 0 0 7 0 2 3 1 13 22477233 > >> KSPSolve 1 1.0 4.3257e+00 1.0 4.33e+09 1.1 2.5e+08 > 4.8e+03 6.6e+01 0 8 9 6 4 0 54 72 20 7 30855976 > >> PCGAMGGraph_AGG 6 1.0 5.0969e+00 1.0 3.76e+07 1.2 5.1e+06 > 4.4e+03 4.8e+01 0 0 0 0 3 0 0 1 0 5 220852 > >> PCGAMGCoarse_AGG 6 1.0 3.1121e+01 1.0 0.00e+00 0.0 2.5e+07 > 6.9e+04 5.5e+01 1 0 1 9 3 1 0 7 27 6 0 > >> PCGAMGProl_AGG 6 1.0 5.8196e-01 1.0 0.00e+00 0.0 6.6e+06 > 9.3e+03 7.2e+01 0 0 0 0 4 0 0 2 1 7 0 > >> PCGAMGPOpt_AGG 6 1.0 3.2414e+00 1.0 2.42e+08 1.2 2.1e+07 > 5.3e+03 1.6e+02 0 0 1 1 9 0 3 6 2 17 2256493 > >> GAMG: createProl 6 1.0 4.0042e+01 1.0 2.80e+08 1.2 5.8e+07 > 3.3e+04 3.4e+02 1 1 2 10 19 1 3 16 31 34 210778 > >> Graph 12 1.0 5.0926e+00 1.0 3.76e+07 1.2 5.1e+06 > 4.4e+03 4.8e+01 0 0 0 0 3 0 0 1 0 5 221038 > >> MIS/Agg 6 1.0 3.5850e-01 1.3 0.00e+00 0.0 1.6e+07 > 1.2e+04 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 > >> SA: col data 6 1.0 3.0509e-01 1.0 0.00e+00 0.0 5.4e+06 > 9.2e+03 2.4e+01 0 0 0 0 1 0 0 2 1 2 0 > >> SA: frmProl0 6 1.0 2.3467e-01 1.1 0.00e+00 0.0 1.3e+06 > 9.5e+03 2.4e+01 0 0 0 0 1 0 0 0 0 2 0 > >> SA: smooth 6 1.0 2.7855e+00 1.0 4.14e+07 1.2 8.1e+06 > 5.5e+03 6.3e+01 0 0 0 0 3 0 1 2 1 6 446491 > >> GAMG: partLevel 6 1.0 3.7266e+01 1.0 1.49e+09 5.1 1.5e+07 > 4.9e+04 3.2e+02 1 1 1 4 17 1 5 4 12 32 321395 > >> repartition 5 1.0 2.0343e+00 1.1 0.00e+00 0.0 4.0e+05 > 1.4e+05 2.5e+02 0 0 0 0 14 0 0 0 1 25 0 > >> Invert-Sort 5 1.0 1.5021e-01 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 3.0e+01 0 0 0 0 2 0 0 0 0 3 0 > >> Move A 5 1.0 1.1548e+00 1.0 0.00e+00 0.0 1.6e+05 > 3.4e+05 7.0e+01 0 0 0 0 4 0 0 0 1 7 0 > >> Move P 5 1.0 4.2799e-01 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 7.5e+01 0 0 0 0 4 0 0 0 0 8 0 > >> PCGAMG Squ l00 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 > 5.8e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 > >> PCGAMG Gal l00 1 1.0 8.7411e+00 1.0 2.93e+08 1.1 5.4e+06 > 4.5e+04 1.2e+01 0 1 0 1 1 0 4 2 4 1 1092355 > >> PCGAMG Opt l00 1 1.0 1.9734e+00 1.0 3.36e+07 1.1 3.2e+06 > 1.2e+04 9.0e+00 0 0 0 0 0 0 0 1 1 1 555327 > >> PCGAMG Gal l01 1 1.0 1.0153e+00 1.0 3.50e+07 1.4 5.9e+06 > 3.9e+04 1.2e+01 0 0 0 1 1 0 0 2 4 1 1079887 > >> PCGAMG Opt l01 1 1.0 7.4812e-02 1.0 5.35e+05 1.2 3.2e+06 > 1.1e+03 9.0e+00 0 0 0 0 0 0 0 1 0 1 232542 > >> PCGAMG Gal l02 1 1.0 1.8063e+00 1.0 7.43e+07 0.0 3.0e+06 > 5.9e+04 1.2e+01 0 0 0 1 1 0 0 1 3 1 593392 > >> PCGAMG Opt l02 1 1.0 1.1580e-01 1.1 6.93e+05 0.0 1.6e+06 > 1.3e+03 9.0e+00 0 0 0 0 0 0 0 0 0 1 93213 > >> PCGAMG Gal l03 1 1.0 6.1075e+00 1.0 2.72e+08 0.0 2.6e+05 > 9.2e+04 1.1e+01 0 0 0 0 1 0 0 0 0 1 36155 > >> PCGAMG Opt l03 1 1.0 8.0836e-02 1.0 1.55e+06 0.0 1.4e+05 > 1.4e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 18229 > >> PCGAMG Gal l04 1 1.0 1.6203e+01 1.0 9.44e+08 0.0 1.4e+04 > 3.0e+05 1.1e+01 0 0 0 0 1 0 0 0 0 1 2366 > >> PCGAMG Opt l04 1 1.0 1.2663e-01 1.0 2.01e+06 0.0 6.9e+03 > 2.2e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 817 > >> PCGAMG Gal l05 1 1.0 1.4800e+00 1.0 3.16e+08 0.0 9.0e+01 > 1.6e+05 1.1e+01 0 0 0 0 1 0 0 0 0 1 796 > >> PCGAMG Opt l05 1 1.0 8.1763e-02 1.1 2.50e+06 0.0 4.8e+01 > 4.6e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 114 > >> PCSetUp 2 1.0 7.7969e+01 1.0 1.97e+09 2.8 8.3e+07 > 3.3e+04 8.1e+02 2 2 3 14 44 2 11 23 43 82 341051 > >> PCSetUpOnBlocks 22 1.0 2.4609e-0317.2 1.46e+06 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 592 > >> PCApply 22 1.0 3.6455e+00 1.1 3.57e+09 1.2 2.4e+08 > 4.3e+03 0.0e+00 0 7 8 5 0 0 43 67 16 0 29434967 > >> > >> --- Event Stage 1: PCSetUp > >> > >> BuildTwoSided 4 1.0 1.5980e-01 2.7 0.00e+00 0.0 2.1e+05 > 8.0e+00 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 > >> BuildTwoSidedF 6 1.0 1.3169e+01 5.5 0.00e+00 0.0 1.9e+06 > 1.9e+05 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 > >> SFSetGraph 5 1.0 4.9640e-0519.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> SFSetUp 4 1.0 1.6038e-01 2.3 0.00e+00 0.0 6.4e+05 > 9.1e+02 0.0e+00 0 0 0 0 0 0 0 3 0 0 0 > >> SFPack 30 1.0 3.3376e-04 4.7 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> SFUnpack 30 1.0 1.2101e-05 1.7 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> MatMult 30 1.0 1.5544e-01 1.5 1.87e+08 1.2 1.0e+07 > 5.5e+03 0.0e+00 0 0 0 0 0 0 31 53 8 0 35930640 > >> MatAssemblyBegin 43 1.0 1.3201e+01 4.7 0.00e+00 0.0 1.9e+06 > 1.9e+05 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 > >> MatAssemblyEnd 43 1.0 1.1159e+01 1.0 2.77e+07705.7 0.0e+00 > 0.0e+00 2.0e+01 0 0 0 0 1 26 0 0 0 13 1036 > >> MatZeroEntries 6 1.0 4.7315e-0410.7 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> MatTranspose 12 1.0 2.5142e-02 1.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> MatMatMultSym 10 1.0 5.8783e-0117.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> MatPtAPSymbolic 5 1.0 1.4489e+01 1.0 0.00e+00 0.0 6.2e+06 > 3.6e+04 3.5e+01 0 0 0 1 2 34 0 32 31 22 0 > >> MatPtAPNumeric 6 1.0 2.8457e+01 1.0 1.50e+09 5.1 2.7e+06 > 1.6e+05 2.0e+01 1 1 0 2 1 66 66 14 61 13 421190 > >> MatGetLocalMat 6 1.0 9.8574e-03 1.3 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> MatGetBrAoCol 6 1.0 3.7669e-01 2.3 0.00e+00 0.0 5.1e+06 > 3.8e+04 0.0e+00 0 0 0 1 0 0 0 27 28 0 0 > >> VecTDot 66 1.0 6.5271e-02 4.1 5.85e+06 1.0 0.0e+00 > 0.0e+00 6.6e+01 0 0 0 0 4 0 1 0 0 42 2922260 > >> VecNorm 36 1.0 1.1226e-02 3.2 3.19e+06 1.0 0.0e+00 > 0.0e+00 3.6e+01 0 0 0 0 2 0 1 0 0 23 9268067 > >> VecCopy 12 1.0 1.2805e-03 3.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> VecSet 11 1.0 6.6620e-05 1.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> VecAXPY 60 1.0 1.0763e-03 1.5 5.32e+06 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 161104914 > >> VecAYPX 24 1.0 2.0581e-03 1.3 2.13e+06 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33701038 > >> VecPointwiseMult 36 1.0 3.5709e-03 1.3 1.60e+06 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 14567861 > >> VecScatterBegin 30 1.0 2.9079e-03 7.8 0.00e+00 0.0 1.0e+07 > 5.5e+03 0.0e+00 0 0 0 0 0 0 0 53 8 0 0 > >> VecScatterEnd 30 1.0 3.7015e-0263.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> KSPSetUp 7 1.0 2.3165e-01 1.0 2.04e+08 1.2 1.0e+07 > 5.5e+03 1.0e+02 0 0 0 0 6 1 34 53 8 64 26654598 > >> PCGAMG Gal l00 1 1.0 4.7415e+00 1.0 2.94e+08 1.1 1.8e+06 > 7.8e+04 0.0e+00 0 1 0 1 0 11 53 9 20 0 2015623 > >> PCGAMG Gal l01 1 1.0 1.2103e+00 1.0 3.50e+07 1.4 4.8e+06 > 6.2e+04 1.2e+01 0 0 0 2 1 3 6 25 41 8 905938 > >> PCGAMG Gal l02 1 1.0 3.4334e+00 1.0 7.41e+07 0.0 2.2e+06 > 8.7e+04 1.2e+01 0 0 0 1 1 8 6 11 27 8 312184 > >> PCGAMG Gal l03 1 1.0 9.6062e+00 1.0 2.71e+08 0.0 1.9e+05 > 1.3e+05 1.1e+01 0 0 0 0 1 22 1 1 4 7 22987 > >> PCGAMG Gal l04 1 1.0 2.2482e+01 1.0 9.43e+08 0.0 8.7e+03 > 4.8e+05 1.1e+01 1 0 0 0 1 52 0 0 1 7 1705 > >> PCGAMG Gal l05 1 1.0 1.5961e+00 1.1 3.16e+08 0.0 6.8e+01 > 2.2e+05 1.1e+01 0 0 0 0 1 4 0 0 0 7 738 > >> PCSetUp 1 1.0 4.3191e+01 1.0 1.70e+09 3.6 1.9e+07 > 3.7e+04 1.6e+02 1 1 1 4 9 100100100100100 420463 > >> > >> --- Event Stage 2: KSP Solve only > >> > >> SFPack 8140 1.0 7.4247e-02 4.8 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> SFUnpack 8140 1.0 1.2905e-02 5.2 5.50e+0637.9 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1267207 > >> MatMult 5500 1.0 2.9994e+01 1.2 3.98e+10 1.1 2.0e+09 > 6.1e+03 0.0e+00 1 76 68 62 0 70 92 78 98 0 40747181 > >> MatMultAdd 1320 1.0 6.2192e+00 2.7 7.97e+08 1.2 2.8e+08 > 4.6e+02 0.0e+00 0 2 10 1 0 14 2 11 1 0 3868976 > >> MatMultTranspose 1320 1.0 4.0304e+00 1.7 8.00e+08 1.2 2.8e+08 > 4.6e+02 0.0e+00 0 2 10 1 0 7 2 11 1 0 5974153 > >> MatSolve 220 0.0 6.7366e-03 0.0 7.41e+06 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1100 > >> MatLUFactorSym 1 1.0 5.8691e-0435.5 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> MatLUFactorNum 1 1.0 1.5955e-03756.2 1.46e+06 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 913 > >> MatResidual 1320 1.0 6.4920e+00 1.3 8.27e+09 1.2 4.4e+08 > 5.5e+03 0.0e+00 0 15 15 13 0 14 19 18 20 0 38146350 > >> MatGetRowIJ 1 0.0 2.7820e-05 0.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> MatGetOrdering 1 0.0 9.6940e-05 0.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> VecTDot 440 1.0 4.6162e+00 6.9 2.31e+08 1.0 0.0e+00 > 0.0e+00 4.4e+02 0 0 0 0 24 5 1 0 0 66 1635124 > >> VecNorm 230 1.0 3.9605e-02 1.6 1.21e+08 1.0 0.0e+00 > 0.0e+00 2.3e+02 0 0 0 0 13 0 0 0 0 34 99622387 > >> VecCopy 3980 1.0 5.4166e-01 4.3 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> VecSet 4640 1.0 1.4216e-02 1.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> VecAXPY 440 1.0 4.2829e-02 1.3 2.31e+08 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 176236363 > >> VecAYPX 8130 1.0 7.3998e-01 1.2 5.78e+08 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 1 0 0 0 2 1 0 0 0 25489392 > >> VecAXPBYCZ 2640 1.0 3.9974e-01 1.5 5.85e+08 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 47716315 > >> VecPointwiseMult 5280 1.0 5.9845e-01 1.5 2.34e+08 1.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 12748927 > >> VecScatterBegin 8140 1.0 4.9231e-01 5.9 0.00e+00 0.0 2.5e+09 > 4.9e+03 0.0e+00 0 0 87 64 0 1 0100100 0 0 > >> VecScatterEnd 8140 1.0 1.0172e+01 3.6 5.50e+0637.9 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 13 0 0 0 0 1608 > >> KSPSetUp 1 1.0 9.5996e-07 3.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >> KSPSolve 10 1.0 3.9685e+01 1.0 4.33e+10 1.1 2.5e+09 > 4.9e+03 6.7e+02 1 83 87 64 37 100100100100100 33637495 > >> PCSetUp 1 1.0 2.4149e-0318.1 1.46e+06 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 603 > >> PCSetUpOnBlocks 220 1.0 2.6945e-03 8.9 1.46e+06 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 540 > >> PCApply 220 1.0 3.2921e+01 1.1 3.57e+10 1.2 2.3e+09 > 4.3e+03 0.0e+00 1 67 81 53 0 81 80 93 82 0 32595360 > >> > ------------------------------------------------------------------------------------------------------------------------ > >> > >> Memory usage is given in bytes: > >> > >> Object Type Creations Destructions Memory Descendants' > Mem. > >> Reports information only for process 0. > >> > >> --- Event Stage 0: Main Stage > >> > >> Container 112 112 69888 0. > >> SNES 1 1 1532 0. > >> DMSNES 1 1 720 0. > >> Distributed Mesh 449 449 30060888 0. > >> DM Label 790 790 549840 0. > >> Quadrature 579 579 379824 0. > >> Index Set 100215 100210 361926232 0. > >> IS L to G Mapping 8 13 4356552 0. > >> Section 771 771 598296 0. > >> Star Forest Graph 897 897 1053640 0. > >> Discrete System 521 521 533512 0. > >> GraphPartitioner 118 118 91568 0. > >> Matrix 432 462 2441805304 0. > >> Matrix Coarsen 6 6 4032 0. > >> Vector 354 354 65492968 0. > >> Linear Space 7 7 5208 0. > >> Dual Space 111 111 113664 0. > >> FE Space 7 7 5992 0. > >> Field over DM 6 6 4560 0. > >> Krylov Solver 21 21 37560 0. > >> DMKSP interface 1 1 704 0. > >> Preconditioner 21 21 21632 0. > >> Viewer 2 1 896 0. > >> PetscRandom 12 12 8520 0. > >> > >> --- Event Stage 1: PCSetUp > >> > >> Index Set 10 15 85367336 0. > >> IS L to G Mapping 5 0 0 0. > >> Star Forest Graph 5 5 6600 0. > >> Matrix 50 20 73134024 0. > >> Vector 28 28 6235096 0. > >> > >> --- Event Stage 2: KSP Solve only > >> > >> Index Set 5 5 8296 0. > >> Matrix 1 1 273856 0. > >> > ======================================================================================================================== > >> Average time to get PetscTime(): 6.40051e-08 > >> Average time for MPI_Barrier(): 8.506e-06 > >> Average time for zero size MPI_Send(): 6.6027e-06 > >> #PETSc Option Table entries: > >> -benchmark_it 10 > >> -dm_distribute > >> -dm_plex_box_dim 3 > >> -dm_plex_box_faces 32,32,32 > >> -dm_plex_box_lower 0,0,0 > >> -dm_plex_box_simplex 0 > >> -dm_plex_box_upper 1,1,1 > >> -dm_refine 5 > >> -ksp_converged_reason > >> -ksp_max_it 150 > >> -ksp_norm_type unpreconditioned > >> -ksp_rtol 1.e-12 > >> -ksp_type cg > >> -log_view > >> -matptap_via scalable > >> -mg_levels_esteig_ksp_max_it 5 > >> -mg_levels_esteig_ksp_type cg > >> -mg_levels_ksp_max_it 2 > >> -mg_levels_ksp_type chebyshev > >> -mg_levels_pc_type jacobi > >> -pc_gamg_agg_nsmooths 1 > >> -pc_gamg_coarse_eq_limit 2000 > >> -pc_gamg_coarse_grid_layout_type spread > >> -pc_gamg_esteig_ksp_max_it 5 > >> -pc_gamg_esteig_ksp_type cg > >> -pc_gamg_process_eq_limit 500 > >> -pc_gamg_repartition false > >> -pc_gamg_reuse_interpolation true > >> -pc_gamg_square_graph 1 > >> -pc_gamg_threshold 0.01 > >> -pc_gamg_threshold_scale .5 > >> -pc_gamg_type agg > >> -pc_type gamg > >> -petscpartitioner_simple_node_grid 8,8,8 > >> -petscpartitioner_simple_process_grid 4,4,4 > >> -petscpartitioner_type simple > >> -potential_petscspace_degree 2 > >> -snes_converged_reason > >> -snes_max_it 1 > >> -snes_monitor > >> -snes_rtol 1.e-8 > >> -snes_type ksponly > >> #End of PETSc Option Table entries > >> Compiled without FORTRAN kernels > >> Compiled with 64 bit PetscInt > >> Compiled with full precision matrices (default) > >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 8 > >> Configure options: CC=mpifccpx CXX=mpiFCCpx CFLAGS="-L > /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" CXXFLAGS="-L > /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" COPTFLAGS=-Kfast > CXXOPTFLAGS=-Kfast --with-fc=0 > --package-prefix-hash=/home/ra010009/a04199/petsc-hash-pkgs --with-batch=1 > --with-shared-libraries=yes --with-debugging=no --with-64-bit-indices=1 > PETSC_ARCH=arch-fugaku-fujitsu > >> ----------------------------------------- > >> Libraries compiled on 2021-02-12 02:27:41 on fn01sv08 > >> Machine characteristics: > Linux-3.10.0-957.27.2.el7.x86_64-x86_64-with-redhat-7.6-Maipo > >> Using PETSc directory: /home/ra010009/a04199/petsc > >> Using PETSc arch: > >> ----------------------------------------- > >> > >> Using C compiler: mpifccpx -L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 > -lfjlapack -fPIC -Kfast > >> ----------------------------------------- > >> > >> Using include paths: -I/home/ra010009/a04199/petsc/include > -I/home/ra010009/a04199/petsc/arch-fugaku-fujitsu/include > >> ----------------------------------------- > >> > >> Using C linker: mpifccpx > >> Using libraries: -Wl,-rpath,/home/ra010009/a04199/petsc/lib > -L/home/ra010009/a04199/petsc/lib -lpetsc > -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 > -L/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 > -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64 > -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64 > -Wl,-rpath,/opt/FJSVxtclanga/.common/MELI022/lib64 > -L/opt/FJSVxtclanga/.common/MELI022/lib64 > -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 > -L/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 > -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 > -L/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 > -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 > -L/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 > -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj > -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj -lX11 -lfjprofmpi -lfjlapack > -ldl -lmpi_cxx -lmpi -lfjstring_internal -lfj90i -lfj90fmt_sve -lfj90f > -lfjsrcinfo -lfjcrt -lfjprofcore -lfjprofomp -lfjc++ -lfjc++abi -lfjdemgl > -lmpg -lm -lrt -lpthread -lelf -lz -lgcc_s -ldl > >> ----------------------------------------- > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Sun Mar 7 23:06:05 2021 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Mon, 8 Mar 2021 08:06:05 +0300 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: <7dde1a8f447e4880b96f3dd7c4b315ae@MBX-LS5.itorg.ad.buffalo.edu> <54542fc5f5164ed8a08e796881a41073@MBX-LS5.itorg.ad.buffalo.edu> Message-ID: > I want understand why calling CreateEmbeddedRootSF() would be an abuse. > It was just sarcasm to emphasize the number of new SFs created. Being a very general code, DMPlex does the right thing and uses the proper calls. > On Mar 7, 2021, at 10:01 PM, Barry Smith wrote: >> >> >> Mark, >> >> Thanks for the numbers. >> >> Extremely problematic. DMPlexDistribute takes 88 percent of the total >> run time, SFBcastOpEnd takes 80 percent. >> >> Probably Matt is right, PetscSF is flooding the network which it >> cannot handle. IMHO fixing PetscSF would be a far better route than writing >> all kinds of fancy DMPLEX hierarchical distributors. PetscSF needs to >> detect that it is sending too many messages together and do the messaging >> in appropriate waves; at the moment PetscSF is as dumb as stone it just >> shoves everything out as fast as it can. Junchao needs access to this >> machine. If everything in PETSc will depend on PetscSF then it simply has >> to scale on systems where you cannot just flood the network with MPI. >> >> Barry >> >> >> Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 2.7e+02 >> 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 >> Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 1.9e+02 >> 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 >> DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 >> 4.3e+00.0e+00 14 0 0 0 0 15 0 0 0 0 0 >> DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 >> 5.4e+00.0e+00 28 0 0 0 0 29 0 0 0 0 0 >> DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 2.3e+02 >> 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 >> DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 3.1e+02 >> 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 >> DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 1.9e+02 >> 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 >> DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 9.3e+01 >> 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 >> SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 2.7e+04 >> 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 >> SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 1.8e+04 >> 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 >> SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 >> SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 1.1e+04 >> 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 >> SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 1.7e+05 >> 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 >> >> On Mar 7, 2021, at 7:35 AM, Mark Adams wrote: >> >> And this data puts one cell per process, distributes, and then refines 5 >> (or 2,3,4 in plot) times. >> >> On Sun, Mar 7, 2021 at 8:27 AM Mark Adams wrote: >> >>> FWIW, Here is the output from ex13 on 32K processes (8K Fugaku >>> nodes/sockets, 4 MPI/node, which seems recommended) with 128^3 vertex mesh >>> (64^3 Q2 3D Laplacian). >>> Almost an hour. >>> Attached is solver scaling. >>> >>> >>> 0 SNES Function norm 3.658334849208e+00 >>> Linear solve converged due to CONVERGED_RTOL iterations 22 >>> 1 SNES Function norm 1.609000373074e-12 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> Linear solve converged due to CONVERGED_RTOL iterations 22 >>> Linear solve converged due to CONVERGED_RTOL iterations 22 >>> Linear solve converged due to CONVERGED_RTOL iterations 22 >>> Linear solve converged due to CONVERGED_RTOL iterations 22 >>> Linear solve converged due to CONVERGED_RTOL iterations 22 >>> Linear solve converged due to CONVERGED_RTOL iterations 22 >>> Linear solve converged due to CONVERGED_RTOL iterations 22 >>> Linear solve converged due to CONVERGED_RTOL iterations 22 >>> Linear solve converged due to CONVERGED_RTOL iterations 22 >>> Linear solve converged due to CONVERGED_RTOL iterations 22 >>> >>> ************************************************************************************************************************ >>> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r >>> -fCourier9' to print this document *** >>> >>> ************************************************************************************************************************ >>> >>> ---------------------------------------------- PETSc Performance >>> Summary: ---------------------------------------------- >>> >>> ../ex13 on a named i07-4008c with 32768 processors, by a04199 Fri Feb >>> 12 23:27:13 2021 >>> Using Petsc Development GIT revision: v3.14.4-579-g4cb72fa GIT Date: >>> 2021-02-05 15:19:40 +0000 >>> >>> Max Max/Min Avg Total >>> Time (sec): 3.373e+03 1.000 3.373e+03 >>> Objects: 1.055e+05 14.797 7.144e+03 >>> Flop: 5.376e+10 1.176 4.885e+10 1.601e+15 >>> Flop/sec: 1.594e+07 1.176 1.448e+07 4.745e+11 >>> MPI Messages: 6.048e+05 30.010 8.833e+04 2.894e+09 >>> MPI Message Lengths: 1.127e+09 4.132 6.660e+03 1.928e+13 >>> MPI Reductions: 1.824e+03 1.000 >>> >>> Flop counting convention: 1 flop = 1 real number operation of type >>> (multiply/divide/add/subtract) >>> e.g., VecAXPY() for real vectors of length N >>> --> 2N flop >>> and VecAXPY() for complex vectors of length >>> N --> 8N flop >>> >>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages >>> --- -- Message Lengths -- -- Reductions -- >>> Avg %Total Avg %Total Count >>> %Total Avg %Total Count %Total >>> 0: Main Stage: 3.2903e+03 97.5% 2.4753e+14 15.5% 3.538e+08 >>> 12.2% 1.779e+04 32.7% 9.870e+02 54.1% >>> 1: PCSetUp: 4.3062e+01 1.3% 1.8160e+13 1.1% 1.902e+07 >>> 0.7% 3.714e+04 3.7% 1.590e+02 8.7% >>> 2: KSP Solve only: 3.9685e+01 1.2% 1.3349e+15 83.4% 2.522e+09 >>> 87.1% 4.868e+03 63.7% 6.700e+02 36.7% >>> >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> See the 'Profiling' chapter of the users' manual for details on >>> interpreting output. >>> Phase summary info: >>> Count: number of times phase was executed >>> Time and Flop: Max - maximum over all processors >>> Ratio - ratio of maximum to minimum over all processors >>> Mess: number of messages sent >>> AvgLen: average message length (bytes) >>> Reduct: number of global reductions >>> Global: entire computation >>> Stage: stages of a computation. Set stages with PetscLogStagePush() >>> and PetscLogStagePop(). >>> %T - percent time in this phase %F - percent flop in this >>> phase >>> %M - percent messages in this phase %L - percent message >>> lengths in this phase >>> %R - percent reductions in this phase >>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time >>> over all processors) >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> Event Count Time (sec) Flop >>> --- Global --- --- Stage ---- Total >>> Max Ratio Max Ratio Max Ratio Mess AvgLen >>> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> --- Event Stage 0: Main Stage >>> >>> PetscBarrier 5 1.0 1.9907e+00 2.2 0.00e+00 0.0 3.8e+06 7.7e+01 >>> 2.0e+01 0 0 0 0 1 0 0 1 0 2 0 >>> BuildTwoSided 62 1.0 7.3272e+0214.1 0.00e+00 0.0 6.7e+06 8.0e+00 >>> 0.0e+00 5 0 0 0 0 5 0 2 0 0 0 >>> BuildTwoSidedF 59 1.0 3.1132e+01 7.4 0.00e+00 0.0 4.8e+06 2.5e+05 >>> 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 >>> SNESSolve 1 1.0 1.7468e+02 1.0 7.83e+09 1.3 3.4e+08 1.3e+04 >>> 8.8e+02 5 13 12 23 48 5 85 96 70 89 1205779 >>> SNESSetUp 1 1.0 2.4195e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 >>> 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 >>> SNESFunctionEval 3 1.0 1.1359e+01 1.2 1.17e+09 1.0 1.6e+06 1.4e+04 >>> 2.0e+00 0 2 0 0 0 0 15 0 0 0 3344744 >>> SNESJacobianEval 2 1.0 1.6829e+02 1.0 1.52e+09 1.0 1.1e+06 8.3e+05 >>> 0.0e+00 5 3 0 5 0 5 20 0 14 0 293588 >>> DMCreateMat 1 1.0 2.4107e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 >>> 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 >>> Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 2.7e+02 >>> 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 >>> Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 1.9e+02 >>> 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 >>> DMPlexPartSelf 1 1.0 1.1498e+002367.3 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> DMPlexPartLblInv 1 1.0 3.6698e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> DMPlexPartLblSF 1 1.0 2.8522e-01 1.7 0.00e+00 0.0 4.9e+04 1.5e+02 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 >>> 4.3e+02 0.0e+00 14 0 0 0 0 15 0 0 0 0 0 >>> DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 >>> 5.4e+02 0.0e+00 28 0 0 0 0 29 0 0 0 0 0 >>> DMPlexInterp 84 1.0 4.3219e-0158.6 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 5.0e+00 0 0 0 0 0 0 0 0 0 1 0 >>> DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 2.3e+02 >>> 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 >>> DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 3.1e+02 >>> 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 >>> DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 1.9e+02 >>> 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 >>> DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 9.3e+01 >>> 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 >>> DMPlexStratify 118 1.0 6.2852e+023280.9 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 1.6e+01 1 0 0 0 1 1 0 0 0 2 0 >>> DMPlexSymmetrize 118 1.0 6.7634e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> DMPlexPrealloc 1 1.0 2.3741e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 >>> 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 >>> DMPlexResidualFE 3 1.0 1.0634e+01 1.2 1.16e+09 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 2 0 0 0 0 15 0 0 0 3569848 >>> DMPlexJacobianFE 2 1.0 1.6809e+02 1.0 1.51e+09 1.0 6.5e+05 1.4e+06 >>> 0.0e+00 5 3 0 5 0 5 20 0 14 0 293801 >>> SFSetGraph 87 1.0 2.7673e-03 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 2.7e+04 >>> 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 >>> SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 >>> 1.8e+04 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 >>> SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 >>> SFReduceBegin 12 1.0 2.4825e-01172.8 0.00e+00 0.0 2.4e+06 >>> 2.0e+05 0.0e+00 0 0 0 2 0 0 0 1 8 0 0 >>> SFReduceEnd 12 1.0 3.8286e+014865.8 3.74e+04 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 31 >>> SFFetchOpBegin 2 1.0 2.4497e-0390.2 0.00e+00 0.0 4.3e+05 3.5e+05 >>> 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 >>> SFFetchOpEnd 2 1.0 6.1349e-0210.9 0.00e+00 0.0 4.3e+05 3.5e+05 >>> 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 >>> SFCreateEmbed 3 1.0 3.6800e+013261.5 0.00e+00 0.0 4.7e+05 >>> 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 1.1e+04 >>> 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 >>> SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 1.7e+05 >>> 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 >>> SFRemoteOff 2 1.0 3.2868e-0143.1 0.00e+00 0.0 8.7e+05 8.2e+03 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> SFPack 1023 1.0 2.5215e-0176.6 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> SFUnpack 1025 1.0 5.1600e-0216.8 5.62e+0521.3 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 54693 >>> MatMult 1549525.4 3.4810e+00 1.3 4.35e+09 1.1 2.2e+08 6.1e+03 >>> 0.0e+00 0 8 8 7 0 0 54 62 21 0 38319208 >>> MatMultAdd 132 1.0 6.9168e-01 3.0 7.97e+07 1.2 2.8e+07 4.6e+02 >>> 0.0e+00 0 0 1 0 0 0 1 8 0 0 3478717 >>> MatMultTranspose 132 1.0 5.9967e-01 1.6 8.00e+07 1.2 3.0e+07 4.5e+02 >>> 0.0e+00 0 0 1 0 0 0 1 9 0 0 4015214 >>> MatSolve 22 0.0 6.8431e-04 0.0 7.41e+05 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1082 >>> MatLUFactorSym 1 1.0 5.9569e-0433.3 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatLUFactorNum 1 1.0 1.6236e-03773.2 1.46e+06 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 897 >>> MatConvert 6 1.0 1.4290e-01 1.2 0.00e+00 0.0 3.0e+06 3.7e+03 >>> 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 >>> MatScale 18 1.0 3.7962e-01 1.3 4.11e+07 1.2 2.0e+06 5.5e+03 >>> 0.0e+00 0 0 0 0 0 0 0 1 0 0 3253392 >>> MatResidual 132 1.0 6.8256e-01 1.4 8.27e+08 1.2 4.4e+07 5.5e+03 >>> 0.0e+00 0 2 2 1 0 0 10 13 4 0 36282014 >>> MatAssemblyBegin 244 1.0 3.1181e+01 6.6 0.00e+00 0.0 4.8e+06 2.5e+05 >>> 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 >>> MatAssemblyEnd 244 1.0 6.3232e+00 1.9 3.17e+06 6.9 0.0e+00 0.0e+00 >>> 1.4e+02 0 0 0 0 8 0 0 0 0 15 7655 >>> MatGetRowIJ 1 0.0 2.5780e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatCreateSubMat 10 1.0 1.5162e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 >>> 1.3e+02 0 0 0 0 7 0 0 0 1 13 0 >>> MatGetOrdering 1 0.0 1.0899e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatCoarsen 6 1.0 3.5837e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 >>> 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 >>> MatZeroEntries 8 1.0 5.3730e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatAXPY 6 1.0 2.6245e-01 1.1 2.66e+05 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 33035 >>> MatTranspose 12 1.0 3.0731e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatMatMultSym 18 1.0 2.1398e+00 1.4 0.00e+00 0.0 6.1e+06 5.5e+03 >>> 4.8e+01 0 0 0 0 3 0 0 2 1 5 0 >>> MatMatMultNum 6 1.0 1.1243e+00 1.0 3.76e+07 1.2 2.0e+06 5.5e+03 >>> 0.0e+00 0 0 0 0 0 0 0 1 0 0 1001203 >>> MatPtAPSymbolic 6 1.0 1.7280e+01 1.0 0.00e+00 0.0 1.2e+07 3.2e+04 >>> 4.2e+01 1 0 0 2 2 1 0 3 6 4 0 >>> MatPtAPNumeric 6 1.0 1.8047e+01 1.0 1.49e+09 5.1 2.8e+06 1.1e+05 >>> 2.4e+01 1 1 0 2 1 1 5 1 5 2 663675 >>> MatTrnMatMultSym 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 >>> 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 >>> MatGetLocalMat 19 1.0 1.3904e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatGetBrAoCol 18 1.0 1.9926e-01 5.0 0.00e+00 0.0 1.4e+07 2.3e+04 >>> 0.0e+00 0 0 0 2 0 0 0 4 5 0 0 >>> MatGetSymTrans 2 1.0 1.8996e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecTDot 176 1.0 7.0632e-01 4.5 3.48e+07 1.0 0.0e+00 0.0e+00 >>> 1.8e+02 0 0 0 0 10 0 0 0 0 18 1608728 >>> VecNorm 60 1.0 1.4074e+0012.2 1.58e+07 1.0 0.0e+00 0.0e+00 >>> 6.0e+01 0 0 0 0 3 0 0 0 0 6 366467 >>> VecCopy 422 1.0 5.1259e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecSet 653 1.0 2.3974e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecAXPY 165 1.0 6.5622e-03 1.3 3.42e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 170485467 >>> VecAYPX 861 1.0 7.8529e-02 1.2 6.21e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 1 0 0 0 25785252 >>> VecAXPBYCZ 264 1.0 4.1343e-02 1.5 5.85e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 1 0 0 0 46135592 >>> VecAssemblyBegin 21 1.0 2.3463e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecAssemblyEnd 21 1.0 1.4457e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecPointwiseMult 600 1.0 5.7510e-02 1.2 2.66e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 15075754 >>> VecScatterBegin 902 1.0 5.1188e-01 1.2 0.00e+00 0.0 2.9e+08 5.3e+03 >>> 0.0e+00 0 0 10 8 0 0 0 82 25 0 0 >>> VecScatterEnd 902 1.0 1.2143e+00 3.2 5.50e+0537.9 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1347 >>> VecSetRandom 6 1.0 2.6354e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> DualSpaceSetUp 7 1.0 5.3467e-0112.0 4.26e+03 1.0 0.0e+00 0.0e+00 >>> 1.3e+01 0 0 0 0 1 0 0 0 0 1 261 >>> FESetUp 7 1.0 1.7541e-01128.5 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPSetUp 15 1.0 2.7470e-01 1.1 2.04e+08 1.2 1.0e+07 5.5e+03 >>> 1.3e+02 0 0 0 0 7 0 2 3 1 13 22477233 >>> KSPSolve 1 1.0 4.3257e+00 1.0 4.33e+09 1.1 2.5e+08 4.8e+03 >>> 6.6e+01 0 8 9 6 4 0 54 72 20 7 30855976 >>> PCGAMGGraph_AGG 6 1.0 5.0969e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 >>> 4.8e+01 0 0 0 0 3 0 0 1 0 5 220852 >>> PCGAMGCoarse_AGG 6 1.0 3.1121e+01 1.0 0.00e+00 0.0 2.5e+07 6.9e+04 >>> 5.5e+01 1 0 1 9 3 1 0 7 27 6 0 >>> PCGAMGProl_AGG 6 1.0 5.8196e-01 1.0 0.00e+00 0.0 6.6e+06 9.3e+03 >>> 7.2e+01 0 0 0 0 4 0 0 2 1 7 0 >>> PCGAMGPOpt_AGG 6 1.0 3.2414e+00 1.0 2.42e+08 1.2 2.1e+07 5.3e+03 >>> 1.6e+02 0 0 1 1 9 0 3 6 2 17 2256493 >>> GAMG: createProl 6 1.0 4.0042e+01 1.0 2.80e+08 1.2 5.8e+07 3.3e+04 >>> 3.4e+02 1 1 2 10 19 1 3 16 31 34 210778 >>> Graph 12 1.0 5.0926e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 >>> 4.8e+01 0 0 0 0 3 0 0 1 0 5 221038 >>> MIS/Agg 6 1.0 3.5850e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 >>> 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 >>> SA: col data 6 1.0 3.0509e-01 1.0 0.00e+00 0.0 5.4e+06 9.2e+03 >>> 2.4e+01 0 0 0 0 1 0 0 2 1 2 0 >>> SA: frmProl0 6 1.0 2.3467e-01 1.1 0.00e+00 0.0 1.3e+06 9.5e+03 >>> 2.4e+01 0 0 0 0 1 0 0 0 0 2 0 >>> SA: smooth 6 1.0 2.7855e+00 1.0 4.14e+07 1.2 8.1e+06 5.5e+03 >>> 6.3e+01 0 0 0 0 3 0 1 2 1 6 446491 >>> GAMG: partLevel 6 1.0 3.7266e+01 1.0 1.49e+09 5.1 1.5e+07 4.9e+04 >>> 3.2e+02 1 1 1 4 17 1 5 4 12 32 321395 >>> repartition 5 1.0 2.0343e+00 1.1 0.00e+00 0.0 4.0e+05 1.4e+05 >>> 2.5e+02 0 0 0 0 14 0 0 0 1 25 0 >>> Invert-Sort 5 1.0 1.5021e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 3.0e+01 0 0 0 0 2 0 0 0 0 3 0 >>> Move A 5 1.0 1.1548e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 >>> 7.0e+01 0 0 0 0 4 0 0 0 1 7 0 >>> Move P 5 1.0 4.2799e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 7.5e+01 0 0 0 0 4 0 0 0 0 8 0 >>> PCGAMG Squ l00 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 >>> 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 >>> PCGAMG Gal l00 1 1.0 8.7411e+00 1.0 2.93e+08 1.1 5.4e+06 4.5e+04 >>> 1.2e+01 0 1 0 1 1 0 4 2 4 1 1092355 >>> PCGAMG Opt l00 1 1.0 1.9734e+00 1.0 3.36e+07 1.1 3.2e+06 1.2e+04 >>> 9.0e+00 0 0 0 0 0 0 0 1 1 1 555327 >>> PCGAMG Gal l01 1 1.0 1.0153e+00 1.0 3.50e+07 1.4 5.9e+06 3.9e+04 >>> 1.2e+01 0 0 0 1 1 0 0 2 4 1 1079887 >>> PCGAMG Opt l01 1 1.0 7.4812e-02 1.0 5.35e+05 1.2 3.2e+06 1.1e+03 >>> 9.0e+00 0 0 0 0 0 0 0 1 0 1 232542 >>> PCGAMG Gal l02 1 1.0 1.8063e+00 1.0 7.43e+07 0.0 3.0e+06 5.9e+04 >>> 1.2e+01 0 0 0 1 1 0 0 1 3 1 593392 >>> PCGAMG Opt l02 1 1.0 1.1580e-01 1.1 6.93e+05 0.0 1.6e+06 1.3e+03 >>> 9.0e+00 0 0 0 0 0 0 0 0 0 1 93213 >>> PCGAMG Gal l03 1 1.0 6.1075e+00 1.0 2.72e+08 0.0 2.6e+05 9.2e+04 >>> 1.1e+01 0 0 0 0 1 0 0 0 0 1 36155 >>> PCGAMG Opt l03 1 1.0 8.0836e-02 1.0 1.55e+06 0.0 1.4e+05 1.4e+03 >>> 8.0e+00 0 0 0 0 0 0 0 0 0 1 18229 >>> PCGAMG Gal l04 1 1.0 1.6203e+01 1.0 9.44e+08 0.0 1.4e+04 3.0e+05 >>> 1.1e+01 0 0 0 0 1 0 0 0 0 1 2366 >>> PCGAMG Opt l04 1 1.0 1.2663e-01 1.0 2.01e+06 0.0 6.9e+03 2.2e+03 >>> 8.0e+00 0 0 0 0 0 0 0 0 0 1 817 >>> PCGAMG Gal l05 1 1.0 1.4800e+00 1.0 3.16e+08 0.0 9.0e+01 1.6e+05 >>> 1.1e+01 0 0 0 0 1 0 0 0 0 1 796 >>> PCGAMG Opt l05 1 1.0 8.1763e-02 1.1 2.50e+06 0.0 4.8e+01 4.6e+03 >>> 8.0e+00 0 0 0 0 0 0 0 0 0 1 114 >>> PCSetUp 2 1.0 7.7969e+01 1.0 1.97e+09 2.8 8.3e+07 3.3e+04 >>> 8.1e+02 2 2 3 14 44 2 11 23 43 82 341051 >>> PCSetUpOnBlocks 22 1.0 2.4609e-0317.2 1.46e+06 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 592 >>> PCApply 22 1.0 3.6455e+00 1.1 3.57e+09 1.2 2.4e+08 4.3e+03 >>> 0.0e+00 0 7 8 5 0 0 43 67 16 0 29434967 >>> >>> --- Event Stage 1: PCSetUp >>> >>> BuildTwoSided 4 1.0 1.5980e-01 2.7 0.00e+00 0.0 2.1e+05 8.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 >>> BuildTwoSidedF 6 1.0 1.3169e+01 5.5 0.00e+00 0.0 1.9e+06 1.9e+05 >>> 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 >>> SFSetGraph 5 1.0 4.9640e-0519.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> SFSetUp 4 1.0 1.6038e-01 2.3 0.00e+00 0.0 6.4e+05 9.1e+02 >>> 0.0e+00 0 0 0 0 0 0 0 3 0 0 0 >>> SFPack 30 1.0 3.3376e-04 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> SFUnpack 30 1.0 1.2101e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatMult 30 1.0 1.5544e-01 1.5 1.87e+08 1.2 1.0e+07 5.5e+03 >>> 0.0e+00 0 0 0 0 0 0 31 53 8 0 35930640 >>> MatAssemblyBegin 43 1.0 1.3201e+01 4.7 0.00e+00 0.0 1.9e+06 1.9e+05 >>> 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 >>> MatAssemblyEnd 43 1.0 1.1159e+01 1.0 2.77e+07705.7 0.0e+00 >>> 0.0e+00 2.0e+01 0 0 0 0 1 26 0 0 0 13 1036 >>> MatZeroEntries 6 1.0 4.7315e-0410.7 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatTranspose 12 1.0 2.5142e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatMatMultSym 10 1.0 5.8783e-0117.4 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatPtAPSymbolic 5 1.0 1.4489e+01 1.0 0.00e+00 0.0 6.2e+06 3.6e+04 >>> 3.5e+01 0 0 0 1 2 34 0 32 31 22 0 >>> MatPtAPNumeric 6 1.0 2.8457e+01 1.0 1.50e+09 5.1 2.7e+06 1.6e+05 >>> 2.0e+01 1 1 0 2 1 66 66 14 61 13 421190 >>> MatGetLocalMat 6 1.0 9.8574e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatGetBrAoCol 6 1.0 3.7669e-01 2.3 0.00e+00 0.0 5.1e+06 3.8e+04 >>> 0.0e+00 0 0 0 1 0 0 0 27 28 0 0 >>> VecTDot 66 1.0 6.5271e-02 4.1 5.85e+06 1.0 0.0e+00 0.0e+00 >>> 6.6e+01 0 0 0 0 4 0 1 0 0 42 2922260 >>> VecNorm 36 1.0 1.1226e-02 3.2 3.19e+06 1.0 0.0e+00 0.0e+00 >>> 3.6e+01 0 0 0 0 2 0 1 0 0 23 9268067 >>> VecCopy 12 1.0 1.2805e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecSet 11 1.0 6.6620e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecAXPY 60 1.0 1.0763e-03 1.5 5.32e+06 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 1 0 0 0 161104914 >>> VecAYPX 24 1.0 2.0581e-03 1.3 2.13e+06 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 33701038 >>> VecPointwiseMult 36 1.0 3.5709e-03 1.3 1.60e+06 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 14567861 >>> VecScatterBegin 30 1.0 2.9079e-03 7.8 0.00e+00 0.0 1.0e+07 5.5e+03 >>> 0.0e+00 0 0 0 0 0 0 0 53 8 0 0 >>> VecScatterEnd 30 1.0 3.7015e-0263.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPSetUp 7 1.0 2.3165e-01 1.0 2.04e+08 1.2 1.0e+07 5.5e+03 >>> 1.0e+02 0 0 0 0 6 1 34 53 8 64 26654598 >>> PCGAMG Gal l00 1 1.0 4.7415e+00 1.0 2.94e+08 1.1 1.8e+06 7.8e+04 >>> 0.0e+00 0 1 0 1 0 11 53 9 20 0 2015623 >>> PCGAMG Gal l01 1 1.0 1.2103e+00 1.0 3.50e+07 1.4 4.8e+06 6.2e+04 >>> 1.2e+01 0 0 0 2 1 3 6 25 41 8 905938 >>> PCGAMG Gal l02 1 1.0 3.4334e+00 1.0 7.41e+07 0.0 2.2e+06 8.7e+04 >>> 1.2e+01 0 0 0 1 1 8 6 11 27 8 312184 >>> PCGAMG Gal l03 1 1.0 9.6062e+00 1.0 2.71e+08 0.0 1.9e+05 1.3e+05 >>> 1.1e+01 0 0 0 0 1 22 1 1 4 7 22987 >>> PCGAMG Gal l04 1 1.0 2.2482e+01 1.0 9.43e+08 0.0 8.7e+03 4.8e+05 >>> 1.1e+01 1 0 0 0 1 52 0 0 1 7 1705 >>> PCGAMG Gal l05 1 1.0 1.5961e+00 1.1 3.16e+08 0.0 6.8e+01 2.2e+05 >>> 1.1e+01 0 0 0 0 1 4 0 0 0 7 738 >>> PCSetUp 1 1.0 4.3191e+01 1.0 1.70e+09 3.6 1.9e+07 3.7e+04 >>> 1.6e+02 1 1 1 4 9 100100100100100 420463 >>> >>> --- Event Stage 2: KSP Solve only >>> >>> SFPack 8140 1.0 7.4247e-02 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> SFUnpack 8140 1.0 1.2905e-02 5.2 5.50e+0637.9 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1267207 >>> MatMult 5500 1.0 2.9994e+01 1.2 3.98e+10 1.1 2.0e+09 6.1e+03 >>> 0.0e+00 1 76 68 62 0 70 92 78 98 0 40747181 >>> MatMultAdd 1320 1.0 6.2192e+00 2.7 7.97e+08 1.2 2.8e+08 4.6e+02 >>> 0.0e+00 0 2 10 1 0 14 2 11 1 0 3868976 >>> MatMultTranspose 1320 1.0 4.0304e+00 1.7 8.00e+08 1.2 2.8e+08 4.6e+02 >>> 0.0e+00 0 2 10 1 0 7 2 11 1 0 5974153 >>> MatSolve 220 0.0 6.7366e-03 0.0 7.41e+06 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1100 >>> MatLUFactorSym 1 1.0 5.8691e-0435.5 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatLUFactorNum 1 1.0 1.5955e-03756.2 1.46e+06 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 913 >>> MatResidual 1320 1.0 6.4920e+00 1.3 8.27e+09 1.2 4.4e+08 5.5e+03 >>> 0.0e+00 0 15 15 13 0 14 19 18 20 0 38146350 >>> MatGetRowIJ 1 0.0 2.7820e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatGetOrdering 1 0.0 9.6940e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecTDot 440 1.0 4.6162e+00 6.9 2.31e+08 1.0 0.0e+00 0.0e+00 >>> 4.4e+02 0 0 0 0 24 5 1 0 0 66 1635124 >>> VecNorm 230 1.0 3.9605e-02 1.6 1.21e+08 1.0 0.0e+00 0.0e+00 >>> 2.3e+02 0 0 0 0 13 0 0 0 0 34 99622387 >>> VecCopy 3980 1.0 5.4166e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecSet 4640 1.0 1.4216e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecAXPY 440 1.0 4.2829e-02 1.3 2.31e+08 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 1 0 0 0 176236363 >>> VecAYPX 8130 1.0 7.3998e-01 1.2 5.78e+08 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 1 0 0 0 2 1 0 0 0 25489392 >>> VecAXPBYCZ 2640 1.0 3.9974e-01 1.5 5.85e+08 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 1 0 0 0 1 1 0 0 0 47716315 >>> VecPointwiseMult 5280 1.0 5.9845e-01 1.5 2.34e+08 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 1 1 0 0 0 12748927 >>> VecScatterBegin 8140 1.0 4.9231e-01 5.9 0.00e+00 0.0 2.5e+09 4.9e+03 >>> 0.0e+00 0 0 87 64 0 1 0100100 0 0 >>> VecScatterEnd 8140 1.0 1.0172e+01 3.6 5.50e+0637.9 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 13 0 0 0 0 1608 >>> KSPSetUp 1 1.0 9.5996e-07 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPSolve 10 1.0 3.9685e+01 1.0 4.33e+10 1.1 2.5e+09 4.9e+03 >>> 6.7e+02 1 83 87 64 37 100100100100100 33637495 >>> PCSetUp 1 1.0 2.4149e-0318.1 1.46e+06 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 603 >>> PCSetUpOnBlocks 220 1.0 2.6945e-03 8.9 1.46e+06 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 540 >>> PCApply 220 1.0 3.2921e+01 1.1 3.57e+10 1.2 2.3e+09 4.3e+03 >>> 0.0e+00 1 67 81 53 0 81 80 93 82 0 32595360 >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> Memory usage is given in bytes: >>> >>> Object Type Creations Destructions Memory Descendants' >>> Mem. >>> Reports information only for process 0. >>> >>> --- Event Stage 0: Main Stage >>> >>> Container 112 112 69888 0. >>> SNES 1 1 1532 0. >>> DMSNES 1 1 720 0. >>> Distributed Mesh 449 449 30060888 0. >>> DM Label 790 790 549840 0. >>> Quadrature 579 579 379824 0. >>> Index Set 100215 100210 361926232 0. >>> IS L to G Mapping 8 13 4356552 0. >>> Section 771 771 598296 0. >>> Star Forest Graph 897 897 1053640 0. >>> Discrete System 521 521 533512 0. >>> GraphPartitioner 118 118 91568 0. >>> Matrix 432 462 2441805304 0. >>> Matrix Coarsen 6 6 4032 0. >>> Vector 354 354 65492968 0. >>> Linear Space 7 7 5208 0. >>> Dual Space 111 111 113664 0. >>> FE Space 7 7 5992 0. >>> Field over DM 6 6 4560 0. >>> Krylov Solver 21 21 37560 0. >>> DMKSP interface 1 1 704 0. >>> Preconditioner 21 21 21632 0. >>> Viewer 2 1 896 0. >>> PetscRandom 12 12 8520 0. >>> >>> --- Event Stage 1: PCSetUp >>> >>> Index Set 10 15 85367336 0. >>> IS L to G Mapping 5 0 0 0. >>> Star Forest Graph 5 5 6600 0. >>> Matrix 50 20 73134024 0. >>> Vector 28 28 6235096 0. >>> >>> --- Event Stage 2: KSP Solve only >>> >>> Index Set 5 5 8296 0. >>> Matrix 1 1 273856 0. >>> >>> ======================================================================================================================== >>> Average time to get PetscTime(): 6.40051e-08 >>> Average time for MPI_Barrier(): 8.506e-06 >>> Average time for zero size MPI_Send(): 6.6027e-06 >>> #PETSc Option Table entries: >>> -benchmark_it 10 >>> -dm_distribute >>> -dm_plex_box_dim 3 >>> -dm_plex_box_faces 32,32,32 >>> -dm_plex_box_lower 0,0,0 >>> -dm_plex_box_simplex 0 >>> -dm_plex_box_upper 1,1,1 >>> -dm_refine 5 >>> -ksp_converged_reason >>> -ksp_max_it 150 >>> -ksp_norm_type unpreconditioned >>> -ksp_rtol 1.e-12 >>> -ksp_type cg >>> -log_view >>> -matptap_via scalable >>> -mg_levels_esteig_ksp_max_it 5 >>> -mg_levels_esteig_ksp_type cg >>> -mg_levels_ksp_max_it 2 >>> -mg_levels_ksp_type chebyshev >>> -mg_levels_pc_type jacobi >>> -pc_gamg_agg_nsmooths 1 >>> -pc_gamg_coarse_eq_limit 2000 >>> -pc_gamg_coarse_grid_layout_type spread >>> -pc_gamg_esteig_ksp_max_it 5 >>> -pc_gamg_esteig_ksp_type cg >>> -pc_gamg_process_eq_limit 500 >>> -pc_gamg_repartition false >>> -pc_gamg_reuse_interpolation true >>> -pc_gamg_square_graph 1 >>> -pc_gamg_threshold 0.01 >>> -pc_gamg_threshold_scale .5 >>> -pc_gamg_type agg >>> -pc_type gamg >>> -petscpartitioner_simple_node_grid 8,8,8 >>> -petscpartitioner_simple_process_grid 4,4,4 >>> -petscpartitioner_type simple >>> -potential_petscspace_degree 2 >>> -snes_converged_reason >>> -snes_max_it 1 >>> -snes_monitor >>> -snes_rtol 1.e-8 >>> -snes_type ksponly >>> #End of PETSc Option Table entries >>> Compiled without FORTRAN kernels >>> Compiled with 64 bit PetscInt >>> Compiled with full precision matrices (default) >>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >>> sizeof(PetscScalar) 8 sizeof(PetscInt) 8 >>> Configure options: CC=mpifccpx CXX=mpiFCCpx CFLAGS="-L >>> /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" CXXFLAGS="-L >>> /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" COPTFLAGS=-Kfast >>> CXXOPTFLAGS=-Kfast --with-fc=0 >>> --package-prefix-hash=/home/ra010009/a04199/petsc-hash-pkgs --with-batch=1 >>> --with-shared-libraries=yes --with-debugging=no --with-64-bit-indices=1 >>> PETSC_ARCH=arch-fugaku-fujitsu >>> ----------------------------------------- >>> Libraries compiled on 2021-02-12 02:27:41 on fn01sv08 >>> Machine characteristics: >>> Linux-3.10.0-957.27.2.el7.x86_64-x86_64-with-redhat-7.6-Maipo >>> Using PETSc directory: /home/ra010009/a04199/petsc >>> Using PETSc arch: >>> ----------------------------------------- >>> >>> Using C compiler: mpifccpx -L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 >>> -lfjlapack -fPIC -Kfast >>> ----------------------------------------- >>> >>> Using include paths: -I/home/ra010009/a04199/petsc/include >>> -I/home/ra010009/a04199/petsc/arch-fugaku-fujitsu/include >>> ----------------------------------------- >>> >>> Using C linker: mpifccpx >>> Using libraries: -Wl,-rpath,/home/ra010009/a04199/petsc/lib >>> -L/home/ra010009/a04199/petsc/lib -lpetsc >>> -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 >>> -L/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 >>> -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64 >>> -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64 >>> -Wl,-rpath,/opt/FJSVxtclanga/.common/MELI022/lib64 >>> -L/opt/FJSVxtclanga/.common/MELI022/lib64 >>> -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 >>> -L/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 >>> -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 >>> -L/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 >>> -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 >>> -L/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 >>> -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj >>> -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj -lX11 -lfjprofmpi -lfjlapack >>> -ldl -lmpi_cxx -lmpi -lfjstring_internal -lfj90i -lfj90fmt_sve -lfj90f >>> -lfjsrcinfo -lfjcrt -lfjprofcore -lfjprofomp -lfjc++ -lfjc++abi -lfjdemgl >>> -lmpg -lm -lrt -lpthread -lelf -lz -lgcc_s -ldl >>> ----------------------------------------- >>> >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Mon Mar 8 03:02:31 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Mon, 8 Mar 2021 10:02:31 +0100 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: References: <8f788afe-01fd-98db-957c-450cd43a18a8@math.u-bordeaux.fr> <1c4a69aa-ff26-848c-770d-89756a87be66@math.u-bordeaux.fr> Message-ID: <42072e8c-24de-ec4a-ee8e-7a9356f224bc@math.u-bordeaux.fr> On 07/03/2021 22:56, Matthew Knepley wrote: > On Sun, Mar 7, 2021 at 4:51 PM Nicolas Barral > > wrote: > > > On 07/03/2021 22:30, Matthew Knepley wrote: > > On Sun, Mar 7, 2021 at 4:13 PM Nicolas Barral > > > > >> wrote: > > > >? ? ?On 07/03/2021 16:54, Matthew Knepley wrote: > >? ? ? > On Sun, Mar 7, 2021 at 8:52 AM Nicolas Barral > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? >>> wrote: > >? ? ? > > >? ? ? >? ? ?Matt, > >? ? ? > > >? ? ? >? ? ?Thanks for your answer. > >? ? ? > > >? ? ? >? ? ?However, DMPlexComputeCellGeometryFVM does not compute > what I > >? ? ?need > >? ? ? >? ? ?(normals of height 1 entities). I can't find any > function doing > >? ? ? >? ? ?that, is > >? ? ? >? ? ?there one ? > >? ? ? > > >? ? ? > > >? ? ? > The normal[] in?DMPlexComputeCellGeometryFVM() is exactly what > >? ? ?you want. > >? ? ? > What does not look right to you? > > > > > >? ? ?So it turns out it's not what I want because I need > non-normalized > >? ? ?normals. It doesn't seem like I can easily retrieve the norm, > can I? > > > > > > You just want area-weighted normals I think, which?means that you > just > > multiply by the area, > > which comes back in the same function. > > > > Ah by the area times 2, of course, my bad. > Do you order height-1 elements in a certain way ? I need to access the > facet (resp. edge) opposite to a vertex in a tet (resp. triangle). > > > Yes. Now that I have pretty much settled on it, I will put it in the > manual. It is currently here: > > https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexinterpolate.c#L56 > > All normals are outward facing, but hopefully the ordering in the sourse > file makes sense. Thanks Matt, but I'm not sure I understand well. What I do so far is: ierr = DMPlexGetCone(dm, c, &cone);CHKERRQ(ierr); for (i=0; i > ? Thanks, > > ? ? Matt > > Thanks > > -- > Nicolas > > > >? ? Thanks, > > > >? ? ? Matt > > > >? ? ?If not, I'll fallback to computing them by hand for now. Is the > >? ? ?following assumption safe or do I have to use > DMPlexGetOrientedFace? > >? ? ? ?>? if I call P0P1P2P3 a tet and note x the cross product, > >? ? ? ?>? P3P2xP3P1 is the outward normal to face P1P2P3 > >? ? ? ?>? P0P2xP0P3? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P2P3 > >? ? ? ?>? P3P1xP3P0? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P3 > >? ? ? ?>? P0P1xP0P2? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P2 > > > >? ? ?Thanks > > > >? ? ?-- > >? ? ?Nicolas > >? ? ? > > >? ? ? >? ? Thanks, > >? ? ? > > >? ? ? >? ? ? Matt > >? ? ? > > >? ? ? >? ? ?So far I've been doing it by hand, and after a lot of > >? ? ?experimenting the > >? ? ? >? ? ?past weeks, it seems that if I call P0P1P2P3 a tetrahedron > >? ? ?and note x > >? ? ? >? ? ?the cross product, > >? ? ? >? ? ?P3P2xP3P1 is the outward normal to face P1P2P3 > >? ? ? >? ? ?P0P2xP0P3? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P2P3 > >? ? ? >? ? ?P3P1xP3P0? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P3 > >? ? ? >? ? ?P0P1xP0P2? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P2 > >? ? ? >? ? ?Have I been lucky but can't expect it to be true ? > >? ? ? > > >? ? ? >? ? ?(Alternatively, there is a link between the normals > and the > >? ? ?element > >? ? ? >? ? ?Jacobian, but I don't know the formula and can? find them) > >? ? ? > > >? ? ? > > >? ? ? >? ? ?Thanks, > >? ? ? > > >? ? ? >? ? ?-- > >? ? ? >? ? ?Nicolas > >? ? ? > > >? ? ? >? ? ?On 08/02/2021 15:19, Matthew Knepley wrote: > >? ? ? >? ? ? > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral > >? ? ? >? ? ? > > >? ? ? > > >? ? ? >? ? ? > >? ? ? >> > >? ? ? >? ? ? > > >? ? ? > > >? ? ? >? ? ? > >? ? ? >>>> wrote: > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Hi all, > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Can I make any assumption on the orientation of > triangular > >? ? ? >? ? ?facets in a > >? ? ? >? ? ? >? ? ?tetrahedral plex ? I need the inward facet > normals. Do > >? ? ?I need > >? ? ? >? ? ?to use > >? ? ? >? ? ? >? ? ?DMPlexGetOrientedFace or can I rely on either > the tet > >? ? ?vertices > >? ? ? >? ? ? >? ? ?ordering, > >? ? ? >? ? ? >? ? ?or the faces ordering ? Could > >? ? ?DMPlexGetRawFaces_Internal be > >? ? ? >? ? ?enough ? > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > You can do it by hand, but you have to account for > the face > >? ? ? >? ? ?orientation > >? ? ? >? ? ? > relative to the cell. That is what > >? ? ? >? ? ? > DMPlexGetOrientedFace() does. I think it would be > easier > >? ? ?to use the > >? ? ? >? ? ? > function below. > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Alternatively, is there a function that > computes the > >? ? ?normals > >? ? ? >? ? ?- without > >? ? ? >? ? ? >? ? ?bringing out the big guns ? > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > This will compute the normals > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? > > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html > >? ? ? >? ? ? > Should not be too heavy weight. > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? THanks, > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? Matt > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Thanks > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?-- > >? ? ? >? ? ? >? ? ?Nicolas > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > -- > >? ? ? >? ? ? > What most experimenters take for granted before > they begin > >? ? ?their > >? ? ? >? ? ? > experiments is infinitely more interesting than any > >? ? ?results to which > >? ? ? >? ? ? > their experiments lead. > >? ? ? >? ? ? > -- Norbert Wiener > >? ? ? >? ? ? > > >? ? ? >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? >? ? ? > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? > -- > >? ? ? > What most experimenters take for granted before they begin > their > >? ? ? > experiments is infinitely more interesting than any > results to which > >? ? ? > their experiments lead. > >? ? ? > -- Norbert Wiener > >? ? ? > > >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Mon Mar 8 08:55:38 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 8 Mar 2021 09:55:38 -0500 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: <42072e8c-24de-ec4a-ee8e-7a9356f224bc@math.u-bordeaux.fr> References: <8f788afe-01fd-98db-957c-450cd43a18a8@math.u-bordeaux.fr> <1c4a69aa-ff26-848c-770d-89756a87be66@math.u-bordeaux.fr> <42072e8c-24de-ec4a-ee8e-7a9356f224bc@math.u-bordeaux.fr> Message-ID: On Mon, Mar 8, 2021 at 4:02 AM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > On 07/03/2021 22:56, Matthew Knepley wrote: > > On Sun, Mar 7, 2021 at 4:51 PM Nicolas Barral > > > > wrote: > > > > > > On 07/03/2021 22:30, Matthew Knepley wrote: > > > On Sun, Mar 7, 2021 at 4:13 PM Nicolas Barral > > > > > > > > >> wrote: > > > > > > On 07/03/2021 16:54, Matthew Knepley wrote: > > > > On Sun, Mar 7, 2021 at 8:52 AM Nicolas Barral > > > > > > > > > > > > > > > > > > > >>> wrote: > > > > > > > > Matt, > > > > > > > > Thanks for your answer. > > > > > > > > However, DMPlexComputeCellGeometryFVM does not compute > > what I > > > need > > > > (normals of height 1 entities). I can't find any > > function doing > > > > that, is > > > > there one ? > > > > > > > > > > > > The normal[] in DMPlexComputeCellGeometryFVM() is exactly > what > > > you want. > > > > What does not look right to you? > > > > > > > > > So it turns out it's not what I want because I need > > non-normalized > > > normals. It doesn't seem like I can easily retrieve the norm, > > can I? > > > > > > > > > You just want area-weighted normals I think, which means that you > > just > > > multiply by the area, > > > which comes back in the same function. > > > > > > > Ah by the area times 2, of course, my bad. > > Do you order height-1 elements in a certain way ? I need to access > the > > facet (resp. edge) opposite to a vertex in a tet (resp. triangle). > > > > > > Yes. Now that I have pretty much settled on it, I will put it in the > > manual. It is currently here: > > > > > https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexinterpolate.c#L56 > > > > All normals are outward facing, but hopefully the ordering in the sourse > > file makes sense. > > Thanks Matt, but I'm not sure I understand well. What I do so far is: > > ierr = DMPlexGetCone(dm, c, &cone);CHKERRQ(ierr); > for (i=0; i f = cone[i]; > ierr = DMPlexComputeCellGeometryFVM(dm, f, &area, NULL, > &vn[i*dim]);CHKERRQ(ierr); > if (dim == 3) { area *= 2; } > for (j=0; j > So in 3D, it seems that: > (vn[9],vn[10],vn[11]) is the inward normal to the facet opposing vertex0 > (vn[6],vn[7],vn[8]) " " 1 > (vn[3],vn[4],vn[5]) " " 2 > (vn[0],vn[1],vn[2]) " " 3 > > in 2D: > (vn[2],vn[3]) is a normal to the edge opposing vertex 0 > (vn[4],vn[5]) " " 1 > (vn[0],vn[1]) " " 2 > Yet in 2D, whether the normals are inward or outward does not seem > consistent across elements. > > What am I wrongly assuming ? > Ah, I see the problem. I probably need another function. You can tell that not many people use Plex this way yet. The logic for what you want is embedded my traversal, but it simple: ierr = DMPlexGetConeSize(dm, c, &coneSize);CHKERRQ(ierr); ierr = DMPlexGetCone(dm, c, &cone);CHKERRQ(ierr); ierr = DMPlexGetConeOrientation(dm, c, &ornt);CHKERRQ(ierr); for (i=0; i= 0 ? 1 : -1; ierr = DMPlexComputeCellGeometryFVM(dm, f, &area, NULL, &vn[i*dim]);CHKERRQ(ierr); if (dim == 3) { area *= 2; } for (j=0; j > -- > Nicolas > > > > > Thanks, > > > > Matt > > > > Thanks > > > > -- > > Nicolas > > > > > > > Thanks, > > > > > > Matt > > > > > > If not, I'll fallback to computing them by hand for now. Is > the > > > following assumption safe or do I have to use > > DMPlexGetOrientedFace? > > > > if I call P0P1P2P3 a tet and note x the cross product, > > > > P3P2xP3P1 is the outward normal to face P1P2P3 > > > > P0P2xP0P3 " P0P2P3 > > > > P3P1xP3P0 " P0P1P3 > > > > P0P1xP0P2 " P0P1P2 > > > > > > Thanks > > > > > > -- > > > Nicolas > > > > > > > > Thanks, > > > > > > > > Matt > > > > > > > > So far I've been doing it by hand, and after a lot of > > > experimenting the > > > > past weeks, it seems that if I call P0P1P2P3 a > tetrahedron > > > and note x > > > > the cross product, > > > > P3P2xP3P1 is the outward normal to face P1P2P3 > > > > P0P2xP0P3 " P0P2P3 > > > > P3P1xP3P0 " P0P1P3 > > > > P0P1xP0P2 " P0P1P2 > > > > Have I been lucky but can't expect it to be true ? > > > > > > > > (Alternatively, there is a link between the normals > > and the > > > element > > > > Jacobian, but I don't know the formula and can find > them) > > > > > > > > > > > > Thanks, > > > > > > > > -- > > > > Nicolas > > > > > > > > On 08/02/2021 15:19, Matthew Knepley wrote: > > > > > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >>>> wrote: > > > > > > > > > > Hi all, > > > > > > > > > > Can I make any assumption on the orientation of > > triangular > > > > facets in a > > > > > tetrahedral plex ? I need the inward facet > > normals. Do > > > I need > > > > to use > > > > > DMPlexGetOrientedFace or can I rely on either > > the tet > > > vertices > > > > > ordering, > > > > > or the faces ordering ? Could > > > DMPlexGetRawFaces_Internal be > > > > enough ? > > > > > > > > > > > > > > > You can do it by hand, but you have to account for > > the face > > > > orientation > > > > > relative to the cell. That is what > > > > > DMPlexGetOrientedFace() does. I think it would be > > easier > > > to use the > > > > > function below. > > > > > > > > > > Alternatively, is there a function that > > computes the > > > normals > > > > - without > > > > > bringing out the big guns ? > > > > > > > > > > > > > > > This will compute the normals > > > > > > > > > > > > > > > > > > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html > > > > > Should not be too heavy weight. > > > > > > > > > > THanks, > > > > > > > > > > Matt > > > > > > > > > > Thanks > > > > > > > > > > -- > > > > > Nicolas > > > > > > > > > > > > > > > > > > > > -- > > > > > What most experimenters take for granted before > > they begin > > > their > > > > > experiments is infinitely more interesting than any > > > results to which > > > > > their experiments lead. > > > > > -- Norbert Wiener > > > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before they begin > > their > > > > experiments is infinitely more interesting than any > > results to which > > > > their experiments lead. > > > > -- Norbert Wiener > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to > which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tisaac at cc.gatech.edu Mon Mar 8 09:08:20 2021 From: tisaac at cc.gatech.edu (Isaac, Tobin G) Date: Mon, 8 Mar 2021 15:08:20 +0000 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: References: <8f788afe-01fd-98db-957c-450cd43a18a8@math.u-bordeaux.fr> <1c4a69aa-ff26-848c-770d-89756a87be66@math.u-bordeaux.fr> <42072e8c-24de-ec4a-ee8e-7a9356f224bc@math.u-bordeaux.fr>, Message-ID: What Nicolas wants is pretty common in DG, and the quantity is available as just the cross product of the two vectors of the facet Jacobian. Computing it the way you suggest is kind of a backward reconstruction. Toby Isaac, Assistant Professor, GTCSE ________________________________ From: petsc-users on behalf of Matthew Knepley Sent: Monday, March 8, 2021, 09:56 To: Nicolas Barral Cc: PETSc Subject: Re: [petsc-users] DMPlex tetrahedra facets orientation On Mon, Mar 8, 2021 at 4:02 AM Nicolas Barral > wrote: On 07/03/2021 22:56, Matthew Knepley wrote: > On Sun, Mar 7, 2021 at 4:51 PM Nicolas Barral > > >> wrote: > > > On 07/03/2021 22:30, Matthew Knepley wrote: > > On Sun, Mar 7, 2021 at 4:13 PM Nicolas Barral > > > > > > > >>> wrote: > > > > On 07/03/2021 16:54, Matthew Knepley wrote: > > > On Sun, Mar 7, 2021 at 8:52 AM Nicolas Barral > > > > > > > > >> > > > > > > > > >>>> wrote: > > > > > > Matt, > > > > > > Thanks for your answer. > > > > > > However, DMPlexComputeCellGeometryFVM does not compute > what I > > need > > > (normals of height 1 entities). I can't find any > function doing > > > that, is > > > there one ? > > > > > > > > > The normal[] in DMPlexComputeCellGeometryFVM() is exactly what > > you want. > > > What does not look right to you? > > > > > > So it turns out it's not what I want because I need > non-normalized > > normals. It doesn't seem like I can easily retrieve the norm, > can I? > > > > > > You just want area-weighted normals I think, which means that you > just > > multiply by the area, > > which comes back in the same function. > > > > Ah by the area times 2, of course, my bad. > Do you order height-1 elements in a certain way ? I need to access the > facet (resp. edge) opposite to a vertex in a tet (resp. triangle). > > > Yes. Now that I have pretty much settled on it, I will put it in the > manual. It is currently here: > > https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexinterpolate.c#L56 > > All normals are outward facing, but hopefully the ordering in the sourse > file makes sense. Thanks Matt, but I'm not sure I understand well. What I do so far is: ierr = DMPlexGetCone(dm, c, &cone);CHKERRQ(ierr); for (i=0; i= 0 ? 1 : -1; ierr = DMPlexComputeCellGeometryFVM(dm, f, &area, NULL, &vn[i*dim]);CHKERRQ(ierr); if (dim == 3) { area *= 2; } for (j=0; j > Thanks, > > Matt > > Thanks > > -- > Nicolas > > > > Thanks, > > > > Matt > > > > If not, I'll fallback to computing them by hand for now. Is the > > following assumption safe or do I have to use > DMPlexGetOrientedFace? > > > if I call P0P1P2P3 a tet and note x the cross product, > > > P3P2xP3P1 is the outward normal to face P1P2P3 > > > P0P2xP0P3 " P0P2P3 > > > P3P1xP3P0 " P0P1P3 > > > P0P1xP0P2 " P0P1P2 > > > > Thanks > > > > -- > > Nicolas > > > > > > Thanks, > > > > > > Matt > > > > > > So far I've been doing it by hand, and after a lot of > > experimenting the > > > past weeks, it seems that if I call P0P1P2P3 a tetrahedron > > and note x > > > the cross product, > > > P3P2xP3P1 is the outward normal to face P1P2P3 > > > P0P2xP0P3 " P0P2P3 > > > P3P1xP3P0 " P0P1P3 > > > P0P1xP0P2 " P0P1P2 > > > Have I been lucky but can't expect it to be true ? > > > > > > (Alternatively, there is a link between the normals > and the > > element > > > Jacobian, but I don't know the formula and can find them) > > > > > > > > > Thanks, > > > > > > -- > > > Nicolas > > > > > > On 08/02/2021 15:19, Matthew Knepley wrote: > > > > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral > > > > > > > > > >> > > > > > > > > >>> > > > > > > > > > >> > > > > > > > > >>>>> wrote: > > > > > > > > Hi all, > > > > > > > > Can I make any assumption on the orientation of > triangular > > > facets in a > > > > tetrahedral plex ? I need the inward facet > normals. Do > > I need > > > to use > > > > DMPlexGetOrientedFace or can I rely on either > the tet > > vertices > > > > ordering, > > > > or the faces ordering ? Could > > DMPlexGetRawFaces_Internal be > > > enough ? > > > > > > > > > > > > You can do it by hand, but you have to account for > the face > > > orientation > > > > relative to the cell. That is what > > > > DMPlexGetOrientedFace() does. I think it would be > easier > > to use the > > > > function below. > > > > > > > > Alternatively, is there a function that > computes the > > normals > > > - without > > > > bringing out the big guns ? > > > > > > > > > > > > This will compute the normals > > > > > > > > > > > > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html > > > > Should not be too heavy weight. > > > > > > > > THanks, > > > > > > > > Matt > > > > > > > > Thanks > > > > > > > > -- > > > > Nicolas > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before > they begin > > their > > > > experiments is infinitely more interesting than any > > results to which > > > > their experiments lead. > > > > -- Norbert Wiener > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin > their > > > experiments is infinitely more interesting than any > results to which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 8 09:11:01 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 8 Mar 2021 10:11:01 -0500 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: References: <8f788afe-01fd-98db-957c-450cd43a18a8@math.u-bordeaux.fr> <1c4a69aa-ff26-848c-770d-89756a87be66@math.u-bordeaux.fr> <42072e8c-24de-ec4a-ee8e-7a9356f224bc@math.u-bordeaux.fr> Message-ID: On Mon, Mar 8, 2021 at 10:08 AM Isaac, Tobin G wrote: > What Nicolas wants is pretty common in DG, and the quantity is available > as just the cross product of the two vectors of the facet Jacobian. > Computing it the way you suggest is kind of a backward reconstruction. > I do not quite understand. The facet Jacobian will not know what the orientation with respect to the cell should be, and that is what he wants. Thanks, Matt > Toby Isaac, Assistant Professor, GTCSE > > ------------------------------ > *From:* petsc-users on behalf of > Matthew Knepley > *Sent:* Monday, March 8, 2021, 09:56 > *To:* Nicolas Barral > *Cc:* PETSc > *Subject:* Re: [petsc-users] DMPlex tetrahedra facets orientation > > On Mon, Mar 8, 2021 at 4:02 AM Nicolas Barral < > nicolas.barral at math.u-bordeaux.fr> wrote: > >> On 07/03/2021 22:56, Matthew Knepley wrote: >> > On Sun, Mar 7, 2021 at 4:51 PM Nicolas Barral >> > > > > wrote: >> > >> > >> > On 07/03/2021 22:30, Matthew Knepley wrote: >> > > On Sun, Mar 7, 2021 at 4:13 PM Nicolas Barral >> > > > > >> > > > > >> wrote: >> > > >> > > On 07/03/2021 16:54, Matthew Knepley wrote: >> > > > On Sun, Mar 7, 2021 at 8:52 AM Nicolas Barral >> > > > > > >> > > > > > >> > > > > > >> > > > > >>> wrote: >> > > > >> > > > Matt, >> > > > >> > > > Thanks for your answer. >> > > > >> > > > However, DMPlexComputeCellGeometryFVM does not compute >> > what I >> > > need >> > > > (normals of height 1 entities). I can't find any >> > function doing >> > > > that, is >> > > > there one ? >> > > > >> > > > >> > > > The normal[] in DMPlexComputeCellGeometryFVM() is exactly >> what >> > > you want. >> > > > What does not look right to you? >> > > >> > > >> > > So it turns out it's not what I want because I need >> > non-normalized >> > > normals. It doesn't seem like I can easily retrieve the norm, >> > can I? >> > > >> > > >> > > You just want area-weighted normals I think, which means that you >> > just >> > > multiply by the area, >> > > which comes back in the same function. >> > > >> > >> > Ah by the area times 2, of course, my bad. >> > Do you order height-1 elements in a certain way ? I need to access >> the >> > facet (resp. edge) opposite to a vertex in a tet (resp. triangle). >> > >> > >> > Yes. Now that I have pretty much settled on it, I will put it in the >> > manual. It is currently here: >> > >> > >> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexinterpolate.c#L56 >> >> > >> > All normals are outward facing, but hopefully the ordering in the >> sourse >> > file makes sense. >> >> Thanks Matt, but I'm not sure I understand well. What I do so far is: >> >> ierr = DMPlexGetCone(dm, c, &cone);CHKERRQ(ierr); >> for (i=0; i> f = cone[i]; >> ierr = DMPlexComputeCellGeometryFVM(dm, f, &area, NULL, >> &vn[i*dim]);CHKERRQ(ierr); >> if (dim == 3) { area *= 2; } >> for (j=0; j> >> So in 3D, it seems that: >> (vn[9],vn[10],vn[11]) is the inward normal to the facet opposing vertex0 >> (vn[6],vn[7],vn[8]) " " 1 >> (vn[3],vn[4],vn[5]) " " 2 >> (vn[0],vn[1],vn[2]) " " 3 >> >> in 2D: >> (vn[2],vn[3]) is a normal to the edge opposing vertex 0 >> (vn[4],vn[5]) " " 1 >> (vn[0],vn[1]) " " 2 >> Yet in 2D, whether the normals are inward or outward does not seem >> consistent across elements. >> >> What am I wrongly assuming ? >> > > Ah, I see the problem. I probably need another function. You can tell that > not many people use Plex this way yet. > The logic for what you want is embedded my traversal, but it simple: > > ierr = DMPlexGetConeSize(dm, c, &coneSize);CHKERRQ(ierr); > ierr = DMPlexGetCone(dm, c, &cone);CHKERRQ(ierr); > ierr = DMPlexGetConeOrientation(dm, c, &ornt);CHKERRQ(ierr); > for (i=0; i f = cone[i]; > flip = ornt[i] >= 0 ? 1 : -1; > ierr = DMPlexComputeCellGeometryFVM(dm, f, &area, NULL, > &vn[i*dim]);CHKERRQ(ierr); > if (dim == 3) { area *= 2; } > for (j=0; j > I could make a function that returns all normals, properly oriented. It > would just do this. > > Thanks, > > Matt > > Thanks, >> >> -- >> Nicolas >> >> > >> > Thanks, >> > >> > Matt >> > >> > Thanks >> > >> > -- >> > Nicolas >> > >> > >> > > Thanks, >> > > >> > > Matt >> > > >> > > If not, I'll fallback to computing them by hand for now. Is >> the >> > > following assumption safe or do I have to use >> > DMPlexGetOrientedFace? >> > > > if I call P0P1P2P3 a tet and note x the cross product, >> > > > P3P2xP3P1 is the outward normal to face P1P2P3 >> > > > P0P2xP0P3 " P0P2P3 >> > > > P3P1xP3P0 " P0P1P3 >> > > > P0P1xP0P2 " P0P1P2 >> > > >> > > Thanks >> > > >> > > -- >> > > Nicolas >> > > > >> > > > Thanks, >> > > > >> > > > Matt >> > > > >> > > > So far I've been doing it by hand, and after a lot of >> > > experimenting the >> > > > past weeks, it seems that if I call P0P1P2P3 a >> tetrahedron >> > > and note x >> > > > the cross product, >> > > > P3P2xP3P1 is the outward normal to face P1P2P3 >> > > > P0P2xP0P3 " P0P2P3 >> > > > P3P1xP3P0 " P0P1P3 >> > > > P0P1xP0P2 " P0P1P2 >> > > > Have I been lucky but can't expect it to be true ? >> > > > >> > > > (Alternatively, there is a link between the normals >> > and the >> > > element >> > > > Jacobian, but I don't know the formula and can find >> them) >> > > > >> > > > >> > > > Thanks, >> > > > >> > > > -- >> > > > Nicolas >> > > > >> > > > On 08/02/2021 15:19, Matthew Knepley wrote: >> > > > > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral >> > > > > > > >> > > > > > >> > > > > > >> > > > > >> >> > > > > > > >> > > > > > >> > > > > > >> > > > > >>>> wrote: >> > > > > >> > > > > Hi all, >> > > > > >> > > > > Can I make any assumption on the orientation of >> > triangular >> > > > facets in a >> > > > > tetrahedral plex ? I need the inward facet >> > normals. Do >> > > I need >> > > > to use >> > > > > DMPlexGetOrientedFace or can I rely on either >> > the tet >> > > vertices >> > > > > ordering, >> > > > > or the faces ordering ? Could >> > > DMPlexGetRawFaces_Internal be >> > > > enough ? >> > > > > >> > > > > >> > > > > You can do it by hand, but you have to account for >> > the face >> > > > orientation >> > > > > relative to the cell. That is what >> > > > > DMPlexGetOrientedFace() does. I think it would be >> > easier >> > > to use the >> > > > > function below. >> > > > > >> > > > > Alternatively, is there a function that >> > computes the >> > > normals >> > > > - without >> > > > > bringing out the big guns ? >> > > > > >> > > > > >> > > > > This will compute the normals >> > > > > >> > > > > >> > > > >> > > >> > >> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html >> >> > > > > Should not be too heavy weight. >> > > > > >> > > > > THanks, >> > > > > >> > > > > Matt >> > > > > >> > > > > Thanks >> > > > > >> > > > > -- >> > > > > Nicolas >> > > > > >> > > > > >> > > > > >> > > > > -- >> > > > > What most experimenters take for granted before >> > they begin >> > > their >> > > > > experiments is infinitely more interesting than any >> > > results to which >> > > > > their experiments lead. >> > > > > -- Norbert Wiener >> > > > > >> > > > > https://www.cse.buffalo.edu/~knepley/ >> >> > > > > >> > >> > > > >> > > > >> > > > >> > > > -- >> > > > What most experimenters take for granted before they begin >> > their >> > > > experiments is infinitely more interesting than any >> > results to which >> > > > their experiments lead. >> > > > -- Norbert Wiener >> > > > >> > > > https://www.cse.buffalo.edu/~knepley/ >> >> > > > >> > >> > > >> > > >> > > >> > > -- >> > > What most experimenters take for granted before they begin their >> > > experiments is infinitely more interesting than any results to >> which >> > > their experiments lead. >> > > -- Norbert Wiener >> > > >> > > https://www.cse.buffalo.edu/~knepley/ >> >> > > >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> > experiments is infinitely more interesting than any results to which >> > their experiments lead. >> > -- Norbert Wiener >> > >> > https://www.cse.buffalo.edu/~knepley/ >> >> > >> > >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tisaac at cc.gatech.edu Mon Mar 8 09:21:33 2021 From: tisaac at cc.gatech.edu (Isaac, Tobin G) Date: Mon, 8 Mar 2021 15:21:33 +0000 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: References: <8f788afe-01fd-98db-957c-450cd43a18a8@math.u-bordeaux.fr> <1c4a69aa-ff26-848c-770d-89756a87be66@math.u-bordeaux.fr> <42072e8c-24de-ec4a-ee8e-7a9356f224bc@math.u-bordeaux.fr> , Message-ID: Right that is what is needed from each side, but taking area and vn and scaling them back together is just undoing a computation that happens inside of cellgeometryfvm. I'm saying that long term, in the same way that for FEM we provide different callbacks that are intended for contraction with v and with grad v, we should have callbacks that separate scalar and vector fluxes so that unit vs scaled normal is a hidden quadrature detail. Toby Isaac, Assistant Professor, GTCSE ________________________________ From: Matthew Knepley Sent: Monday, March 8, 2021, 10:11 To: Isaac, Tobin G Cc: Nicolas Barral; PETSc Subject: Re: [petsc-users] DMPlex tetrahedra facets orientation On Mon, Mar 8, 2021 at 10:08 AM Isaac, Tobin G > wrote: What Nicolas wants is pretty common in DG, and the quantity is available as just the cross product of the two vectors of the facet Jacobian. Computing it the way you suggest is kind of a backward reconstruction. I do not quite understand. The facet Jacobian will not know what the orientation with respect to the cell should be, and that is what he wants. Thanks, Matt Toby Isaac, Assistant Professor, GTCSE ________________________________ From: petsc-users > on behalf of Matthew Knepley > Sent: Monday, March 8, 2021, 09:56 To: Nicolas Barral Cc: PETSc Subject: Re: [petsc-users] DMPlex tetrahedra facets orientation On Mon, Mar 8, 2021 at 4:02 AM Nicolas Barral > wrote: On 07/03/2021 22:56, Matthew Knepley wrote: > On Sun, Mar 7, 2021 at 4:51 PM Nicolas Barral > > >> wrote: > > > On 07/03/2021 22:30, Matthew Knepley wrote: > > On Sun, Mar 7, 2021 at 4:13 PM Nicolas Barral > > > > > > > >>> wrote: > > > > On 07/03/2021 16:54, Matthew Knepley wrote: > > > On Sun, Mar 7, 2021 at 8:52 AM Nicolas Barral > > > > > > > > >> > > > > > > > > >>>> wrote: > > > > > > Matt, > > > > > > Thanks for your answer. > > > > > > However, DMPlexComputeCellGeometryFVM does not compute > what I > > need > > > (normals of height 1 entities). I can't find any > function doing > > > that, is > > > there one ? > > > > > > > > > The normal[] in DMPlexComputeCellGeometryFVM() is exactly what > > you want. > > > What does not look right to you? > > > > > > So it turns out it's not what I want because I need > non-normalized > > normals. It doesn't seem like I can easily retrieve the norm, > can I? > > > > > > You just want area-weighted normals I think, which means that you > just > > multiply by the area, > > which comes back in the same function. > > > > Ah by the area times 2, of course, my bad. > Do you order height-1 elements in a certain way ? I need to access the > facet (resp. edge) opposite to a vertex in a tet (resp. triangle). > > > Yes. Now that I have pretty much settled on it, I will put it in the > manual. It is currently here: > > https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexinterpolate.c#L56 > > All normals are outward facing, but hopefully the ordering in the sourse > file makes sense. Thanks Matt, but I'm not sure I understand well. What I do so far is: ierr = DMPlexGetCone(dm, c, &cone);CHKERRQ(ierr); for (i=0; i= 0 ? 1 : -1; ierr = DMPlexComputeCellGeometryFVM(dm, f, &area, NULL, &vn[i*dim]);CHKERRQ(ierr); if (dim == 3) { area *= 2; } for (j=0; j > Thanks, > > Matt > > Thanks > > -- > Nicolas > > > > Thanks, > > > > Matt > > > > If not, I'll fallback to computing them by hand for now. Is the > > following assumption safe or do I have to use > DMPlexGetOrientedFace? > > > if I call P0P1P2P3 a tet and note x the cross product, > > > P3P2xP3P1 is the outward normal to face P1P2P3 > > > P0P2xP0P3 " P0P2P3 > > > P3P1xP3P0 " P0P1P3 > > > P0P1xP0P2 " P0P1P2 > > > > Thanks > > > > -- > > Nicolas > > > > > > Thanks, > > > > > > Matt > > > > > > So far I've been doing it by hand, and after a lot of > > experimenting the > > > past weeks, it seems that if I call P0P1P2P3 a tetrahedron > > and note x > > > the cross product, > > > P3P2xP3P1 is the outward normal to face P1P2P3 > > > P0P2xP0P3 " P0P2P3 > > > P3P1xP3P0 " P0P1P3 > > > P0P1xP0P2 " P0P1P2 > > > Have I been lucky but can't expect it to be true ? > > > > > > (Alternatively, there is a link between the normals > and the > > element > > > Jacobian, but I don't know the formula and can find them) > > > > > > > > > Thanks, > > > > > > -- > > > Nicolas > > > > > > On 08/02/2021 15:19, Matthew Knepley wrote: > > > > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral > > > > > > > > > >> > > > > > > > > >>> > > > > > > > > > >> > > > > > > > > >>>>> wrote: > > > > > > > > Hi all, > > > > > > > > Can I make any assumption on the orientation of > triangular > > > facets in a > > > > tetrahedral plex ? I need the inward facet > normals. Do > > I need > > > to use > > > > DMPlexGetOrientedFace or can I rely on either > the tet > > vertices > > > > ordering, > > > > or the faces ordering ? Could > > DMPlexGetRawFaces_Internal be > > > enough ? > > > > > > > > > > > > You can do it by hand, but you have to account for > the face > > > orientation > > > > relative to the cell. That is what > > > > DMPlexGetOrientedFace() does. I think it would be > easier > > to use the > > > > function below. > > > > > > > > Alternatively, is there a function that > computes the > > normals > > > - without > > > > bringing out the big guns ? > > > > > > > > > > > > This will compute the normals > > > > > > > > > > > > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html > > > > Should not be too heavy weight. > > > > > > > > THanks, > > > > > > > > Matt > > > > > > > > Thanks > > > > > > > > -- > > > > Nicolas > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before > they begin > > their > > > > experiments is infinitely more interesting than any > > results to which > > > > their experiments lead. > > > > -- Norbert Wiener > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin > their > > > experiments is infinitely more interesting than any > results to which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 8 09:27:09 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 8 Mar 2021 10:27:09 -0500 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: References: <8f788afe-01fd-98db-957c-450cd43a18a8@math.u-bordeaux.fr> <1c4a69aa-ff26-848c-770d-89756a87be66@math.u-bordeaux.fr> <42072e8c-24de-ec4a-ee8e-7a9356f224bc@math.u-bordeaux.fr> Message-ID: On Mon, Mar 8, 2021 at 10:21 AM Isaac, Tobin G wrote: > Right that is what is needed from each side, but taking area and vn and > scaling them back together is just undoing a computation that happens > inside of cellgeometryfvm. I'm saying that long term, in the same way that > for FEM we provide different callbacks that are intended for contraction > with v and with grad v, we should have callbacks that separate scalar and > vector fluxes so that unit vs scaled normal is a hidden quadrature detail. > Oh, yes I agree. The scaling was just a stopgap. Matt > Toby Isaac, Assistant Professor, GTCSE > > ------------------------------ > *From:* Matthew Knepley > *Sent:* Monday, March 8, 2021, 10:11 > *To:* Isaac, Tobin G > *Cc:* Nicolas Barral; PETSc > *Subject:* Re: [petsc-users] DMPlex tetrahedra facets orientation > > On Mon, Mar 8, 2021 at 10:08 AM Isaac, Tobin G > wrote: > >> What Nicolas wants is pretty common in DG, and the quantity is available >> as just the cross product of the two vectors of the facet Jacobian. >> Computing it the way you suggest is kind of a backward reconstruction. >> > > I do not quite understand. The facet Jacobian will not know what the > orientation with respect to the cell should be, and that is what he wants. > > Thanks, > > Matt > > >> Toby Isaac, Assistant Professor, GTCSE >> >> ------------------------------ >> *From:* petsc-users on behalf of >> Matthew Knepley >> *Sent:* Monday, March 8, 2021, 09:56 >> *To:* Nicolas Barral >> *Cc:* PETSc >> *Subject:* Re: [petsc-users] DMPlex tetrahedra facets orientation >> >> On Mon, Mar 8, 2021 at 4:02 AM Nicolas Barral < >> nicolas.barral at math.u-bordeaux.fr> wrote: >> >>> On 07/03/2021 22:56, Matthew Knepley wrote: >>> > On Sun, Mar 7, 2021 at 4:51 PM Nicolas Barral >>> > >> > > wrote: >>> > >>> > >>> > On 07/03/2021 22:30, Matthew Knepley wrote: >>> > > On Sun, Mar 7, 2021 at 4:13 PM Nicolas Barral >>> > > >> > >>> > > >> > >> wrote: >>> > > >>> > > On 07/03/2021 16:54, Matthew Knepley wrote: >>> > > > On Sun, Mar 7, 2021 at 8:52 AM Nicolas Barral >>> > > > >> > >>> > > >> > > >>> > > > >> > >>> > > >> > >>> wrote: >>> > > > >>> > > > Matt, >>> > > > >>> > > > Thanks for your answer. >>> > > > >>> > > > However, DMPlexComputeCellGeometryFVM does not >>> compute >>> > what I >>> > > need >>> > > > (normals of height 1 entities). I can't find any >>> > function doing >>> > > > that, is >>> > > > there one ? >>> > > > >>> > > > >>> > > > The normal[] in DMPlexComputeCellGeometryFVM() is >>> exactly what >>> > > you want. >>> > > > What does not look right to you? >>> > > >>> > > >>> > > So it turns out it's not what I want because I need >>> > non-normalized >>> > > normals. It doesn't seem like I can easily retrieve the >>> norm, >>> > can I? >>> > > >>> > > >>> > > You just want area-weighted normals I think, which means that >>> you >>> > just >>> > > multiply by the area, >>> > > which comes back in the same function. >>> > > >>> > >>> > Ah by the area times 2, of course, my bad. >>> > Do you order height-1 elements in a certain way ? I need to access >>> the >>> > facet (resp. edge) opposite to a vertex in a tet (resp. triangle). >>> > >>> > >>> > Yes. Now that I have pretty much settled on it, I will put it in the >>> > manual. It is currently here: >>> > >>> > >>> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexinterpolate.c#L56 >>> >>> > >>> > All normals are outward facing, but hopefully the ordering in the >>> sourse >>> > file makes sense. >>> >>> Thanks Matt, but I'm not sure I understand well. What I do so far is: >>> >>> ierr = DMPlexGetCone(dm, c, &cone);CHKERRQ(ierr); >>> for (i=0; i>> f = cone[i]; >>> ierr = DMPlexComputeCellGeometryFVM(dm, f, &area, NULL, >>> &vn[i*dim]);CHKERRQ(ierr); >>> if (dim == 3) { area *= 2; } >>> for (j=0; j>> >>> So in 3D, it seems that: >>> (vn[9],vn[10],vn[11]) is the inward normal to the facet opposing vertex0 >>> (vn[6],vn[7],vn[8]) " " 1 >>> (vn[3],vn[4],vn[5]) " " 2 >>> (vn[0],vn[1],vn[2]) " " 3 >>> >>> in 2D: >>> (vn[2],vn[3]) is a normal to the edge opposing vertex 0 >>> (vn[4],vn[5]) " " 1 >>> (vn[0],vn[1]) " " 2 >>> Yet in 2D, whether the normals are inward or outward does not seem >>> consistent across elements. >>> >>> What am I wrongly assuming ? >>> >> >> Ah, I see the problem. I probably need another function. You can tell >> that not many people use Plex this way yet. >> The logic for what you want is embedded my traversal, but it simple: >> >> ierr = DMPlexGetConeSize(dm, c, &coneSize);CHKERRQ(ierr); >> ierr = DMPlexGetCone(dm, c, &cone);CHKERRQ(ierr); >> ierr = DMPlexGetConeOrientation(dm, c, &ornt);CHKERRQ(ierr); >> for (i=0; i> f = cone[i]; >> flip = ornt[i] >= 0 ? 1 : -1; >> ierr = DMPlexComputeCellGeometryFVM(dm, f, &area, NULL, >> &vn[i*dim]);CHKERRQ(ierr); >> if (dim == 3) { area *= 2; } >> for (j=0; j> >> I could make a function that returns all normals, properly oriented. It >> would just do this. >> >> Thanks, >> >> Matt >> >> Thanks, >>> >>> -- >>> Nicolas >>> >>> > >>> > Thanks, >>> > >>> > Matt >>> > >>> > Thanks >>> > >>> > -- >>> > Nicolas >>> > >>> > >>> > > Thanks, >>> > > >>> > > Matt >>> > > >>> > > If not, I'll fallback to computing them by hand for now. Is >>> the >>> > > following assumption safe or do I have to use >>> > DMPlexGetOrientedFace? >>> > > > if I call P0P1P2P3 a tet and note x the cross product, >>> > > > P3P2xP3P1 is the outward normal to face P1P2P3 >>> > > > P0P2xP0P3 " P0P2P3 >>> > > > P3P1xP3P0 " P0P1P3 >>> > > > P0P1xP0P2 " P0P1P2 >>> > > >>> > > Thanks >>> > > >>> > > -- >>> > > Nicolas >>> > > > >>> > > > Thanks, >>> > > > >>> > > > Matt >>> > > > >>> > > > So far I've been doing it by hand, and after a lot of >>> > > experimenting the >>> > > > past weeks, it seems that if I call P0P1P2P3 a >>> tetrahedron >>> > > and note x >>> > > > the cross product, >>> > > > P3P2xP3P1 is the outward normal to face P1P2P3 >>> > > > P0P2xP0P3 " P0P2P3 >>> > > > P3P1xP3P0 " P0P1P3 >>> > > > P0P1xP0P2 " P0P1P2 >>> > > > Have I been lucky but can't expect it to be true ? >>> > > > >>> > > > (Alternatively, there is a link between the normals >>> > and the >>> > > element >>> > > > Jacobian, but I don't know the formula and can find >>> them) >>> > > > >>> > > > >>> > > > Thanks, >>> > > > >>> > > > -- >>> > > > Nicolas >>> > > > >>> > > > On 08/02/2021 15:19, Matthew Knepley wrote: >>> > > > > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral >>> > > > > >> > >>> > > >> > > >>> > > > >> > >>> > > >> > >> >>> > > > > >> > >>> > > >> > > >>> > > > >> > >>> > > >> > >>>> wrote: >>> > > > > >>> > > > > Hi all, >>> > > > > >>> > > > > Can I make any assumption on the orientation >>> of >>> > triangular >>> > > > facets in a >>> > > > > tetrahedral plex ? I need the inward facet >>> > normals. Do >>> > > I need >>> > > > to use >>> > > > > DMPlexGetOrientedFace or can I rely on either >>> > the tet >>> > > vertices >>> > > > > ordering, >>> > > > > or the faces ordering ? Could >>> > > DMPlexGetRawFaces_Internal be >>> > > > enough ? >>> > > > > >>> > > > > >>> > > > > You can do it by hand, but you have to account for >>> > the face >>> > > > orientation >>> > > > > relative to the cell. That is what >>> > > > > DMPlexGetOrientedFace() does. I think it would be >>> > easier >>> > > to use the >>> > > > > function below. >>> > > > > >>> > > > > Alternatively, is there a function that >>> > computes the >>> > > normals >>> > > > - without >>> > > > > bringing out the big guns ? >>> > > > > >>> > > > > >>> > > > > This will compute the normals >>> > > > > >>> > > > > >>> > > > >>> > > >>> > >>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html >>> >>> > > > > Should not be too heavy weight. >>> > > > > >>> > > > > THanks, >>> > > > > >>> > > > > Matt >>> > > > > >>> > > > > Thanks >>> > > > > >>> > > > > -- >>> > > > > Nicolas >>> > > > > >>> > > > > >>> > > > > >>> > > > > -- >>> > > > > What most experimenters take for granted before >>> > they begin >>> > > their >>> > > > > experiments is infinitely more interesting than >>> any >>> > > results to which >>> > > > > their experiments lead. >>> > > > > -- Norbert Wiener >>> > > > > >>> > > > > https://www.cse.buffalo.edu/~knepley/ >>> >>> > > > >> >>> > >>> > > > >>> > > > >>> > > > >>> > > > -- >>> > > > What most experimenters take for granted before they >>> begin >>> > their >>> > > > experiments is infinitely more interesting than any >>> > results to which >>> > > > their experiments lead. >>> > > > -- Norbert Wiener >>> > > > >>> > > > https://www.cse.buffalo.edu/~knepley/ >>> >>> > > >> >>> > >>> > > >>> > > >>> > > >>> > > -- >>> > > What most experimenters take for granted before they begin their >>> > > experiments is infinitely more interesting than any results to >>> which >>> > > their experiments lead. >>> > > -- Norbert Wiener >>> > > >>> > > https://www.cse.buffalo.edu/~knepley/ >>> >>> > >> >>> > >>> > >>> > >>> > >>> > -- >>> > What most experimenters take for granted before they begin their >>> > experiments is infinitely more interesting than any results to which >>> > their experiments lead. >>> > -- Norbert Wiener >>> > >>> > https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >>> > >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Mon Mar 8 10:18:22 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Mon, 8 Mar 2021 17:18:22 +0100 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: References: <8f788afe-01fd-98db-957c-450cd43a18a8@math.u-bordeaux.fr> <1c4a69aa-ff26-848c-770d-89756a87be66@math.u-bordeaux.fr> <42072e8c-24de-ec4a-ee8e-7a9356f224bc@math.u-bordeaux.fr> Message-ID: <3b1b02d9-bf81-74b0-879a-eba7f6965574@math.u-bordeaux.fr> On 08/03/2021 15:55, Matthew Knepley wrote: > On Mon, Mar 8, 2021 at 4:02 AM Nicolas Barral > > wrote: > > On 07/03/2021 22:56, Matthew Knepley wrote: > > On Sun, Mar 7, 2021 at 4:51 PM Nicolas Barral > > > > >> wrote: > > > > > >? ? ?On 07/03/2021 22:30, Matthew Knepley wrote: > >? ? ? > On Sun, Mar 7, 2021 at 4:13 PM Nicolas Barral > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? >>> wrote: > >? ? ? > > >? ? ? >? ? ?On 07/03/2021 16:54, Matthew Knepley wrote: > >? ? ? >? ? ? > On Sun, Mar 7, 2021 at 8:52 AM Nicolas Barral > >? ? ? >? ? ? > > >? ? ? > > >? ? ? >? ? ? > >? ? ? >> > >? ? ? >? ? ? > > >? ? ? > > >? ? ? >? ? ? > >? ? ? >>>> wrote: > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Matt, > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Thanks for your answer. > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?However, DMPlexComputeCellGeometryFVM does not > compute > >? ? ?what I > >? ? ? >? ? ?need > >? ? ? >? ? ? >? ? ?(normals of height 1 entities). I can't find any > >? ? ?function doing > >? ? ? >? ? ? >? ? ?that, is > >? ? ? >? ? ? >? ? ?there one ? > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > The normal[] in?DMPlexComputeCellGeometryFVM() is > exactly what > >? ? ? >? ? ?you want. > >? ? ? >? ? ? > What does not look right to you? > >? ? ? > > >? ? ? > > >? ? ? >? ? ?So it turns out it's not what I want because I need > >? ? ?non-normalized > >? ? ? >? ? ?normals. It doesn't seem like I can easily retrieve > the norm, > >? ? ?can I? > >? ? ? > > >? ? ? > > >? ? ? > You just want area-weighted normals I think, which?means > that you > >? ? ?just > >? ? ? > multiply by the area, > >? ? ? > which comes back in the same function. > >? ? ? > > > > >? ? ?Ah by the area times 2, of course, my bad. > >? ? ?Do you order height-1 elements in a certain way ? I need to > access the > >? ? ?facet (resp. edge) opposite to a vertex in a tet (resp. > triangle). > > > > > > Yes. Now that I have pretty much settled on it, I will put it in the > > manual. It is currently here: > > > > > https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexinterpolate.c#L56 > > > > All normals are outward facing, but hopefully the ordering in the > sourse > > file makes sense. > > Thanks Matt, but I'm not sure I understand well. What I do so far is: > > ierr = DMPlexGetCone(dm, c, &cone);CHKERRQ(ierr); > ? ?for (i=0; i ? ? ?f = cone[i]; > ? ? ?ierr = DMPlexComputeCellGeometryFVM(dm, f, &area, NULL, > &vn[i*dim]);CHKERRQ(ierr); > ? ? ?if (dim == 3) { area *= 2; } > ? ? ?for (j=0; j > So in 3D, it seems that: > (vn[9],vn[10],vn[11]) is the inward normal to the facet opposing vertex0 > (vn[6],vn[7],vn[8])? ? ? ? ? ? ?"? ? ? ? ? ? ? ? ? ? "? ? ? ? ? ? ? ? ?1 > (vn[3],vn[4],vn[5])? ? ? ? ? ? ?"? ? ? ? ? ? ? ? ? ? "? ? ? ? ? ? ? ? ?2 > (vn[0],vn[1],vn[2])? ? ? ? ? ? ?"? ? ? ? ? ? ? ? ? ? "? ? ? ? ? ? ? ? ?3 > > in 2D: > (vn[2],vn[3]) is a normal to the edge opposing vertex 0 > (vn[4],vn[5])? ? ? ? ? "? ? ? ? ? ? ? ? ? "? ? ? ? ? ?1 > (vn[0],vn[1])? ? ? ? ? "? ? ? ? ? ? ? ? ? "? ? ? ? ? ?2 > Yet in 2D, whether the normals are inward or outward does not seem > consistent across elements. > > What am I wrongly assuming ? > > > Ah, I see the problem. I probably need another function. You can tell > that not many people use Plex this way yet. > The logic for what you want is embedded my traversal, but it simple: > > ierr = DMPlexGetConeSize(dm, c, &coneSize);CHKERRQ(ierr); > ierr = DMPlexGetCone(dm, c, &cone);CHKERRQ(ierr); > ierr = DMPlexGetConeOrientation(dm, c, &ornt);CHKERRQ(ierr); > ? ?for (i=0; i ? ? ?f = cone[i]; > ? ? ?flip = ornt[i] >= 0 ? 1 : -1; > ? ? ?ierr = DMPlexComputeCellGeometryFVM(dm, f, &area, NULL, > &vn[i*dim]);CHKERRQ(ierr); > ? ? ?if (dim == 3) { area *= 2; } > ? ? ?for (j=0; j I could make a function that returns all normals, properly oriented. It > would just do this. Ah this works now, thanks Matt. Toby is correct, it is ultimately related to Jacobians, and what I need can be done differently, not sure it's clearer though. Out of curiosity, what is the logic in the facet ordering ? Thanks -- Nicolas > > ? Thanks, > > ? ? ?Matt > > Thanks, > > -- > Nicolas > > > > >? ? Thanks, > > > >? ? ? Matt > > > >? ? ?Thanks > > > >? ? ?-- > >? ? ?Nicolas > > > > > >? ? ? >? ? Thanks, > >? ? ? > > >? ? ? >? ? ? Matt > >? ? ? > > >? ? ? >? ? ?If not, I'll fallback to computing them by hand for > now. Is the > >? ? ? >? ? ?following assumption safe or do I have to use > >? ? ?DMPlexGetOrientedFace? > >? ? ? >? ? ? ?>? if I call P0P1P2P3 a tet and note x the cross > product, > >? ? ? >? ? ? ?>? P3P2xP3P1 is the outward normal to face P1P2P3 > >? ? ? >? ? ? ?>? P0P2xP0P3? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P2P3 > >? ? ? >? ? ? ?>? P3P1xP3P0? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P3 > >? ? ? >? ? ? ?>? P0P1xP0P2? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P2 > >? ? ? > > >? ? ? >? ? ?Thanks > >? ? ? > > >? ? ? >? ? ?-- > >? ? ? >? ? ?Nicolas > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? Thanks, > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? Matt > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?So far I've been doing it by hand, and after a > lot of > >? ? ? >? ? ?experimenting the > >? ? ? >? ? ? >? ? ?past weeks, it seems that if I call P0P1P2P3 a > tetrahedron > >? ? ? >? ? ?and note x > >? ? ? >? ? ? >? ? ?the cross product, > >? ? ? >? ? ? >? ? ?P3P2xP3P1 is the outward normal to face P1P2P3 > >? ? ? >? ? ? >? ? ?P0P2xP0P3? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P2P3 > >? ? ? >? ? ? >? ? ?P3P1xP3P0? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P3 > >? ? ? >? ? ? >? ? ?P0P1xP0P2? ? ? ? ? ? ? "? ? ? ? ? ? ? ? P0P1P2 > >? ? ? >? ? ? >? ? ?Have I been lucky but can't expect it to be true ? > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?(Alternatively, there is a link between the normals > >? ? ?and the > >? ? ? >? ? ?element > >? ? ? >? ? ? >? ? ?Jacobian, but I don't know the formula and can > find them) > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Thanks, > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?-- > >? ? ? >? ? ? >? ? ?Nicolas > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?On 08/02/2021 15:19, Matthew Knepley wrote: > >? ? ? >? ? ? >? ? ? > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral > >? ? ? >? ? ? >? ? ? > > >? ? ? > > >? ? ? >? ? ? > >? ? ? >> > >? ? ? >? ? ? >? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >>> > >? ? ? >? ? ? >? ? ? > > >? ? ? > > >? ? ? >? ? ? > >? ? ? >> > >? ? ? >? ? ? >? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >>>>> wrote: > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?Hi all, > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?Can I make any assumption on the > orientation of > >? ? ?triangular > >? ? ? >? ? ? >? ? ?facets in a > >? ? ? >? ? ? >? ? ? >? ? ?tetrahedral plex ? I need the inward facet > >? ? ?normals. Do > >? ? ? >? ? ?I need > >? ? ? >? ? ? >? ? ?to use > >? ? ? >? ? ? >? ? ? >? ? ?DMPlexGetOrientedFace or can I rely on > either > >? ? ?the tet > >? ? ? >? ? ?vertices > >? ? ? >? ? ? >? ? ? >? ? ?ordering, > >? ? ? >? ? ? >? ? ? >? ? ?or the faces ordering ? Could > >? ? ? >? ? ?DMPlexGetRawFaces_Internal be > >? ? ? >? ? ? >? ? ?enough ? > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > You can do it by hand, but you have to > account for > >? ? ?the face > >? ? ? >? ? ? >? ? ?orientation > >? ? ? >? ? ? >? ? ? > relative to the cell. That is what > >? ? ? >? ? ? >? ? ? > DMPlexGetOrientedFace() does. I think it > would be > >? ? ?easier > >? ? ? >? ? ?to use the > >? ? ? >? ? ? >? ? ? > function below. > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?Alternatively, is there a function that > >? ? ?computes the > >? ? ? >? ? ?normals > >? ? ? >? ? ? >? ? ?- without > >? ? ? >? ? ? >? ? ? >? ? ?bringing out the big guns ? > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > This will compute the normals > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? > > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html > >? ? ? >? ? ? >? ? ? > Should not be too heavy weight. > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? THanks, > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? Matt > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?Thanks > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?-- > >? ? ? >? ? ? >? ? ? >? ? ?Nicolas > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > -- > >? ? ? >? ? ? >? ? ? > What most experimenters take for granted before > >? ? ?they begin > >? ? ? >? ? ?their > >? ? ? >? ? ? >? ? ? > experiments is infinitely more interesting > than any > >? ? ? >? ? ?results to which > >? ? ? >? ? ? >? ? ? > their experiments lead. > >? ? ? >? ? ? >? ? ? > -- Norbert Wiener > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? >? ? ? >? ? ? > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > -- > >? ? ? >? ? ? > What most experimenters take for granted before > they begin > >? ? ?their > >? ? ? >? ? ? > experiments is infinitely more interesting than any > >? ? ?results to which > >? ? ? >? ? ? > their experiments lead. > >? ? ? >? ? ? > -- Norbert Wiener > >? ? ? >? ? ? > > >? ? ? >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? >? ? ? > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? > -- > >? ? ? > What most experimenters take for granted before they begin > their > >? ? ? > experiments is infinitely more interesting than any > results to which > >? ? ? > their experiments lead. > >? ? ? > -- Norbert Wiener > >? ? ? > > >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Mon Mar 8 12:22:39 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 8 Mar 2021 13:22:39 -0500 Subject: [petsc-users] DMPlex tetrahedra facets orientation In-Reply-To: <3b1b02d9-bf81-74b0-879a-eba7f6965574@math.u-bordeaux.fr> References: <8f788afe-01fd-98db-957c-450cd43a18a8@math.u-bordeaux.fr> <1c4a69aa-ff26-848c-770d-89756a87be66@math.u-bordeaux.fr> <42072e8c-24de-ec4a-ee8e-7a9356f224bc@math.u-bordeaux.fr> <3b1b02d9-bf81-74b0-879a-eba7f6965574@math.u-bordeaux.fr> Message-ID: On Mon, Mar 8, 2021 at 11:18 AM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > On 08/03/2021 15:55, Matthew Knepley wrote: > > On Mon, Mar 8, 2021 at 4:02 AM Nicolas Barral > > > > wrote: > > > > On 07/03/2021 22:56, Matthew Knepley wrote: > > > On Sun, Mar 7, 2021 at 4:51 PM Nicolas Barral > > > > > > > > >> wrote: > > > > > > > > > On 07/03/2021 22:30, Matthew Knepley wrote: > > > > On Sun, Mar 7, 2021 at 4:13 PM Nicolas Barral > > > > > > > > > > > > > > > > > > > >>> wrote: > > > > > > > > On 07/03/2021 16:54, Matthew Knepley wrote: > > > > > On Sun, Mar 7, 2021 at 8:52 AM Nicolas Barral > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >>>> wrote: > > > > > > > > > > Matt, > > > > > > > > > > Thanks for your answer. > > > > > > > > > > However, DMPlexComputeCellGeometryFVM does not > > compute > > > what I > > > > need > > > > > (normals of height 1 entities). I can't find any > > > function doing > > > > > that, is > > > > > there one ? > > > > > > > > > > > > > > > The normal[] in DMPlexComputeCellGeometryFVM() is > > exactly what > > > > you want. > > > > > What does not look right to you? > > > > > > > > > > > > So it turns out it's not what I want because I need > > > non-normalized > > > > normals. It doesn't seem like I can easily retrieve > > the norm, > > > can I? > > > > > > > > > > > > You just want area-weighted normals I think, which means > > that you > > > just > > > > multiply by the area, > > > > which comes back in the same function. > > > > > > > > > > Ah by the area times 2, of course, my bad. > > > Do you order height-1 elements in a certain way ? I need to > > access the > > > facet (resp. edge) opposite to a vertex in a tet (resp. > > triangle). > > > > > > > > > Yes. Now that I have pretty much settled on it, I will put it in > the > > > manual. It is currently here: > > > > > > > > > https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexinterpolate.c#L56 > > > > > > All normals are outward facing, but hopefully the ordering in the > > sourse > > > file makes sense. > > > > Thanks Matt, but I'm not sure I understand well. What I do so far is: > > > > ierr = DMPlexGetCone(dm, c, &cone);CHKERRQ(ierr); > > for (i=0; i > f = cone[i]; > > ierr = DMPlexComputeCellGeometryFVM(dm, f, &area, NULL, > > &vn[i*dim]);CHKERRQ(ierr); > > if (dim == 3) { area *= 2; } > > for (j=0; j > > > So in 3D, it seems that: > > (vn[9],vn[10],vn[11]) is the inward normal to the facet opposing > vertex0 > > (vn[6],vn[7],vn[8]) " " > 1 > > (vn[3],vn[4],vn[5]) " " > 2 > > (vn[0],vn[1],vn[2]) " " > 3 > > > > in 2D: > > (vn[2],vn[3]) is a normal to the edge opposing vertex 0 > > (vn[4],vn[5]) " " 1 > > (vn[0],vn[1]) " " 2 > > Yet in 2D, whether the normals are inward or outward does not seem > > consistent across elements. > > > > What am I wrongly assuming ? > > > > > > Ah, I see the problem. I probably need another function. You can tell > > that not many people use Plex this way yet. > > The logic for what you want is embedded my traversal, but it simple: > > > > ierr = DMPlexGetConeSize(dm, c, &coneSize);CHKERRQ(ierr); > > ierr = DMPlexGetCone(dm, c, &cone);CHKERRQ(ierr); > > ierr = DMPlexGetConeOrientation(dm, c, &ornt);CHKERRQ(ierr); > > for (i=0; i > f = cone[i]; > > flip = ornt[i] >= 0 ? 1 : -1; > > ierr = DMPlexComputeCellGeometryFVM(dm, f, &area, NULL, > > &vn[i*dim]);CHKERRQ(ierr); > > if (dim == 3) { area *= 2; } > > for (j=0; j > I could make a function that returns all normals, properly oriented. It > > would just do this. > > Ah this works now, thanks Matt. Toby is correct, it is ultimately > related to Jacobians, and what I need can be done differently, not sure > it's clearer though. > > Out of curiosity, what is the logic in the facet ordering ? > The order of faces in a cell was somewhat arbitrary. However, I wanted that select vertices from closure(cell) = vertices before interpolation of cell so the canonical orientation of face should have the vertices such that they give me the order of vertices I expect in cell-vertex meshes. This way Uninterpolate(Interpolate(dm)) is idempotent. Thanks, Matt > Thanks > > -- > Nicolas > > > > > Thanks, > > > > Matt > > > > Thanks, > > > > -- > > Nicolas > > > > > > > > Thanks, > > > > > > Matt > > > > > > Thanks > > > > > > -- > > > Nicolas > > > > > > > > > > Thanks, > > > > > > > > Matt > > > > > > > > If not, I'll fallback to computing them by hand for > > now. Is the > > > > following assumption safe or do I have to use > > > DMPlexGetOrientedFace? > > > > > if I call P0P1P2P3 a tet and note x the cross > > product, > > > > > P3P2xP3P1 is the outward normal to face P1P2P3 > > > > > P0P2xP0P3 " P0P2P3 > > > > > P3P1xP3P0 " P0P1P3 > > > > > P0P1xP0P2 " P0P1P2 > > > > > > > > Thanks > > > > > > > > -- > > > > Nicolas > > > > > > > > > > Thanks, > > > > > > > > > > Matt > > > > > > > > > > So far I've been doing it by hand, and after a > > lot of > > > > experimenting the > > > > > past weeks, it seems that if I call P0P1P2P3 a > > tetrahedron > > > > and note x > > > > > the cross product, > > > > > P3P2xP3P1 is the outward normal to face P1P2P3 > > > > > P0P2xP0P3 " P0P2P3 > > > > > P3P1xP3P0 " P0P1P3 > > > > > P0P1xP0P2 " P0P1P2 > > > > > Have I been lucky but can't expect it to be > true ? > > > > > > > > > > (Alternatively, there is a link between the > normals > > > and the > > > > element > > > > > Jacobian, but I don't know the formula and can > > find them) > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > -- > > > > > Nicolas > > > > > > > > > > On 08/02/2021 15:19, Matthew Knepley wrote: > > > > > > On Mon, Feb 8, 2021 at 6:01 AM Nicolas Barral > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >>>>> wrote: > > > > > > > > > > > > Hi all, > > > > > > > > > > > > Can I make any assumption on the > > orientation of > > > triangular > > > > > facets in a > > > > > > tetrahedral plex ? I need the inward > facet > > > normals. Do > > > > I need > > > > > to use > > > > > > DMPlexGetOrientedFace or can I rely on > > either > > > the tet > > > > vertices > > > > > > ordering, > > > > > > or the faces ordering ? Could > > > > DMPlexGetRawFaces_Internal be > > > > > enough ? > > > > > > > > > > > > > > > > > > You can do it by hand, but you have to > > account for > > > the face > > > > > orientation > > > > > > relative to the cell. That is what > > > > > > DMPlexGetOrientedFace() does. I think it > > would be > > > easier > > > > to use the > > > > > > function below. > > > > > > > > > > > > Alternatively, is there a function that > > > computes the > > > > normals > > > > > - without > > > > > > bringing out the big guns ? > > > > > > > > > > > > > > > > > > This will compute the normals > > > > > > > > > > > > > > > > > > > > > > > > > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexComputeCellGeometryFVM.html > > > > > > Should not be too heavy weight. > > > > > > > > > > > > THanks, > > > > > > > > > > > > Matt > > > > > > > > > > > > Thanks > > > > > > > > > > > > -- > > > > > > Nicolas > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > What most experimenters take for granted > before > > > they begin > > > > their > > > > > > experiments is infinitely more interesting > > than any > > > > results to which > > > > > > their experiments lead. > > > > > > -- Norbert Wiener > > > > > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > What most experimenters take for granted before > > they begin > > > their > > > > > experiments is infinitely more interesting than any > > > results to which > > > > > their experiments lead. > > > > > -- Norbert Wiener > > > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before they begin > > their > > > > experiments is infinitely more interesting than any > > results to which > > > > their experiments lead. > > > > -- Norbert Wiener > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to > which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Mon Mar 8 18:54:54 2021 From: fdkong.jd at gmail.com (Fande Kong) Date: Mon, 8 Mar 2021 17:54:54 -0700 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! Message-ID: Hi All, mpicc rejected "-fPIC". Anyone has a clue how to work around this issue? The log was attached. Thanks so much, Fande -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 221111 bytes Desc: not available URL: From knepley at gmail.com Mon Mar 8 19:07:42 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 8 Mar 2021 20:07:42 -0500 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: Message-ID: On Mon, Mar 8, 2021 at 7:55 PM Fande Kong wrote: > Hi All, > > mpicc rejected "-fPIC". Anyone has a clue how to work around this issue? > The failure is at the last step Executing: mpicc -o /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest -fPIC /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers -lconftest Possible ERROR while running linker: exit code 1 stderr: ld: can't link with a main executable file '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' for architecture x86_64 clang-11: error: linker command failed with exit code 1 (use -v to see invocation) but you have some flags stuck in which may or may not affect this. I would try shutting them off: LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath /Users/kongf/miniconda3/envs/moose/lib -L/Users/kongf/miniconda3/envs/moose/lib I cannot tell exactly why clang is failing because it does not report a specific error. Thanks, Matt The log was attached. > > Thanks so much, > > Fande > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Mon Mar 8 19:23:22 2021 From: fdkong.jd at gmail.com (Fande Kong) Date: Mon, 8 Mar 2021 18:23:22 -0700 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: Message-ID: Thanks Matthew, Hmm, we still have the same issue after shutting off all unknown flags. Thanks, Fande On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley wrote: > On Mon, Mar 8, 2021 at 7:55 PM Fande Kong wrote: > >> Hi All, >> >> mpicc rejected "-fPIC". Anyone has a clue how to work around this issue? >> > > The failure is at the last step > > Executing: mpicc -o > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest > -fPIC > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o > -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers > -lconftest > > Possible ERROR while running linker: exit code 1 > > stderr: > > ld: can't link with a main executable file > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > for architecture x86_64 > > clang-11: error: linker command failed with exit code 1 (use -v to see > invocation) > > but you have some flags stuck in which may or may not affect this. I would > try shutting them off: > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath > /Users/kongf/miniconda3/envs/moose/lib > -L/Users/kongf/miniconda3/envs/moose/lib > > I cannot tell exactly why clang is failing because it does not report a > specific error. > > Thanks, > > Matt > > The log was attached. >> >> Thanks so much, >> >> Fande >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 216645 bytes Desc: not available URL: From knepley at gmail.com Mon Mar 8 19:31:42 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 8 Mar 2021 20:31:42 -0500 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: Message-ID: On Mon, Mar 8, 2021 at 8:23 PM Fande Kong wrote: > Thanks Matthew, > > Hmm, we still have the same issue after shutting off all unknown flags. > Oh, I was misinterpreting the error message: ld: can't link with a main executable file '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' So clang did not _actually_ make a shared library, it made an executable. Did clang-11 change the options it uses to build a shared library? Satish, do we test with clang-11? Thanks, Matt Thanks, > > Fande > > On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley wrote: > >> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong wrote: >> >>> Hi All, >>> >>> mpicc rejected "-fPIC". Anyone has a clue how to work around this issue? >>> >> >> The failure is at the last step >> >> Executing: mpicc -o >> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest >> -fPIC >> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o >> -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers >> -lconftest >> >> Possible ERROR while running linker: exit code 1 >> >> stderr: >> >> ld: can't link with a main executable file >> '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' >> for architecture x86_64 >> >> clang-11: error: linker command failed with exit code 1 (use -v to see >> invocation) >> >> but you have some flags stuck in which may or may not affect this. I >> would try shutting them off: >> >> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath >> /Users/kongf/miniconda3/envs/moose/lib >> -L/Users/kongf/miniconda3/envs/moose/lib >> >> I cannot tell exactly why clang is failing because it does not report a >> specific error. >> >> Thanks, >> >> Matt >> >> The log was attached. >>> >>> Thanks so much, >>> >>> Fande >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Mar 8 20:28:04 2021 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 8 Mar 2021 20:28:04 -0600 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: Message-ID: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> Fande, I see you are using CONDA, this can cause issues since it sticks all kinds of things into the environment. PETSc tries to remove some of them but perhaps not enough. If you run printenv you will see all the mess it is dumping in. Can you trying the same build without CONDA environment? Barry > On Mar 8, 2021, at 7:31 PM, Matthew Knepley wrote: > > On Mon, Mar 8, 2021 at 8:23 PM Fande Kong > wrote: > Thanks Matthew, > > Hmm, we still have the same issue after shutting off all unknown flags. > > Oh, I was misinterpreting the error message: > > ld: can't link with a main executable file '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > So clang did not _actually_ make a shared library, it made an executable. Did clang-11 change the options it uses to build a shared library? > > Satish, do we test with clang-11? > > Thanks, > > Matt > > Thanks, > > Fande > > On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley > wrote: > On Mon, Mar 8, 2021 at 7:55 PM Fande Kong > wrote: > Hi All, > > mpicc rejected "-fPIC". Anyone has a clue how to work around this issue? > > The failure is at the last step > > Executing: mpicc -o /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest -fPIC /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers -lconftest > Possible ERROR while running linker: exit code 1 > stderr: > ld: can't link with a main executable file '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' for architecture x86_64 > clang-11: error: linker command failed with exit code 1 (use -v to see invocation) > > but you have some flags stuck in which may or may not affect this. I would try shutting them off: > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath /Users/kongf/miniconda3/envs/moose/lib -L/Users/kongf/miniconda3/envs/moose/lib > > I cannot tell exactly why clang is failing because it does not report a specific error. > > Thanks, > > Matt > > The log was attached. > > Thanks so much, > > Fande > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From e0425375 at gmail.com Tue Mar 9 02:07:16 2021 From: e0425375 at gmail.com (Florian Bruckner) Date: Tue, 9 Mar 2021 09:07:16 +0100 Subject: [petsc-users] using preconditioner with SLEPc In-Reply-To: References: <7C5B30FE-C539-4A14-B442-B1C91618E4AC@petsc.dev> <119944FD-4F1E-4B2F-A39D-65ADDB12BB5F@petsc.dev> <6EF7889D-DC17-46FC-82A5-9409C41E231D@petsc.dev> <46C744D7-4376-46B3-B5C4-211A4C8C2291@dsic.upv.es> <80BCEEDC-4C1E-4512-AAF5-7B6E718C7D1D@dsic.upv.es> Message-ID: Dear Jose, I asked Lawrence Mitchell from the firedrake people to help me with the slepc update (I think they are applying some modifications for petsc, which is why simply updating petsc within my docker container did not work). Now the latest slepc version runs and I already get some results of the eigenmode solver. The good thing is that the solver runs significantly faster. The bad thing is that the results are still wrong :-) Could you have a short look at the code: es = SLEPc.EPS().create(comm=fd.COMM_WORLD) es.setDimensions(k) es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) es.setProblemType(SLEPc.EPS.ProblemType.PGNHEP) # Generalized Non-Hermitian eigenproblem with positive definite B es.setWhichEigenpairs(SLEPc.EPS.Which.LARGEST_MAGNITUDE) #es.setTrueResidual(True) es.setTolerances(1e-10) es.setOperators(self.B0, self.A0) es.setFromOptions() st = es.getST() st.setPreconditionerMat(self.P0) You wrote that when using shift-and-invert with target=0 the solver internally uses A0^{-1}*B0. Nevertheless I think the precond P0 mat should be an approximation of A0, right? This is because the solver uses the B0-inner product to preserve symmetry. Or is the B0 missing in my code? As I mentioned before, convergence of the method is extremely fast. I thought that maybe the tolerance is set too low, but increasing it did not change the situation. With using setTrueResidual, there is no convergence at all. Figures show the different results for the original scipy method (which has been compared to published results) as well as the new slepc method. For some strange reason I get nearly the same (wrong) results if i replace A0 with P0 in the original scipy code. In my case A0 is a non-local field operator and P0 only contains local and next-neighbour interaction. Is it possible that the wrong operator (P0 instead of A0) is used internally? best wishes Florian On Thu, Feb 18, 2021 at 1:00 PM Florian Bruckner wrote: > Dear Jose, > thanks for your work. I just looked over the code, but I didn't have time > to implement our solver, yet. > If I understand the code correctly, it allows to set a precond-matrix > which should approximate A-sigma*B. > > I will try to get our code running in the next few weeks. From user > perspective it would maybe simplify things if approximations for A as well > as B are given, since this would hide the internal ST transformations. > > best wishes > Florian > > On Tue, Feb 16, 2021 at 8:54 PM Jose E. Roman wrote: > >> Florian: I have created a MR >> https://gitlab.com/slepc/slepc/-/merge_requests/149 >> Let me know if it fits your needs. >> >> Jose >> >> >> > El 15 feb 2021, a las 18:44, Jose E. Roman >> escribi?: >> > >> > >> > >> >> El 15 feb 2021, a las 14:53, Matthew Knepley >> escribi?: >> >> >> >> On Mon, Feb 15, 2021 at 7:27 AM Jose E. Roman >> wrote: >> >> I will think about the viability of adding an interface function to >> pass the preconditioner matrix. >> >> >> >> Regarding the question about the B-orthogonality of computed vectors, >> in the symmetric solver the B-orthogonality is enforced during the >> computation, so you have guarantee that the computed vectors satisfy it. >> But if solved as non-symetric, the computed vectors may depart from >> B-orthogonality, unless the tolerance is very small. >> >> >> >> Yes, the vectors I generate are not B-orthogonal. >> >> >> >> Jose, do you think there is a way to reformulate what I am doing to >> use the symmetric solver, even if we only have the action of B? >> > >> > Yes, you can do the following: >> > >> > ierr = EPSSetOperators(eps,S,NULL);CHKERRQ(ierr); // S is your shell >> matrix A^{-1}*B >> > ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr); // symmetric >> problem though S is not symmetric >> > ierr = EPSSetFromOptions(eps);CHKERRQ(ierr); >> > ierr = EPSSetUp(eps);CHKERRQ(ierr); // note explicitly calling setup >> here >> > ierr = EPSGetBV(eps,&bv);CHKERRQ(ierr); >> > ierr = BVSetMatrix(bv,B,PETSC_FALSE);CHKERRQ(ierr); // replace >> solver's inner product >> > ierr = EPSSolve(eps);CHKERRQ(ierr); >> > >> > I have tried this with test1.c and it works. The computed eigenvectors >> should be B-orthogonal in this case. >> > >> > Jose >> > >> > >> >> >> >> Thanks, >> >> >> >> Matt >> >> >> >> Jose >> >> >> >> >> >>> El 14 feb 2021, a las 21:41, Barry Smith escribi?: >> >>> >> >>> >> >>> Florian, >> >>> >> >>> I'm sorry I don't know the answers; I can only speculate. There is >> a STGetShift(). >> >>> >> >>> All I was saying is theoretically there could/should be such >> support in SLEPc. >> >>> >> >>> Barry >> >>> >> >>> >> >>>> On Feb 13, 2021, at 6:43 PM, Florian Bruckner >> wrote: >> >>>> >> >>>> Dear Barry, >> >>>> thank you for your clarification. What I wanted to say is that even >> if I could reset the KSP operators directly I would require to know which >> transformation ST applies in order to provide the preconditioning matrix >> for the correct operator. >> >>>> The more general solution would be that SLEPc provides the interface >> to pass the preconditioning matrix for A0 and ST applies the same >> transformations as for the operator. >> >>>> >> >>>> If you write "SLEPc could provide an interface", do you mean someone >> should implement it, or should it already be possible and I am not using it >> correctly? >> >>>> I wrote a small standalone example based on ex9.py from slepc4py, >> where i tried to use an operator. >> >>>> >> >>>> best wishes >> >>>> Florian >> >>>> >> >>>> On Sat, Feb 13, 2021 at 7:15 PM Barry Smith >> wrote: >> >>>> >> >>>> >> >>>>> On Feb 13, 2021, at 2:47 AM, Pierre Jolivet >> wrote: >> >>>>> >> >>>>> >> >>>>> >> >>>>>> On 13 Feb 2021, at 7:25 AM, Florian Bruckner >> wrote: >> >>>>>> >> >>>>>> Dear Jose, Dear Barry, >> >>>>>> thanks again for your reply. One final question about the B0 >> orthogonality. Do you mean that eigenvectors are not B0 orthogonal, but >> they are i*B0 orthogonal? or is there an issue with Matt's approach? >> >>>>>> For my problem I can show that eigenvalues fulfill an >> orthogonality relation (phi_i, A0 phi_j ) = omega_i (phi_i, B0 phi_j) = >> delta_ij. This should be independent of the solving method, right? >> >>>>>> >> >>>>>> Regarding Barry's advice this is what I first tried: >> >>>>>> es = SLEPc.EPS().create(comm=fd.COMM_WORLD) >> >>>>>> st = es.getST() >> >>>>>> ksp = st.getKSP() >> >>>>>> ksp.setOperators(self.A0, self.P0) >> >>>>>> >> >>>>>> But it seems that the provided P0 is not used. Furthermore the >> interface is maybe a bit confusing if ST performs some transformation. In >> this case P0 needs to approximate A0^{-1}*B0 and not A0, right? >> >>>>> >> >>>>> No, you need to approximate (A0-sigma B0)^-1. If you have a null >> shift, which looks like it is the case, you end up with A0^-1. >> >>>> >> >>>> Just trying to provide more clarity with the terms. >> >>>> >> >>>> If ST transforms the operator in the KSP to (A0-sigma B0) and you >> are providing the "sparse matrix from which the preconditioner is to be >> built" then you need to provide something that approximates (A0-sigma B0). >> Since the PC will use your matrix to construct a preconditioner that >> approximates the inverse of (A0-sigma B0), you don't need to directly >> provide something that approximates (A0-sigma B0)^-1 >> >>>> >> >>>> Yes, I would think SLEPc could provide an interface where it manages >> "the matrix from which to construct the preconditioner" and transforms that >> matrix just like the true matrix. To do it by hand you simply need to know >> what A0 and B0 are and which sigma ST has selected and then you can >> construct your modA0 - sigma modB0 and pass it to the KSP. Where modA0 and >> modB0 are your "sparser approximations". >> >>>> >> >>>> Barry >> >>>> >> >>>> >> >>>>> >> >>>>>> Nevertheless I think it would be the best solution if one could >> provide P0 (approx A0) and SLEPc derives the preconditioner from this. >> Would this be hard to implement? >> >>>>> >> >>>>> This is what Barry?s suggestion is implementing. Don?t know why it >> doesn?t work with your Python operator though. >> >>>>> >> >>>>> Thanks, >> >>>>> Pierre >> >>>>> >> >>>>>> best wishes >> >>>>>> Florian >> >>>>>> >> >>>>>> >> >>>>>> On Sat, Feb 13, 2021 at 4:19 AM Barry Smith >> wrote: >> >>>>>> >> >>>>>> >> >>>>>>> On Feb 12, 2021, at 2:32 AM, Florian Bruckner >> wrote: >> >>>>>>> >> >>>>>>> Dear Jose, Dear Matt, >> >>>>>>> >> >>>>>>> I needed some time to think about your answers. >> >>>>>>> If I understand correctly, the eigenmode solver internally uses >> A0^{-1}*B0, which is normally handled by the ST object, which creates a KSP >> solver and a corresponding preconditioner. >> >>>>>>> What I would need is an interface to provide not only the system >> Matrix A0 (which is an operator), but also a preconditioning matrix (sparse >> approximation of the operator). >> >>>>>>> Unfortunately this interface is not available, right? >> >>>>>> >> >>>>>> If SLEPc does not provide this directly it is still intended to >> be trivial to provide the "preconditioner matrix" (that is matrix from >> which the preconditioner is built). Just get the KSP from the ST object and >> use KSPSetOperators() to provide the "preconditioner matrix" . >> >>>>>> >> >>>>>> Barry >> >>>>>> >> >>>>>>> >> >>>>>>> Matt directly creates A0^{-1}*B0 as a matshell operator. The >> operator uses a KSP with a proper PC internally. SLEPc would directly get >> A0^{-1}*B0 and solve a standard eigenvalue problem with this modified >> operator. Did I understand this correctly? >> >>>>>>> >> >>>>>>> I have two further points, which I did not mention yet: the >> matrix B0 is Hermitian, but it is (purely) imaginary (B0.real=0). Right >> now, I am using Firedrake to set up the PETSc system matrices A0, i*B0 >> (which is real). Then I convert them into ScipyLinearOperators and use >> scipy.sparse.eigsh(B0, b=A0, Minv=Minv) to calculate the eigenvalues. >> Minv=A0^-1 is also solving within scipy using a preconditioned gmres. >> Advantage of this setup is that the imaginary B0 can be handled efficiently >> and also the post-processing of the eigenvectors (which requires complex >> arithmetics) is simplified. >> >>>>>>> >> >>>>>>> Nevertheless I think that the mixing of PETSc and Scipy looks too >> complicated and is not very flexible. >> >>>>>>> If I would use Matt's approach, could I then simply switch >> between multiple standard eigenvalue methods (e.g. LOBPCG)? or is it >> limited due to the use of matshell? >> >>>>>>> Is there a solution for the imaginary B0, or do I have to use the >> non-hermitian methods? Is this a large performance drawback? >> >>>>>>> >> >>>>>>> thanks again, >> >>>>>>> and best wishes >> >>>>>>> Florian >> >>>>>>> >> >>>>>>> On Mon, Feb 8, 2021 at 3:37 PM Jose E. Roman >> wrote: >> >>>>>>> The problem can be written as A0*v=omega*B0*v and you want the >> eigenvalues omega closest to zero. If the matrices were explicitly >> available, you would do shift-and-invert with target=0, that is >> >>>>>>> >> >>>>>>> (A0-sigma*B0)^{-1}*B0*v=theta*v for sigma=0, that is >> >>>>>>> >> >>>>>>> A0^{-1}*B0*v=theta*v >> >>>>>>> >> >>>>>>> and you compute EPS_LARGEST_MAGNITUDE eigenvalues theta=1/omega. >> >>>>>>> >> >>>>>>> Matt: I guess you should have EPS_LARGEST_MAGNITUDE instead of >> EPS_SMALLEST_REAL in your code. Are you getting the eigenvalues you need? >> EPS_SMALLEST_REAL will give slow convergence. >> >>>>>>> >> >>>>>>> Florian: I would not recommend setting the KSP matrices directly, >> it may produce strange side-effects. We should have an interface function >> to pass this matrix. Currently there is STPrecondSetMatForPC() but it has >> two problems: (1) it is intended for STPRECOND, so cannot be used with >> Krylov-Schur, and (2) it is not currently available in the python interface. >> >>>>>>> >> >>>>>>> The approach used by Matt is a workaround that does not use ST, >> so you can handle linear solves with a KSP of your own. >> >>>>>>> >> >>>>>>> As an alternative, since your problem is symmetric, you could try >> LOBPCG, assuming that the leftmost eigenvalues are those that you want >> (e.g. if all eigenvalues are non-negative). In that case you could use >> STPrecondSetMatForPC(), but the remaining issue is calling it from python. >> >>>>>>> >> >>>>>>> If you are using the git repo, I could add the relevant code. >> >>>>>>> >> >>>>>>> Jose >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>>> El 8 feb 2021, a las 14:22, Matthew Knepley >> escribi?: >> >>>>>>>> >> >>>>>>>> On Mon, Feb 8, 2021 at 7:04 AM Florian Bruckner < >> e0425375 at gmail.com> wrote: >> >>>>>>>> Dear PETSc / SLEPc Users, >> >>>>>>>> >> >>>>>>>> my question is very similar to the one posted here: >> >>>>>>>> >> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-August/035878.html >> >>>>>>>> >> >>>>>>>> The eigensystem I would like to solve looks like: >> >>>>>>>> B0 v = 1/omega A0 v >> >>>>>>>> B0 and A0 are both hermitian, A0 is positive definite, but only >> given as a linear operator (matshell). I am looking for the largest >> eigenvalues (=smallest omega). >> >>>>>>>> >> >>>>>>>> I also have a sparse approximation P0 of the A0 operator, which >> i would like to use as precondtioner, using something like this: >> >>>>>>>> >> >>>>>>>> es = SLEPc.EPS().create(comm=fd.COMM_WORLD) >> >>>>>>>> st = es.getST() >> >>>>>>>> ksp = st.getKSP() >> >>>>>>>> ksp.setOperators(self.A0, self.P0) >> >>>>>>>> >> >>>>>>>> Unfortunately PETSc still complains that it cannot create a >> preconditioner for a type 'python' matrix although P0.type == 'seqaij' (but >> A0.type == 'python'). >> >>>>>>>> By the way, should P0 be an approximation of A0 or does it have >> to include B0? >> >>>>>>>> >> >>>>>>>> Right now I am using the krylov-schur method. Are there any >> alternatives if A0 is only given as an operator? >> >>>>>>>> >> >>>>>>>> Jose can correct me if I say something wrong. >> >>>>>>>> >> >>>>>>>> When I did this, I made a shell operator for the action of >> A0^{-1} B0 which has a KSPSolve() in it, so you can use your P0 >> preconditioning matrix, and >> >>>>>>>> then handed that to EPS. You can see me do it here: >> >>>>>>>> >> >>>>>>>> >> https://gitlab.com/knepley/bamg/-/blob/master/src/coarse/bamgCoarseSpace.c#L123 >> >>>>>>>> >> >>>>>>>> I had a hard time getting the embedded solver to work the way I >> wanted, but maybe that is the better way. >> >>>>>>>> >> >>>>>>>> Thanks, >> >>>>>>>> >> >>>>>>>> Matt >> >>>>>>>> >> >>>>>>>> thanks for any advice >> >>>>>>>> best wishes >> >>>>>>>> Florian >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> -- >> >>>>>>>> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> >>>>>>>> -- Norbert Wiener >> >>>>>>>> >> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >> >>>>>>> >> >>>>>> >> >>>>> >> >>>> >> >>>> >> >>> >> >> >> >> >> >> >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> >> -- Norbert Wiener >> >> >> >> https://www.cse.buffalo.edu/~knepley/ >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: frequencies_scipy_P0_instead_A0.png Type: image/png Size: 50300 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: frequencies_slepc.png Type: image/png Size: 49785 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: frequencies_scipy.png Type: image/png Size: 46246 bytes Desc: not available URL: From jroman at dsic.upv.es Tue Mar 9 03:48:52 2021 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 9 Mar 2021 10:48:52 +0100 Subject: [petsc-users] using preconditioner with SLEPc In-Reply-To: References: <7C5B30FE-C539-4A14-B442-B1C91618E4AC@petsc.dev> <119944FD-4F1E-4B2F-A39D-65ADDB12BB5F@petsc.dev> <6EF7889D-DC17-46FC-82A5-9409C41E231D@petsc.dev> <46C744D7-4376-46B3-B5C4-211A4C8C2291@dsic.upv.es> <80BCEEDC-4C1E-4512-AAF5-7B6E718C7D1D@dsic.upv.es> Message-ID: The reason may be that it is using a direct solver instead of an iterative solver. What do you get for -eps_view ? Does the code work correctly if you comment out st.setPreconditionerMat(self.P0) ? Your approach should work, but I would first try as is done in the example https://slepc.upv.es/slepc-main/src/eps/tutorials/ex46.c.html that is, shift-and-invert with target=0 and target_magnitude. Jose > El 9 mar 2021, a las 9:07, Florian Bruckner escribi?: > > Dear Jose, > I asked Lawrence Mitchell from the firedrake people to help me with the slepc update (I think they are applying some modifications for petsc, which is why simply updating petsc within my docker container did not work). > Now the latest slepc version runs and I already get some results of the eigenmode solver. The good thing is that the solver runs significantly faster. The bad thing is that the results are still wrong :-) > > Could you have a short look at the code: > es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > es.setDimensions(k) > es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) > es.setProblemType(SLEPc.EPS.ProblemType.PGNHEP) # Generalized Non-Hermitian eigenproblem with positive definite B > es.setWhichEigenpairs(SLEPc.EPS.Which.LARGEST_MAGNITUDE) > #es.setTrueResidual(True) > es.setTolerances(1e-10) > es.setOperators(self.B0, self.A0) > es.setFromOptions() > > st = es.getST() > st.setPreconditionerMat(self.P0) > > You wrote that when using shift-and-invert with target=0 the solver internally uses A0^{-1}*B0. > Nevertheless I think the precond P0 mat should be an approximation of A0, right? > This is because the solver uses the B0-inner product to preserve symmetry. > Or is the B0 missing in my code? > > As I mentioned before, convergence of the method is extremely fast. I thought that maybe the tolerance is set too low, but increasing it did not change the situation. > With using setTrueResidual, there is no convergence at all. > > Figures show the different results for the original scipy method (which has been compared to published results) as well as the new slepc method. > For some strange reason I get nearly the same (wrong) results if i replace A0 with P0 in the original scipy code. > In my case A0 is a non-local field operator and P0 only contains local and next-neighbour interaction. > Is it possible that the wrong operator (P0 instead of A0) is used internally? > > best wishes > Florian > > On Thu, Feb 18, 2021 at 1:00 PM Florian Bruckner wrote: > Dear Jose, > thanks for your work. I just looked over the code, but I didn't have time to implement our solver, yet. > If I understand the code correctly, it allows to set a precond-matrix which should approximate A-sigma*B. > > I will try to get our code running in the next few weeks. From user perspective it would maybe simplify things if approximations for A as well as B are given, since this would hide the internal ST transformations. > > best wishes > Florian > > On Tue, Feb 16, 2021 at 8:54 PM Jose E. Roman wrote: > Florian: I have created a MR https://gitlab.com/slepc/slepc/-/merge_requests/149 > Let me know if it fits your needs. > > Jose > > > > El 15 feb 2021, a las 18:44, Jose E. Roman escribi?: > > > > > > > >> El 15 feb 2021, a las 14:53, Matthew Knepley escribi?: > >> > >> On Mon, Feb 15, 2021 at 7:27 AM Jose E. Roman wrote: > >> I will think about the viability of adding an interface function to pass the preconditioner matrix. > >> > >> Regarding the question about the B-orthogonality of computed vectors, in the symmetric solver the B-orthogonality is enforced during the computation, so you have guarantee that the computed vectors satisfy it. But if solved as non-symetric, the computed vectors may depart from B-orthogonality, unless the tolerance is very small. > >> > >> Yes, the vectors I generate are not B-orthogonal. > >> > >> Jose, do you think there is a way to reformulate what I am doing to use the symmetric solver, even if we only have the action of B? > > > > Yes, you can do the following: > > > > ierr = EPSSetOperators(eps,S,NULL);CHKERRQ(ierr); // S is your shell matrix A^{-1}*B > > ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr); // symmetric problem though S is not symmetric > > ierr = EPSSetFromOptions(eps);CHKERRQ(ierr); > > ierr = EPSSetUp(eps);CHKERRQ(ierr); // note explicitly calling setup here > > ierr = EPSGetBV(eps,&bv);CHKERRQ(ierr); > > ierr = BVSetMatrix(bv,B,PETSC_FALSE);CHKERRQ(ierr); // replace solver's inner product > > ierr = EPSSolve(eps);CHKERRQ(ierr); > > > > I have tried this with test1.c and it works. The computed eigenvectors should be B-orthogonal in this case. > > > > Jose > > > > > >> > >> Thanks, > >> > >> Matt > >> > >> Jose > >> > >> > >>> El 14 feb 2021, a las 21:41, Barry Smith escribi?: > >>> > >>> > >>> Florian, > >>> > >>> I'm sorry I don't know the answers; I can only speculate. There is a STGetShift(). > >>> > >>> All I was saying is theoretically there could/should be such support in SLEPc. > >>> > >>> Barry > >>> > >>> > >>>> On Feb 13, 2021, at 6:43 PM, Florian Bruckner wrote: > >>>> > >>>> Dear Barry, > >>>> thank you for your clarification. What I wanted to say is that even if I could reset the KSP operators directly I would require to know which transformation ST applies in order to provide the preconditioning matrix for the correct operator. > >>>> The more general solution would be that SLEPc provides the interface to pass the preconditioning matrix for A0 and ST applies the same transformations as for the operator. > >>>> > >>>> If you write "SLEPc could provide an interface", do you mean someone should implement it, or should it already be possible and I am not using it correctly? > >>>> I wrote a small standalone example based on ex9.py from slepc4py, where i tried to use an operator. > >>>> > >>>> best wishes > >>>> Florian > >>>> > >>>> On Sat, Feb 13, 2021 at 7:15 PM Barry Smith wrote: > >>>> > >>>> > >>>>> On Feb 13, 2021, at 2:47 AM, Pierre Jolivet wrote: > >>>>> > >>>>> > >>>>> > >>>>>> On 13 Feb 2021, at 7:25 AM, Florian Bruckner wrote: > >>>>>> > >>>>>> Dear Jose, Dear Barry, > >>>>>> thanks again for your reply. One final question about the B0 orthogonality. Do you mean that eigenvectors are not B0 orthogonal, but they are i*B0 orthogonal? or is there an issue with Matt's approach? > >>>>>> For my problem I can show that eigenvalues fulfill an orthogonality relation (phi_i, A0 phi_j ) = omega_i (phi_i, B0 phi_j) = delta_ij. This should be independent of the solving method, right? > >>>>>> > >>>>>> Regarding Barry's advice this is what I first tried: > >>>>>> es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > >>>>>> st = es.getST() > >>>>>> ksp = st.getKSP() > >>>>>> ksp.setOperators(self.A0, self.P0) > >>>>>> > >>>>>> But it seems that the provided P0 is not used. Furthermore the interface is maybe a bit confusing if ST performs some transformation. In this case P0 needs to approximate A0^{-1}*B0 and not A0, right? > >>>>> > >>>>> No, you need to approximate (A0-sigma B0)^-1. If you have a null shift, which looks like it is the case, you end up with A0^-1. > >>>> > >>>> Just trying to provide more clarity with the terms. > >>>> > >>>> If ST transforms the operator in the KSP to (A0-sigma B0) and you are providing the "sparse matrix from which the preconditioner is to be built" then you need to provide something that approximates (A0-sigma B0). Since the PC will use your matrix to construct a preconditioner that approximates the inverse of (A0-sigma B0), you don't need to directly provide something that approximates (A0-sigma B0)^-1 > >>>> > >>>> Yes, I would think SLEPc could provide an interface where it manages "the matrix from which to construct the preconditioner" and transforms that matrix just like the true matrix. To do it by hand you simply need to know what A0 and B0 are and which sigma ST has selected and then you can construct your modA0 - sigma modB0 and pass it to the KSP. Where modA0 and modB0 are your "sparser approximations". > >>>> > >>>> Barry > >>>> > >>>> > >>>>> > >>>>>> Nevertheless I think it would be the best solution if one could provide P0 (approx A0) and SLEPc derives the preconditioner from this. Would this be hard to implement? > >>>>> > >>>>> This is what Barry?s suggestion is implementing. Don?t know why it doesn?t work with your Python operator though. > >>>>> > >>>>> Thanks, > >>>>> Pierre > >>>>> > >>>>>> best wishes > >>>>>> Florian > >>>>>> > >>>>>> > >>>>>> On Sat, Feb 13, 2021 at 4:19 AM Barry Smith wrote: > >>>>>> > >>>>>> > >>>>>>> On Feb 12, 2021, at 2:32 AM, Florian Bruckner wrote: > >>>>>>> > >>>>>>> Dear Jose, Dear Matt, > >>>>>>> > >>>>>>> I needed some time to think about your answers. > >>>>>>> If I understand correctly, the eigenmode solver internally uses A0^{-1}*B0, which is normally handled by the ST object, which creates a KSP solver and a corresponding preconditioner. > >>>>>>> What I would need is an interface to provide not only the system Matrix A0 (which is an operator), but also a preconditioning matrix (sparse approximation of the operator). > >>>>>>> Unfortunately this interface is not available, right? > >>>>>> > >>>>>> If SLEPc does not provide this directly it is still intended to be trivial to provide the "preconditioner matrix" (that is matrix from which the preconditioner is built). Just get the KSP from the ST object and use KSPSetOperators() to provide the "preconditioner matrix" . > >>>>>> > >>>>>> Barry > >>>>>> > >>>>>>> > >>>>>>> Matt directly creates A0^{-1}*B0 as a matshell operator. The operator uses a KSP with a proper PC internally. SLEPc would directly get A0^{-1}*B0 and solve a standard eigenvalue problem with this modified operator. Did I understand this correctly? > >>>>>>> > >>>>>>> I have two further points, which I did not mention yet: the matrix B0 is Hermitian, but it is (purely) imaginary (B0.real=0). Right now, I am using Firedrake to set up the PETSc system matrices A0, i*B0 (which is real). Then I convert them into ScipyLinearOperators and use scipy.sparse.eigsh(B0, b=A0, Minv=Minv) to calculate the eigenvalues. Minv=A0^-1 is also solving within scipy using a preconditioned gmres. Advantage of this setup is that the imaginary B0 can be handled efficiently and also the post-processing of the eigenvectors (which requires complex arithmetics) is simplified. > >>>>>>> > >>>>>>> Nevertheless I think that the mixing of PETSc and Scipy looks too complicated and is not very flexible. > >>>>>>> If I would use Matt's approach, could I then simply switch between multiple standard eigenvalue methods (e.g. LOBPCG)? or is it limited due to the use of matshell? > >>>>>>> Is there a solution for the imaginary B0, or do I have to use the non-hermitian methods? Is this a large performance drawback? > >>>>>>> > >>>>>>> thanks again, > >>>>>>> and best wishes > >>>>>>> Florian > >>>>>>> > >>>>>>> On Mon, Feb 8, 2021 at 3:37 PM Jose E. Roman wrote: > >>>>>>> The problem can be written as A0*v=omega*B0*v and you want the eigenvalues omega closest to zero. If the matrices were explicitly available, you would do shift-and-invert with target=0, that is > >>>>>>> > >>>>>>> (A0-sigma*B0)^{-1}*B0*v=theta*v for sigma=0, that is > >>>>>>> > >>>>>>> A0^{-1}*B0*v=theta*v > >>>>>>> > >>>>>>> and you compute EPS_LARGEST_MAGNITUDE eigenvalues theta=1/omega. > >>>>>>> > >>>>>>> Matt: I guess you should have EPS_LARGEST_MAGNITUDE instead of EPS_SMALLEST_REAL in your code. Are you getting the eigenvalues you need? EPS_SMALLEST_REAL will give slow convergence. > >>>>>>> > >>>>>>> Florian: I would not recommend setting the KSP matrices directly, it may produce strange side-effects. We should have an interface function to pass this matrix. Currently there is STPrecondSetMatForPC() but it has two problems: (1) it is intended for STPRECOND, so cannot be used with Krylov-Schur, and (2) it is not currently available in the python interface. > >>>>>>> > >>>>>>> The approach used by Matt is a workaround that does not use ST, so you can handle linear solves with a KSP of your own. > >>>>>>> > >>>>>>> As an alternative, since your problem is symmetric, you could try LOBPCG, assuming that the leftmost eigenvalues are those that you want (e.g. if all eigenvalues are non-negative). In that case you could use STPrecondSetMatForPC(), but the remaining issue is calling it from python. > >>>>>>> > >>>>>>> If you are using the git repo, I could add the relevant code. > >>>>>>> > >>>>>>> Jose > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> El 8 feb 2021, a las 14:22, Matthew Knepley escribi?: > >>>>>>>> > >>>>>>>> On Mon, Feb 8, 2021 at 7:04 AM Florian Bruckner wrote: > >>>>>>>> Dear PETSc / SLEPc Users, > >>>>>>>> > >>>>>>>> my question is very similar to the one posted here: > >>>>>>>> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-August/035878.html > >>>>>>>> > >>>>>>>> The eigensystem I would like to solve looks like: > >>>>>>>> B0 v = 1/omega A0 v > >>>>>>>> B0 and A0 are both hermitian, A0 is positive definite, but only given as a linear operator (matshell). I am looking for the largest eigenvalues (=smallest omega). > >>>>>>>> > >>>>>>>> I also have a sparse approximation P0 of the A0 operator, which i would like to use as precondtioner, using something like this: > >>>>>>>> > >>>>>>>> es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > >>>>>>>> st = es.getST() > >>>>>>>> ksp = st.getKSP() > >>>>>>>> ksp.setOperators(self.A0, self.P0) > >>>>>>>> > >>>>>>>> Unfortunately PETSc still complains that it cannot create a preconditioner for a type 'python' matrix although P0.type == 'seqaij' (but A0.type == 'python'). > >>>>>>>> By the way, should P0 be an approximation of A0 or does it have to include B0? > >>>>>>>> > >>>>>>>> Right now I am using the krylov-schur method. Are there any alternatives if A0 is only given as an operator? > >>>>>>>> > >>>>>>>> Jose can correct me if I say something wrong. > >>>>>>>> > >>>>>>>> When I did this, I made a shell operator for the action of A0^{-1} B0 which has a KSPSolve() in it, so you can use your P0 preconditioning matrix, and > >>>>>>>> then handed that to EPS. You can see me do it here: > >>>>>>>> > >>>>>>>> https://gitlab.com/knepley/bamg/-/blob/master/src/coarse/bamgCoarseSpace.c#L123 > >>>>>>>> > >>>>>>>> I had a hard time getting the embedded solver to work the way I wanted, but maybe that is the better way. > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> > >>>>>>>> Matt > >>>>>>>> > >>>>>>>> thanks for any advice > >>>>>>>> best wishes > >>>>>>>> Florian > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > >>>>>>>> -- Norbert Wiener > >>>>>>>> > >>>>>>>> https://www.cse.buffalo.edu/~knepley/ > >>>>>>> > >>>>>> > >>>>> > >>>> > >>>> > >>> > >> > >> > >> > >> -- > >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > >> -- Norbert Wiener > >> > >> https://www.cse.buffalo.edu/~knepley/ > > From e0425375 at gmail.com Tue Mar 9 06:07:29 2021 From: e0425375 at gmail.com (Florian Bruckner) Date: Tue, 9 Mar 2021 13:07:29 +0100 Subject: [petsc-users] using preconditioner with SLEPc In-Reply-To: References: <7C5B30FE-C539-4A14-B442-B1C91618E4AC@petsc.dev> <119944FD-4F1E-4B2F-A39D-65ADDB12BB5F@petsc.dev> <6EF7889D-DC17-46FC-82A5-9409C41E231D@petsc.dev> <46C744D7-4376-46B3-B5C4-211A4C8C2291@dsic.upv.es> <80BCEEDC-4C1E-4512-AAF5-7B6E718C7D1D@dsic.upv.es> Message-ID: Dear Jose, I appended the output of eps-view for the original method and for the method you proposed. Unfortunately the new method does not converge. This I what i did: es = SLEPc.EPS().create(comm=fd.COMM_WORLD) es.setDimensions(k) es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) es.setProblemType(SLEPc.EPS.ProblemType.PGNHEP) # Generalized Non-Hermitian eigenproblem with positive definite B es.setTarget(0.0) es.setWhichEigenpairs(SLEPc.EPS.Which.TARGET_MAGNITUDE) es.setTolerances(1e-10) es.setOperators(self.B0, self.A0) es.setFromOptions() st = es.getST() st.setPreconditionerMat(self.P0) Is TARGET_MAGNITUDE correct? If I change it back to LARGEST_MAGNITUDE i can reproduce the (wrong) results from before. Why do I need the shift-invert mode at all? I thought this is only necessary if I would like to solve for the smallest eigenmodes? Without st.setPreconditionerMat(self.P0) the code does not work, because A0 is a matshell and the preconditioner cannot be set up. If I use pc_type = None the method converges, but results are totally wrong (1e-12 GHz instead of 7GHz). What confuses me most, is that the slepc results (without target=0 and with the precond matrix) produces nearly correct results, which perfectly fit the results from scipy when using P0 instead of A0. This is a strange coincidence, and looks like P0 is used somewhere instead of A0. thanks for your help Florian On Tue, Mar 9, 2021 at 10:48 AM Jose E. Roman wrote: > The reason may be that it is using a direct solver instead of an iterative > solver. What do you get for -eps_view ? > > Does the code work correctly if you comment out > st.setPreconditionerMat(self.P0) ? > > Your approach should work, but I would first try as is done in the example > https://slepc.upv.es/slepc-main/src/eps/tutorials/ex46.c.html > that is, shift-and-invert with target=0 and target_magnitude. > > Jose > > > > El 9 mar 2021, a las 9:07, Florian Bruckner > escribi?: > > > > Dear Jose, > > I asked Lawrence Mitchell from the firedrake people to help me with the > slepc update (I think they are applying some modifications for petsc, which > is why simply updating petsc within my docker container did not work). > > Now the latest slepc version runs and I already get some results of the > eigenmode solver. The good thing is that the solver runs significantly > faster. The bad thing is that the results are still wrong :-) > > > > Could you have a short look at the code: > > es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > es.setDimensions(k) > > es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) > > es.setProblemType(SLEPc.EPS.ProblemType.PGNHEP) # Generalized > Non-Hermitian eigenproblem with positive definite B > > es.setWhichEigenpairs(SLEPc.EPS.Which.LARGEST_MAGNITUDE) > > #es.setTrueResidual(True) > > es.setTolerances(1e-10) > > es.setOperators(self.B0, self.A0) > > es.setFromOptions() > > > > st = es.getST() > > st.setPreconditionerMat(self.P0) > > > > You wrote that when using shift-and-invert with target=0 the solver > internally uses A0^{-1}*B0. > > Nevertheless I think the precond P0 mat should be an approximation of > A0, right? > > This is because the solver uses the B0-inner product to preserve > symmetry. > > Or is the B0 missing in my code? > > > > As I mentioned before, convergence of the method is extremely fast. I > thought that maybe the tolerance is set too low, but increasing it did not > change the situation. > > With using setTrueResidual, there is no convergence at all. > > > > Figures show the different results for the original scipy method (which > has been compared to published results) as well as the new slepc method. > > For some strange reason I get nearly the same (wrong) results if i > replace A0 with P0 in the original scipy code. > > In my case A0 is a non-local field operator and P0 only contains local > and next-neighbour interaction. > > Is it possible that the wrong operator (P0 instead of A0) is used > internally? > > > > best wishes > > Florian > > > > On Thu, Feb 18, 2021 at 1:00 PM Florian Bruckner > wrote: > > Dear Jose, > > thanks for your work. I just looked over the code, but I didn't have > time to implement our solver, yet. > > If I understand the code correctly, it allows to set a precond-matrix > which should approximate A-sigma*B. > > > > I will try to get our code running in the next few weeks. From user > perspective it would maybe simplify things if approximations for A as well > as B are given, since this would hide the internal ST transformations. > > > > best wishes > > Florian > > > > On Tue, Feb 16, 2021 at 8:54 PM Jose E. Roman > wrote: > > Florian: I have created a MR > https://gitlab.com/slepc/slepc/-/merge_requests/149 > > Let me know if it fits your needs. > > > > Jose > > > > > > > El 15 feb 2021, a las 18:44, Jose E. Roman > escribi?: > > > > > > > > > > > >> El 15 feb 2021, a las 14:53, Matthew Knepley > escribi?: > > >> > > >> On Mon, Feb 15, 2021 at 7:27 AM Jose E. Roman > wrote: > > >> I will think about the viability of adding an interface function to > pass the preconditioner matrix. > > >> > > >> Regarding the question about the B-orthogonality of computed vectors, > in the symmetric solver the B-orthogonality is enforced during the > computation, so you have guarantee that the computed vectors satisfy it. > But if solved as non-symetric, the computed vectors may depart from > B-orthogonality, unless the tolerance is very small. > > >> > > >> Yes, the vectors I generate are not B-orthogonal. > > >> > > >> Jose, do you think there is a way to reformulate what I am doing to > use the symmetric solver, even if we only have the action of B? > > > > > > Yes, you can do the following: > > > > > > ierr = EPSSetOperators(eps,S,NULL);CHKERRQ(ierr); // S is your > shell matrix A^{-1}*B > > > ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr); // symmetric > problem though S is not symmetric > > > ierr = EPSSetFromOptions(eps);CHKERRQ(ierr); > > > ierr = EPSSetUp(eps);CHKERRQ(ierr); // note explicitly calling > setup here > > > ierr = EPSGetBV(eps,&bv);CHKERRQ(ierr); > > > ierr = BVSetMatrix(bv,B,PETSC_FALSE);CHKERRQ(ierr); // replace > solver's inner product > > > ierr = EPSSolve(eps);CHKERRQ(ierr); > > > > > > I have tried this with test1.c and it works. The computed eigenvectors > should be B-orthogonal in this case. > > > > > > Jose > > > > > > > > >> > > >> Thanks, > > >> > > >> Matt > > >> > > >> Jose > > >> > > >> > > >>> El 14 feb 2021, a las 21:41, Barry Smith > escribi?: > > >>> > > >>> > > >>> Florian, > > >>> > > >>> I'm sorry I don't know the answers; I can only speculate. There is > a STGetShift(). > > >>> > > >>> All I was saying is theoretically there could/should be such > support in SLEPc. > > >>> > > >>> Barry > > >>> > > >>> > > >>>> On Feb 13, 2021, at 6:43 PM, Florian Bruckner > wrote: > > >>>> > > >>>> Dear Barry, > > >>>> thank you for your clarification. What I wanted to say is that even > if I could reset the KSP operators directly I would require to know which > transformation ST applies in order to provide the preconditioning matrix > for the correct operator. > > >>>> The more general solution would be that SLEPc provides the > interface to pass the preconditioning matrix for A0 and ST applies the same > transformations as for the operator. > > >>>> > > >>>> If you write "SLEPc could provide an interface", do you mean > someone should implement it, or should it already be possible and I am not > using it correctly? > > >>>> I wrote a small standalone example based on ex9.py from slepc4py, > where i tried to use an operator. > > >>>> > > >>>> best wishes > > >>>> Florian > > >>>> > > >>>> On Sat, Feb 13, 2021 at 7:15 PM Barry Smith > wrote: > > >>>> > > >>>> > > >>>>> On Feb 13, 2021, at 2:47 AM, Pierre Jolivet > wrote: > > >>>>> > > >>>>> > > >>>>> > > >>>>>> On 13 Feb 2021, at 7:25 AM, Florian Bruckner > wrote: > > >>>>>> > > >>>>>> Dear Jose, Dear Barry, > > >>>>>> thanks again for your reply. One final question about the B0 > orthogonality. Do you mean that eigenvectors are not B0 orthogonal, but > they are i*B0 orthogonal? or is there an issue with Matt's approach? > > >>>>>> For my problem I can show that eigenvalues fulfill an > orthogonality relation (phi_i, A0 phi_j ) = omega_i (phi_i, B0 phi_j) = > delta_ij. This should be independent of the solving method, right? > > >>>>>> > > >>>>>> Regarding Barry's advice this is what I first tried: > > >>>>>> es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > >>>>>> st = es.getST() > > >>>>>> ksp = st.getKSP() > > >>>>>> ksp.setOperators(self.A0, self.P0) > > >>>>>> > > >>>>>> But it seems that the provided P0 is not used. Furthermore the > interface is maybe a bit confusing if ST performs some transformation. In > this case P0 needs to approximate A0^{-1}*B0 and not A0, right? > > >>>>> > > >>>>> No, you need to approximate (A0-sigma B0)^-1. If you have a null > shift, which looks like it is the case, you end up with A0^-1. > > >>>> > > >>>> Just trying to provide more clarity with the terms. > > >>>> > > >>>> If ST transforms the operator in the KSP to (A0-sigma B0) and you > are providing the "sparse matrix from which the preconditioner is to be > built" then you need to provide something that approximates (A0-sigma B0). > Since the PC will use your matrix to construct a preconditioner that > approximates the inverse of (A0-sigma B0), you don't need to directly > provide something that approximates (A0-sigma B0)^-1 > > >>>> > > >>>> Yes, I would think SLEPc could provide an interface where it > manages "the matrix from which to construct the preconditioner" and > transforms that matrix just like the true matrix. To do it by hand you > simply need to know what A0 and B0 are and which sigma ST has selected and > then you can construct your modA0 - sigma modB0 and pass it to the KSP. > Where modA0 and modB0 are your "sparser approximations". > > >>>> > > >>>> Barry > > >>>> > > >>>> > > >>>>> > > >>>>>> Nevertheless I think it would be the best solution if one could > provide P0 (approx A0) and SLEPc derives the preconditioner from this. > Would this be hard to implement? > > >>>>> > > >>>>> This is what Barry?s suggestion is implementing. Don?t know why it > doesn?t work with your Python operator though. > > >>>>> > > >>>>> Thanks, > > >>>>> Pierre > > >>>>> > > >>>>>> best wishes > > >>>>>> Florian > > >>>>>> > > >>>>>> > > >>>>>> On Sat, Feb 13, 2021 at 4:19 AM Barry Smith > wrote: > > >>>>>> > > >>>>>> > > >>>>>>> On Feb 12, 2021, at 2:32 AM, Florian Bruckner < > e0425375 at gmail.com> wrote: > > >>>>>>> > > >>>>>>> Dear Jose, Dear Matt, > > >>>>>>> > > >>>>>>> I needed some time to think about your answers. > > >>>>>>> If I understand correctly, the eigenmode solver internally uses > A0^{-1}*B0, which is normally handled by the ST object, which creates a KSP > solver and a corresponding preconditioner. > > >>>>>>> What I would need is an interface to provide not only the system > Matrix A0 (which is an operator), but also a preconditioning matrix (sparse > approximation of the operator). > > >>>>>>> Unfortunately this interface is not available, right? > > >>>>>> > > >>>>>> If SLEPc does not provide this directly it is still intended to > be trivial to provide the "preconditioner matrix" (that is matrix from > which the preconditioner is built). Just get the KSP from the ST object and > use KSPSetOperators() to provide the "preconditioner matrix" . > > >>>>>> > > >>>>>> Barry > > >>>>>> > > >>>>>>> > > >>>>>>> Matt directly creates A0^{-1}*B0 as a matshell operator. The > operator uses a KSP with a proper PC internally. SLEPc would directly get > A0^{-1}*B0 and solve a standard eigenvalue problem with this modified > operator. Did I understand this correctly? > > >>>>>>> > > >>>>>>> I have two further points, which I did not mention yet: the > matrix B0 is Hermitian, but it is (purely) imaginary (B0.real=0). Right > now, I am using Firedrake to set up the PETSc system matrices A0, i*B0 > (which is real). Then I convert them into ScipyLinearOperators and use > scipy.sparse.eigsh(B0, b=A0, Minv=Minv) to calculate the eigenvalues. > Minv=A0^-1 is also solving within scipy using a preconditioned gmres. > Advantage of this setup is that the imaginary B0 can be handled efficiently > and also the post-processing of the eigenvectors (which requires complex > arithmetics) is simplified. > > >>>>>>> > > >>>>>>> Nevertheless I think that the mixing of PETSc and Scipy looks > too complicated and is not very flexible. > > >>>>>>> If I would use Matt's approach, could I then simply switch > between multiple standard eigenvalue methods (e.g. LOBPCG)? or is it > limited due to the use of matshell? > > >>>>>>> Is there a solution for the imaginary B0, or do I have to use > the non-hermitian methods? Is this a large performance drawback? > > >>>>>>> > > >>>>>>> thanks again, > > >>>>>>> and best wishes > > >>>>>>> Florian > > >>>>>>> > > >>>>>>> On Mon, Feb 8, 2021 at 3:37 PM Jose E. Roman > wrote: > > >>>>>>> The problem can be written as A0*v=omega*B0*v and you want the > eigenvalues omega closest to zero. If the matrices were explicitly > available, you would do shift-and-invert with target=0, that is > > >>>>>>> > > >>>>>>> (A0-sigma*B0)^{-1}*B0*v=theta*v for sigma=0, that is > > >>>>>>> > > >>>>>>> A0^{-1}*B0*v=theta*v > > >>>>>>> > > >>>>>>> and you compute EPS_LARGEST_MAGNITUDE eigenvalues theta=1/omega. > > >>>>>>> > > >>>>>>> Matt: I guess you should have EPS_LARGEST_MAGNITUDE instead of > EPS_SMALLEST_REAL in your code. Are you getting the eigenvalues you need? > EPS_SMALLEST_REAL will give slow convergence. > > >>>>>>> > > >>>>>>> Florian: I would not recommend setting the KSP matrices > directly, it may produce strange side-effects. We should have an interface > function to pass this matrix. Currently there is STPrecondSetMatForPC() but > it has two problems: (1) it is intended for STPRECOND, so cannot be used > with Krylov-Schur, and (2) it is not currently available in the python > interface. > > >>>>>>> > > >>>>>>> The approach used by Matt is a workaround that does not use ST, > so you can handle linear solves with a KSP of your own. > > >>>>>>> > > >>>>>>> As an alternative, since your problem is symmetric, you could > try LOBPCG, assuming that the leftmost eigenvalues are those that you want > (e.g. if all eigenvalues are non-negative). In that case you could use > STPrecondSetMatForPC(), but the remaining issue is calling it from python. > > >>>>>>> > > >>>>>>> If you are using the git repo, I could add the relevant code. > > >>>>>>> > > >>>>>>> Jose > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>>> El 8 feb 2021, a las 14:22, Matthew Knepley > escribi?: > > >>>>>>>> > > >>>>>>>> On Mon, Feb 8, 2021 at 7:04 AM Florian Bruckner < > e0425375 at gmail.com> wrote: > > >>>>>>>> Dear PETSc / SLEPc Users, > > >>>>>>>> > > >>>>>>>> my question is very similar to the one posted here: > > >>>>>>>> > https://lists.mcs.anl.gov/pipermail/petsc-users/2018-August/035878.html > > >>>>>>>> > > >>>>>>>> The eigensystem I would like to solve looks like: > > >>>>>>>> B0 v = 1/omega A0 v > > >>>>>>>> B0 and A0 are both hermitian, A0 is positive definite, but only > given as a linear operator (matshell). I am looking for the largest > eigenvalues (=smallest omega). > > >>>>>>>> > > >>>>>>>> I also have a sparse approximation P0 of the A0 operator, which > i would like to use as precondtioner, using something like this: > > >>>>>>>> > > >>>>>>>> es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > >>>>>>>> st = es.getST() > > >>>>>>>> ksp = st.getKSP() > > >>>>>>>> ksp.setOperators(self.A0, self.P0) > > >>>>>>>> > > >>>>>>>> Unfortunately PETSc still complains that it cannot create a > preconditioner for a type 'python' matrix although P0.type == 'seqaij' (but > A0.type == 'python'). > > >>>>>>>> By the way, should P0 be an approximation of A0 or does it have > to include B0? > > >>>>>>>> > > >>>>>>>> Right now I am using the krylov-schur method. Are there any > alternatives if A0 is only given as an operator? > > >>>>>>>> > > >>>>>>>> Jose can correct me if I say something wrong. > > >>>>>>>> > > >>>>>>>> When I did this, I made a shell operator for the action of > A0^{-1} B0 which has a KSPSolve() in it, so you can use your P0 > preconditioning matrix, and > > >>>>>>>> then handed that to EPS. You can see me do it here: > > >>>>>>>> > > >>>>>>>> > https://gitlab.com/knepley/bamg/-/blob/master/src/coarse/bamgCoarseSpace.c#L123 > > >>>>>>>> > > >>>>>>>> I had a hard time getting the embedded solver to work the way I > wanted, but maybe that is the better way. > > >>>>>>>> > > >>>>>>>> Thanks, > > >>>>>>>> > > >>>>>>>> Matt > > >>>>>>>> > > >>>>>>>> thanks for any advice > > >>>>>>>> best wishes > > >>>>>>>> Florian > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> -- > > >>>>>>>> What most experimenters take for granted before they begin > their experiments is infinitely more interesting than any results to which > their experiments lead. > > >>>>>>>> -- Norbert Wiener > > >>>>>>>> > > >>>>>>>> https://www.cse.buffalo.edu/~knepley/ > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >>>> > > >>> > > >> > > >> > > >> > > >> -- > > >> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > >> -- Norbert Wiener > > >> > > >> https://www.cse.buffalo.edu/~knepley/ > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- EPS Object: 1 MPI processes type: krylovschur 50% of basis vectors kept after restart using the locking variant problem type: generalized non-symmetric eigenvalue problem with symmetric positive definite B selected portion of the spectrum: largest eigenvalues in magnitude postprocessing eigenvectors with purification number of eigenvalues (nev): 50 number of column vectors (ncv): 100 maximum dimension of projected problem (mpd): 100 maximum number of iterations: 100 tolerance: 1e-10 convergence test: relative to the eigenvalue BV Object: 1 MPI processes type: svec 101 columns of global length 3076 vector orthogonalization method: classical Gram-Schmidt orthogonalization refinement: if needed (eta: 0.7071) block orthogonalization method: GS non-standard inner product tolerance for definite inner product: 2.22045e-15 inner product matrix: Mat Object: 1 MPI processes type: python rows=3076, cols=3076, bs=3076 Python: magnumpi.field_terms.field_term.A0Operator doing matmult as a single matrix-matrix product DS Object: 1 MPI processes type: nhep ST Object: 1 MPI processes type: shift shift: 0. number of matrices: 2 all matrices have unknown nonzero pattern KSP Object: (st_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (st_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 4.36739 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=3076, cols=3076, bs=2 package used to perform factorization: petsc total: nonzeros=1905232, allocated nonzeros=1905232 using I-node routines: found 963 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI processes type: python rows=3076, cols=3076, bs=3076 Python: magnumpi.field_terms.field_term.A0Operator Mat Object: (st_) 1 MPI processes type: seqaij rows=3076, cols=3076, bs=2 total: nonzeros=436240, allocated nonzeros=436240 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 1417 nodes, limit used is 5 EPS Object: 1 MPI processes type: krylovschur 50% of basis vectors kept after restart using the locking variant problem type: generalized non-symmetric eigenvalue problem with symmetric positive definite B selected portion of the spectrum: largest eigenvalues in magnitude postprocessing eigenvectors with purification number of eigenvalues (nev): 50 number of column vectors (ncv): 100 maximum dimension of projected problem (mpd): 100 maximum number of iterations: 100 tolerance: 1e-10 convergence test: relative to the eigenvalue BV Object: 1 MPI processes type: svec 101 columns of global length 3076 vector orthogonalization method: classical Gram-Schmidt orthogonalization refinement: if needed (eta: 0.7071) block orthogonalization method: GS non-standard inner product tolerance for definite inner product: 2.22045e-15 inner product matrix: Mat Object: 1 MPI processes type: python rows=3076, cols=3076, bs=3076 Python: magnumpi.field_terms.field_term.A0Operator doing matmult as a single matrix-matrix product DS Object: 1 MPI processes type: nhep ST Object: 1 MPI processes type: shift shift: 0. number of matrices: 2 all matrices have unknown nonzero pattern KSP Object: (st_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (st_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 4.36739 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=3076, cols=3076, bs=2 package used to perform factorization: petsc total: nonzeros=1905232, allocated nonzeros=1905232 using I-node routines: found 963 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI processes type: python rows=3076, cols=3076, bs=3076 Python: magnumpi.field_terms.field_term.A0Operator Mat Object: (st_) 1 MPI processes type: seqaij rows=3076, cols=3076, bs=2 total: nonzeros=436240, allocated nonzeros=436240 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 1417 nodes, limit used is 5 nconv: 58 -------------- next part -------------- EPS Object: 1 MPI processes type: krylovschur 50% of basis vectors kept after restart using the locking variant problem type: generalized non-symmetric eigenvalue problem with symmetric positive definite B selected portion of the spectrum: closest to target: 0. (in magnitude) postprocessing eigenvectors with purification number of eigenvalues (nev): 50 number of column vectors (ncv): 100 maximum dimension of projected problem (mpd): 100 maximum number of iterations: 100 tolerance: 1e-10 convergence test: relative to the eigenvalue BV Object: 1 MPI processes type: svec 101 columns of global length 3076 vector orthogonalization method: classical Gram-Schmidt orthogonalization refinement: if needed (eta: 0.7071) block orthogonalization method: GS non-standard inner product tolerance for definite inner product: 2.22045e-15 inner product matrix: Mat Object: 1 MPI processes type: python rows=3076, cols=3076, bs=3076 Python: magnumpi.field_terms.field_term.A0Operator doing matmult as a single matrix-matrix product DS Object: 1 MPI processes type: nhep ST Object: 1 MPI processes type: shift shift: 0. number of matrices: 2 all matrices have unknown nonzero pattern KSP Object: (st_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (st_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 4.36739 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=3076, cols=3076, bs=2 package used to perform factorization: petsc total: nonzeros=1905232, allocated nonzeros=1905232 using I-node routines: found 963 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI processes type: python rows=3076, cols=3076, bs=3076 Python: magnumpi.field_terms.field_term.A0Operator Mat Object: (st_) 1 MPI processes type: seqaij rows=3076, cols=3076, bs=2 total: nonzeros=436240, allocated nonzeros=436240 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 1417 nodes, limit used is 5 EPS Object: 1 MPI processes type: krylovschur 50% of basis vectors kept after restart using the locking variant problem type: generalized non-symmetric eigenvalue problem with symmetric positive definite B selected portion of the spectrum: closest to target: 0. (in magnitude) postprocessing eigenvectors with purification number of eigenvalues (nev): 50 number of column vectors (ncv): 100 maximum dimension of projected problem (mpd): 100 maximum number of iterations: 100 tolerance: 1e-10 convergence test: relative to the eigenvalue BV Object: 1 MPI processes type: svec 101 columns of global length 3076 vector orthogonalization method: classical Gram-Schmidt orthogonalization refinement: if needed (eta: 0.7071) block orthogonalization method: GS non-standard inner product tolerance for definite inner product: 2.22045e-15 inner product matrix: Mat Object: 1 MPI processes type: python rows=3076, cols=3076, bs=3076 Python: magnumpi.field_terms.field_term.A0Operator doing matmult as a single matrix-matrix product DS Object: 1 MPI processes type: nhep ST Object: 1 MPI processes type: shift shift: 0. number of matrices: 2 all matrices have unknown nonzero pattern KSP Object: (st_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (st_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 4.36739 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=3076, cols=3076, bs=2 package used to perform factorization: petsc total: nonzeros=1905232, allocated nonzeros=1905232 using I-node routines: found 963 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: 1 MPI processes type: python rows=3076, cols=3076, bs=3076 Python: magnumpi.field_terms.field_term.A0Operator Mat Object: (st_) 1 MPI processes type: seqaij rows=3076, cols=3076, bs=2 total: nonzeros=436240, allocated nonzeros=436240 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 1417 nodes, limit used is 5 nconv: 0 From jroman at dsic.upv.es Tue Mar 9 06:20:48 2021 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 9 Mar 2021 13:20:48 +0100 Subject: [petsc-users] using preconditioner with SLEPc In-Reply-To: References: <7C5B30FE-C539-4A14-B442-B1C91618E4AC@petsc.dev> <119944FD-4F1E-4B2F-A39D-65ADDB12BB5F@petsc.dev> <6EF7889D-DC17-46FC-82A5-9409C41E231D@petsc.dev> <46C744D7-4376-46B3-B5C4-211A4C8C2291@dsic.upv.es> <80BCEEDC-4C1E-4512-AAF5-7B6E718C7D1D@dsic.upv.es> Message-ID: <1DCC9C38-D49E-4A02-8EE3-B3701618914A@dsic.upv.es> > El 9 mar 2021, a las 13:07, Florian Bruckner escribi?: > > Dear Jose, > I appended the output of eps-view for the original method and for the method you proposed. > Unfortunately the new method does not converge. This I what i did: > > es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > es.setDimensions(k) > es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) > es.setProblemType(SLEPc.EPS.ProblemType.PGNHEP) # Generalized Non-Hermitian eigenproblem with positive definite B > es.setTarget(0.0) > es.setWhichEigenpairs(SLEPc.EPS.Which.TARGET_MAGNITUDE) > es.setTolerances(1e-10) > es.setOperators(self.B0, self.A0) > es.setFromOptions() > st = es.getST() > st.setPreconditionerMat(self.P0) No, you did not set SINVERT. Also, you should set (A,B) and not (B,A). And forget about PGNHEP for the moment. > > Is TARGET_MAGNITUDE correct? If I change it back to LARGEST_MAGNITUDE i can reproduce the (wrong) results from before. > Why do I need the shift-invert mode at all? I thought this is only necessary if I would like to solve for the smallest eigenmodes? > > Without st.setPreconditionerMat(self.P0) the code does not work, because A0 is a matshell and the preconditioner cannot be set up. > If I use pc_type = None the method converges, but results are totally wrong (1e-12 GHz instead of 7GHz). > > What confuses me most, is that the slepc results (without target=0 and with the precond matrix) produces nearly correct results, > which perfectly fit the results from scipy when using P0 instead of A0. This is a strange coincidence, and looks like P0 is used somewhere instead of A0. As I said, the problem is that it is using PREONLY+LU when it should use GMRES+BJACOBI by default. Are you setting PREONLY+LU somehow? Try running with -st_ksp_type gmres -st_pc_type bjacobi Jose > > thanks for your help > Florian > > On Tue, Mar 9, 2021 at 10:48 AM Jose E. Roman wrote: > The reason may be that it is using a direct solver instead of an iterative solver. What do you get for -eps_view ? > > Does the code work correctly if you comment out st.setPreconditionerMat(self.P0) ? > > Your approach should work, but I would first try as is done in the example https://slepc.upv.es/slepc-main/src/eps/tutorials/ex46.c.html > that is, shift-and-invert with target=0 and target_magnitude. > > Jose > > > > El 9 mar 2021, a las 9:07, Florian Bruckner escribi?: > > > > Dear Jose, > > I asked Lawrence Mitchell from the firedrake people to help me with the slepc update (I think they are applying some modifications for petsc, which is why simply updating petsc within my docker container did not work). > > Now the latest slepc version runs and I already get some results of the eigenmode solver. The good thing is that the solver runs significantly faster. The bad thing is that the results are still wrong :-) > > > > Could you have a short look at the code: > > es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > es.setDimensions(k) > > es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) > > es.setProblemType(SLEPc.EPS.ProblemType.PGNHEP) # Generalized Non-Hermitian eigenproblem with positive definite B > > es.setWhichEigenpairs(SLEPc.EPS.Which.LARGEST_MAGNITUDE) > > #es.setTrueResidual(True) > > es.setTolerances(1e-10) > > es.setOperators(self.B0, self.A0) > > es.setFromOptions() > > > > st = es.getST() > > st.setPreconditionerMat(self.P0) > > > > You wrote that when using shift-and-invert with target=0 the solver internally uses A0^{-1}*B0. > > Nevertheless I think the precond P0 mat should be an approximation of A0, right? > > This is because the solver uses the B0-inner product to preserve symmetry. > > Or is the B0 missing in my code? > > > > As I mentioned before, convergence of the method is extremely fast. I thought that maybe the tolerance is set too low, but increasing it did not change the situation. > > With using setTrueResidual, there is no convergence at all. > > > > Figures show the different results for the original scipy method (which has been compared to published results) as well as the new slepc method. > > For some strange reason I get nearly the same (wrong) results if i replace A0 with P0 in the original scipy code. > > In my case A0 is a non-local field operator and P0 only contains local and next-neighbour interaction. > > Is it possible that the wrong operator (P0 instead of A0) is used internally? > > > > best wishes > > Florian > > > > On Thu, Feb 18, 2021 at 1:00 PM Florian Bruckner wrote: > > Dear Jose, > > thanks for your work. I just looked over the code, but I didn't have time to implement our solver, yet. > > If I understand the code correctly, it allows to set a precond-matrix which should approximate A-sigma*B. > > > > I will try to get our code running in the next few weeks. From user perspective it would maybe simplify things if approximations for A as well as B are given, since this would hide the internal ST transformations. > > > > best wishes > > Florian > > > > On Tue, Feb 16, 2021 at 8:54 PM Jose E. Roman wrote: > > Florian: I have created a MR https://gitlab.com/slepc/slepc/-/merge_requests/149 > > Let me know if it fits your needs. > > > > Jose > > > > > > > El 15 feb 2021, a las 18:44, Jose E. Roman escribi?: > > > > > > > > > > > >> El 15 feb 2021, a las 14:53, Matthew Knepley escribi?: > > >> > > >> On Mon, Feb 15, 2021 at 7:27 AM Jose E. Roman wrote: > > >> I will think about the viability of adding an interface function to pass the preconditioner matrix. > > >> > > >> Regarding the question about the B-orthogonality of computed vectors, in the symmetric solver the B-orthogonality is enforced during the computation, so you have guarantee that the computed vectors satisfy it. But if solved as non-symetric, the computed vectors may depart from B-orthogonality, unless the tolerance is very small. > > >> > > >> Yes, the vectors I generate are not B-orthogonal. > > >> > > >> Jose, do you think there is a way to reformulate what I am doing to use the symmetric solver, even if we only have the action of B? > > > > > > Yes, you can do the following: > > > > > > ierr = EPSSetOperators(eps,S,NULL);CHKERRQ(ierr); // S is your shell matrix A^{-1}*B > > > ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr); // symmetric problem though S is not symmetric > > > ierr = EPSSetFromOptions(eps);CHKERRQ(ierr); > > > ierr = EPSSetUp(eps);CHKERRQ(ierr); // note explicitly calling setup here > > > ierr = EPSGetBV(eps,&bv);CHKERRQ(ierr); > > > ierr = BVSetMatrix(bv,B,PETSC_FALSE);CHKERRQ(ierr); // replace solver's inner product > > > ierr = EPSSolve(eps);CHKERRQ(ierr); > > > > > > I have tried this with test1.c and it works. The computed eigenvectors should be B-orthogonal in this case. > > > > > > Jose > > > > > > > > >> > > >> Thanks, > > >> > > >> Matt > > >> > > >> Jose > > >> > > >> > > >>> El 14 feb 2021, a las 21:41, Barry Smith escribi?: > > >>> > > >>> > > >>> Florian, > > >>> > > >>> I'm sorry I don't know the answers; I can only speculate. There is a STGetShift(). > > >>> > > >>> All I was saying is theoretically there could/should be such support in SLEPc. > > >>> > > >>> Barry > > >>> > > >>> > > >>>> On Feb 13, 2021, at 6:43 PM, Florian Bruckner wrote: > > >>>> > > >>>> Dear Barry, > > >>>> thank you for your clarification. What I wanted to say is that even if I could reset the KSP operators directly I would require to know which transformation ST applies in order to provide the preconditioning matrix for the correct operator. > > >>>> The more general solution would be that SLEPc provides the interface to pass the preconditioning matrix for A0 and ST applies the same transformations as for the operator. > > >>>> > > >>>> If you write "SLEPc could provide an interface", do you mean someone should implement it, or should it already be possible and I am not using it correctly? > > >>>> I wrote a small standalone example based on ex9.py from slepc4py, where i tried to use an operator. > > >>>> > > >>>> best wishes > > >>>> Florian > > >>>> > > >>>> On Sat, Feb 13, 2021 at 7:15 PM Barry Smith wrote: > > >>>> > > >>>> > > >>>>> On Feb 13, 2021, at 2:47 AM, Pierre Jolivet wrote: > > >>>>> > > >>>>> > > >>>>> > > >>>>>> On 13 Feb 2021, at 7:25 AM, Florian Bruckner wrote: > > >>>>>> > > >>>>>> Dear Jose, Dear Barry, > > >>>>>> thanks again for your reply. One final question about the B0 orthogonality. Do you mean that eigenvectors are not B0 orthogonal, but they are i*B0 orthogonal? or is there an issue with Matt's approach? > > >>>>>> For my problem I can show that eigenvalues fulfill an orthogonality relation (phi_i, A0 phi_j ) = omega_i (phi_i, B0 phi_j) = delta_ij. This should be independent of the solving method, right? > > >>>>>> > > >>>>>> Regarding Barry's advice this is what I first tried: > > >>>>>> es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > >>>>>> st = es.getST() > > >>>>>> ksp = st.getKSP() > > >>>>>> ksp.setOperators(self.A0, self.P0) > > >>>>>> > > >>>>>> But it seems that the provided P0 is not used. Furthermore the interface is maybe a bit confusing if ST performs some transformation. In this case P0 needs to approximate A0^{-1}*B0 and not A0, right? > > >>>>> > > >>>>> No, you need to approximate (A0-sigma B0)^-1. If you have a null shift, which looks like it is the case, you end up with A0^-1. > > >>>> > > >>>> Just trying to provide more clarity with the terms. > > >>>> > > >>>> If ST transforms the operator in the KSP to (A0-sigma B0) and you are providing the "sparse matrix from which the preconditioner is to be built" then you need to provide something that approximates (A0-sigma B0). Since the PC will use your matrix to construct a preconditioner that approximates the inverse of (A0-sigma B0), you don't need to directly provide something that approximates (A0-sigma B0)^-1 > > >>>> > > >>>> Yes, I would think SLEPc could provide an interface where it manages "the matrix from which to construct the preconditioner" and transforms that matrix just like the true matrix. To do it by hand you simply need to know what A0 and B0 are and which sigma ST has selected and then you can construct your modA0 - sigma modB0 and pass it to the KSP. Where modA0 and modB0 are your "sparser approximations". > > >>>> > > >>>> Barry > > >>>> > > >>>> > > >>>>> > > >>>>>> Nevertheless I think it would be the best solution if one could provide P0 (approx A0) and SLEPc derives the preconditioner from this. Would this be hard to implement? > > >>>>> > > >>>>> This is what Barry?s suggestion is implementing. Don?t know why it doesn?t work with your Python operator though. > > >>>>> > > >>>>> Thanks, > > >>>>> Pierre > > >>>>> > > >>>>>> best wishes > > >>>>>> Florian > > >>>>>> > > >>>>>> > > >>>>>> On Sat, Feb 13, 2021 at 4:19 AM Barry Smith wrote: > > >>>>>> > > >>>>>> > > >>>>>>> On Feb 12, 2021, at 2:32 AM, Florian Bruckner wrote: > > >>>>>>> > > >>>>>>> Dear Jose, Dear Matt, > > >>>>>>> > > >>>>>>> I needed some time to think about your answers. > > >>>>>>> If I understand correctly, the eigenmode solver internally uses A0^{-1}*B0, which is normally handled by the ST object, which creates a KSP solver and a corresponding preconditioner. > > >>>>>>> What I would need is an interface to provide not only the system Matrix A0 (which is an operator), but also a preconditioning matrix (sparse approximation of the operator). > > >>>>>>> Unfortunately this interface is not available, right? > > >>>>>> > > >>>>>> If SLEPc does not provide this directly it is still intended to be trivial to provide the "preconditioner matrix" (that is matrix from which the preconditioner is built). Just get the KSP from the ST object and use KSPSetOperators() to provide the "preconditioner matrix" . > > >>>>>> > > >>>>>> Barry > > >>>>>> > > >>>>>>> > > >>>>>>> Matt directly creates A0^{-1}*B0 as a matshell operator. The operator uses a KSP with a proper PC internally. SLEPc would directly get A0^{-1}*B0 and solve a standard eigenvalue problem with this modified operator. Did I understand this correctly? > > >>>>>>> > > >>>>>>> I have two further points, which I did not mention yet: the matrix B0 is Hermitian, but it is (purely) imaginary (B0.real=0). Right now, I am using Firedrake to set up the PETSc system matrices A0, i*B0 (which is real). Then I convert them into ScipyLinearOperators and use scipy.sparse.eigsh(B0, b=A0, Minv=Minv) to calculate the eigenvalues. Minv=A0^-1 is also solving within scipy using a preconditioned gmres. Advantage of this setup is that the imaginary B0 can be handled efficiently and also the post-processing of the eigenvectors (which requires complex arithmetics) is simplified. > > >>>>>>> > > >>>>>>> Nevertheless I think that the mixing of PETSc and Scipy looks too complicated and is not very flexible. > > >>>>>>> If I would use Matt's approach, could I then simply switch between multiple standard eigenvalue methods (e.g. LOBPCG)? or is it limited due to the use of matshell? > > >>>>>>> Is there a solution for the imaginary B0, or do I have to use the non-hermitian methods? Is this a large performance drawback? > > >>>>>>> > > >>>>>>> thanks again, > > >>>>>>> and best wishes > > >>>>>>> Florian > > >>>>>>> > > >>>>>>> On Mon, Feb 8, 2021 at 3:37 PM Jose E. Roman wrote: > > >>>>>>> The problem can be written as A0*v=omega*B0*v and you want the eigenvalues omega closest to zero. If the matrices were explicitly available, you would do shift-and-invert with target=0, that is > > >>>>>>> > > >>>>>>> (A0-sigma*B0)^{-1}*B0*v=theta*v for sigma=0, that is > > >>>>>>> > > >>>>>>> A0^{-1}*B0*v=theta*v > > >>>>>>> > > >>>>>>> and you compute EPS_LARGEST_MAGNITUDE eigenvalues theta=1/omega. > > >>>>>>> > > >>>>>>> Matt: I guess you should have EPS_LARGEST_MAGNITUDE instead of EPS_SMALLEST_REAL in your code. Are you getting the eigenvalues you need? EPS_SMALLEST_REAL will give slow convergence. > > >>>>>>> > > >>>>>>> Florian: I would not recommend setting the KSP matrices directly, it may produce strange side-effects. We should have an interface function to pass this matrix. Currently there is STPrecondSetMatForPC() but it has two problems: (1) it is intended for STPRECOND, so cannot be used with Krylov-Schur, and (2) it is not currently available in the python interface. > > >>>>>>> > > >>>>>>> The approach used by Matt is a workaround that does not use ST, so you can handle linear solves with a KSP of your own. > > >>>>>>> > > >>>>>>> As an alternative, since your problem is symmetric, you could try LOBPCG, assuming that the leftmost eigenvalues are those that you want (e.g. if all eigenvalues are non-negative). In that case you could use STPrecondSetMatForPC(), but the remaining issue is calling it from python. > > >>>>>>> > > >>>>>>> If you are using the git repo, I could add the relevant code. > > >>>>>>> > > >>>>>>> Jose > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>>> El 8 feb 2021, a las 14:22, Matthew Knepley escribi?: > > >>>>>>>> > > >>>>>>>> On Mon, Feb 8, 2021 at 7:04 AM Florian Bruckner wrote: > > >>>>>>>> Dear PETSc / SLEPc Users, > > >>>>>>>> > > >>>>>>>> my question is very similar to the one posted here: > > >>>>>>>> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-August/035878.html > > >>>>>>>> > > >>>>>>>> The eigensystem I would like to solve looks like: > > >>>>>>>> B0 v = 1/omega A0 v > > >>>>>>>> B0 and A0 are both hermitian, A0 is positive definite, but only given as a linear operator (matshell). I am looking for the largest eigenvalues (=smallest omega). > > >>>>>>>> > > >>>>>>>> I also have a sparse approximation P0 of the A0 operator, which i would like to use as precondtioner, using something like this: > > >>>>>>>> > > >>>>>>>> es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > >>>>>>>> st = es.getST() > > >>>>>>>> ksp = st.getKSP() > > >>>>>>>> ksp.setOperators(self.A0, self.P0) > > >>>>>>>> > > >>>>>>>> Unfortunately PETSc still complains that it cannot create a preconditioner for a type 'python' matrix although P0.type == 'seqaij' (but A0.type == 'python'). > > >>>>>>>> By the way, should P0 be an approximation of A0 or does it have to include B0? > > >>>>>>>> > > >>>>>>>> Right now I am using the krylov-schur method. Are there any alternatives if A0 is only given as an operator? > > >>>>>>>> > > >>>>>>>> Jose can correct me if I say something wrong. > > >>>>>>>> > > >>>>>>>> When I did this, I made a shell operator for the action of A0^{-1} B0 which has a KSPSolve() in it, so you can use your P0 preconditioning matrix, and > > >>>>>>>> then handed that to EPS. You can see me do it here: > > >>>>>>>> > > >>>>>>>> https://gitlab.com/knepley/bamg/-/blob/master/src/coarse/bamgCoarseSpace.c#L123 > > >>>>>>>> > > >>>>>>>> I had a hard time getting the embedded solver to work the way I wanted, but maybe that is the better way. > > >>>>>>>> > > >>>>>>>> Thanks, > > >>>>>>>> > > >>>>>>>> Matt > > >>>>>>>> > > >>>>>>>> thanks for any advice > > >>>>>>>> best wishes > > >>>>>>>> Florian > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> -- > > >>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > >>>>>>>> -- Norbert Wiener > > >>>>>>>> > > >>>>>>>> https://www.cse.buffalo.edu/~knepley/ > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >>>> > > >>> > > >> > > >> > > >> > > >> -- > > >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > >> -- Norbert Wiener > > >> > > >> https://www.cse.buffalo.edu/~knepley/ > > > > > > From e0425375 at gmail.com Tue Mar 9 07:23:23 2021 From: e0425375 at gmail.com (Florian Bruckner) Date: Tue, 9 Mar 2021 14:23:23 +0100 Subject: [petsc-users] using preconditioner with SLEPc In-Reply-To: <1DCC9C38-D49E-4A02-8EE3-B3701618914A@dsic.upv.es> References: <7C5B30FE-C539-4A14-B442-B1C91618E4AC@petsc.dev> <119944FD-4F1E-4B2F-A39D-65ADDB12BB5F@petsc.dev> <6EF7889D-DC17-46FC-82A5-9409C41E231D@petsc.dev> <46C744D7-4376-46B3-B5C4-211A4C8C2291@dsic.upv.es> <80BCEEDC-4C1E-4512-AAF5-7B6E718C7D1D@dsic.upv.es> <1DCC9C38-D49E-4A02-8EE3-B3701618914A@dsic.upv.es> Message-ID: Dear Jose, good news: something is running. I have to admit that I don't understand the difference of what you suggested, but if I run es = SLEPc.EPS().create(comm=fd.COMM_WORLD) es.setDimensions(k) es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) es.setProblemType(SLEPc.EPS.ProblemType.GNHEP) # Generalized Non-Hermitian eigenproblem with positive definite B es.setTarget(0.0) es.setWhichEigenpairs(SLEPc.EPS.Which.TARGET_MAGNITUDE) es.setTolerances(1e-10) es.setOperators(self.A0, self.B0) es.setFromOptions() st = es.getST() st.setType(SLEPc.ST.Type.SINVERT) st.setPreconditionerMat(self.P0) es.solve() es.view() python run_slepc.py -st_ksp_type gmres -st_pc_type bjacobi i get the correct eigenvalues if as -1j*es.getEigenvalue(i) (the -1j is because B0 should be purely imaginary). Omitting the preconditioning matrix gives the same results, but as expected is significantly slower. If running it with the -st_ksp_type preonly -st_pc_type lu it still converges against totally wrong values. But this could be due to the preonly option. If only the PC is applied only, it is clear that A0 is not used at all, right? I didn't set LU manually. Perhaps firedrake changes the defaults somehow? But this should not be a problem. So, the code seems to work. Just to be sure, solving with (A0, B0) for SMALLEST_MAGNITUDE should be similar to target=0 and TARGET_MAGNITUDE, or is there any difference? Finally, also solving (B0, A0) with LARGEST_MAGNITUDE should be similar, but one gets 1/omega as eigenvalues. In all cases the provided precond matrix should approximate A0, right? Is this correct, or are the differences in the implementation when using the different formulations of the problem? again, many many thanks. best wishes Florian On Tue, Mar 9, 2021 at 1:20 PM Jose E. Roman wrote: > > > > El 9 mar 2021, a las 13:07, Florian Bruckner > escribi?: > > > > Dear Jose, > > I appended the output of eps-view for the original method and for the > method you proposed. > > Unfortunately the new method does not converge. This I what i did: > > > > es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > es.setDimensions(k) > > es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) > > es.setProblemType(SLEPc.EPS.ProblemType.PGNHEP) # Generalized > Non-Hermitian eigenproblem with positive definite B > > es.setTarget(0.0) > > es.setWhichEigenpairs(SLEPc.EPS.Which.TARGET_MAGNITUDE) > > es.setTolerances(1e-10) > > es.setOperators(self.B0, self.A0) > > es.setFromOptions() > > st = es.getST() > > st.setPreconditionerMat(self.P0) > > No, you did not set SINVERT. Also, you should set (A,B) and not (B,A). And > forget about PGNHEP for the moment. > > > > > Is TARGET_MAGNITUDE correct? If I change it back to LARGEST_MAGNITUDE i > can reproduce the (wrong) results from before. > > Why do I need the shift-invert mode at all? I thought this is only > necessary if I would like to solve for the smallest eigenmodes? > > > > Without st.setPreconditionerMat(self.P0) the code does not work, because > A0 is a matshell and the preconditioner cannot be set up. > > If I use pc_type = None the method converges, but results are totally > wrong (1e-12 GHz instead of 7GHz). > > > > What confuses me most, is that the slepc results (without target=0 and > with the precond matrix) produces nearly correct results, > > which perfectly fit the results from scipy when using P0 instead of A0. > This is a strange coincidence, and looks like P0 is used somewhere instead > of A0. > > As I said, the problem is that it is using PREONLY+LU when it should use > GMRES+BJACOBI by default. Are you setting PREONLY+LU somehow? Try running > with -st_ksp_type gmres -st_pc_type bjacobi > > Jose > > > > > thanks for your help > > Florian > > > > On Tue, Mar 9, 2021 at 10:48 AM Jose E. Roman > wrote: > > The reason may be that it is using a direct solver instead of an > iterative solver. What do you get for -eps_view ? > > > > Does the code work correctly if you comment out > st.setPreconditionerMat(self.P0) ? > > > > Your approach should work, but I would first try as is done in the > example https://slepc.upv.es/slepc-main/src/eps/tutorials/ex46.c.html > > that is, shift-and-invert with target=0 and target_magnitude. > > > > Jose > > > > > > > El 9 mar 2021, a las 9:07, Florian Bruckner > escribi?: > > > > > > Dear Jose, > > > I asked Lawrence Mitchell from the firedrake people to help me with > the slepc update (I think they are applying some modifications for petsc, > which is why simply updating petsc within my docker container did not work). > > > Now the latest slepc version runs and I already get some results of > the eigenmode solver. The good thing is that the solver runs significantly > faster. The bad thing is that the results are still wrong :-) > > > > > > Could you have a short look at the code: > > > es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > > es.setDimensions(k) > > > es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) > > > es.setProblemType(SLEPc.EPS.ProblemType.PGNHEP) # Generalized > Non-Hermitian eigenproblem with positive definite B > > > es.setWhichEigenpairs(SLEPc.EPS.Which.LARGEST_MAGNITUDE) > > > #es.setTrueResidual(True) > > > es.setTolerances(1e-10) > > > es.setOperators(self.B0, self.A0) > > > es.setFromOptions() > > > > > > st = es.getST() > > > st.setPreconditionerMat(self.P0) > > > > > > You wrote that when using shift-and-invert with target=0 the solver > internally uses A0^{-1}*B0. > > > Nevertheless I think the precond P0 mat should be an approximation of > A0, right? > > > This is because the solver uses the B0-inner product to preserve > symmetry. > > > Or is the B0 missing in my code? > > > > > > As I mentioned before, convergence of the method is extremely fast. I > thought that maybe the tolerance is set too low, but increasing it did not > change the situation. > > > With using setTrueResidual, there is no convergence at all. > > > > > > Figures show the different results for the original scipy method > (which has been compared to published results) as well as the new slepc > method. > > > For some strange reason I get nearly the same (wrong) results if i > replace A0 with P0 in the original scipy code. > > > In my case A0 is a non-local field operator and P0 only contains local > and next-neighbour interaction. > > > Is it possible that the wrong operator (P0 instead of A0) is used > internally? > > > > > > best wishes > > > Florian > > > > > > On Thu, Feb 18, 2021 at 1:00 PM Florian Bruckner > wrote: > > > Dear Jose, > > > thanks for your work. I just looked over the code, but I didn't have > time to implement our solver, yet. > > > If I understand the code correctly, it allows to set a precond-matrix > which should approximate A-sigma*B. > > > > > > I will try to get our code running in the next few weeks. From user > perspective it would maybe simplify things if approximations for A as well > as B are given, since this would hide the internal ST transformations. > > > > > > best wishes > > > Florian > > > > > > On Tue, Feb 16, 2021 at 8:54 PM Jose E. Roman > wrote: > > > Florian: I have created a MR > https://gitlab.com/slepc/slepc/-/merge_requests/149 > > > Let me know if it fits your needs. > > > > > > Jose > > > > > > > > > > El 15 feb 2021, a las 18:44, Jose E. Roman > escribi?: > > > > > > > > > > > > > > > >> El 15 feb 2021, a las 14:53, Matthew Knepley > escribi?: > > > >> > > > >> On Mon, Feb 15, 2021 at 7:27 AM Jose E. Roman > wrote: > > > >> I will think about the viability of adding an interface function to > pass the preconditioner matrix. > > > >> > > > >> Regarding the question about the B-orthogonality of computed > vectors, in the symmetric solver the B-orthogonality is enforced during the > computation, so you have guarantee that the computed vectors satisfy it. > But if solved as non-symetric, the computed vectors may depart from > B-orthogonality, unless the tolerance is very small. > > > >> > > > >> Yes, the vectors I generate are not B-orthogonal. > > > >> > > > >> Jose, do you think there is a way to reformulate what I am doing to > use the symmetric solver, even if we only have the action of B? > > > > > > > > Yes, you can do the following: > > > > > > > > ierr = EPSSetOperators(eps,S,NULL);CHKERRQ(ierr); // S is your > shell matrix A^{-1}*B > > > > ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr); // symmetric > problem though S is not symmetric > > > > ierr = EPSSetFromOptions(eps);CHKERRQ(ierr); > > > > ierr = EPSSetUp(eps);CHKERRQ(ierr); // note explicitly calling > setup here > > > > ierr = EPSGetBV(eps,&bv);CHKERRQ(ierr); > > > > ierr = BVSetMatrix(bv,B,PETSC_FALSE);CHKERRQ(ierr); // replace > solver's inner product > > > > ierr = EPSSolve(eps);CHKERRQ(ierr); > > > > > > > > I have tried this with test1.c and it works. The computed > eigenvectors should be B-orthogonal in this case. > > > > > > > > Jose > > > > > > > > > > > >> > > > >> Thanks, > > > >> > > > >> Matt > > > >> > > > >> Jose > > > >> > > > >> > > > >>> El 14 feb 2021, a las 21:41, Barry Smith > escribi?: > > > >>> > > > >>> > > > >>> Florian, > > > >>> > > > >>> I'm sorry I don't know the answers; I can only speculate. There > is a STGetShift(). > > > >>> > > > >>> All I was saying is theoretically there could/should be such > support in SLEPc. > > > >>> > > > >>> Barry > > > >>> > > > >>> > > > >>>> On Feb 13, 2021, at 6:43 PM, Florian Bruckner > wrote: > > > >>>> > > > >>>> Dear Barry, > > > >>>> thank you for your clarification. What I wanted to say is that > even if I could reset the KSP operators directly I would require to know > which transformation ST applies in order to provide the preconditioning > matrix for the correct operator. > > > >>>> The more general solution would be that SLEPc provides the > interface to pass the preconditioning matrix for A0 and ST applies the same > transformations as for the operator. > > > >>>> > > > >>>> If you write "SLEPc could provide an interface", do you mean > someone should implement it, or should it already be possible and I am not > using it correctly? > > > >>>> I wrote a small standalone example based on ex9.py from slepc4py, > where i tried to use an operator. > > > >>>> > > > >>>> best wishes > > > >>>> Florian > > > >>>> > > > >>>> On Sat, Feb 13, 2021 at 7:15 PM Barry Smith > wrote: > > > >>>> > > > >>>> > > > >>>>> On Feb 13, 2021, at 2:47 AM, Pierre Jolivet > wrote: > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>>> On 13 Feb 2021, at 7:25 AM, Florian Bruckner < > e0425375 at gmail.com> wrote: > > > >>>>>> > > > >>>>>> Dear Jose, Dear Barry, > > > >>>>>> thanks again for your reply. One final question about the B0 > orthogonality. Do you mean that eigenvectors are not B0 orthogonal, but > they are i*B0 orthogonal? or is there an issue with Matt's approach? > > > >>>>>> For my problem I can show that eigenvalues fulfill an > orthogonality relation (phi_i, A0 phi_j ) = omega_i (phi_i, B0 phi_j) = > delta_ij. This should be independent of the solving method, right? > > > >>>>>> > > > >>>>>> Regarding Barry's advice this is what I first tried: > > > >>>>>> es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > > >>>>>> st = es.getST() > > > >>>>>> ksp = st.getKSP() > > > >>>>>> ksp.setOperators(self.A0, self.P0) > > > >>>>>> > > > >>>>>> But it seems that the provided P0 is not used. Furthermore the > interface is maybe a bit confusing if ST performs some transformation. In > this case P0 needs to approximate A0^{-1}*B0 and not A0, right? > > > >>>>> > > > >>>>> No, you need to approximate (A0-sigma B0)^-1. If you have a null > shift, which looks like it is the case, you end up with A0^-1. > > > >>>> > > > >>>> Just trying to provide more clarity with the terms. > > > >>>> > > > >>>> If ST transforms the operator in the KSP to (A0-sigma B0) and you > are providing the "sparse matrix from which the preconditioner is to be > built" then you need to provide something that approximates (A0-sigma B0). > Since the PC will use your matrix to construct a preconditioner that > approximates the inverse of (A0-sigma B0), you don't need to directly > provide something that approximates (A0-sigma B0)^-1 > > > >>>> > > > >>>> Yes, I would think SLEPc could provide an interface where it > manages "the matrix from which to construct the preconditioner" and > transforms that matrix just like the true matrix. To do it by hand you > simply need to know what A0 and B0 are and which sigma ST has selected and > then you can construct your modA0 - sigma modB0 and pass it to the KSP. > Where modA0 and modB0 are your "sparser approximations". > > > >>>> > > > >>>> Barry > > > >>>> > > > >>>> > > > >>>>> > > > >>>>>> Nevertheless I think it would be the best solution if one could > provide P0 (approx A0) and SLEPc derives the preconditioner from this. > Would this be hard to implement? > > > >>>>> > > > >>>>> This is what Barry?s suggestion is implementing. Don?t know why > it doesn?t work with your Python operator though. > > > >>>>> > > > >>>>> Thanks, > > > >>>>> Pierre > > > >>>>> > > > >>>>>> best wishes > > > >>>>>> Florian > > > >>>>>> > > > >>>>>> > > > >>>>>> On Sat, Feb 13, 2021 at 4:19 AM Barry Smith > wrote: > > > >>>>>> > > > >>>>>> > > > >>>>>>> On Feb 12, 2021, at 2:32 AM, Florian Bruckner < > e0425375 at gmail.com> wrote: > > > >>>>>>> > > > >>>>>>> Dear Jose, Dear Matt, > > > >>>>>>> > > > >>>>>>> I needed some time to think about your answers. > > > >>>>>>> If I understand correctly, the eigenmode solver internally > uses A0^{-1}*B0, which is normally handled by the ST object, which creates > a KSP solver and a corresponding preconditioner. > > > >>>>>>> What I would need is an interface to provide not only the > system Matrix A0 (which is an operator), but also a preconditioning matrix > (sparse approximation of the operator). > > > >>>>>>> Unfortunately this interface is not available, right? > > > >>>>>> > > > >>>>>> If SLEPc does not provide this directly it is still intended > to be trivial to provide the "preconditioner matrix" (that is matrix from > which the preconditioner is built). Just get the KSP from the ST object and > use KSPSetOperators() to provide the "preconditioner matrix" . > > > >>>>>> > > > >>>>>> Barry > > > >>>>>> > > > >>>>>>> > > > >>>>>>> Matt directly creates A0^{-1}*B0 as a matshell operator. The > operator uses a KSP with a proper PC internally. SLEPc would directly get > A0^{-1}*B0 and solve a standard eigenvalue problem with this modified > operator. Did I understand this correctly? > > > >>>>>>> > > > >>>>>>> I have two further points, which I did not mention yet: the > matrix B0 is Hermitian, but it is (purely) imaginary (B0.real=0). Right > now, I am using Firedrake to set up the PETSc system matrices A0, i*B0 > (which is real). Then I convert them into ScipyLinearOperators and use > scipy.sparse.eigsh(B0, b=A0, Minv=Minv) to calculate the eigenvalues. > Minv=A0^-1 is also solving within scipy using a preconditioned gmres. > Advantage of this setup is that the imaginary B0 can be handled efficiently > and also the post-processing of the eigenvectors (which requires complex > arithmetics) is simplified. > > > >>>>>>> > > > >>>>>>> Nevertheless I think that the mixing of PETSc and Scipy looks > too complicated and is not very flexible. > > > >>>>>>> If I would use Matt's approach, could I then simply switch > between multiple standard eigenvalue methods (e.g. LOBPCG)? or is it > limited due to the use of matshell? > > > >>>>>>> Is there a solution for the imaginary B0, or do I have to use > the non-hermitian methods? Is this a large performance drawback? > > > >>>>>>> > > > >>>>>>> thanks again, > > > >>>>>>> and best wishes > > > >>>>>>> Florian > > > >>>>>>> > > > >>>>>>> On Mon, Feb 8, 2021 at 3:37 PM Jose E. Roman < > jroman at dsic.upv.es> wrote: > > > >>>>>>> The problem can be written as A0*v=omega*B0*v and you want the > eigenvalues omega closest to zero. If the matrices were explicitly > available, you would do shift-and-invert with target=0, that is > > > >>>>>>> > > > >>>>>>> (A0-sigma*B0)^{-1}*B0*v=theta*v for sigma=0, that is > > > >>>>>>> > > > >>>>>>> A0^{-1}*B0*v=theta*v > > > >>>>>>> > > > >>>>>>> and you compute EPS_LARGEST_MAGNITUDE eigenvalues > theta=1/omega. > > > >>>>>>> > > > >>>>>>> Matt: I guess you should have EPS_LARGEST_MAGNITUDE instead of > EPS_SMALLEST_REAL in your code. Are you getting the eigenvalues you need? > EPS_SMALLEST_REAL will give slow convergence. > > > >>>>>>> > > > >>>>>>> Florian: I would not recommend setting the KSP matrices > directly, it may produce strange side-effects. We should have an interface > function to pass this matrix. Currently there is STPrecondSetMatForPC() but > it has two problems: (1) it is intended for STPRECOND, so cannot be used > with Krylov-Schur, and (2) it is not currently available in the python > interface. > > > >>>>>>> > > > >>>>>>> The approach used by Matt is a workaround that does not use > ST, so you can handle linear solves with a KSP of your own. > > > >>>>>>> > > > >>>>>>> As an alternative, since your problem is symmetric, you could > try LOBPCG, assuming that the leftmost eigenvalues are those that you want > (e.g. if all eigenvalues are non-negative). In that case you could use > STPrecondSetMatForPC(), but the remaining issue is calling it from python. > > > >>>>>>> > > > >>>>>>> If you are using the git repo, I could add the relevant code. > > > >>>>>>> > > > >>>>>>> Jose > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> > > > >>>>>>>> El 8 feb 2021, a las 14:22, Matthew Knepley < > knepley at gmail.com> escribi?: > > > >>>>>>>> > > > >>>>>>>> On Mon, Feb 8, 2021 at 7:04 AM Florian Bruckner < > e0425375 at gmail.com> wrote: > > > >>>>>>>> Dear PETSc / SLEPc Users, > > > >>>>>>>> > > > >>>>>>>> my question is very similar to the one posted here: > > > >>>>>>>> > https://lists.mcs.anl.gov/pipermail/petsc-users/2018-August/035878.html > > > >>>>>>>> > > > >>>>>>>> The eigensystem I would like to solve looks like: > > > >>>>>>>> B0 v = 1/omega A0 v > > > >>>>>>>> B0 and A0 are both hermitian, A0 is positive definite, but > only given as a linear operator (matshell). I am looking for the largest > eigenvalues (=smallest omega). > > > >>>>>>>> > > > >>>>>>>> I also have a sparse approximation P0 of the A0 operator, > which i would like to use as precondtioner, using something like this: > > > >>>>>>>> > > > >>>>>>>> es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > > >>>>>>>> st = es.getST() > > > >>>>>>>> ksp = st.getKSP() > > > >>>>>>>> ksp.setOperators(self.A0, self.P0) > > > >>>>>>>> > > > >>>>>>>> Unfortunately PETSc still complains that it cannot create a > preconditioner for a type 'python' matrix although P0.type == 'seqaij' (but > A0.type == 'python'). > > > >>>>>>>> By the way, should P0 be an approximation of A0 or does it > have to include B0? > > > >>>>>>>> > > > >>>>>>>> Right now I am using the krylov-schur method. Are there any > alternatives if A0 is only given as an operator? > > > >>>>>>>> > > > >>>>>>>> Jose can correct me if I say something wrong. > > > >>>>>>>> > > > >>>>>>>> When I did this, I made a shell operator for the action of > A0^{-1} B0 which has a KSPSolve() in it, so you can use your P0 > preconditioning matrix, and > > > >>>>>>>> then handed that to EPS. You can see me do it here: > > > >>>>>>>> > > > >>>>>>>> > https://gitlab.com/knepley/bamg/-/blob/master/src/coarse/bamgCoarseSpace.c#L123 > > > >>>>>>>> > > > >>>>>>>> I had a hard time getting the embedded solver to work the way > I wanted, but maybe that is the better way. > > > >>>>>>>> > > > >>>>>>>> Thanks, > > > >>>>>>>> > > > >>>>>>>> Matt > > > >>>>>>>> > > > >>>>>>>> thanks for any advice > > > >>>>>>>> best wishes > > > >>>>>>>> Florian > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> -- > > > >>>>>>>> What most experimenters take for granted before they begin > their experiments is infinitely more interesting than any results to which > their experiments lead. > > > >>>>>>>> -- Norbert Wiener > > > >>>>>>>> > > > >>>>>>>> https://www.cse.buffalo.edu/~knepley/ > > > >>>>>>> > > > >>>>>> > > > >>>>> > > > >>>> > > > >>>> > > > >>> > > > >> > > > >> > > > >> > > > >> -- > > > >> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > >> -- Norbert Wiener > > > >> > > > >> https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 9 07:29:29 2021 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 9 Mar 2021 08:29:29 -0500 Subject: [petsc-users] using preconditioner with SLEPc In-Reply-To: References: <7C5B30FE-C539-4A14-B442-B1C91618E4AC@petsc.dev> <119944FD-4F1E-4B2F-A39D-65ADDB12BB5F@petsc.dev> <6EF7889D-DC17-46FC-82A5-9409C41E231D@petsc.dev> <46C744D7-4376-46B3-B5C4-211A4C8C2291@dsic.upv.es> <80BCEEDC-4C1E-4512-AAF5-7B6E718C7D1D@dsic.upv.es> <1DCC9C38-D49E-4A02-8EE3-B3701618914A@dsic.upv.es> Message-ID: On Tue, Mar 9, 2021 at 8:23 AM Florian Bruckner wrote: > Dear Jose, > > good news: something is running. I have to admit that I don't understand > the difference of what you suggested, but if I run > es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > es.setDimensions(k) > es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) > > es.setProblemType(SLEPc.EPS.ProblemType.GNHEP) # Generalized > Non-Hermitian eigenproblem with positive definite B > es.setTarget(0.0) > es.setWhichEigenpairs(SLEPc.EPS.Which.TARGET_MAGNITUDE) > > es.setTolerances(1e-10) > es.setOperators(self.A0, self.B0) > > es.setFromOptions() > st = es.getST() > st.setType(SLEPc.ST.Type.SINVERT) > st.setPreconditionerMat(self.P0) > es.solve() > > es.view() > python run_slepc.py -st_ksp_type gmres -st_pc_type bjacobi > > i get the correct eigenvalues if as -1j*es.getEigenvalue(i) (the -1j is > because B0 should be purely imaginary). > Omitting the preconditioning matrix gives the same results, but as > expected is significantly slower. > > If running it with the -st_ksp_type preonly -st_pc_type lu it still > converges against totally wrong values. > But this could be due to the preonly option. If only the PC is applied > only, it is clear that A0 is not used at all, right? > > I didn't set LU manually. Perhaps firedrake changes the defaults somehow? > But this should not be a problem. > Hi Florian, This is a mismatch of assumptions by you and Firedrake. The solver (preonly, LU) factors the _preconditioning_ matrix and solves it directly. Usually this is very robust, but it assumes that the system matrix and preconditioning matrix are the same, so you get the solution to your actual problem. Here you have an approximate preconditioning matrix, so (preonly, LU) solves only that approximate problem. The solver (gmres, LU) will use GMRES to solve the system matrix, preconditioned by the LU solve of the preconditioning matrix, which gives you the right result. Does that make sense? Thanks, Matt > So, the code seems to work. Just to be sure, solving with (A0, B0) for > SMALLEST_MAGNITUDE should be similar to target=0 and TARGET_MAGNITUDE, or > is there any difference? > Finally, also solving (B0, A0) with LARGEST_MAGNITUDE should be similar, > but one gets 1/omega as eigenvalues. In all cases the provided precond > matrix should approximate A0, right? > Is this correct, or are the differences in the implementation when using > the different formulations of the problem? > > again, many many thanks. > best wishes > Florian > > On Tue, Mar 9, 2021 at 1:20 PM Jose E. Roman wrote: > >> >> >> > El 9 mar 2021, a las 13:07, Florian Bruckner >> escribi?: >> > >> > Dear Jose, >> > I appended the output of eps-view for the original method and for the >> method you proposed. >> > Unfortunately the new method does not converge. This I what i did: >> > >> > es = SLEPc.EPS().create(comm=fd.COMM_WORLD) >> > es.setDimensions(k) >> > es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) >> > es.setProblemType(SLEPc.EPS.ProblemType.PGNHEP) # Generalized >> Non-Hermitian eigenproblem with positive definite B >> > es.setTarget(0.0) >> > es.setWhichEigenpairs(SLEPc.EPS.Which.TARGET_MAGNITUDE) >> > es.setTolerances(1e-10) >> > es.setOperators(self.B0, self.A0) >> > es.setFromOptions() >> > st = es.getST() >> > st.setPreconditionerMat(self.P0) >> >> No, you did not set SINVERT. Also, you should set (A,B) and not (B,A). >> And forget about PGNHEP for the moment. >> >> > >> > Is TARGET_MAGNITUDE correct? If I change it back to LARGEST_MAGNITUDE i >> can reproduce the (wrong) results from before. >> > Why do I need the shift-invert mode at all? I thought this is only >> necessary if I would like to solve for the smallest eigenmodes? >> > >> > Without st.setPreconditionerMat(self.P0) the code does not work, >> because A0 is a matshell and the preconditioner cannot be set up. >> > If I use pc_type = None the method converges, but results are totally >> wrong (1e-12 GHz instead of 7GHz). >> > >> > What confuses me most, is that the slepc results (without target=0 and >> with the precond matrix) produces nearly correct results, >> > which perfectly fit the results from scipy when using P0 instead of A0. >> This is a strange coincidence, and looks like P0 is used somewhere instead >> of A0. >> >> As I said, the problem is that it is using PREONLY+LU when it should use >> GMRES+BJACOBI by default. Are you setting PREONLY+LU somehow? Try running >> with -st_ksp_type gmres -st_pc_type bjacobi >> >> Jose >> >> > >> > thanks for your help >> > Florian >> > >> > On Tue, Mar 9, 2021 at 10:48 AM Jose E. Roman >> wrote: >> > The reason may be that it is using a direct solver instead of an >> iterative solver. What do you get for -eps_view ? >> > >> > Does the code work correctly if you comment out >> st.setPreconditionerMat(self.P0) ? >> > >> > Your approach should work, but I would first try as is done in the >> example https://slepc.upv.es/slepc-main/src/eps/tutorials/ex46.c.html >> > that is, shift-and-invert with target=0 and target_magnitude. >> > >> > Jose >> > >> > >> > > El 9 mar 2021, a las 9:07, Florian Bruckner >> escribi?: >> > > >> > > Dear Jose, >> > > I asked Lawrence Mitchell from the firedrake people to help me with >> the slepc update (I think they are applying some modifications for petsc, >> which is why simply updating petsc within my docker container did not work). >> > > Now the latest slepc version runs and I already get some results of >> the eigenmode solver. The good thing is that the solver runs significantly >> faster. The bad thing is that the results are still wrong :-) >> > > >> > > Could you have a short look at the code: >> > > es = SLEPc.EPS().create(comm=fd.COMM_WORLD) >> > > es.setDimensions(k) >> > > es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) >> > > es.setProblemType(SLEPc.EPS.ProblemType.PGNHEP) # Generalized >> Non-Hermitian eigenproblem with positive definite B >> > > es.setWhichEigenpairs(SLEPc.EPS.Which.LARGEST_MAGNITUDE) >> > > #es.setTrueResidual(True) >> > > es.setTolerances(1e-10) >> > > es.setOperators(self.B0, self.A0) >> > > es.setFromOptions() >> > > >> > > st = es.getST() >> > > st.setPreconditionerMat(self.P0) >> > > >> > > You wrote that when using shift-and-invert with target=0 the solver >> internally uses A0^{-1}*B0. >> > > Nevertheless I think the precond P0 mat should be an approximation of >> A0, right? >> > > This is because the solver uses the B0-inner product to preserve >> symmetry. >> > > Or is the B0 missing in my code? >> > > >> > > As I mentioned before, convergence of the method is extremely fast. I >> thought that maybe the tolerance is set too low, but increasing it did not >> change the situation. >> > > With using setTrueResidual, there is no convergence at all. >> > > >> > > Figures show the different results for the original scipy method >> (which has been compared to published results) as well as the new slepc >> method. >> > > For some strange reason I get nearly the same (wrong) results if i >> replace A0 with P0 in the original scipy code. >> > > In my case A0 is a non-local field operator and P0 only contains >> local and next-neighbour interaction. >> > > Is it possible that the wrong operator (P0 instead of A0) is used >> internally? >> > > >> > > best wishes >> > > Florian >> > > >> > > On Thu, Feb 18, 2021 at 1:00 PM Florian Bruckner >> wrote: >> > > Dear Jose, >> > > thanks for your work. I just looked over the code, but I didn't have >> time to implement our solver, yet. >> > > If I understand the code correctly, it allows to set a precond-matrix >> which should approximate A-sigma*B. >> > > >> > > I will try to get our code running in the next few weeks. From user >> perspective it would maybe simplify things if approximations for A as well >> as B are given, since this would hide the internal ST transformations. >> > > >> > > best wishes >> > > Florian >> > > >> > > On Tue, Feb 16, 2021 at 8:54 PM Jose E. Roman >> wrote: >> > > Florian: I have created a MR >> https://gitlab.com/slepc/slepc/-/merge_requests/149 >> > > Let me know if it fits your needs. >> > > >> > > Jose >> > > >> > > >> > > > El 15 feb 2021, a las 18:44, Jose E. Roman >> escribi?: >> > > > >> > > > >> > > > >> > > >> El 15 feb 2021, a las 14:53, Matthew Knepley >> escribi?: >> > > >> >> > > >> On Mon, Feb 15, 2021 at 7:27 AM Jose E. Roman >> wrote: >> > > >> I will think about the viability of adding an interface function >> to pass the preconditioner matrix. >> > > >> >> > > >> Regarding the question about the B-orthogonality of computed >> vectors, in the symmetric solver the B-orthogonality is enforced during the >> computation, so you have guarantee that the computed vectors satisfy it. >> But if solved as non-symetric, the computed vectors may depart from >> B-orthogonality, unless the tolerance is very small. >> > > >> >> > > >> Yes, the vectors I generate are not B-orthogonal. >> > > >> >> > > >> Jose, do you think there is a way to reformulate what I am doing >> to use the symmetric solver, even if we only have the action of B? >> > > > >> > > > Yes, you can do the following: >> > > > >> > > > ierr = EPSSetOperators(eps,S,NULL);CHKERRQ(ierr); // S is your >> shell matrix A^{-1}*B >> > > > ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr); // symmetric >> problem though S is not symmetric >> > > > ierr = EPSSetFromOptions(eps);CHKERRQ(ierr); >> > > > ierr = EPSSetUp(eps);CHKERRQ(ierr); // note explicitly calling >> setup here >> > > > ierr = EPSGetBV(eps,&bv);CHKERRQ(ierr); >> > > > ierr = BVSetMatrix(bv,B,PETSC_FALSE);CHKERRQ(ierr); // replace >> solver's inner product >> > > > ierr = EPSSolve(eps);CHKERRQ(ierr); >> > > > >> > > > I have tried this with test1.c and it works. The computed >> eigenvectors should be B-orthogonal in this case. >> > > > >> > > > Jose >> > > > >> > > > >> > > >> >> > > >> Thanks, >> > > >> >> > > >> Matt >> > > >> >> > > >> Jose >> > > >> >> > > >> >> > > >>> El 14 feb 2021, a las 21:41, Barry Smith >> escribi?: >> > > >>> >> > > >>> >> > > >>> Florian, >> > > >>> >> > > >>> I'm sorry I don't know the answers; I can only speculate. There >> is a STGetShift(). >> > > >>> >> > > >>> All I was saying is theoretically there could/should be such >> support in SLEPc. >> > > >>> >> > > >>> Barry >> > > >>> >> > > >>> >> > > >>>> On Feb 13, 2021, at 6:43 PM, Florian Bruckner < >> e0425375 at gmail.com> wrote: >> > > >>>> >> > > >>>> Dear Barry, >> > > >>>> thank you for your clarification. What I wanted to say is that >> even if I could reset the KSP operators directly I would require to know >> which transformation ST applies in order to provide the preconditioning >> matrix for the correct operator. >> > > >>>> The more general solution would be that SLEPc provides the >> interface to pass the preconditioning matrix for A0 and ST applies the same >> transformations as for the operator. >> > > >>>> >> > > >>>> If you write "SLEPc could provide an interface", do you mean >> someone should implement it, or should it already be possible and I am not >> using it correctly? >> > > >>>> I wrote a small standalone example based on ex9.py from >> slepc4py, where i tried to use an operator. >> > > >>>> >> > > >>>> best wishes >> > > >>>> Florian >> > > >>>> >> > > >>>> On Sat, Feb 13, 2021 at 7:15 PM Barry Smith >> wrote: >> > > >>>> >> > > >>>> >> > > >>>>> On Feb 13, 2021, at 2:47 AM, Pierre Jolivet >> wrote: >> > > >>>>> >> > > >>>>> >> > > >>>>> >> > > >>>>>> On 13 Feb 2021, at 7:25 AM, Florian Bruckner < >> e0425375 at gmail.com> wrote: >> > > >>>>>> >> > > >>>>>> Dear Jose, Dear Barry, >> > > >>>>>> thanks again for your reply. One final question about the B0 >> orthogonality. Do you mean that eigenvectors are not B0 orthogonal, but >> they are i*B0 orthogonal? or is there an issue with Matt's approach? >> > > >>>>>> For my problem I can show that eigenvalues fulfill an >> orthogonality relation (phi_i, A0 phi_j ) = omega_i (phi_i, B0 phi_j) = >> delta_ij. This should be independent of the solving method, right? >> > > >>>>>> >> > > >>>>>> Regarding Barry's advice this is what I first tried: >> > > >>>>>> es = SLEPc.EPS().create(comm=fd.COMM_WORLD) >> > > >>>>>> st = es.getST() >> > > >>>>>> ksp = st.getKSP() >> > > >>>>>> ksp.setOperators(self.A0, self.P0) >> > > >>>>>> >> > > >>>>>> But it seems that the provided P0 is not used. Furthermore the >> interface is maybe a bit confusing if ST performs some transformation. In >> this case P0 needs to approximate A0^{-1}*B0 and not A0, right? >> > > >>>>> >> > > >>>>> No, you need to approximate (A0-sigma B0)^-1. If you have a >> null shift, which looks like it is the case, you end up with A0^-1. >> > > >>>> >> > > >>>> Just trying to provide more clarity with the terms. >> > > >>>> >> > > >>>> If ST transforms the operator in the KSP to (A0-sigma B0) and >> you are providing the "sparse matrix from which the preconditioner is to be >> built" then you need to provide something that approximates (A0-sigma B0). >> Since the PC will use your matrix to construct a preconditioner that >> approximates the inverse of (A0-sigma B0), you don't need to directly >> provide something that approximates (A0-sigma B0)^-1 >> > > >>>> >> > > >>>> Yes, I would think SLEPc could provide an interface where it >> manages "the matrix from which to construct the preconditioner" and >> transforms that matrix just like the true matrix. To do it by hand you >> simply need to know what A0 and B0 are and which sigma ST has selected and >> then you can construct your modA0 - sigma modB0 and pass it to the KSP. >> Where modA0 and modB0 are your "sparser approximations". >> > > >>>> >> > > >>>> Barry >> > > >>>> >> > > >>>> >> > > >>>>> >> > > >>>>>> Nevertheless I think it would be the best solution if one >> could provide P0 (approx A0) and SLEPc derives the preconditioner from >> this. Would this be hard to implement? >> > > >>>>> >> > > >>>>> This is what Barry?s suggestion is implementing. Don?t know why >> it doesn?t work with your Python operator though. >> > > >>>>> >> > > >>>>> Thanks, >> > > >>>>> Pierre >> > > >>>>> >> > > >>>>>> best wishes >> > > >>>>>> Florian >> > > >>>>>> >> > > >>>>>> >> > > >>>>>> On Sat, Feb 13, 2021 at 4:19 AM Barry Smith >> wrote: >> > > >>>>>> >> > > >>>>>> >> > > >>>>>>> On Feb 12, 2021, at 2:32 AM, Florian Bruckner < >> e0425375 at gmail.com> wrote: >> > > >>>>>>> >> > > >>>>>>> Dear Jose, Dear Matt, >> > > >>>>>>> >> > > >>>>>>> I needed some time to think about your answers. >> > > >>>>>>> If I understand correctly, the eigenmode solver internally >> uses A0^{-1}*B0, which is normally handled by the ST object, which creates >> a KSP solver and a corresponding preconditioner. >> > > >>>>>>> What I would need is an interface to provide not only the >> system Matrix A0 (which is an operator), but also a preconditioning matrix >> (sparse approximation of the operator). >> > > >>>>>>> Unfortunately this interface is not available, right? >> > > >>>>>> >> > > >>>>>> If SLEPc does not provide this directly it is still intended >> to be trivial to provide the "preconditioner matrix" (that is matrix from >> which the preconditioner is built). Just get the KSP from the ST object and >> use KSPSetOperators() to provide the "preconditioner matrix" . >> > > >>>>>> >> > > >>>>>> Barry >> > > >>>>>> >> > > >>>>>>> >> > > >>>>>>> Matt directly creates A0^{-1}*B0 as a matshell operator. The >> operator uses a KSP with a proper PC internally. SLEPc would directly get >> A0^{-1}*B0 and solve a standard eigenvalue problem with this modified >> operator. Did I understand this correctly? >> > > >>>>>>> >> > > >>>>>>> I have two further points, which I did not mention yet: the >> matrix B0 is Hermitian, but it is (purely) imaginary (B0.real=0). Right >> now, I am using Firedrake to set up the PETSc system matrices A0, i*B0 >> (which is real). Then I convert them into ScipyLinearOperators and use >> scipy.sparse.eigsh(B0, b=A0, Minv=Minv) to calculate the eigenvalues. >> Minv=A0^-1 is also solving within scipy using a preconditioned gmres. >> Advantage of this setup is that the imaginary B0 can be handled efficiently >> and also the post-processing of the eigenvectors (which requires complex >> arithmetics) is simplified. >> > > >>>>>>> >> > > >>>>>>> Nevertheless I think that the mixing of PETSc and Scipy looks >> too complicated and is not very flexible. >> > > >>>>>>> If I would use Matt's approach, could I then simply switch >> between multiple standard eigenvalue methods (e.g. LOBPCG)? or is it >> limited due to the use of matshell? >> > > >>>>>>> Is there a solution for the imaginary B0, or do I have to use >> the non-hermitian methods? Is this a large performance drawback? >> > > >>>>>>> >> > > >>>>>>> thanks again, >> > > >>>>>>> and best wishes >> > > >>>>>>> Florian >> > > >>>>>>> >> > > >>>>>>> On Mon, Feb 8, 2021 at 3:37 PM Jose E. Roman < >> jroman at dsic.upv.es> wrote: >> > > >>>>>>> The problem can be written as A0*v=omega*B0*v and you want >> the eigenvalues omega closest to zero. If the matrices were explicitly >> available, you would do shift-and-invert with target=0, that is >> > > >>>>>>> >> > > >>>>>>> (A0-sigma*B0)^{-1}*B0*v=theta*v for sigma=0, that is >> > > >>>>>>> >> > > >>>>>>> A0^{-1}*B0*v=theta*v >> > > >>>>>>> >> > > >>>>>>> and you compute EPS_LARGEST_MAGNITUDE eigenvalues >> theta=1/omega. >> > > >>>>>>> >> > > >>>>>>> Matt: I guess you should have EPS_LARGEST_MAGNITUDE instead >> of EPS_SMALLEST_REAL in your code. Are you getting the eigenvalues you >> need? EPS_SMALLEST_REAL will give slow convergence. >> > > >>>>>>> >> > > >>>>>>> Florian: I would not recommend setting the KSP matrices >> directly, it may produce strange side-effects. We should have an interface >> function to pass this matrix. Currently there is STPrecondSetMatForPC() but >> it has two problems: (1) it is intended for STPRECOND, so cannot be used >> with Krylov-Schur, and (2) it is not currently available in the python >> interface. >> > > >>>>>>> >> > > >>>>>>> The approach used by Matt is a workaround that does not use >> ST, so you can handle linear solves with a KSP of your own. >> > > >>>>>>> >> > > >>>>>>> As an alternative, since your problem is symmetric, you could >> try LOBPCG, assuming that the leftmost eigenvalues are those that you want >> (e.g. if all eigenvalues are non-negative). In that case you could use >> STPrecondSetMatForPC(), but the remaining issue is calling it from python. >> > > >>>>>>> >> > > >>>>>>> If you are using the git repo, I could add the relevant code. >> > > >>>>>>> >> > > >>>>>>> Jose >> > > >>>>>>> >> > > >>>>>>> >> > > >>>>>>> >> > > >>>>>>>> El 8 feb 2021, a las 14:22, Matthew Knepley < >> knepley at gmail.com> escribi?: >> > > >>>>>>>> >> > > >>>>>>>> On Mon, Feb 8, 2021 at 7:04 AM Florian Bruckner < >> e0425375 at gmail.com> wrote: >> > > >>>>>>>> Dear PETSc / SLEPc Users, >> > > >>>>>>>> >> > > >>>>>>>> my question is very similar to the one posted here: >> > > >>>>>>>> >> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-August/035878.html >> > > >>>>>>>> >> > > >>>>>>>> The eigensystem I would like to solve looks like: >> > > >>>>>>>> B0 v = 1/omega A0 v >> > > >>>>>>>> B0 and A0 are both hermitian, A0 is positive definite, but >> only given as a linear operator (matshell). I am looking for the largest >> eigenvalues (=smallest omega). >> > > >>>>>>>> >> > > >>>>>>>> I also have a sparse approximation P0 of the A0 operator, >> which i would like to use as precondtioner, using something like this: >> > > >>>>>>>> >> > > >>>>>>>> es = SLEPc.EPS().create(comm=fd.COMM_WORLD) >> > > >>>>>>>> st = es.getST() >> > > >>>>>>>> ksp = st.getKSP() >> > > >>>>>>>> ksp.setOperators(self.A0, self.P0) >> > > >>>>>>>> >> > > >>>>>>>> Unfortunately PETSc still complains that it cannot create a >> preconditioner for a type 'python' matrix although P0.type == 'seqaij' (but >> A0.type == 'python'). >> > > >>>>>>>> By the way, should P0 be an approximation of A0 or does it >> have to include B0? >> > > >>>>>>>> >> > > >>>>>>>> Right now I am using the krylov-schur method. Are there any >> alternatives if A0 is only given as an operator? >> > > >>>>>>>> >> > > >>>>>>>> Jose can correct me if I say something wrong. >> > > >>>>>>>> >> > > >>>>>>>> When I did this, I made a shell operator for the action of >> A0^{-1} B0 which has a KSPSolve() in it, so you can use your P0 >> preconditioning matrix, and >> > > >>>>>>>> then handed that to EPS. You can see me do it here: >> > > >>>>>>>> >> > > >>>>>>>> >> https://gitlab.com/knepley/bamg/-/blob/master/src/coarse/bamgCoarseSpace.c#L123 >> > > >>>>>>>> >> > > >>>>>>>> I had a hard time getting the embedded solver to work the >> way I wanted, but maybe that is the better way. >> > > >>>>>>>> >> > > >>>>>>>> Thanks, >> > > >>>>>>>> >> > > >>>>>>>> Matt >> > > >>>>>>>> >> > > >>>>>>>> thanks for any advice >> > > >>>>>>>> best wishes >> > > >>>>>>>> Florian >> > > >>>>>>>> >> > > >>>>>>>> >> > > >>>>>>>> -- >> > > >>>>>>>> What most experimenters take for granted before they begin >> their experiments is infinitely more interesting than any results to which >> their experiments lead. >> > > >>>>>>>> -- Norbert Wiener >> > > >>>>>>>> >> > > >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >> > > >>>>>>> >> > > >>>>>> >> > > >>>>> >> > > >>>> >> > > >>>> >> > > >>> >> > > >> >> > > >> >> > > >> >> > > >> -- >> > > >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > > >> -- Norbert Wiener >> > > >> >> > > >> https://www.cse.buffalo.edu/~knepley/ >> > > >> > > >> >> > >> > >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Tue Mar 9 08:17:01 2021 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 9 Mar 2021 15:17:01 +0100 Subject: [petsc-users] using preconditioner with SLEPc In-Reply-To: References: <7C5B30FE-C539-4A14-B442-B1C91618E4AC@petsc.dev> <119944FD-4F1E-4B2F-A39D-65ADDB12BB5F@petsc.dev> <6EF7889D-DC17-46FC-82A5-9409C41E231D@petsc.dev> <46C744D7-4376-46B3-B5C4-211A4C8C2291@dsic.upv.es> <80BCEEDC-4C1E-4512-AAF5-7B6E718C7D1D@dsic.upv.es> <1DCC9C38-D49E-4A02-8EE3-B3701618914A@dsic.upv.es> Message-ID: <4639F19E-5B56-43F9-A617-FC44195E462F@dsic.upv.es> Preconditioned eigensolvers such as GD, JD, LOBPCG, are intended to work with preconditioners, i.e. not very accurate approximations of (A-sigma*B)^-1, but Krylov solvers require full accuracy, so even if you use an approximation to build the preconditioner it is necessary to do a full solve, i.e. gmres or bcgs wrapping your preconditioner. When I added this, I changed it to default to gmres+bjacobi when EPSSetPreconditionerMat() is called, but for some reason in your case it is not doing it. If it's Firedrake who is setting preonly+lu I would say it is not necessary (that is the default). I don't know if there is another explanation, maybe the order is relevant: try calling es.setFromOptions() after st.setPreconditionerMat(self.P0). Anyway, I will add a check that EPSSetPreconditionerMat() fails if the user selects preonly in Krylov solvers. Look at my first email, where I explained the equivalence between shift-and-invert with target=0 and regular solve with the reversed matrices: The problem can be written as A0*v=omega*B0*v and you want the eigenvalues omega closest to zero. If the matrices were explicitly available, you would do shift-and-invert with target=0, that is (A0-sigma*B0)^{-1}*B0*v=theta*v for sigma=0, that is A0^{-1}*B0*v=theta*v and you compute EPS_LARGEST_MAGNITUDE eigenvalues theta=1/omega. See also chapter 3 of the users manual. Jose > El 9 mar 2021, a las 14:29, Matthew Knepley escribi?: > > On Tue, Mar 9, 2021 at 8:23 AM Florian Bruckner wrote: > Dear Jose, > > good news: something is running. I have to admit that I don't understand the difference of what you suggested, but if I run > es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > es.setDimensions(k) > es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) > es.setProblemType(SLEPc.EPS.ProblemType.GNHEP) # Generalized Non-Hermitian eigenproblem with positive definite B > es.setTarget(0.0) > es.setWhichEigenpairs(SLEPc.EPS.Which.TARGET_MAGNITUDE) > es.setTolerances(1e-10) > es.setOperators(self.A0, self.B0) > es.setFromOptions() > st = es.getST() > st.setType(SLEPc.ST.Type.SINVERT) > st.setPreconditionerMat(self.P0) > es.solve() > es.view() > python run_slepc.py -st_ksp_type gmres -st_pc_type bjacobi > > i get the correct eigenvalues if as -1j*es.getEigenvalue(i) (the -1j is because B0 should be purely imaginary). > Omitting the preconditioning matrix gives the same results, but as expected is significantly slower. > > If running it with the -st_ksp_type preonly -st_pc_type lu it still converges against totally wrong values. > But this could be due to the preonly option. If only the PC is applied only, it is clear that A0 is not used at all, right? > > I didn't set LU manually. Perhaps firedrake changes the defaults somehow? But this should not be a problem. > > Hi Florian, > > This is a mismatch of assumptions by you and Firedrake. The solver (preonly, LU) factors the _preconditioning_ matrix and > solves it directly. Usually this is very robust, but it assumes that the system matrix and preconditioning matrix are the same, > so you get the solution to your actual problem. Here you have an approximate preconditioning matrix, so (preonly, LU) solves > only that approximate problem. The solver (gmres, LU) will use GMRES to solve the system matrix, preconditioned by the LU > solve of the preconditioning matrix, which gives you the right result. > > Does that make sense? > > Thanks, > > Matt > > So, the code seems to work. Just to be sure, solving with (A0, B0) for SMALLEST_MAGNITUDE should be similar to target=0 and TARGET_MAGNITUDE, or is there any difference? > Finally, also solving (B0, A0) with LARGEST_MAGNITUDE should be similar, but one gets 1/omega as eigenvalues. In all cases the provided precond matrix should approximate A0, right? > Is this correct, or are the differences in the implementation when using the different formulations of the problem? > > again, many many thanks. > best wishes > Florian > > On Tue, Mar 9, 2021 at 1:20 PM Jose E. Roman wrote: > > > > El 9 mar 2021, a las 13:07, Florian Bruckner escribi?: > > > > Dear Jose, > > I appended the output of eps-view for the original method and for the method you proposed. > > Unfortunately the new method does not converge. This I what i did: > > > > es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > es.setDimensions(k) > > es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) > > es.setProblemType(SLEPc.EPS.ProblemType.PGNHEP) # Generalized Non-Hermitian eigenproblem with positive definite B > > es.setTarget(0.0) > > es.setWhichEigenpairs(SLEPc.EPS.Which.TARGET_MAGNITUDE) > > es.setTolerances(1e-10) > > es.setOperators(self.B0, self.A0) > > es.setFromOptions() > > st = es.getST() > > st.setPreconditionerMat(self.P0) > > No, you did not set SINVERT. Also, you should set (A,B) and not (B,A). And forget about PGNHEP for the moment. > > > > > Is TARGET_MAGNITUDE correct? If I change it back to LARGEST_MAGNITUDE i can reproduce the (wrong) results from before. > > Why do I need the shift-invert mode at all? I thought this is only necessary if I would like to solve for the smallest eigenmodes? > > > > Without st.setPreconditionerMat(self.P0) the code does not work, because A0 is a matshell and the preconditioner cannot be set up. > > If I use pc_type = None the method converges, but results are totally wrong (1e-12 GHz instead of 7GHz). > > > > What confuses me most, is that the slepc results (without target=0 and with the precond matrix) produces nearly correct results, > > which perfectly fit the results from scipy when using P0 instead of A0. This is a strange coincidence, and looks like P0 is used somewhere instead of A0. > > As I said, the problem is that it is using PREONLY+LU when it should use GMRES+BJACOBI by default. Are you setting PREONLY+LU somehow? Try running with -st_ksp_type gmres -st_pc_type bjacobi > > Jose > > > > > thanks for your help > > Florian > > > > On Tue, Mar 9, 2021 at 10:48 AM Jose E. Roman wrote: > > The reason may be that it is using a direct solver instead of an iterative solver. What do you get for -eps_view ? > > > > Does the code work correctly if you comment out st.setPreconditionerMat(self.P0) ? > > > > Your approach should work, but I would first try as is done in the example https://slepc.upv.es/slepc-main/src/eps/tutorials/ex46.c.html > > that is, shift-and-invert with target=0 and target_magnitude. > > > > Jose > > > > > > > El 9 mar 2021, a las 9:07, Florian Bruckner escribi?: > > > > > > Dear Jose, > > > I asked Lawrence Mitchell from the firedrake people to help me with the slepc update (I think they are applying some modifications for petsc, which is why simply updating petsc within my docker container did not work). > > > Now the latest slepc version runs and I already get some results of the eigenmode solver. The good thing is that the solver runs significantly faster. The bad thing is that the results are still wrong :-) > > > > > > Could you have a short look at the code: > > > es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > > es.setDimensions(k) > > > es.setType(SLEPc.EPS.Type.KRYLOVSCHUR) > > > es.setProblemType(SLEPc.EPS.ProblemType.PGNHEP) # Generalized Non-Hermitian eigenproblem with positive definite B > > > es.setWhichEigenpairs(SLEPc.EPS.Which.LARGEST_MAGNITUDE) > > > #es.setTrueResidual(True) > > > es.setTolerances(1e-10) > > > es.setOperators(self.B0, self.A0) > > > es.setFromOptions() > > > > > > st = es.getST() > > > st.setPreconditionerMat(self.P0) > > > > > > You wrote that when using shift-and-invert with target=0 the solver internally uses A0^{-1}*B0. > > > Nevertheless I think the precond P0 mat should be an approximation of A0, right? > > > This is because the solver uses the B0-inner product to preserve symmetry. > > > Or is the B0 missing in my code? > > > > > > As I mentioned before, convergence of the method is extremely fast. I thought that maybe the tolerance is set too low, but increasing it did not change the situation. > > > With using setTrueResidual, there is no convergence at all. > > > > > > Figures show the different results for the original scipy method (which has been compared to published results) as well as the new slepc method. > > > For some strange reason I get nearly the same (wrong) results if i replace A0 with P0 in the original scipy code. > > > In my case A0 is a non-local field operator and P0 only contains local and next-neighbour interaction. > > > Is it possible that the wrong operator (P0 instead of A0) is used internally? > > > > > > best wishes > > > Florian > > > > > > On Thu, Feb 18, 2021 at 1:00 PM Florian Bruckner wrote: > > > Dear Jose, > > > thanks for your work. I just looked over the code, but I didn't have time to implement our solver, yet. > > > If I understand the code correctly, it allows to set a precond-matrix which should approximate A-sigma*B. > > > > > > I will try to get our code running in the next few weeks. From user perspective it would maybe simplify things if approximations for A as well as B are given, since this would hide the internal ST transformations. > > > > > > best wishes > > > Florian > > > > > > On Tue, Feb 16, 2021 at 8:54 PM Jose E. Roman wrote: > > > Florian: I have created a MR https://gitlab.com/slepc/slepc/-/merge_requests/149 > > > Let me know if it fits your needs. > > > > > > Jose > > > > > > > > > > El 15 feb 2021, a las 18:44, Jose E. Roman escribi?: > > > > > > > > > > > > > > > >> El 15 feb 2021, a las 14:53, Matthew Knepley escribi?: > > > >> > > > >> On Mon, Feb 15, 2021 at 7:27 AM Jose E. Roman wrote: > > > >> I will think about the viability of adding an interface function to pass the preconditioner matrix. > > > >> > > > >> Regarding the question about the B-orthogonality of computed vectors, in the symmetric solver the B-orthogonality is enforced during the computation, so you have guarantee that the computed vectors satisfy it. But if solved as non-symetric, the computed vectors may depart from B-orthogonality, unless the tolerance is very small. > > > >> > > > >> Yes, the vectors I generate are not B-orthogonal. > > > >> > > > >> Jose, do you think there is a way to reformulate what I am doing to use the symmetric solver, even if we only have the action of B? > > > > > > > > Yes, you can do the following: > > > > > > > > ierr = EPSSetOperators(eps,S,NULL);CHKERRQ(ierr); // S is your shell matrix A^{-1}*B > > > > ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr); // symmetric problem though S is not symmetric > > > > ierr = EPSSetFromOptions(eps);CHKERRQ(ierr); > > > > ierr = EPSSetUp(eps);CHKERRQ(ierr); // note explicitly calling setup here > > > > ierr = EPSGetBV(eps,&bv);CHKERRQ(ierr); > > > > ierr = BVSetMatrix(bv,B,PETSC_FALSE);CHKERRQ(ierr); // replace solver's inner product > > > > ierr = EPSSolve(eps);CHKERRQ(ierr); > > > > > > > > I have tried this with test1.c and it works. The computed eigenvectors should be B-orthogonal in this case. > > > > > > > > Jose > > > > > > > > > > > >> > > > >> Thanks, > > > >> > > > >> Matt > > > >> > > > >> Jose > > > >> > > > >> > > > >>> El 14 feb 2021, a las 21:41, Barry Smith escribi?: > > > >>> > > > >>> > > > >>> Florian, > > > >>> > > > >>> I'm sorry I don't know the answers; I can only speculate. There is a STGetShift(). > > > >>> > > > >>> All I was saying is theoretically there could/should be such support in SLEPc. > > > >>> > > > >>> Barry > > > >>> > > > >>> > > > >>>> On Feb 13, 2021, at 6:43 PM, Florian Bruckner wrote: > > > >>>> > > > >>>> Dear Barry, > > > >>>> thank you for your clarification. What I wanted to say is that even if I could reset the KSP operators directly I would require to know which transformation ST applies in order to provide the preconditioning matrix for the correct operator. > > > >>>> The more general solution would be that SLEPc provides the interface to pass the preconditioning matrix for A0 and ST applies the same transformations as for the operator. > > > >>>> > > > >>>> If you write "SLEPc could provide an interface", do you mean someone should implement it, or should it already be possible and I am not using it correctly? > > > >>>> I wrote a small standalone example based on ex9.py from slepc4py, where i tried to use an operator. > > > >>>> > > > >>>> best wishes > > > >>>> Florian > > > >>>> > > > >>>> On Sat, Feb 13, 2021 at 7:15 PM Barry Smith wrote: > > > >>>> > > > >>>> > > > >>>>> On Feb 13, 2021, at 2:47 AM, Pierre Jolivet wrote: > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>>> On 13 Feb 2021, at 7:25 AM, Florian Bruckner wrote: > > > >>>>>> > > > >>>>>> Dear Jose, Dear Barry, > > > >>>>>> thanks again for your reply. One final question about the B0 orthogonality. Do you mean that eigenvectors are not B0 orthogonal, but they are i*B0 orthogonal? or is there an issue with Matt's approach? > > > >>>>>> For my problem I can show that eigenvalues fulfill an orthogonality relation (phi_i, A0 phi_j ) = omega_i (phi_i, B0 phi_j) = delta_ij. This should be independent of the solving method, right? > > > >>>>>> > > > >>>>>> Regarding Barry's advice this is what I first tried: > > > >>>>>> es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > > >>>>>> st = es.getST() > > > >>>>>> ksp = st.getKSP() > > > >>>>>> ksp.setOperators(self.A0, self.P0) > > > >>>>>> > > > >>>>>> But it seems that the provided P0 is not used. Furthermore the interface is maybe a bit confusing if ST performs some transformation. In this case P0 needs to approximate A0^{-1}*B0 and not A0, right? > > > >>>>> > > > >>>>> No, you need to approximate (A0-sigma B0)^-1. If you have a null shift, which looks like it is the case, you end up with A0^-1. > > > >>>> > > > >>>> Just trying to provide more clarity with the terms. > > > >>>> > > > >>>> If ST transforms the operator in the KSP to (A0-sigma B0) and you are providing the "sparse matrix from which the preconditioner is to be built" then you need to provide something that approximates (A0-sigma B0). Since the PC will use your matrix to construct a preconditioner that approximates the inverse of (A0-sigma B0), you don't need to directly provide something that approximates (A0-sigma B0)^-1 > > > >>>> > > > >>>> Yes, I would think SLEPc could provide an interface where it manages "the matrix from which to construct the preconditioner" and transforms that matrix just like the true matrix. To do it by hand you simply need to know what A0 and B0 are and which sigma ST has selected and then you can construct your modA0 - sigma modB0 and pass it to the KSP. Where modA0 and modB0 are your "sparser approximations". > > > >>>> > > > >>>> Barry > > > >>>> > > > >>>> > > > >>>>> > > > >>>>>> Nevertheless I think it would be the best solution if one could provide P0 (approx A0) and SLEPc derives the preconditioner from this. Would this be hard to implement? > > > >>>>> > > > >>>>> This is what Barry?s suggestion is implementing. Don?t know why it doesn?t work with your Python operator though. > > > >>>>> > > > >>>>> Thanks, > > > >>>>> Pierre > > > >>>>> > > > >>>>>> best wishes > > > >>>>>> Florian > > > >>>>>> > > > >>>>>> > > > >>>>>> On Sat, Feb 13, 2021 at 4:19 AM Barry Smith wrote: > > > >>>>>> > > > >>>>>> > > > >>>>>>> On Feb 12, 2021, at 2:32 AM, Florian Bruckner wrote: > > > >>>>>>> > > > >>>>>>> Dear Jose, Dear Matt, > > > >>>>>>> > > > >>>>>>> I needed some time to think about your answers. > > > >>>>>>> If I understand correctly, the eigenmode solver internally uses A0^{-1}*B0, which is normally handled by the ST object, which creates a KSP solver and a corresponding preconditioner. > > > >>>>>>> What I would need is an interface to provide not only the system Matrix A0 (which is an operator), but also a preconditioning matrix (sparse approximation of the operator). > > > >>>>>>> Unfortunately this interface is not available, right? > > > >>>>>> > > > >>>>>> If SLEPc does not provide this directly it is still intended to be trivial to provide the "preconditioner matrix" (that is matrix from which the preconditioner is built). Just get the KSP from the ST object and use KSPSetOperators() to provide the "preconditioner matrix" . > > > >>>>>> > > > >>>>>> Barry > > > >>>>>> > > > >>>>>>> > > > >>>>>>> Matt directly creates A0^{-1}*B0 as a matshell operator. The operator uses a KSP with a proper PC internally. SLEPc would directly get A0^{-1}*B0 and solve a standard eigenvalue problem with this modified operator. Did I understand this correctly? > > > >>>>>>> > > > >>>>>>> I have two further points, which I did not mention yet: the matrix B0 is Hermitian, but it is (purely) imaginary (B0.real=0). Right now, I am using Firedrake to set up the PETSc system matrices A0, i*B0 (which is real). Then I convert them into ScipyLinearOperators and use scipy.sparse.eigsh(B0, b=A0, Minv=Minv) to calculate the eigenvalues. Minv=A0^-1 is also solving within scipy using a preconditioned gmres. Advantage of this setup is that the imaginary B0 can be handled efficiently and also the post-processing of the eigenvectors (which requires complex arithmetics) is simplified. > > > >>>>>>> > > > >>>>>>> Nevertheless I think that the mixing of PETSc and Scipy looks too complicated and is not very flexible. > > > >>>>>>> If I would use Matt's approach, could I then simply switch between multiple standard eigenvalue methods (e.g. LOBPCG)? or is it limited due to the use of matshell? > > > >>>>>>> Is there a solution for the imaginary B0, or do I have to use the non-hermitian methods? Is this a large performance drawback? > > > >>>>>>> > > > >>>>>>> thanks again, > > > >>>>>>> and best wishes > > > >>>>>>> Florian > > > >>>>>>> > > > >>>>>>> On Mon, Feb 8, 2021 at 3:37 PM Jose E. Roman wrote: > > > >>>>>>> The problem can be written as A0*v=omega*B0*v and you want the eigenvalues omega closest to zero. If the matrices were explicitly available, you would do shift-and-invert with target=0, that is > > > >>>>>>> > > > >>>>>>> (A0-sigma*B0)^{-1}*B0*v=theta*v for sigma=0, that is > > > >>>>>>> > > > >>>>>>> A0^{-1}*B0*v=theta*v > > > >>>>>>> > > > >>>>>>> and you compute EPS_LARGEST_MAGNITUDE eigenvalues theta=1/omega. > > > >>>>>>> > > > >>>>>>> Matt: I guess you should have EPS_LARGEST_MAGNITUDE instead of EPS_SMALLEST_REAL in your code. Are you getting the eigenvalues you need? EPS_SMALLEST_REAL will give slow convergence. > > > >>>>>>> > > > >>>>>>> Florian: I would not recommend setting the KSP matrices directly, it may produce strange side-effects. We should have an interface function to pass this matrix. Currently there is STPrecondSetMatForPC() but it has two problems: (1) it is intended for STPRECOND, so cannot be used with Krylov-Schur, and (2) it is not currently available in the python interface. > > > >>>>>>> > > > >>>>>>> The approach used by Matt is a workaround that does not use ST, so you can handle linear solves with a KSP of your own. > > > >>>>>>> > > > >>>>>>> As an alternative, since your problem is symmetric, you could try LOBPCG, assuming that the leftmost eigenvalues are those that you want (e.g. if all eigenvalues are non-negative). In that case you could use STPrecondSetMatForPC(), but the remaining issue is calling it from python. > > > >>>>>>> > > > >>>>>>> If you are using the git repo, I could add the relevant code. > > > >>>>>>> > > > >>>>>>> Jose > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> > > > >>>>>>>> El 8 feb 2021, a las 14:22, Matthew Knepley escribi?: > > > >>>>>>>> > > > >>>>>>>> On Mon, Feb 8, 2021 at 7:04 AM Florian Bruckner wrote: > > > >>>>>>>> Dear PETSc / SLEPc Users, > > > >>>>>>>> > > > >>>>>>>> my question is very similar to the one posted here: > > > >>>>>>>> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-August/035878.html > > > >>>>>>>> > > > >>>>>>>> The eigensystem I would like to solve looks like: > > > >>>>>>>> B0 v = 1/omega A0 v > > > >>>>>>>> B0 and A0 are both hermitian, A0 is positive definite, but only given as a linear operator (matshell). I am looking for the largest eigenvalues (=smallest omega). > > > >>>>>>>> > > > >>>>>>>> I also have a sparse approximation P0 of the A0 operator, which i would like to use as precondtioner, using something like this: > > > >>>>>>>> > > > >>>>>>>> es = SLEPc.EPS().create(comm=fd.COMM_WORLD) > > > >>>>>>>> st = es.getST() > > > >>>>>>>> ksp = st.getKSP() > > > >>>>>>>> ksp.setOperators(self.A0, self.P0) > > > >>>>>>>> > > > >>>>>>>> Unfortunately PETSc still complains that it cannot create a preconditioner for a type 'python' matrix although P0.type == 'seqaij' (but A0.type == 'python'). > > > >>>>>>>> By the way, should P0 be an approximation of A0 or does it have to include B0? > > > >>>>>>>> > > > >>>>>>>> Right now I am using the krylov-schur method. Are there any alternatives if A0 is only given as an operator? > > > >>>>>>>> > > > >>>>>>>> Jose can correct me if I say something wrong. > > > >>>>>>>> > > > >>>>>>>> When I did this, I made a shell operator for the action of A0^{-1} B0 which has a KSPSolve() in it, so you can use your P0 preconditioning matrix, and > > > >>>>>>>> then handed that to EPS. You can see me do it here: > > > >>>>>>>> > > > >>>>>>>> https://gitlab.com/knepley/bamg/-/blob/master/src/coarse/bamgCoarseSpace.c#L123 > > > >>>>>>>> > > > >>>>>>>> I had a hard time getting the embedded solver to work the way I wanted, but maybe that is the better way. > > > >>>>>>>> > > > >>>>>>>> Thanks, > > > >>>>>>>> > > > >>>>>>>> Matt > > > >>>>>>>> > > > >>>>>>>> thanks for any advice > > > >>>>>>>> best wishes > > > >>>>>>>> Florian > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> -- > > > >>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > >>>>>>>> -- Norbert Wiener > > > >>>>>>>> > > > >>>>>>>> https://www.cse.buffalo.edu/~knepley/ > > > >>>>>>> > > > >>>>>> > > > >>>>> > > > >>>> > > > >>>> > > > >>> > > > >> > > > >> > > > >> > > > >> -- > > > >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > >> -- Norbert Wiener > > > >> > > > >> https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From wence at gmx.li Tue Mar 9 08:41:46 2021 From: wence at gmx.li (Lawrence Mitchell) Date: Tue, 9 Mar 2021 14:41:46 +0000 Subject: [petsc-users] using preconditioner with SLEPc In-Reply-To: <4639F19E-5B56-43F9-A617-FC44195E462F@dsic.upv.es> References: <7C5B30FE-C539-4A14-B442-B1C91618E4AC@petsc.dev> <119944FD-4F1E-4B2F-A39D-65ADDB12BB5F@petsc.dev> <6EF7889D-DC17-46FC-82A5-9409C41E231D@petsc.dev> <46C744D7-4376-46B3-B5C4-211A4C8C2291@dsic.upv.es> <80BCEEDC-4C1E-4512-AAF5-7B6E718C7D1D@dsic.upv.es> <1DCC9C38-D49E-4A02-8EE3-B3701618914A@dsic.upv.es> <4639F19E-5B56-43F9-A617-FC44195E462F@dsic.upv.es> Message-ID: <2DF988D3-E268-4D5E-BFA2-FDC79CECE0F1@gmx.li> > On 9 Mar 2021, at 14:17, Jose E. Roman wrote: > > When I added this, I changed it to default to gmres+bjacobi when EPSSetPreconditionerMat() is called, but for some reason in your case it is not doing it. If it's Firedrake who is setting preonly+lu I would say it is not necessary (that is the default). I don't think we do any setting of SLEPc object options (we do for KSP, but we don't wrap up SLEPc objects at all). Lawrence From wence at gmx.li Tue Mar 9 17:06:16 2021 From: wence at gmx.li (Lawrence Mitchell) Date: Tue, 9 Mar 2021 23:06:16 +0000 Subject: [petsc-users] DMPlex in Firedrake: scaling of mesh distribution In-Reply-To: References: Message-ID: <0B9D5B9C-81FF-4619-BF1D-0539BED52CDC@gmx.li> Dear Alexei, I echo the comments that Barry and others have made. Some more in line below. > On 5 Mar 2021, at 21:06, Alexei Colin wrote: > > To PETSc DMPlex users, Firedrake users, Dr. Knepley and Dr. Karpeev: > > Is it expected for mesh distribution step to > (A) take a share of 50-99% of total time-to-solution of an FEM problem, and We hope not! > (B) take an amount of time that increases with the number of ranks, and > (C) take an amount of memory on rank 0 that does not decrease with the > number of ranks This is a consequence, as Matt notes, of us making a serial mesh and then doing a one to all distribution. > > 1a. Is mesh distribution fundamentally necessary for any FEM framework, > or is it only needed by Firedrake? If latter, then how do other > frameworks partition the mesh and execute in parallel with MPI but avoid > the non-scalable mesh destribution step? Matt points out that we should do something smarter (namely make and distribute a small mesh from serial to parallel and then do refinement and repartitioning in parallel). This is not implemented out of the box, but here is some code that (in up to date Firedrake/petsc) does that from firedrake import * from firedrake.cython.dmcommon import CELL_SETS_LABEL, FACE_SETS_LABEL from firedrake.cython.mgimpl import filter_labels from firedrake.petsc import PETSc # Create a small mesh that is cheap to distribute. mesh = UnitSquareMesh(10, 10) dm = mesh.topology_dm dm.setRefinementUniform(True) # Refine it a bunch of times, edge midpoint division. rdm = dm.refine() rdm = rdm.refine() rdm = rdm.refine() # Remove some labels that will be reconstructed. filter_labels(rdm, rdm.getHeightStratum(1), "exterior_facets", "boundary_faces", FACE_SETS_LABEL) filter_labels(rdm, rdm.getHeightStratum(0), CELL_SETS_LABEL) for label in ["interior_facets", "pyop2_core", "pyop2_owned", "pyop2_ghost"]: rdm.removeLabel(label) # Redistributed for better load balanced partitions (this is in parallel). rdm.distribute() # Now make the firedrake mesh object. rmesh = Mesh(rdm, distribution_parameters={"partition": False}) # Now do things in parallel. This is probably something we should push into the library (it's quite fiddly!), so if you can try it out easily and check that it works please let us know! Thanks, Lawrence From fdkong.jd at gmail.com Wed Mar 10 10:11:54 2021 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 10 Mar 2021 09:11:54 -0700 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> Message-ID: Thanks, Barry, It seems PETSc works fine with manually built compilers. We are pretty much sure that the issue is related to conda. Conda might introduce extra flags. We still need to make it work with conda because we deliver our package via conda for users. I unset all flags from conda, and got slightly different results this time. The log was attached. Anyone could explain the motivation that we try to build executable without a main function? Thanks, Fande Executing: mpicc -c -o /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers -fPIC /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c Successful compile: Source: #include "confdefs.h" #include "conffix.h" #include int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; void foo(void){ fprintf_ptr(stdout,"hello"); return; } void bar(void){foo();} Running Executable WITHOUT threads to time it out Executing: mpicc -o /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so -dynamic -fPIC /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o Possible ERROR while running linker: exit code 1 stderr: Undefined symbols for architecture x86_64: "_main", referenced from: implicit entry/start for main executable ld: symbol(s) not found for architecture x86_64 clang-11: error: linker command failed with exit code 1 (use -v to see invocation) Rejected C compiler flag -fPIC because it was not compatible with shared linker mpicc using flags ['-dynamic'] On Mon, Mar 8, 2021 at 7:28 PM Barry Smith wrote: > > Fande, > > I see you are using CONDA, this can cause issues since it sticks all > kinds of things into the environment. PETSc tries to remove some of them > but perhaps not enough. If you run printenv you will see all the mess it is > dumping in. > > Can you trying the same build without CONDA environment? > > Barry > > > On Mar 8, 2021, at 7:31 PM, Matthew Knepley wrote: > > On Mon, Mar 8, 2021 at 8:23 PM Fande Kong wrote: > >> Thanks Matthew, >> >> Hmm, we still have the same issue after shutting off all unknown flags. >> > > Oh, I was misinterpreting the error message: > > ld: can't link with a main executable file > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > So clang did not _actually_ make a shared library, it made an executable. > Did clang-11 change the options it uses to build a shared library? > > Satish, do we test with clang-11? > > Thanks, > > Matt > > Thanks, >> >> Fande >> >> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley wrote: >> >>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong wrote: >>> >>>> Hi All, >>>> >>>> mpicc rejected "-fPIC". Anyone has a clue how to work around this issue? >>>> >>> >>> The failure is at the last step >>> >>> Executing: mpicc -o >>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest >>> -fPIC >>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o >>> -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers >>> -lconftest >>> Possible ERROR while running linker: exit code 1 >>> stderr: >>> ld: can't link with a main executable file >>> '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' >>> for architecture x86_64 >>> clang-11: error: linker command failed with exit code 1 (use -v to see >>> invocation) >>> >>> but you have some flags stuck in which may or may not affect this. I >>> would try shutting them off: >>> >>> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath >>> /Users/kongf/miniconda3/envs/moose/lib >>> -L/Users/kongf/miniconda3/envs/moose/lib >>> >>> I cannot tell exactly why clang is failing because it does not report a >>> specific error. >>> >>> Thanks, >>> >>> Matt >>> >>> The log was attached. >>>> >>>> Thanks so much, >>>> >>>> Fande >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 212451 bytes Desc: not available URL: From balay at mcs.anl.gov Wed Mar 10 10:34:19 2021 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 10 Mar 2021 10:34:19 -0600 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> Message-ID: <45707425-ed77-2c54-1465-19d326994d1e@mcs.anl.gov> On Wed, 10 Mar 2021, Fande Kong wrote: > Thanks, Barry, > > It seems PETSc works fine with manually built compilers. We are pretty much > sure that the issue is related to conda. Conda might introduce extra flags. > > We still need to make it work with conda because we deliver our package via > conda for users. > > > I unset all flags from conda, and got slightly different results this > time. The log was attached. Anyone could explain the motivation that we > try to build executable without a main function? Its attempting to build a shared library - which shouldn't have a main. However - the primary issue is: >>>>> Executing: mpicc -o /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers -lconftest Possible ERROR while running linker: exit code 1 stderr: ld: can't link with a main executable file '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.dylib' for architecture x86_64 clang-11: error: linker command failed with exit code 1 (use -v to see invocation) <<< Here its built a .dylib - and attempting to use it. But the compiler gives the above error. Can you build with conda compilers - but outside conda env? It appears to have: AS=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-as AR=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-ar CXX_FOR_BUILD=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-clang++ INSTALL_NAME_TOOL=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-install_name_tool CC_FOR_BUILD=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-clang FC_FOR_BUILD=/Users/kongf/miniconda3/envs/testpetsc/bin/-gfortran LD=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-ld Don't know if any one of them is making a difference. Also you might want to comment out "self.executeTest(self.resetEnvCompilers)" in setCompilers.py to see if its making a difference [same with self.checkEnvCompilers?] Satish > > Thanks, > > Fande > > Executing: mpicc -c -o > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers > -fPIC > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c > > Successful compile: > Source: > #include "confdefs.h" > #include "conffix.h" > #include > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; > void foo(void){ > fprintf_ptr(stdout,"hello"); > return; > } > void bar(void){foo();} > Running Executable WITHOUT threads to time it out > Executing: mpicc -o > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so > -dynamic -fPIC > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > Possible ERROR while running linker: exit code 1 > stderr: > Undefined symbols for architecture x86_64: > "_main", referenced from: > implicit entry/start for main executable > ld: symbol(s) not found for architecture x86_64 > clang-11: error: linker command failed with exit code 1 (use -v to see > invocation) > Rejected C compiler flag -fPIC because it was not compatible with > shared linker mpicc using flags ['-dynamic'] > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith wrote: > > > > > Fande, > > > > I see you are using CONDA, this can cause issues since it sticks all > > kinds of things into the environment. PETSc tries to remove some of them > > but perhaps not enough. If you run printenv you will see all the mess it is > > dumping in. > > > > Can you trying the same build without CONDA environment? > > > > Barry > > > > > > On Mar 8, 2021, at 7:31 PM, Matthew Knepley wrote: > > > > On Mon, Mar 8, 2021 at 8:23 PM Fande Kong wrote: > > > >> Thanks Matthew, > >> > >> Hmm, we still have the same issue after shutting off all unknown flags. > >> > > > > Oh, I was misinterpreting the error message: > > > > ld: can't link with a main executable file > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > > > So clang did not _actually_ make a shared library, it made an executable. > > Did clang-11 change the options it uses to build a shared library? > > > > Satish, do we test with clang-11? > > > > Thanks, > > > > Matt > > > > Thanks, > >> > >> Fande > >> > >> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley wrote: > >> > >>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong wrote: > >>> > >>>> Hi All, > >>>> > >>>> mpicc rejected "-fPIC". Anyone has a clue how to work around this issue? > >>>> > >>> > >>> The failure is at the last step > >>> > >>> Executing: mpicc -o > >>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest > >>> -fPIC > >>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o > >>> -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers > >>> -lconftest > >>> Possible ERROR while running linker: exit code 1 > >>> stderr: > >>> ld: can't link with a main executable file > >>> '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > >>> for architecture x86_64 > >>> clang-11: error: linker command failed with exit code 1 (use -v to see > >>> invocation) > >>> > >>> but you have some flags stuck in which may or may not affect this. I > >>> would try shutting them off: > >>> > >>> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath > >>> /Users/kongf/miniconda3/envs/moose/lib > >>> -L/Users/kongf/miniconda3/envs/moose/lib > >>> > >>> I cannot tell exactly why clang is failing because it does not report a > >>> specific error. > >>> > >>> Thanks, > >>> > >>> Matt > >>> > >>> The log was attached. > >>>> > >>>> Thanks so much, > >>>> > >>>> Fande > >>>> > >>> > >>> > >>> -- > >>> What most experimenters take for granted before they begin their > >>> experiments is infinitely more interesting than any results to which their > >>> experiments lead. > >>> -- Norbert Wiener > >>> > >>> https://www.cse.buffalo.edu/~knepley/ > >>> > >>> > >> > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which their > > experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > From fdkong.jd at gmail.com Wed Mar 10 12:15:22 2021 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 10 Mar 2021 11:15:22 -0700 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: <45707425-ed77-2c54-1465-19d326994d1e@mcs.anl.gov> References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> <45707425-ed77-2c54-1465-19d326994d1e@mcs.anl.gov> Message-ID: On Wed, Mar 10, 2021 at 9:34 AM Satish Balay wrote: > On Wed, 10 Mar 2021, Fande Kong wrote: > > > Thanks, Barry, > > > > It seems PETSc works fine with manually built compilers. We are pretty > much > > sure that the issue is related to conda. Conda might introduce extra > flags. > > > > We still need to make it work with conda because we deliver our package > via > > conda for users. > > > > > > I unset all flags from conda, and got slightly different results this > > time. The log was attached. Anyone could explain the motivation that we > > try to build executable without a main function? > > Its attempting to build a shared library - which shouldn't have a main. > > However - the primary issue is: > > >>>>> > Executing: mpicc -o > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers > -lconftest > Possible ERROR while running linker: exit code 1 > stderr: > ld: can't link with a main executable file > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.dylib' > for architecture x86_64 > clang-11: error: linker command failed with exit code 1 (use -v to see > invocation) > <<< > > Here its built a .dylib - and attempting to use it. But the compiler gives > the above error. > > Can you build with conda compilers - but outside conda env? It appears to > have: > I am already outside of conda (build), but in order to use conda compilers, we need a few variables. > > AS=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-as > AR=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-ar > > CXX_FOR_BUILD=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-clang++ > > INSTALL_NAME_TOOL=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-install_name_tool > > CC_FOR_BUILD=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-clang > FC_FOR_BUILD=/Users/kongf/miniconda3/envs/testpetsc/bin/-gfortran > LD=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-ld > > Don't know if any one of them is making a difference. > I unset all these variables, and did not make a difference. > > Also you might want to comment out > "self.executeTest(self.resetEnvCompilers)" in setCompilers.py to see if its > making a difference [same with self.checkEnvCompilers? > Still the same issue. Might it is just a bad compiler Thanks for your help Fande, > > > Satish > > > > > > Thanks, > > > > Fande > > > > Executing: mpicc -c -o > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers > > -fPIC > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c > > > > Successful compile: > > Source: > > #include "confdefs.h" > > #include "conffix.h" > > #include > > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; > > void foo(void){ > > fprintf_ptr(stdout,"hello"); > > return; > > } > > void bar(void){foo();} > > Running Executable WITHOUT threads to time it out > > Executing: mpicc -o > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so > > -dynamic -fPIC > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > Possible ERROR while running linker: exit code 1 > > stderr: > > Undefined symbols for architecture x86_64: > > "_main", referenced from: > > implicit entry/start for main executable > > ld: symbol(s) not found for architecture x86_64 > > clang-11: error: linker command failed with exit code 1 (use -v to see > > invocation) > > Rejected C compiler flag -fPIC because it was not compatible > with > > shared linker mpicc using flags ['-dynamic'] > > > > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith wrote: > > > > > > > > Fande, > > > > > > I see you are using CONDA, this can cause issues since it sticks > all > > > kinds of things into the environment. PETSc tries to remove some of > them > > > but perhaps not enough. If you run printenv you will see all the mess > it is > > > dumping in. > > > > > > Can you trying the same build without CONDA environment? > > > > > > Barry > > > > > > > > > On Mar 8, 2021, at 7:31 PM, Matthew Knepley wrote: > > > > > > On Mon, Mar 8, 2021 at 8:23 PM Fande Kong wrote: > > > > > >> Thanks Matthew, > > >> > > >> Hmm, we still have the same issue after shutting off all unknown > flags. > > >> > > > > > > Oh, I was misinterpreting the error message: > > > > > > ld: can't link with a main executable file > > > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > > > > > So clang did not _actually_ make a shared library, it made an > executable. > > > Did clang-11 change the options it uses to build a shared library? > > > > > > Satish, do we test with clang-11? > > > > > > Thanks, > > > > > > Matt > > > > > > Thanks, > > >> > > >> Fande > > >> > > >> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley > wrote: > > >> > > >>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong > wrote: > > >>> > > >>>> Hi All, > > >>>> > > >>>> mpicc rejected "-fPIC". Anyone has a clue how to work around this > issue? > > >>>> > > >>> > > >>> The failure is at the last step > > >>> > > >>> Executing: mpicc -o > > >>> > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest > > >>> -fPIC > > >>> > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o > > >>> > -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers > > >>> -lconftest > > >>> Possible ERROR while running linker: exit code 1 > > >>> stderr: > > >>> ld: can't link with a main executable file > > >>> > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > >>> for architecture x86_64 > > >>> clang-11: error: linker command failed with exit code 1 (use -v to > see > > >>> invocation) > > >>> > > >>> but you have some flags stuck in which may or may not affect this. I > > >>> would try shutting them off: > > >>> > > >>> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs > -rpath > > >>> /Users/kongf/miniconda3/envs/moose/lib > > >>> -L/Users/kongf/miniconda3/envs/moose/lib > > >>> > > >>> I cannot tell exactly why clang is failing because it does not > report a > > >>> specific error. > > >>> > > >>> Thanks, > > >>> > > >>> Matt > > >>> > > >>> The log was attached. > > >>>> > > >>>> Thanks so much, > > >>>> > > >>>> Fande > > >>>> > > >>> > > >>> > > >>> -- > > >>> What most experimenters take for granted before they begin their > > >>> experiments is infinitely more interesting than any results to which > their > > >>> experiments lead. > > >>> -- Norbert Wiener > > >>> > > >>> https://www.cse.buffalo.edu/~knepley/ > > >>> > > >>> > > >> > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to which > their > > > experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 231658 bytes Desc: not available URL: From balay at mcs.anl.gov Wed Mar 10 12:34:14 2021 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 10 Mar 2021 12:34:14 -0600 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> <45707425-ed77-2c54-1465-19d326994d1e@mcs.anl.gov> Message-ID: <8fdb8baf-4a78-108f-c76c-89bacbf0c5f5@mcs.anl.gov> On Wed, 10 Mar 2021, Fande Kong wrote: > On Wed, Mar 10, 2021 at 9:34 AM Satish Balay wrote: > > > On Wed, 10 Mar 2021, Fande Kong wrote: > > > > > Thanks, Barry, > > > > > > It seems PETSc works fine with manually built compilers. We are pretty > > much > > > sure that the issue is related to conda. Conda might introduce extra > > flags. > > > > > > We still need to make it work with conda because we deliver our package > > via > > > conda for users. > > > > > > > > > I unset all flags from conda, and got slightly different results this > > > time. The log was attached. Anyone could explain the motivation that we > > > try to build executable without a main function? > > > > Its attempting to build a shared library - which shouldn't have a main. > > > > However - the primary issue is: > > > > >>>>> > > Executing: mpicc -o > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers > > -lconftest > > Possible ERROR while running linker: exit code 1 > > stderr: > > ld: can't link with a main executable file > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.dylib' > > for architecture x86_64 > > clang-11: error: linker command failed with exit code 1 (use -v to see > > invocation) > > <<< > > > > Here its built a .dylib - and attempting to use it. But the compiler gives > > the above error. > > > > Can you build with conda compilers - but outside conda env? It appears to > > have: > > > > I am already outside of conda (build), but in order to use conda compilers, > we need a few variables. Can you send the configure.log for this build outside conda - but using conda tools. And what do you get for: /Users/kongf/miniconda3/envs/testpetsc/bin/mpicc -show /Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-clang -v src/benchmarks/sizeof.c > > > > AS=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-as > > AR=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-ar > > > > CXX_FOR_BUILD=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-clang++ > > > > INSTALL_NAME_TOOL=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-install_name_tool > > > > CC_FOR_BUILD=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-clang > > FC_FOR_BUILD=/Users/kongf/miniconda3/envs/testpetsc/bin/-gfortran > > LD=/Users/kongf/miniconda3/envs/testpetsc/bin/x86_64-apple-darwin13.4.0-ld > > > > Don't know if any one of them is making a difference. > > > > I unset all these variables, and did not make a difference. > > > > > > Also you might want to comment out > > "self.executeTest(self.resetEnvCompilers)" in setCompilers.py to see if its > > making a difference [same with self.checkEnvCompilers? > > > > Still the same issue. Might it is just a bad compiler Perhaps you can try to reproduce the above test outside configure [i.e create a .dylib, and attempt to link to it] - and attempt to debug it further. If its a broken conda compiler [or conda-ld] perhaps there is a way to use compilers/ld from outside conda - from this conda build. Satish - > > > Thanks for your help > > Fande, > > > > > > > > Satish > > > > > > > > > > Thanks, > > > > > > Fande > > > > > > Executing: mpicc -c -o > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > > -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers > > > -fPIC > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c > > > > > > Successful compile: > > > Source: > > > #include "confdefs.h" > > > #include "conffix.h" > > > #include > > > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; > > > void foo(void){ > > > fprintf_ptr(stdout,"hello"); > > > return; > > > } > > > void bar(void){foo();} > > > Running Executable WITHOUT threads to time it out > > > Executing: mpicc -o > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so > > > -dynamic -fPIC > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > > > Possible ERROR while running linker: exit code 1 > > > stderr: > > > Undefined symbols for architecture x86_64: > > > "_main", referenced from: > > > implicit entry/start for main executable > > > ld: symbol(s) not found for architecture x86_64 > > > clang-11: error: linker command failed with exit code 1 (use -v to see > > > invocation) > > > Rejected C compiler flag -fPIC because it was not compatible > > with > > > shared linker mpicc using flags ['-dynamic'] > > > > > > > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith wrote: > > > > > > > > > > > Fande, > > > > > > > > I see you are using CONDA, this can cause issues since it sticks > > all > > > > kinds of things into the environment. PETSc tries to remove some of > > them > > > > but perhaps not enough. If you run printenv you will see all the mess > > it is > > > > dumping in. > > > > > > > > Can you trying the same build without CONDA environment? > > > > > > > > Barry > > > > > > > > > > > > On Mar 8, 2021, at 7:31 PM, Matthew Knepley wrote: > > > > > > > > On Mon, Mar 8, 2021 at 8:23 PM Fande Kong wrote: > > > > > > > >> Thanks Matthew, > > > >> > > > >> Hmm, we still have the same issue after shutting off all unknown > > flags. > > > >> > > > > > > > > Oh, I was misinterpreting the error message: > > > > > > > > ld: can't link with a main executable file > > > > > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > > > > > > > So clang did not _actually_ make a shared library, it made an > > executable. > > > > Did clang-11 change the options it uses to build a shared library? > > > > > > > > Satish, do we test with clang-11? > > > > > > > > Thanks, > > > > > > > > Matt > > > > > > > > Thanks, > > > >> > > > >> Fande > > > >> > > > >> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley > > wrote: > > > >> > > > >>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong > > wrote: > > > >>> > > > >>>> Hi All, > > > >>>> > > > >>>> mpicc rejected "-fPIC". Anyone has a clue how to work around this > > issue? > > > >>>> > > > >>> > > > >>> The failure is at the last step > > > >>> > > > >>> Executing: mpicc -o > > > >>> > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest > > > >>> -fPIC > > > >>> > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o > > > >>> > > -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers > > > >>> -lconftest > > > >>> Possible ERROR while running linker: exit code 1 > > > >>> stderr: > > > >>> ld: can't link with a main executable file > > > >>> > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > > >>> for architecture x86_64 > > > >>> clang-11: error: linker command failed with exit code 1 (use -v to > > see > > > >>> invocation) > > > >>> > > > >>> but you have some flags stuck in which may or may not affect this. I > > > >>> would try shutting them off: > > > >>> > > > >>> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs > > -rpath > > > >>> /Users/kongf/miniconda3/envs/moose/lib > > > >>> -L/Users/kongf/miniconda3/envs/moose/lib > > > >>> > > > >>> I cannot tell exactly why clang is failing because it does not > > report a > > > >>> specific error. > > > >>> > > > >>> Thanks, > > > >>> > > > >>> Matt > > > >>> > > > >>> The log was attached. > > > >>>> > > > >>>> Thanks so much, > > > >>>> > > > >>>> Fande > > > >>>> > > > >>> > > > >>> > > > >>> -- > > > >>> What most experimenters take for granted before they begin their > > > >>> experiments is infinitely more interesting than any results to which > > their > > > >>> experiments lead. > > > >>> -- Norbert Wiener > > > >>> > > > >>> https://www.cse.buffalo.edu/~knepley/ > > > >>> > > > >>> > > > >> > > > > > > > > -- > > > > What most experimenters take for granted before they begin their > > > > experiments is infinitely more interesting than any results to which > > their > > > > experiments lead. > > > > -- Norbert Wiener > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > > > > > > > > > From bsmith at petsc.dev Wed Mar 10 13:05:47 2021 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 10 Mar 2021 13:05:47 -0600 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> Message-ID: Fande, Please add in config/BuildSystem/config/framework.py line 528 two new lines # pgi dumps filename on stderr - but returns 0 errorcode' lines = [s for s in lines if lines != 'conftest.c:'] # in case -pie is always being passed to linker lines = [s for s in lines if s.find('-pie being ignored. It is only used when linking a main executable') < 0] Barry You have (another of Conda's "take over the world my way" approach) LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath /Users/kongf/miniconda3/envs/testpetsc/lib -L/Users/kongf/miniconda3/envs/testpetsc/lib Executing: mpicc -o /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest -dynamiclib -single_module /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o Possible ERROR while running linker: stderr: ld: warning: -pie being ignored. It is only used when linking a main executable Rejecting C linker flag -dynamiclib -single_module due to ld: warning: -pie being ignored. It is only used when linking a main executable This is the correct link command for the Mac but it is being rejected due to the warning message. > On Mar 10, 2021, at 10:11 AM, Fande Kong wrote: > > Thanks, Barry, > > It seems PETSc works fine with manually built compilers. We are pretty much sure that the issue is related to conda. Conda might introduce extra flags. > > We still need to make it work with conda because we deliver our package via conda for users. > > > I unset all flags from conda, and got slightly different results this time. The log was attached. Anyone could explain the motivation that we try to build executable without a main function? > > Thanks, > > Fande > > Executing: mpicc -c -o /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers -fPIC /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c > Successful compile: > Source: > #include "confdefs.h" > #include "conffix.h" > #include > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; > void foo(void){ > fprintf_ptr(stdout,"hello"); > return; > } > void bar(void){foo();} > Running Executable WITHOUT threads to time it out > Executing: mpicc -o /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so -dynamic -fPIC /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > Possible ERROR while running linker: exit code 1 > stderr: > Undefined symbols for architecture x86_64: > "_main", referenced from: > implicit entry/start for main executable > ld: symbol(s) not found for architecture x86_64 > clang-11: error: linker command failed with exit code 1 (use -v to see invocation) > Rejected C compiler flag -fPIC because it was not compatible with shared linker mpicc using flags ['-dynamic'] > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith > wrote: > > Fande, > > I see you are using CONDA, this can cause issues since it sticks all kinds of things into the environment. PETSc tries to remove some of them but perhaps not enough. If you run printenv you will see all the mess it is dumping in. > > Can you trying the same build without CONDA environment? > > Barry > > >> On Mar 8, 2021, at 7:31 PM, Matthew Knepley > wrote: >> >> On Mon, Mar 8, 2021 at 8:23 PM Fande Kong > wrote: >> Thanks Matthew, >> >> Hmm, we still have the same issue after shutting off all unknown flags. >> >> Oh, I was misinterpreting the error message: >> >> ld: can't link with a main executable file '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' >> >> So clang did not _actually_ make a shared library, it made an executable. Did clang-11 change the options it uses to build a shared library? >> >> Satish, do we test with clang-11? >> >> Thanks, >> >> Matt >> >> Thanks, >> >> Fande >> >> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley > wrote: >> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong > wrote: >> Hi All, >> >> mpicc rejected "-fPIC". Anyone has a clue how to work around this issue? >> >> The failure is at the last step >> >> Executing: mpicc -o /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest -fPIC /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers -lconftest >> Possible ERROR while running linker: exit code 1 >> stderr: >> ld: can't link with a main executable file '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' for architecture x86_64 >> clang-11: error: linker command failed with exit code 1 (use -v to see invocation) >> >> but you have some flags stuck in which may or may not affect this. I would try shutting them off: >> >> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath /Users/kongf/miniconda3/envs/moose/lib -L/Users/kongf/miniconda3/envs/moose/lib >> >> I cannot tell exactly why clang is failing because it does not report a specific error. >> >> Thanks, >> >> Matt >> >> The log was attached. >> >> Thanks so much, >> >> Fande >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Wed Mar 10 13:43:38 2021 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 10 Mar 2021 12:43:38 -0700 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> Message-ID: Thanks Barry, Got the same result, but "-pie" was not filtered out somehow. I did changes like this: kongf at x86_64-apple-darwin13 petsc % git diff diff --git a/config/BuildSystem/config/framework.py b/config/BuildSystem/config/framework.py index beefe82956..c31fbeb95e 100644 --- a/config/BuildSystem/config/framework.py +++ b/config/BuildSystem/config/framework.py @@ -504,6 +504,8 @@ class Framework(config.base.Configure, script.LanguageProcessor): lines = [s for s in lines if s.find('Load a valid targeting module or set CRAY_CPU_TARGET') < 0] # pgi dumps filename on stderr - but returns 0 errorcode' lines = [s for s in lines if lines != 'conftest.c:'] + # in case -pie is always being passed to linker + lines = [s for s in lines if s.find('-pie being ignored. It is only used when linking a main executable') < 0] if lines: output = reduce(lambda s, t: s+t, lines, '\n') else: output = '' log.write("Linker stderr after filtering:\n"+output+":\n") The log was attached again. Thanks, Fande On Wed, Mar 10, 2021 at 12:05 PM Barry Smith wrote: > Fande, > > Please add in config/BuildSystem/config/framework.py line 528 two new > lines > > # pgi dumps filename on stderr - but returns 0 errorcode' > lines = [s for s in lines if lines != 'conftest.c:'] > # in case -pie is always being passed to linker > lines = [s for s in lines if s.find('-pie being ignored. It is only > used when linking a main executable') < 0] > > Barry > > You have (another of Conda's "take over the world my way" approach) > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath > /Users/kongf/miniconda3/envs/testpetsc/lib > -L/Users/kongf/miniconda3/envs/testpetsc/lib > > Executing: mpicc -o > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest > -dynamiclib -single_module > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > Possible ERROR while running linker: > stderr: > ld: warning: -pie being ignored. It is only used when linking a main > executable > Rejecting C linker flag -dynamiclib -single_module due to > > ld: warning: -pie being ignored. It is only used when linking a main > executable > > This is the correct link command for the Mac but it is being rejected due > to the warning message. > > > On Mar 10, 2021, at 10:11 AM, Fande Kong wrote: > > Thanks, Barry, > > It seems PETSc works fine with manually built compilers. We are pretty > much sure that the issue is related to conda. Conda might introduce extra > flags. > > We still need to make it work with conda because we deliver our package > via conda for users. > > > I unset all flags from conda, and got slightly different results this > time. The log was attached. Anyone could explain the motivation that we > try to build executable without a main function? > > Thanks, > > Fande > > Executing: mpicc -c -o > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers > -fPIC > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c > > Successful compile: > Source: > #include "confdefs.h" > #include "conffix.h" > #include > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; > void foo(void){ > fprintf_ptr(stdout,"hello"); > return; > } > void bar(void){foo();} > Running Executable WITHOUT threads to time it out > Executing: mpicc -o > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so > -dynamic -fPIC > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > Possible ERROR while running linker: exit code 1 > stderr: > Undefined symbols for architecture x86_64: > "_main", referenced from: > implicit entry/start for main executable > ld: symbol(s) not found for architecture x86_64 > clang-11: error: linker command failed with exit code 1 (use -v to see > invocation) > Rejected C compiler flag -fPIC because it was not compatible > with shared linker mpicc using flags ['-dynamic'] > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith wrote: > >> >> Fande, >> >> I see you are using CONDA, this can cause issues since it sticks all >> kinds of things into the environment. PETSc tries to remove some of them >> but perhaps not enough. If you run printenv you will see all the mess it is >> dumping in. >> >> Can you trying the same build without CONDA environment? >> >> Barry >> >> >> On Mar 8, 2021, at 7:31 PM, Matthew Knepley wrote: >> >> On Mon, Mar 8, 2021 at 8:23 PM Fande Kong wrote: >> >>> Thanks Matthew, >>> >>> Hmm, we still have the same issue after shutting off all unknown flags. >>> >> >> Oh, I was misinterpreting the error message: >> >> ld: can't link with a main executable file >> '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' >> >> So clang did not _actually_ make a shared library, it made an executable. >> Did clang-11 change the options it uses to build a shared library? >> >> Satish, do we test with clang-11? >> >> Thanks, >> >> Matt >> >> Thanks, >>> >>> Fande >>> >>> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley >>> wrote: >>> >>>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong wrote: >>>> >>>>> Hi All, >>>>> >>>>> mpicc rejected "-fPIC". Anyone has a clue how to work around this >>>>> issue? >>>>> >>>> >>>> The failure is at the last step >>>> >>>> Executing: mpicc -o >>>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest >>>> -fPIC >>>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o >>>> -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers >>>> -lconftest >>>> Possible ERROR while running linker: exit code 1 >>>> stderr: >>>> ld: can't link with a main executable file >>>> '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' >>>> for architecture x86_64 >>>> clang-11: error: linker command failed with exit code 1 (use -v to see >>>> invocation) >>>> >>>> but you have some flags stuck in which may or may not affect this. I >>>> would try shutting them off: >>>> >>>> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath >>>> /Users/kongf/miniconda3/envs/moose/lib >>>> -L/Users/kongf/miniconda3/envs/moose/lib >>>> >>>> I cannot tell exactly why clang is failing because it does not report a >>>> specific error. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> The log was attached. >>>>> >>>>> Thanks so much, >>>>> >>>>> Fande >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 236796 bytes Desc: not available URL: From balay at mcs.anl.gov Wed Mar 10 13:51:08 2021 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 10 Mar 2021 13:51:08 -0600 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> Message-ID: <1c46a1e2-ca46-cac1-a955-529c37986ff@mcs.anl.gov> > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath /Users/kongf/miniconda3/envs/testpetsc/lib -L/Users/kongf/miniconda3/envs/testpetsc/lib Does conda compiler pick up '-pie' from this env variable? If so - perhaps its easier to just modify it? Or is it encoded in mpicc wrapper? [mpicc -show] Satish On Wed, 10 Mar 2021, Fande Kong wrote: > Thanks Barry, > > Got the same result, but "-pie" was not filtered out somehow. > > I did changes like this: > > kongf at x86_64-apple-darwin13 petsc % git diff > diff --git a/config/BuildSystem/config/framework.py > b/config/BuildSystem/config/framework.py > index beefe82956..c31fbeb95e 100644 > --- a/config/BuildSystem/config/framework.py > +++ b/config/BuildSystem/config/framework.py > @@ -504,6 +504,8 @@ class Framework(config.base.Configure, > script.LanguageProcessor): > lines = [s for s in lines if s.find('Load a valid targeting module or > set CRAY_CPU_TARGET') < 0] > # pgi dumps filename on stderr - but returns 0 errorcode' > lines = [s for s in lines if lines != 'conftest.c:'] > + # in case -pie is always being passed to linker > + lines = [s for s in lines if s.find('-pie being ignored. It is only > used when linking a main executable') < 0] > if lines: output = reduce(lambda s, t: s+t, lines, '\n') > else: output = '' > log.write("Linker stderr after filtering:\n"+output+":\n") > > The log was attached again. > > Thanks, > > Fande > > > On Wed, Mar 10, 2021 at 12:05 PM Barry Smith wrote: > > > Fande, > > > > Please add in config/BuildSystem/config/framework.py line 528 two new > > lines > > > > # pgi dumps filename on stderr - but returns 0 errorcode' > > lines = [s for s in lines if lines != 'conftest.c:'] > > # in case -pie is always being passed to linker > > lines = [s for s in lines if s.find('-pie being ignored. It is only > > used when linking a main executable') < 0] > > > > Barry > > > > You have (another of Conda's "take over the world my way" approach) > > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath > > /Users/kongf/miniconda3/envs/testpetsc/lib > > -L/Users/kongf/miniconda3/envs/testpetsc/lib > > > > Executing: mpicc -o > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest > > -dynamiclib -single_module > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > Possible ERROR while running linker: > > stderr: > > ld: warning: -pie being ignored. It is only used when linking a main > > executable > > Rejecting C linker flag -dynamiclib -single_module due to > > > > ld: warning: -pie being ignored. It is only used when linking a main > > executable > > > > This is the correct link command for the Mac but it is being rejected due > > to the warning message. > > > > > > On Mar 10, 2021, at 10:11 AM, Fande Kong wrote: > > > > Thanks, Barry, > > > > It seems PETSc works fine with manually built compilers. We are pretty > > much sure that the issue is related to conda. Conda might introduce extra > > flags. > > > > We still need to make it work with conda because we deliver our package > > via conda for users. > > > > > > I unset all flags from conda, and got slightly different results this > > time. The log was attached. Anyone could explain the motivation that we > > try to build executable without a main function? > > > > Thanks, > > > > Fande > > > > Executing: mpicc -c -o > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers > > -fPIC > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c > > > > Successful compile: > > Source: > > #include "confdefs.h" > > #include "conffix.h" > > #include > > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; > > void foo(void){ > > fprintf_ptr(stdout,"hello"); > > return; > > } > > void bar(void){foo();} > > Running Executable WITHOUT threads to time it out > > Executing: mpicc -o > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so > > -dynamic -fPIC > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > Possible ERROR while running linker: exit code 1 > > stderr: > > Undefined symbols for architecture x86_64: > > "_main", referenced from: > > implicit entry/start for main executable > > ld: symbol(s) not found for architecture x86_64 > > clang-11: error: linker command failed with exit code 1 (use -v to see > > invocation) > > Rejected C compiler flag -fPIC because it was not compatible > > with shared linker mpicc using flags ['-dynamic'] > > > > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith wrote: > > > >> > >> Fande, > >> > >> I see you are using CONDA, this can cause issues since it sticks all > >> kinds of things into the environment. PETSc tries to remove some of them > >> but perhaps not enough. If you run printenv you will see all the mess it is > >> dumping in. > >> > >> Can you trying the same build without CONDA environment? > >> > >> Barry > >> > >> > >> On Mar 8, 2021, at 7:31 PM, Matthew Knepley wrote: > >> > >> On Mon, Mar 8, 2021 at 8:23 PM Fande Kong wrote: > >> > >>> Thanks Matthew, > >>> > >>> Hmm, we still have the same issue after shutting off all unknown flags. > >>> > >> > >> Oh, I was misinterpreting the error message: > >> > >> ld: can't link with a main executable file > >> '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > >> > >> So clang did not _actually_ make a shared library, it made an executable. > >> Did clang-11 change the options it uses to build a shared library? > >> > >> Satish, do we test with clang-11? > >> > >> Thanks, > >> > >> Matt > >> > >> Thanks, > >>> > >>> Fande > >>> > >>> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley > >>> wrote: > >>> > >>>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong wrote: > >>>> > >>>>> Hi All, > >>>>> > >>>>> mpicc rejected "-fPIC". Anyone has a clue how to work around this > >>>>> issue? > >>>>> > >>>> > >>>> The failure is at the last step > >>>> > >>>> Executing: mpicc -o > >>>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest > >>>> -fPIC > >>>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o > >>>> -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers > >>>> -lconftest > >>>> Possible ERROR while running linker: exit code 1 > >>>> stderr: > >>>> ld: can't link with a main executable file > >>>> '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > >>>> for architecture x86_64 > >>>> clang-11: error: linker command failed with exit code 1 (use -v to see > >>>> invocation) > >>>> > >>>> but you have some flags stuck in which may or may not affect this. I > >>>> would try shutting them off: > >>>> > >>>> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath > >>>> /Users/kongf/miniconda3/envs/moose/lib > >>>> -L/Users/kongf/miniconda3/envs/moose/lib > >>>> > >>>> I cannot tell exactly why clang is failing because it does not report a > >>>> specific error. > >>>> > >>>> Thanks, > >>>> > >>>> Matt > >>>> > >>>> The log was attached. > >>>>> > >>>>> Thanks so much, > >>>>> > >>>>> Fande > >>>>> > >>>> > >>>> > >>>> -- > >>>> What most experimenters take for granted before they begin their > >>>> experiments is infinitely more interesting than any results to which their > >>>> experiments lead. > >>>> -- Norbert Wiener > >>>> > >>>> https://www.cse.buffalo.edu/~knepley/ > >>>> > >>>> > >>> > >> > >> -- > >> What most experimenters take for granted before they begin their > >> experiments is infinitely more interesting than any results to which their > >> experiments lead. > >> -- Norbert Wiener > >> > >> https://www.cse.buffalo.edu/~knepley/ > >> > >> > >> > >> > > > > > > > From fdkong.jd at gmail.com Wed Mar 10 14:27:23 2021 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 10 Mar 2021 13:27:23 -0700 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: <1c46a1e2-ca46-cac1-a955-529c37986ff@mcs.anl.gov> References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> <1c46a1e2-ca46-cac1-a955-529c37986ff@mcs.anl.gov> Message-ID: I guess it was encoded in mpicc petsc % mpicc -show x86_64-apple-darwin13.4.0-clang -march=core2 -mtune=haswell -Wl,-pie -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs -Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib -L/Users/kongf/miniconda3/envs/testpetsc/lib -Wl,-commons,use_dylibs -I/Users/kongf/miniconda3/envs/testpetsc/include -L/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi Thanks, Fande On Wed, Mar 10, 2021 at 12:51 PM Satish Balay wrote: > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath > /Users/kongf/miniconda3/envs/testpetsc/lib > -L/Users/kongf/miniconda3/envs/testpetsc/lib > > Does conda compiler pick up '-pie' from this env variable? If so - perhaps > its easier to just modify it? > > Or is it encoded in mpicc wrapper? [mpicc -show] > > Satish > > On Wed, 10 Mar 2021, Fande Kong wrote: > > > Thanks Barry, > > > > Got the same result, but "-pie" was not filtered out somehow. > > > > I did changes like this: > > > > kongf at x86_64-apple-darwin13 petsc % git diff > > diff --git a/config/BuildSystem/config/framework.py > > b/config/BuildSystem/config/framework.py > > index beefe82956..c31fbeb95e 100644 > > --- a/config/BuildSystem/config/framework.py > > +++ b/config/BuildSystem/config/framework.py > > @@ -504,6 +504,8 @@ class Framework(config.base.Configure, > > script.LanguageProcessor): > > lines = [s for s in lines if s.find('Load a valid targeting module or > > set CRAY_CPU_TARGET') < 0] > > # pgi dumps filename on stderr - but returns 0 errorcode' > > lines = [s for s in lines if lines != 'conftest.c:'] > > + # in case -pie is always being passed to linker > > + lines = [s for s in lines if s.find('-pie being ignored. It is only > > used when linking a main executable') < 0] > > if lines: output = reduce(lambda s, t: s+t, lines, '\n') > > else: output = '' > > log.write("Linker stderr after filtering:\n"+output+":\n") > > > > The log was attached again. > > > > Thanks, > > > > Fande > > > > > > On Wed, Mar 10, 2021 at 12:05 PM Barry Smith wrote: > > > > > Fande, > > > > > > Please add in config/BuildSystem/config/framework.py line 528 two > new > > > lines > > > > > > # pgi dumps filename on stderr - but returns 0 errorcode' > > > lines = [s for s in lines if lines != 'conftest.c:'] > > > # in case -pie is always being passed to linker > > > lines = [s for s in lines if s.find('-pie being ignored. It is > only > > > used when linking a main executable') < 0] > > > > > > Barry > > > > > > You have (another of Conda's "take over the world my way" approach) > > > > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs > -rpath > > > /Users/kongf/miniconda3/envs/testpetsc/lib > > > -L/Users/kongf/miniconda3/envs/testpetsc/lib > > > > > > Executing: mpicc -o > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest > > > -dynamiclib -single_module > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > Possible ERROR while running linker: > > > stderr: > > > ld: warning: -pie being ignored. It is only used when linking a main > > > executable > > > Rejecting C linker flag -dynamiclib -single_module due to > > > > > > ld: warning: -pie being ignored. It is only used when linking a main > > > executable > > > > > > This is the correct link command for the Mac but it is being rejected > due > > > to the warning message. > > > > > > > > > On Mar 10, 2021, at 10:11 AM, Fande Kong wrote: > > > > > > Thanks, Barry, > > > > > > It seems PETSc works fine with manually built compilers. We are pretty > > > much sure that the issue is related to conda. Conda might introduce > extra > > > flags. > > > > > > We still need to make it work with conda because we deliver our package > > > via conda for users. > > > > > > > > > I unset all flags from conda, and got slightly different results this > > > time. The log was attached. Anyone could explain the motivation that > we > > > try to build executable without a main function? > > > > > > Thanks, > > > > > > Fande > > > > > > Executing: mpicc -c -o > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers > > > -fPIC > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c > > > > > > Successful compile: > > > Source: > > > #include "confdefs.h" > > > #include "conffix.h" > > > #include > > > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; > > > void foo(void){ > > > fprintf_ptr(stdout,"hello"); > > > return; > > > } > > > void bar(void){foo();} > > > Running Executable WITHOUT threads to time it out > > > Executing: mpicc -o > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so > > > -dynamic -fPIC > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > > > Possible ERROR while running linker: exit code 1 > > > stderr: > > > Undefined symbols for architecture x86_64: > > > "_main", referenced from: > > > implicit entry/start for main executable > > > ld: symbol(s) not found for architecture x86_64 > > > clang-11: error: linker command failed with exit code 1 (use -v to see > > > invocation) > > > Rejected C compiler flag -fPIC because it was not compatible > > > with shared linker mpicc using flags ['-dynamic'] > > > > > > > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith wrote: > > > > > >> > > >> Fande, > > >> > > >> I see you are using CONDA, this can cause issues since it sticks > all > > >> kinds of things into the environment. PETSc tries to remove some of > them > > >> but perhaps not enough. If you run printenv you will see all the mess > it is > > >> dumping in. > > >> > > >> Can you trying the same build without CONDA environment? > > >> > > >> Barry > > >> > > >> > > >> On Mar 8, 2021, at 7:31 PM, Matthew Knepley > wrote: > > >> > > >> On Mon, Mar 8, 2021 at 8:23 PM Fande Kong > wrote: > > >> > > >>> Thanks Matthew, > > >>> > > >>> Hmm, we still have the same issue after shutting off all unknown > flags. > > >>> > > >> > > >> Oh, I was misinterpreting the error message: > > >> > > >> ld: can't link with a main executable file > > >> > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > >> > > >> So clang did not _actually_ make a shared library, it made an > executable. > > >> Did clang-11 change the options it uses to build a shared library? > > >> > > >> Satish, do we test with clang-11? > > >> > > >> Thanks, > > >> > > >> Matt > > >> > > >> Thanks, > > >>> > > >>> Fande > > >>> > > >>> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley > > >>> wrote: > > >>> > > >>>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong > wrote: > > >>>> > > >>>>> Hi All, > > >>>>> > > >>>>> mpicc rejected "-fPIC". Anyone has a clue how to work around this > > >>>>> issue? > > >>>>> > > >>>> > > >>>> The failure is at the last step > > >>>> > > >>>> Executing: mpicc -o > > >>>> > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest > > >>>> -fPIC > > >>>> > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o > > >>>> > -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers > > >>>> -lconftest > > >>>> Possible ERROR while running linker: exit code 1 > > >>>> stderr: > > >>>> ld: can't link with a main executable file > > >>>> > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > >>>> for architecture x86_64 > > >>>> clang-11: error: linker command failed with exit code 1 (use -v to > see > > >>>> invocation) > > >>>> > > >>>> but you have some flags stuck in which may or may not affect this. I > > >>>> would try shutting them off: > > >>>> > > >>>> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs > -rpath > > >>>> /Users/kongf/miniconda3/envs/moose/lib > > >>>> -L/Users/kongf/miniconda3/envs/moose/lib > > >>>> > > >>>> I cannot tell exactly why clang is failing because it does not > report a > > >>>> specific error. > > >>>> > > >>>> Thanks, > > >>>> > > >>>> Matt > > >>>> > > >>>> The log was attached. > > >>>>> > > >>>>> Thanks so much, > > >>>>> > > >>>>> Fande > > >>>>> > > >>>> > > >>>> > > >>>> -- > > >>>> What most experimenters take for granted before they begin their > > >>>> experiments is infinitely more interesting than any results to > which their > > >>>> experiments lead. > > >>>> -- Norbert Wiener > > >>>> > > >>>> https://www.cse.buffalo.edu/~knepley/ > > >>>> > > >>>> > > >>> > > >> > > >> -- > > >> What most experimenters take for granted before they begin their > > >> experiments is infinitely more interesting than any results to which > their > > >> experiments lead. > > >> -- Norbert Wiener > > >> > > >> https://www.cse.buffalo.edu/~knepley/ > > >> > > >> > > >> > > >> > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Mar 10 14:35:59 2021 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 10 Mar 2021 14:35:59 -0600 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> <1c46a1e2-ca46-cac1-a955-529c37986ff@mcs.anl.gov> Message-ID: <71f053b9-6e4-af4a-fa7c-2f20704c8029@mcs.anl.gov> Can you use a different MPI for this conda install? Alternative: ./configure CC=x86_64-apple-darwin13.4.0-clang COPTFLAGS="-march=core2 -mtune=haswell" CPPFLAGS=-I/Users/kongf/miniconda3/envs/testpetsc/include LDFLAGS="-Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs -Wl,-commons,use_dylibs" LIBS="-Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi" etc.. [don't know if you really need LDFLAGS options] Satish On Wed, 10 Mar 2021, Fande Kong wrote: > I guess it was encoded in mpicc > > petsc % mpicc -show > x86_64-apple-darwin13.4.0-clang -march=core2 -mtune=haswell -Wl,-pie > -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs > -Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib > -L/Users/kongf/miniconda3/envs/testpetsc/lib -Wl,-commons,use_dylibs > -I/Users/kongf/miniconda3/envs/testpetsc/include > -L/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi > > > Thanks, > > Fande > > On Wed, Mar 10, 2021 at 12:51 PM Satish Balay wrote: > > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath > > /Users/kongf/miniconda3/envs/testpetsc/lib > > -L/Users/kongf/miniconda3/envs/testpetsc/lib > > > > Does conda compiler pick up '-pie' from this env variable? If so - perhaps > > its easier to just modify it? > > > > Or is it encoded in mpicc wrapper? [mpicc -show] > > > > Satish > > > > On Wed, 10 Mar 2021, Fande Kong wrote: > > > > > Thanks Barry, > > > > > > Got the same result, but "-pie" was not filtered out somehow. > > > > > > I did changes like this: > > > > > > kongf at x86_64-apple-darwin13 petsc % git diff > > > diff --git a/config/BuildSystem/config/framework.py > > > b/config/BuildSystem/config/framework.py > > > index beefe82956..c31fbeb95e 100644 > > > --- a/config/BuildSystem/config/framework.py > > > +++ b/config/BuildSystem/config/framework.py > > > @@ -504,6 +504,8 @@ class Framework(config.base.Configure, > > > script.LanguageProcessor): > > > lines = [s for s in lines if s.find('Load a valid targeting module or > > > set CRAY_CPU_TARGET') < 0] > > > # pgi dumps filename on stderr - but returns 0 errorcode' > > > lines = [s for s in lines if lines != 'conftest.c:'] > > > + # in case -pie is always being passed to linker > > > + lines = [s for s in lines if s.find('-pie being ignored. It is only > > > used when linking a main executable') < 0] > > > if lines: output = reduce(lambda s, t: s+t, lines, '\n') > > > else: output = '' > > > log.write("Linker stderr after filtering:\n"+output+":\n") > > > > > > The log was attached again. > > > > > > Thanks, > > > > > > Fande > > > > > > > > > On Wed, Mar 10, 2021 at 12:05 PM Barry Smith wrote: > > > > > > > Fande, > > > > > > > > Please add in config/BuildSystem/config/framework.py line 528 two > > new > > > > lines > > > > > > > > # pgi dumps filename on stderr - but returns 0 errorcode' > > > > lines = [s for s in lines if lines != 'conftest.c:'] > > > > # in case -pie is always being passed to linker > > > > lines = [s for s in lines if s.find('-pie being ignored. It is > > only > > > > used when linking a main executable') < 0] > > > > > > > > Barry > > > > > > > > You have (another of Conda's "take over the world my way" approach) > > > > > > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs > > -rpath > > > > /Users/kongf/miniconda3/envs/testpetsc/lib > > > > -L/Users/kongf/miniconda3/envs/testpetsc/lib > > > > > > > > Executing: mpicc -o > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest > > > > -dynamiclib -single_module > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > Possible ERROR while running linker: > > > > stderr: > > > > ld: warning: -pie being ignored. It is only used when linking a main > > > > executable > > > > Rejecting C linker flag -dynamiclib -single_module due to > > > > > > > > ld: warning: -pie being ignored. It is only used when linking a main > > > > executable > > > > > > > > This is the correct link command for the Mac but it is being rejected > > due > > > > to the warning message. > > > > > > > > > > > > On Mar 10, 2021, at 10:11 AM, Fande Kong wrote: > > > > > > > > Thanks, Barry, > > > > > > > > It seems PETSc works fine with manually built compilers. We are pretty > > > > much sure that the issue is related to conda. Conda might introduce > > extra > > > > flags. > > > > > > > > We still need to make it work with conda because we deliver our package > > > > via conda for users. > > > > > > > > > > > > I unset all flags from conda, and got slightly different results this > > > > time. The log was attached. Anyone could explain the motivation that > > we > > > > try to build executable without a main function? > > > > > > > > Thanks, > > > > > > > > Fande > > > > > > > > Executing: mpicc -c -o > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > > > -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers > > > > -fPIC > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c > > > > > > > > Successful compile: > > > > Source: > > > > #include "confdefs.h" > > > > #include "conffix.h" > > > > #include > > > > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; > > > > void foo(void){ > > > > fprintf_ptr(stdout,"hello"); > > > > return; > > > > } > > > > void bar(void){foo();} > > > > Running Executable WITHOUT threads to time it out > > > > Executing: mpicc -o > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so > > > > -dynamic -fPIC > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > > > > > Possible ERROR while running linker: exit code 1 > > > > stderr: > > > > Undefined symbols for architecture x86_64: > > > > "_main", referenced from: > > > > implicit entry/start for main executable > > > > ld: symbol(s) not found for architecture x86_64 > > > > clang-11: error: linker command failed with exit code 1 (use -v to see > > > > invocation) > > > > Rejected C compiler flag -fPIC because it was not compatible > > > > with shared linker mpicc using flags ['-dynamic'] > > > > > > > > > > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith wrote: > > > > > > > >> > > > >> Fande, > > > >> > > > >> I see you are using CONDA, this can cause issues since it sticks > > all > > > >> kinds of things into the environment. PETSc tries to remove some of > > them > > > >> but perhaps not enough. If you run printenv you will see all the mess > > it is > > > >> dumping in. > > > >> > > > >> Can you trying the same build without CONDA environment? > > > >> > > > >> Barry > > > >> > > > >> > > > >> On Mar 8, 2021, at 7:31 PM, Matthew Knepley > > wrote: > > > >> > > > >> On Mon, Mar 8, 2021 at 8:23 PM Fande Kong > > wrote: > > > >> > > > >>> Thanks Matthew, > > > >>> > > > >>> Hmm, we still have the same issue after shutting off all unknown > > flags. > > > >>> > > > >> > > > >> Oh, I was misinterpreting the error message: > > > >> > > > >> ld: can't link with a main executable file > > > >> > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > > >> > > > >> So clang did not _actually_ make a shared library, it made an > > executable. > > > >> Did clang-11 change the options it uses to build a shared library? > > > >> > > > >> Satish, do we test with clang-11? > > > >> > > > >> Thanks, > > > >> > > > >> Matt > > > >> > > > >> Thanks, > > > >>> > > > >>> Fande > > > >>> > > > >>> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley > > > >>> wrote: > > > >>> > > > >>>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong > > wrote: > > > >>>> > > > >>>>> Hi All, > > > >>>>> > > > >>>>> mpicc rejected "-fPIC". Anyone has a clue how to work around this > > > >>>>> issue? > > > >>>>> > > > >>>> > > > >>>> The failure is at the last step > > > >>>> > > > >>>> Executing: mpicc -o > > > >>>> > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest > > > >>>> -fPIC > > > >>>> > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o > > > >>>> > > -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers > > > >>>> -lconftest > > > >>>> Possible ERROR while running linker: exit code 1 > > > >>>> stderr: > > > >>>> ld: can't link with a main executable file > > > >>>> > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > > >>>> for architecture x86_64 > > > >>>> clang-11: error: linker command failed with exit code 1 (use -v to > > see > > > >>>> invocation) > > > >>>> > > > >>>> but you have some flags stuck in which may or may not affect this. I > > > >>>> would try shutting them off: > > > >>>> > > > >>>> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs > > -rpath > > > >>>> /Users/kongf/miniconda3/envs/moose/lib > > > >>>> -L/Users/kongf/miniconda3/envs/moose/lib > > > >>>> > > > >>>> I cannot tell exactly why clang is failing because it does not > > report a > > > >>>> specific error. > > > >>>> > > > >>>> Thanks, > > > >>>> > > > >>>> Matt > > > >>>> > > > >>>> The log was attached. > > > >>>>> > > > >>>>> Thanks so much, > > > >>>>> > > > >>>>> Fande > > > >>>>> > > > >>>> > > > >>>> > > > >>>> -- > > > >>>> What most experimenters take for granted before they begin their > > > >>>> experiments is infinitely more interesting than any results to > > which their > > > >>>> experiments lead. > > > >>>> -- Norbert Wiener > > > >>>> > > > >>>> https://www.cse.buffalo.edu/~knepley/ > > > >>>> > > > >>>> > > > >>> > > > >> > > > >> -- > > > >> What most experimenters take for granted before they begin their > > > >> experiments is infinitely more interesting than any results to which > > their > > > >> experiments lead. > > > >> -- Norbert Wiener > > > >> > > > >> https://www.cse.buffalo.edu/~knepley/ > > > >> > > > >> > > > >> > > > >> > > > > > > > > > > > > > > > > > > > > From fdkong.jd at gmail.com Wed Mar 10 17:17:18 2021 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 10 Mar 2021 16:17:18 -0700 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> Message-ID: On Wed, Mar 10, 2021 at 12:05 PM Barry Smith wrote: > Fande, > > Please add in config/BuildSystem/config/framework.py line 528 two new > lines > > # pgi dumps filename on stderr - but returns 0 errorcode' > lines = [s for s in lines if lines != 'conftest.c:'] > # in case -pie is always being passed to linker > lines = [s for s in lines if s.find('-pie being ignored. It is only > used when linking a main executable') < 0] > > Barry > > You have (another of Conda's "take over the world my way" approach) > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath > /Users/kongf/miniconda3/envs/testpetsc/lib > -L/Users/kongf/miniconda3/envs/testpetsc/lib > > Executing: mpicc -o > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest > -dynamiclib -single_module > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > Possible ERROR while running linker: > stderr: > ld: warning: -pie being ignored. It is only used when linking a main > executable > Rejecting C linker flag -dynamiclib -single_module due to > > ld: warning: -pie being ignored. It is only used when linking a main > executable > > This is the correct link command for the Mac but it is being rejected due > to the warning message. > Could we somehow skip warning messages? Fande > > > On Mar 10, 2021, at 10:11 AM, Fande Kong wrote: > > Thanks, Barry, > > It seems PETSc works fine with manually built compilers. We are pretty > much sure that the issue is related to conda. Conda might introduce extra > flags. > > We still need to make it work with conda because we deliver our package > via conda for users. > > > I unset all flags from conda, and got slightly different results this > time. The log was attached. Anyone could explain the motivation that we > try to build executable without a main function? > > Thanks, > > Fande > > Executing: mpicc -c -o > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers > -fPIC > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c > > Successful compile: > Source: > #include "confdefs.h" > #include "conffix.h" > #include > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; > void foo(void){ > fprintf_ptr(stdout,"hello"); > return; > } > void bar(void){foo();} > Running Executable WITHOUT threads to time it out > Executing: mpicc -o > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so > -dynamic -fPIC > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > Possible ERROR while running linker: exit code 1 > stderr: > Undefined symbols for architecture x86_64: > "_main", referenced from: > implicit entry/start for main executable > ld: symbol(s) not found for architecture x86_64 > clang-11: error: linker command failed with exit code 1 (use -v to see > invocation) > Rejected C compiler flag -fPIC because it was not compatible > with shared linker mpicc using flags ['-dynamic'] > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith wrote: > >> >> Fande, >> >> I see you are using CONDA, this can cause issues since it sticks all >> kinds of things into the environment. PETSc tries to remove some of them >> but perhaps not enough. If you run printenv you will see all the mess it is >> dumping in. >> >> Can you trying the same build without CONDA environment? >> >> Barry >> >> >> On Mar 8, 2021, at 7:31 PM, Matthew Knepley wrote: >> >> On Mon, Mar 8, 2021 at 8:23 PM Fande Kong wrote: >> >>> Thanks Matthew, >>> >>> Hmm, we still have the same issue after shutting off all unknown flags. >>> >> >> Oh, I was misinterpreting the error message: >> >> ld: can't link with a main executable file >> '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' >> >> So clang did not _actually_ make a shared library, it made an executable. >> Did clang-11 change the options it uses to build a shared library? >> >> Satish, do we test with clang-11? >> >> Thanks, >> >> Matt >> >> Thanks, >>> >>> Fande >>> >>> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley >>> wrote: >>> >>>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong wrote: >>>> >>>>> Hi All, >>>>> >>>>> mpicc rejected "-fPIC". Anyone has a clue how to work around this >>>>> issue? >>>>> >>>> >>>> The failure is at the last step >>>> >>>> Executing: mpicc -o >>>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest >>>> -fPIC >>>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o >>>> -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers >>>> -lconftest >>>> Possible ERROR while running linker: exit code 1 >>>> stderr: >>>> ld: can't link with a main executable file >>>> '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' >>>> for architecture x86_64 >>>> clang-11: error: linker command failed with exit code 1 (use -v to see >>>> invocation) >>>> >>>> but you have some flags stuck in which may or may not affect this. I >>>> would try shutting them off: >>>> >>>> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath >>>> /Users/kongf/miniconda3/envs/moose/lib >>>> -L/Users/kongf/miniconda3/envs/moose/lib >>>> >>>> I cannot tell exactly why clang is failing because it does not report a >>>> specific error. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> The log was attached. >>>>> >>>>> Thanks so much, >>>>> >>>>> Fande >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Wed Mar 10 17:21:22 2021 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 10 Mar 2021 16:21:22 -0700 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: <71f053b9-6e4-af4a-fa7c-2f20704c8029@mcs.anl.gov> References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> <1c46a1e2-ca46-cac1-a955-529c37986ff@mcs.anl.gov> <71f053b9-6e4-af4a-fa7c-2f20704c8029@mcs.anl.gov> Message-ID: On Wed, Mar 10, 2021 at 1:36 PM Satish Balay wrote: > Can you use a different MPI for this conda install? > We control how to build MPI. If I take "-pie" options out of LDFLAGS, conda can not compile mpich. > > Alternative: > > ./configure CC=x86_64-apple-darwin13.4.0-clang COPTFLAGS="-march=core2 > -mtune=haswell" CPPFLAGS=-I/Users/kongf/miniconda3/envs/testpetsc/include > LDFLAGS="-Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs > -Wl,-commons,use_dylibs" > LIBS="-Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi" > MPI can not generate an executable because we took out "-pie". Thanks, Fande > > etc.. [don't know if you really need LDFLAGS options] > > Satish > > On Wed, 10 Mar 2021, Fande Kong wrote: > > > I guess it was encoded in mpicc > > > > petsc % mpicc -show > > x86_64-apple-darwin13.4.0-clang -march=core2 -mtune=haswell -Wl,-pie > > -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs > > -Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib > > -L/Users/kongf/miniconda3/envs/testpetsc/lib -Wl,-commons,use_dylibs > > -I/Users/kongf/miniconda3/envs/testpetsc/include > > -L/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi > > > > > > Thanks, > > > > Fande > > > > On Wed, Mar 10, 2021 at 12:51 PM Satish Balay wrote: > > > > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs > -rpath > > > /Users/kongf/miniconda3/envs/testpetsc/lib > > > -L/Users/kongf/miniconda3/envs/testpetsc/lib > > > > > > Does conda compiler pick up '-pie' from this env variable? If so - > perhaps > > > its easier to just modify it? > > > > > > Or is it encoded in mpicc wrapper? [mpicc -show] > > > > > > Satish > > > > > > On Wed, 10 Mar 2021, Fande Kong wrote: > > > > > > > Thanks Barry, > > > > > > > > Got the same result, but "-pie" was not filtered out somehow. > > > > > > > > I did changes like this: > > > > > > > > kongf at x86_64-apple-darwin13 petsc % git diff > > > > diff --git a/config/BuildSystem/config/framework.py > > > > b/config/BuildSystem/config/framework.py > > > > index beefe82956..c31fbeb95e 100644 > > > > --- a/config/BuildSystem/config/framework.py > > > > +++ b/config/BuildSystem/config/framework.py > > > > @@ -504,6 +504,8 @@ class Framework(config.base.Configure, > > > > script.LanguageProcessor): > > > > lines = [s for s in lines if s.find('Load a valid targeting > module or > > > > set CRAY_CPU_TARGET') < 0] > > > > # pgi dumps filename on stderr - but returns 0 errorcode' > > > > lines = [s for s in lines if lines != 'conftest.c:'] > > > > + # in case -pie is always being passed to linker > > > > + lines = [s for s in lines if s.find('-pie being ignored. It is > only > > > > used when linking a main executable') < 0] > > > > if lines: output = reduce(lambda s, t: s+t, lines, '\n') > > > > else: output = '' > > > > log.write("Linker stderr after filtering:\n"+output+":\n") > > > > > > > > The log was attached again. > > > > > > > > Thanks, > > > > > > > > Fande > > > > > > > > > > > > On Wed, Mar 10, 2021 at 12:05 PM Barry Smith > wrote: > > > > > > > > > Fande, > > > > > > > > > > Please add in config/BuildSystem/config/framework.py line 528 > two > > > new > > > > > lines > > > > > > > > > > # pgi dumps filename on stderr - but returns 0 errorcode' > > > > > lines = [s for s in lines if lines != 'conftest.c:'] > > > > > # in case -pie is always being passed to linker > > > > > lines = [s for s in lines if s.find('-pie being ignored. It > is > > > only > > > > > used when linking a main executable') < 0] > > > > > > > > > > Barry > > > > > > > > > > You have (another of Conda's "take over the world my way" > approach) > > > > > > > > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs > > > -rpath > > > > > /Users/kongf/miniconda3/envs/testpetsc/lib > > > > > -L/Users/kongf/miniconda3/envs/testpetsc/lib > > > > > > > > > > Executing: mpicc -o > > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest > > > > > -dynamiclib -single_module > > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > > Possible ERROR while running linker: > > > > > stderr: > > > > > ld: warning: -pie being ignored. It is only used when linking a > main > > > > > executable > > > > > Rejecting C linker flag -dynamiclib -single_module due > to > > > > > > > > > > ld: warning: -pie being ignored. It is only used when linking a > main > > > > > executable > > > > > > > > > > This is the correct link command for the Mac but it is being > rejected > > > due > > > > > to the warning message. > > > > > > > > > > > > > > > On Mar 10, 2021, at 10:11 AM, Fande Kong > wrote: > > > > > > > > > > Thanks, Barry, > > > > > > > > > > It seems PETSc works fine with manually built compilers. We are > pretty > > > > > much sure that the issue is related to conda. Conda might introduce > > > extra > > > > > flags. > > > > > > > > > > We still need to make it work with conda because we deliver our > package > > > > > via conda for users. > > > > > > > > > > > > > > > I unset all flags from conda, and got slightly different results > this > > > > > time. The log was attached. Anyone could explain the motivation > that > > > we > > > > > try to build executable without a main function? > > > > > > > > > > Thanks, > > > > > > > > > > Fande > > > > > > > > > > Executing: mpicc -c -o > > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > > > > > > -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers > > > > > -fPIC > > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c > > > > > > > > > > Successful compile: > > > > > Source: > > > > > #include "confdefs.h" > > > > > #include "conffix.h" > > > > > #include > > > > > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; > > > > > void foo(void){ > > > > > fprintf_ptr(stdout,"hello"); > > > > > return; > > > > > } > > > > > void bar(void){foo();} > > > > > Running Executable WITHOUT threads to time it out > > > > > Executing: mpicc -o > > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so > > > > > -dynamic -fPIC > > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > > > > > > > Possible ERROR while running linker: exit code 1 > > > > > stderr: > > > > > Undefined symbols for architecture x86_64: > > > > > "_main", referenced from: > > > > > implicit entry/start for main executable > > > > > ld: symbol(s) not found for architecture x86_64 > > > > > clang-11: error: linker command failed with exit code 1 (use -v to > see > > > > > invocation) > > > > > Rejected C compiler flag -fPIC because it was not > compatible > > > > > with shared linker mpicc using flags ['-dynamic'] > > > > > > > > > > > > > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith > wrote: > > > > > > > > > >> > > > > >> Fande, > > > > >> > > > > >> I see you are using CONDA, this can cause issues since it > sticks > > > all > > > > >> kinds of things into the environment. PETSc tries to remove some > of > > > them > > > > >> but perhaps not enough. If you run printenv you will see all the > mess > > > it is > > > > >> dumping in. > > > > >> > > > > >> Can you trying the same build without CONDA environment? > > > > >> > > > > >> Barry > > > > >> > > > > >> > > > > >> On Mar 8, 2021, at 7:31 PM, Matthew Knepley > > > wrote: > > > > >> > > > > >> On Mon, Mar 8, 2021 at 8:23 PM Fande Kong > > > wrote: > > > > >> > > > > >>> Thanks Matthew, > > > > >>> > > > > >>> Hmm, we still have the same issue after shutting off all unknown > > > flags. > > > > >>> > > > > >> > > > > >> Oh, I was misinterpreting the error message: > > > > >> > > > > >> ld: can't link with a main executable file > > > > >> > > > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > > > >> > > > > >> So clang did not _actually_ make a shared library, it made an > > > executable. > > > > >> Did clang-11 change the options it uses to build a shared library? > > > > >> > > > > >> Satish, do we test with clang-11? > > > > >> > > > > >> Thanks, > > > > >> > > > > >> Matt > > > > >> > > > > >> Thanks, > > > > >>> > > > > >>> Fande > > > > >>> > > > > >>> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley < > knepley at gmail.com> > > > > >>> wrote: > > > > >>> > > > > >>>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong > > > wrote: > > > > >>>> > > > > >>>>> Hi All, > > > > >>>>> > > > > >>>>> mpicc rejected "-fPIC". Anyone has a clue how to work around > this > > > > >>>>> issue? > > > > >>>>> > > > > >>>> > > > > >>>> The failure is at the last step > > > > >>>> > > > > >>>> Executing: mpicc -o > > > > >>>> > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest > > > > >>>> -fPIC > > > > >>>> > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o > > > > >>>> > > > > -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers > > > > >>>> -lconftest > > > > >>>> Possible ERROR while running linker: exit code 1 > > > > >>>> stderr: > > > > >>>> ld: can't link with a main executable file > > > > >>>> > > > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > > > >>>> for architecture x86_64 > > > > >>>> clang-11: error: linker command failed with exit code 1 (use -v > to > > > see > > > > >>>> invocation) > > > > >>>> > > > > >>>> but you have some flags stuck in which may or may not affect > this. I > > > > >>>> would try shutting them off: > > > > >>>> > > > > >>>> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs > > > -rpath > > > > >>>> /Users/kongf/miniconda3/envs/moose/lib > > > > >>>> -L/Users/kongf/miniconda3/envs/moose/lib > > > > >>>> > > > > >>>> I cannot tell exactly why clang is failing because it does not > > > report a > > > > >>>> specific error. > > > > >>>> > > > > >>>> Thanks, > > > > >>>> > > > > >>>> Matt > > > > >>>> > > > > >>>> The log was attached. > > > > >>>>> > > > > >>>>> Thanks so much, > > > > >>>>> > > > > >>>>> Fande > > > > >>>>> > > > > >>>> > > > > >>>> > > > > >>>> -- > > > > >>>> What most experimenters take for granted before they begin their > > > > >>>> experiments is infinitely more interesting than any results to > > > which their > > > > >>>> experiments lead. > > > > >>>> -- Norbert Wiener > > > > >>>> > > > > >>>> https://www.cse.buffalo.edu/~knepley/ > > > > >>>> > > > > >>>> > > > > >>> > > > > >> > > > > >> -- > > > > >> What most experimenters take for granted before they begin their > > > > >> experiments is infinitely more interesting than any results to > which > > > their > > > > >> experiments lead. > > > > >> -- Norbert Wiener > > > > >> > > > > >> https://www.cse.buffalo.edu/~knepley/ > > > > >> > > > > >> > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Wed Mar 10 17:59:21 2021 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 10 Mar 2021 16:59:21 -0700 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> <1c46a1e2-ca46-cac1-a955-529c37986ff@mcs.anl.gov> <71f053b9-6e4-af4a-fa7c-2f20704c8029@mcs.anl.gov> Message-ID: Do not know what the fix should look like, but this works for me @staticmethod @@ -1194,7 +1194,6 @@ class Configure(config.base.Configure): output.find('unrecognized command line option') >= 0 or output.find('unrecognized option') >= 0 or output.find('unrecognised option') >= 0 or output.find('not recognized') >= 0 or output.find('not recognised') >= 0 or output.find('unknown option') >= 0 or output.find('unknown flag') >= 0 or output.find('Unknown switch') >= 0 or - output.find('ignoring option') >= 0 or output.find('ignored') >= 0 or output.find('argument unused') >= 0 or output.find('not supported') >= 0 or # When checking for the existence of 'attribute' output.find('is unsupported and will be skipped') >= 0 or Thanks, Fande On Wed, Mar 10, 2021 at 4:21 PM Fande Kong wrote: > > > On Wed, Mar 10, 2021 at 1:36 PM Satish Balay wrote: > >> Can you use a different MPI for this conda install? >> > > We control how to build MPI. If I take "-pie" options out of LDFLAGS, > conda can not compile mpich. > > > > >> >> Alternative: >> >> ./configure CC=x86_64-apple-darwin13.4.0-clang COPTFLAGS="-march=core2 >> -mtune=haswell" CPPFLAGS=-I/Users/kongf/miniconda3/envs/testpetsc/include >> LDFLAGS="-Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs >> -Wl,-commons,use_dylibs" >> LIBS="-Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi" >> > > MPI can not generate an executable because we took out "-pie". > > Thanks, > > Fande > > >> >> etc.. [don't know if you really need LDFLAGS options] >> >> Satish >> >> On Wed, 10 Mar 2021, Fande Kong wrote: >> >> > I guess it was encoded in mpicc >> > >> > petsc % mpicc -show >> > x86_64-apple-darwin13.4.0-clang -march=core2 -mtune=haswell -Wl,-pie >> > -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs >> > -Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib >> > -L/Users/kongf/miniconda3/envs/testpetsc/lib -Wl,-commons,use_dylibs >> > -I/Users/kongf/miniconda3/envs/testpetsc/include >> > -L/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi >> > >> > >> > Thanks, >> > >> > Fande >> > >> > On Wed, Mar 10, 2021 at 12:51 PM Satish Balay >> wrote: >> > >> > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs >> -rpath >> > > /Users/kongf/miniconda3/envs/testpetsc/lib >> > > -L/Users/kongf/miniconda3/envs/testpetsc/lib >> > > >> > > Does conda compiler pick up '-pie' from this env variable? If so - >> perhaps >> > > its easier to just modify it? >> > > >> > > Or is it encoded in mpicc wrapper? [mpicc -show] >> > > >> > > Satish >> > > >> > > On Wed, 10 Mar 2021, Fande Kong wrote: >> > > >> > > > Thanks Barry, >> > > > >> > > > Got the same result, but "-pie" was not filtered out somehow. >> > > > >> > > > I did changes like this: >> > > > >> > > > kongf at x86_64-apple-darwin13 petsc % git diff >> > > > diff --git a/config/BuildSystem/config/framework.py >> > > > b/config/BuildSystem/config/framework.py >> > > > index beefe82956..c31fbeb95e 100644 >> > > > --- a/config/BuildSystem/config/framework.py >> > > > +++ b/config/BuildSystem/config/framework.py >> > > > @@ -504,6 +504,8 @@ class Framework(config.base.Configure, >> > > > script.LanguageProcessor): >> > > > lines = [s for s in lines if s.find('Load a valid targeting >> module or >> > > > set CRAY_CPU_TARGET') < 0] >> > > > # pgi dumps filename on stderr - but returns 0 errorcode' >> > > > lines = [s for s in lines if lines != 'conftest.c:'] >> > > > + # in case -pie is always being passed to linker >> > > > + lines = [s for s in lines if s.find('-pie being ignored. It is >> only >> > > > used when linking a main executable') < 0] >> > > > if lines: output = reduce(lambda s, t: s+t, lines, '\n') >> > > > else: output = '' >> > > > log.write("Linker stderr after filtering:\n"+output+":\n") >> > > > >> > > > The log was attached again. >> > > > >> > > > Thanks, >> > > > >> > > > Fande >> > > > >> > > > >> > > > On Wed, Mar 10, 2021 at 12:05 PM Barry Smith >> wrote: >> > > > >> > > > > Fande, >> > > > > >> > > > > Please add in config/BuildSystem/config/framework.py line 528 >> two >> > > new >> > > > > lines >> > > > > >> > > > > # pgi dumps filename on stderr - but returns 0 errorcode' >> > > > > lines = [s for s in lines if lines != 'conftest.c:'] >> > > > > # in case -pie is always being passed to linker >> > > > > lines = [s for s in lines if s.find('-pie being ignored. It >> is >> > > only >> > > > > used when linking a main executable') < 0] >> > > > > >> > > > > Barry >> > > > > >> > > > > You have (another of Conda's "take over the world my way" >> approach) >> > > > > >> > > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs >> > > -rpath >> > > > > /Users/kongf/miniconda3/envs/testpetsc/lib >> > > > > -L/Users/kongf/miniconda3/envs/testpetsc/lib >> > > > > >> > > > > Executing: mpicc -o >> > > > > >> > > >> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest >> > > > > -dynamiclib -single_module >> > > > > >> > > >> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o >> > > > > Possible ERROR while running linker: >> > > > > stderr: >> > > > > ld: warning: -pie being ignored. It is only used when linking a >> main >> > > > > executable >> > > > > Rejecting C linker flag -dynamiclib -single_module >> due to >> > > > > >> > > > > ld: warning: -pie being ignored. It is only used when linking a >> main >> > > > > executable >> > > > > >> > > > > This is the correct link command for the Mac but it is being >> rejected >> > > due >> > > > > to the warning message. >> > > > > >> > > > > >> > > > > On Mar 10, 2021, at 10:11 AM, Fande Kong >> wrote: >> > > > > >> > > > > Thanks, Barry, >> > > > > >> > > > > It seems PETSc works fine with manually built compilers. We are >> pretty >> > > > > much sure that the issue is related to conda. Conda might >> introduce >> > > extra >> > > > > flags. >> > > > > >> > > > > We still need to make it work with conda because we deliver our >> package >> > > > > via conda for users. >> > > > > >> > > > > >> > > > > I unset all flags from conda, and got slightly different results >> this >> > > > > time. The log was attached. Anyone could explain the motivation >> that >> > > we >> > > > > try to build executable without a main function? >> > > > > >> > > > > Thanks, >> > > > > >> > > > > Fande >> > > > > >> > > > > Executing: mpicc -c -o >> > > > > >> > > >> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o >> > > > > >> > > >> -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers >> > > > > -fPIC >> > > > > >> > > >> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c >> > > > > >> > > > > Successful compile: >> > > > > Source: >> > > > > #include "confdefs.h" >> > > > > #include "conffix.h" >> > > > > #include >> > > > > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; >> > > > > void foo(void){ >> > > > > fprintf_ptr(stdout,"hello"); >> > > > > return; >> > > > > } >> > > > > void bar(void){foo();} >> > > > > Running Executable WITHOUT threads to time it out >> > > > > Executing: mpicc -o >> > > > > >> > > >> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so >> > > > > -dynamic -fPIC >> > > > > >> > > >> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o >> > > > > >> > > > > Possible ERROR while running linker: exit code 1 >> > > > > stderr: >> > > > > Undefined symbols for architecture x86_64: >> > > > > "_main", referenced from: >> > > > > implicit entry/start for main executable >> > > > > ld: symbol(s) not found for architecture x86_64 >> > > > > clang-11: error: linker command failed with exit code 1 (use -v >> to see >> > > > > invocation) >> > > > > Rejected C compiler flag -fPIC because it was not >> compatible >> > > > > with shared linker mpicc using flags ['-dynamic'] >> > > > > >> > > > > >> > > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith >> wrote: >> > > > > >> > > > >> >> > > > >> Fande, >> > > > >> >> > > > >> I see you are using CONDA, this can cause issues since it >> sticks >> > > all >> > > > >> kinds of things into the environment. PETSc tries to remove some >> of >> > > them >> > > > >> but perhaps not enough. If you run printenv you will see all the >> mess >> > > it is >> > > > >> dumping in. >> > > > >> >> > > > >> Can you trying the same build without CONDA environment? >> > > > >> >> > > > >> Barry >> > > > >> >> > > > >> >> > > > >> On Mar 8, 2021, at 7:31 PM, Matthew Knepley >> > > wrote: >> > > > >> >> > > > >> On Mon, Mar 8, 2021 at 8:23 PM Fande Kong >> > > wrote: >> > > > >> >> > > > >>> Thanks Matthew, >> > > > >>> >> > > > >>> Hmm, we still have the same issue after shutting off all unknown >> > > flags. >> > > > >>> >> > > > >> >> > > > >> Oh, I was misinterpreting the error message: >> > > > >> >> > > > >> ld: can't link with a main executable file >> > > > >> >> > > >> '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' >> > > > >> >> > > > >> So clang did not _actually_ make a shared library, it made an >> > > executable. >> > > > >> Did clang-11 change the options it uses to build a shared >> library? >> > > > >> >> > > > >> Satish, do we test with clang-11? >> > > > >> >> > > > >> Thanks, >> > > > >> >> > > > >> Matt >> > > > >> >> > > > >> Thanks, >> > > > >>> >> > > > >>> Fande >> > > > >>> >> > > > >>> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley < >> knepley at gmail.com> >> > > > >>> wrote: >> > > > >>> >> > > > >>>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong > > >> > > wrote: >> > > > >>>> >> > > > >>>>> Hi All, >> > > > >>>>> >> > > > >>>>> mpicc rejected "-fPIC". Anyone has a clue how to work around >> this >> > > > >>>>> issue? >> > > > >>>>> >> > > > >>>> >> > > > >>>> The failure is at the last step >> > > > >>>> >> > > > >>>> Executing: mpicc -o >> > > > >>>> >> > > >> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest >> > > > >>>> -fPIC >> > > > >>>> >> > > >> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o >> > > > >>>> >> > > >> -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers >> > > > >>>> -lconftest >> > > > >>>> Possible ERROR while running linker: exit code 1 >> > > > >>>> stderr: >> > > > >>>> ld: can't link with a main executable file >> > > > >>>> >> > > >> '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' >> > > > >>>> for architecture x86_64 >> > > > >>>> clang-11: error: linker command failed with exit code 1 (use >> -v to >> > > see >> > > > >>>> invocation) >> > > > >>>> >> > > > >>>> but you have some flags stuck in which may or may not affect >> this. I >> > > > >>>> would try shutting them off: >> > > > >>>> >> > > > >>>> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs >> > > -rpath >> > > > >>>> /Users/kongf/miniconda3/envs/moose/lib >> > > > >>>> -L/Users/kongf/miniconda3/envs/moose/lib >> > > > >>>> >> > > > >>>> I cannot tell exactly why clang is failing because it does not >> > > report a >> > > > >>>> specific error. >> > > > >>>> >> > > > >>>> Thanks, >> > > > >>>> >> > > > >>>> Matt >> > > > >>>> >> > > > >>>> The log was attached. >> > > > >>>>> >> > > > >>>>> Thanks so much, >> > > > >>>>> >> > > > >>>>> Fande >> > > > >>>>> >> > > > >>>> >> > > > >>>> >> > > > >>>> -- >> > > > >>>> What most experimenters take for granted before they begin >> their >> > > > >>>> experiments is infinitely more interesting than any results to >> > > which their >> > > > >>>> experiments lead. >> > > > >>>> -- Norbert Wiener >> > > > >>>> >> > > > >>>> https://www.cse.buffalo.edu/~knepley/ >> > > > >>>> >> > > > >>>> >> > > > >>> >> > > > >> >> > > > >> -- >> > > > >> What most experimenters take for granted before they begin their >> > > > >> experiments is infinitely more interesting than any results to >> which >> > > their >> > > > >> experiments lead. >> > > > >> -- Norbert Wiener >> > > > >> >> > > > >> https://www.cse.buffalo.edu/~knepley/ >> > > > >> >> > > > >> >> > > > >> >> > > > >> >> > > > > >> > > > > >> > > > > >> > > > >> > > >> > > >> > >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Mar 10 19:24:15 2021 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 10 Mar 2021 19:24:15 -0600 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> <1c46a1e2-ca46-cac1-a955-529c37986ff@mcs.anl.gov> <71f053b9-6e4-af4a-fa7c-2f20704c8029@mcs.anl.gov> Message-ID: <01EF18AD-B106-4E9F-92CF-5A74541B0176@petsc.dev> > On Mar 10, 2021, at 5:59 PM, Fande Kong wrote: > > Do not know what the fix should look like, but this works for me Please clarify. Is this using the mpicc that has a -pie in the show or not? Is this using the first "fix" I sent you also? Please send your entire patch as an attachment and the successful configure.log Barry > > > @staticmethod > @@ -1194,7 +1194,6 @@ class Configure(config.base.Configure): > output.find('unrecognized command line option') >= 0 or output.find('unrecognized option') >= 0 or output.find('unrecognised option') >= 0 or > output.find('not recognized') >= 0 or output.find('not recognised') >= 0 or > output.find('unknown option') >= 0 or output.find('unknown flag') >= 0 or output.find('Unknown switch') >= 0 or > - output.find('ignoring option') >= 0 or output.find('ignored') >= 0 or > output.find('argument unused') >= 0 or output.find('not supported') >= 0 or > # When checking for the existence of 'attribute' > output.find('is unsupported and will be skipped') >= 0 or > > > > Thanks, > > Fande > > On Wed, Mar 10, 2021 at 4:21 PM Fande Kong > wrote: > > > On Wed, Mar 10, 2021 at 1:36 PM Satish Balay > wrote: > Can you use a different MPI for this conda install? > > We control how to build MPI. If I take "-pie" options out of LDFLAGS, conda can not compile mpich. > > > > > Alternative: > > ./configure CC=x86_64-apple-darwin13.4.0-clang COPTFLAGS="-march=core2 -mtune=haswell" CPPFLAGS=-I/Users/kongf/miniconda3/envs/testpetsc/include > LDFLAGS="-Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs -Wl,-commons,use_dylibs" LIBS="-Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi" > > MPI can not generate an executable because we took out "-pie". > > Thanks, > > Fande > > > etc.. [don't know if you really need LDFLAGS options] > > Satish > > On Wed, 10 Mar 2021, Fande Kong wrote: > > > I guess it was encoded in mpicc > > > > petsc % mpicc -show > > x86_64-apple-darwin13.4.0-clang -march=core2 -mtune=haswell -Wl,-pie > > -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs > > -Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib > > -L/Users/kongf/miniconda3/envs/testpetsc/lib -Wl,-commons,use_dylibs > > -I/Users/kongf/miniconda3/envs/testpetsc/include > > -L/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi > > > > > > Thanks, > > > > Fande > > > > On Wed, Mar 10, 2021 at 12:51 PM Satish Balay > wrote: > > > > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath > > > /Users/kongf/miniconda3/envs/testpetsc/lib > > > -L/Users/kongf/miniconda3/envs/testpetsc/lib > > > > > > Does conda compiler pick up '-pie' from this env variable? If so - perhaps > > > its easier to just modify it? > > > > > > Or is it encoded in mpicc wrapper? [mpicc -show] > > > > > > Satish > > > > > > On Wed, 10 Mar 2021, Fande Kong wrote: > > > > > > > Thanks Barry, > > > > > > > > Got the same result, but "-pie" was not filtered out somehow. > > > > > > > > I did changes like this: > > > > > > > > kongf at x86_64-apple-darwin13 petsc % git diff > > > > diff --git a/config/BuildSystem/config/framework.py > > > > b/config/BuildSystem/config/framework.py > > > > index beefe82956..c31fbeb95e 100644 > > > > --- a/config/BuildSystem/config/framework.py > > > > +++ b/config/BuildSystem/config/framework.py > > > > @@ -504,6 +504,8 @@ class Framework(config.base.Configure, > > > > script.LanguageProcessor): > > > > lines = [s for s in lines if s.find('Load a valid targeting module or > > > > set CRAY_CPU_TARGET') < 0] > > > > # pgi dumps filename on stderr - but returns 0 errorcode' > > > > lines = [s for s in lines if lines != 'conftest.c:'] > > > > + # in case -pie is always being passed to linker > > > > + lines = [s for s in lines if s.find('-pie being ignored. It is only > > > > used when linking a main executable') < 0] > > > > if lines: output = reduce(lambda s, t: s+t, lines, '\n') > > > > else: output = '' > > > > log.write("Linker stderr after filtering:\n"+output+":\n") > > > > > > > > The log was attached again. > > > > > > > > Thanks, > > > > > > > > Fande > > > > > > > > > > > > On Wed, Mar 10, 2021 at 12:05 PM Barry Smith > wrote: > > > > > > > > > Fande, > > > > > > > > > > Please add in config/BuildSystem/config/framework.py line 528 two > > > new > > > > > lines > > > > > > > > > > # pgi dumps filename on stderr - but returns 0 errorcode' > > > > > lines = [s for s in lines if lines != 'conftest.c:'] > > > > > # in case -pie is always being passed to linker > > > > > lines = [s for s in lines if s.find('-pie being ignored. It is > > > only > > > > > used when linking a main executable') < 0] > > > > > > > > > > Barry > > > > > > > > > > You have (another of Conda's "take over the world my way" approach) > > > > > > > > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs > > > -rpath > > > > > /Users/kongf/miniconda3/envs/testpetsc/lib > > > > > -L/Users/kongf/miniconda3/envs/testpetsc/lib > > > > > > > > > > Executing: mpicc -o > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest > > > > > -dynamiclib -single_module > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > > Possible ERROR while running linker: > > > > > stderr: > > > > > ld: warning: -pie being ignored. It is only used when linking a main > > > > > executable > > > > > Rejecting C linker flag -dynamiclib -single_module due to > > > > > > > > > > ld: warning: -pie being ignored. It is only used when linking a main > > > > > executable > > > > > > > > > > This is the correct link command for the Mac but it is being rejected > > > due > > > > > to the warning message. > > > > > > > > > > > > > > > On Mar 10, 2021, at 10:11 AM, Fande Kong > wrote: > > > > > > > > > > Thanks, Barry, > > > > > > > > > > It seems PETSc works fine with manually built compilers. We are pretty > > > > > much sure that the issue is related to conda. Conda might introduce > > > extra > > > > > flags. > > > > > > > > > > We still need to make it work with conda because we deliver our package > > > > > via conda for users. > > > > > > > > > > > > > > > I unset all flags from conda, and got slightly different results this > > > > > time. The log was attached. Anyone could explain the motivation that > > > we > > > > > try to build executable without a main function? > > > > > > > > > > Thanks, > > > > > > > > > > Fande > > > > > > > > > > Executing: mpicc -c -o > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > > > > > -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers > > > > > -fPIC > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c > > > > > > > > > > Successful compile: > > > > > Source: > > > > > #include "confdefs.h" > > > > > #include "conffix.h" > > > > > #include > > > > > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; > > > > > void foo(void){ > > > > > fprintf_ptr(stdout,"hello"); > > > > > return; > > > > > } > > > > > void bar(void){foo();} > > > > > Running Executable WITHOUT threads to time it out > > > > > Executing: mpicc -o > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so > > > > > -dynamic -fPIC > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > > > > > > > Possible ERROR while running linker: exit code 1 > > > > > stderr: > > > > > Undefined symbols for architecture x86_64: > > > > > "_main", referenced from: > > > > > implicit entry/start for main executable > > > > > ld: symbol(s) not found for architecture x86_64 > > > > > clang-11: error: linker command failed with exit code 1 (use -v to see > > > > > invocation) > > > > > Rejected C compiler flag -fPIC because it was not compatible > > > > > with shared linker mpicc using flags ['-dynamic'] > > > > > > > > > > > > > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith > wrote: > > > > > > > > > >> > > > > >> Fande, > > > > >> > > > > >> I see you are using CONDA, this can cause issues since it sticks > > > all > > > > >> kinds of things into the environment. PETSc tries to remove some of > > > them > > > > >> but perhaps not enough. If you run printenv you will see all the mess > > > it is > > > > >> dumping in. > > > > >> > > > > >> Can you trying the same build without CONDA environment? > > > > >> > > > > >> Barry > > > > >> > > > > >> > > > > >> On Mar 8, 2021, at 7:31 PM, Matthew Knepley > > > > wrote: > > > > >> > > > > >> On Mon, Mar 8, 2021 at 8:23 PM Fande Kong > > > > wrote: > > > > >> > > > > >>> Thanks Matthew, > > > > >>> > > > > >>> Hmm, we still have the same issue after shutting off all unknown > > > flags. > > > > >>> > > > > >> > > > > >> Oh, I was misinterpreting the error message: > > > > >> > > > > >> ld: can't link with a main executable file > > > > >> > > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > > > >> > > > > >> So clang did not _actually_ make a shared library, it made an > > > executable. > > > > >> Did clang-11 change the options it uses to build a shared library? > > > > >> > > > > >> Satish, do we test with clang-11? > > > > >> > > > > >> Thanks, > > > > >> > > > > >> Matt > > > > >> > > > > >> Thanks, > > > > >>> > > > > >>> Fande > > > > >>> > > > > >>> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley > > > > > >>> wrote: > > > > >>> > > > > >>>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong > > > > wrote: > > > > >>>> > > > > >>>>> Hi All, > > > > >>>>> > > > > >>>>> mpicc rejected "-fPIC". Anyone has a clue how to work around this > > > > >>>>> issue? > > > > >>>>> > > > > >>>> > > > > >>>> The failure is at the last step > > > > >>>> > > > > >>>> Executing: mpicc -o > > > > >>>> > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest > > > > >>>> -fPIC > > > > >>>> > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o > > > > >>>> > > > -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers > > > > >>>> -lconftest > > > > >>>> Possible ERROR while running linker: exit code 1 > > > > >>>> stderr: > > > > >>>> ld: can't link with a main executable file > > > > >>>> > > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > > > >>>> for architecture x86_64 > > > > >>>> clang-11: error: linker command failed with exit code 1 (use -v to > > > see > > > > >>>> invocation) > > > > >>>> > > > > >>>> but you have some flags stuck in which may or may not affect this. I > > > > >>>> would try shutting them off: > > > > >>>> > > > > >>>> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs > > > -rpath > > > > >>>> /Users/kongf/miniconda3/envs/moose/lib > > > > >>>> -L/Users/kongf/miniconda3/envs/moose/lib > > > > >>>> > > > > >>>> I cannot tell exactly why clang is failing because it does not > > > report a > > > > >>>> specific error. > > > > >>>> > > > > >>>> Thanks, > > > > >>>> > > > > >>>> Matt > > > > >>>> > > > > >>>> The log was attached. > > > > >>>>> > > > > >>>>> Thanks so much, > > > > >>>>> > > > > >>>>> Fande > > > > >>>>> > > > > >>>> > > > > >>>> > > > > >>>> -- > > > > >>>> What most experimenters take for granted before they begin their > > > > >>>> experiments is infinitely more interesting than any results to > > > which their > > > > >>>> experiments lead. > > > > >>>> -- Norbert Wiener > > > > >>>> > > > > >>>> https://www.cse.buffalo.edu/~knepley/ > > > > >>>> > > > > > >>>> > > > > >>> > > > > >> > > > > >> -- > > > > >> What most experimenters take for granted before they begin their > > > > >> experiments is infinitely more interesting than any results to which > > > their > > > > >> experiments lead. > > > > >> -- Norbert Wiener > > > > >> > > > > >> https://www.cse.buffalo.edu/~knepley/ > > > > >> > > > > > >> > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Mar 10 19:30:09 2021 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 10 Mar 2021 19:30:09 -0600 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> <1c46a1e2-ca46-cac1-a955-529c37986ff@mcs.anl.gov> <71f053b9-6e4-af4a-fa7c-2f20704c8029@mcs.anl.gov> Message-ID: Fande, Before send the files I requested in my last email could you try with the branch barry/2021-03-10/handle-pie-flag-conda/release and send its configure.log if it fails. Thanks Barry > On Mar 10, 2021, at 5:59 PM, Fande Kong wrote: > > Do not know what the fix should look like, but this works for me > > > @staticmethod > @@ -1194,7 +1194,6 @@ class Configure(config.base.Configure): > output.find('unrecognized command line option') >= 0 or output.find('unrecognized option') >= 0 or output.find('unrecognised option') >= 0 or > output.find('not recognized') >= 0 or output.find('not recognised') >= 0 or > output.find('unknown option') >= 0 or output.find('unknown flag') >= 0 or output.find('Unknown switch') >= 0 or > - output.find('ignoring option') >= 0 or output.find('ignored') >= 0 or > output.find('argument unused') >= 0 or output.find('not supported') >= 0 or > # When checking for the existence of 'attribute' > output.find('is unsupported and will be skipped') >= 0 or > > > > Thanks, > > Fande > > On Wed, Mar 10, 2021 at 4:21 PM Fande Kong > wrote: > > > On Wed, Mar 10, 2021 at 1:36 PM Satish Balay > wrote: > Can you use a different MPI for this conda install? > > We control how to build MPI. If I take "-pie" options out of LDFLAGS, conda can not compile mpich. > > > > > Alternative: > > ./configure CC=x86_64-apple-darwin13.4.0-clang COPTFLAGS="-march=core2 -mtune=haswell" CPPFLAGS=-I/Users/kongf/miniconda3/envs/testpetsc/include > LDFLAGS="-Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs -Wl,-commons,use_dylibs" LIBS="-Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi" > > MPI can not generate an executable because we took out "-pie". > > Thanks, > > Fande > > > etc.. [don't know if you really need LDFLAGS options] > > Satish > > On Wed, 10 Mar 2021, Fande Kong wrote: > > > I guess it was encoded in mpicc > > > > petsc % mpicc -show > > x86_64-apple-darwin13.4.0-clang -march=core2 -mtune=haswell -Wl,-pie > > -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs > > -Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib > > -L/Users/kongf/miniconda3/envs/testpetsc/lib -Wl,-commons,use_dylibs > > -I/Users/kongf/miniconda3/envs/testpetsc/include > > -L/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi > > > > > > Thanks, > > > > Fande > > > > On Wed, Mar 10, 2021 at 12:51 PM Satish Balay > wrote: > > > > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath > > > /Users/kongf/miniconda3/envs/testpetsc/lib > > > -L/Users/kongf/miniconda3/envs/testpetsc/lib > > > > > > Does conda compiler pick up '-pie' from this env variable? If so - perhaps > > > its easier to just modify it? > > > > > > Or is it encoded in mpicc wrapper? [mpicc -show] > > > > > > Satish > > > > > > On Wed, 10 Mar 2021, Fande Kong wrote: > > > > > > > Thanks Barry, > > > > > > > > Got the same result, but "-pie" was not filtered out somehow. > > > > > > > > I did changes like this: > > > > > > > > kongf at x86_64-apple-darwin13 petsc % git diff > > > > diff --git a/config/BuildSystem/config/framework.py > > > > b/config/BuildSystem/config/framework.py > > > > index beefe82956..c31fbeb95e 100644 > > > > --- a/config/BuildSystem/config/framework.py > > > > +++ b/config/BuildSystem/config/framework.py > > > > @@ -504,6 +504,8 @@ class Framework(config.base.Configure, > > > > script.LanguageProcessor): > > > > lines = [s for s in lines if s.find('Load a valid targeting module or > > > > set CRAY_CPU_TARGET') < 0] > > > > # pgi dumps filename on stderr - but returns 0 errorcode' > > > > lines = [s for s in lines if lines != 'conftest.c:'] > > > > + # in case -pie is always being passed to linker > > > > + lines = [s for s in lines if s.find('-pie being ignored. It is only > > > > used when linking a main executable') < 0] > > > > if lines: output = reduce(lambda s, t: s+t, lines, '\n') > > > > else: output = '' > > > > log.write("Linker stderr after filtering:\n"+output+":\n") > > > > > > > > The log was attached again. > > > > > > > > Thanks, > > > > > > > > Fande > > > > > > > > > > > > On Wed, Mar 10, 2021 at 12:05 PM Barry Smith > wrote: > > > > > > > > > Fande, > > > > > > > > > > Please add in config/BuildSystem/config/framework.py line 528 two > > > new > > > > > lines > > > > > > > > > > # pgi dumps filename on stderr - but returns 0 errorcode' > > > > > lines = [s for s in lines if lines != 'conftest.c:'] > > > > > # in case -pie is always being passed to linker > > > > > lines = [s for s in lines if s.find('-pie being ignored. It is > > > only > > > > > used when linking a main executable') < 0] > > > > > > > > > > Barry > > > > > > > > > > You have (another of Conda's "take over the world my way" approach) > > > > > > > > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs > > > -rpath > > > > > /Users/kongf/miniconda3/envs/testpetsc/lib > > > > > -L/Users/kongf/miniconda3/envs/testpetsc/lib > > > > > > > > > > Executing: mpicc -o > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest > > > > > -dynamiclib -single_module > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > > Possible ERROR while running linker: > > > > > stderr: > > > > > ld: warning: -pie being ignored. It is only used when linking a main > > > > > executable > > > > > Rejecting C linker flag -dynamiclib -single_module due to > > > > > > > > > > ld: warning: -pie being ignored. It is only used when linking a main > > > > > executable > > > > > > > > > > This is the correct link command for the Mac but it is being rejected > > > due > > > > > to the warning message. > > > > > > > > > > > > > > > On Mar 10, 2021, at 10:11 AM, Fande Kong > wrote: > > > > > > > > > > Thanks, Barry, > > > > > > > > > > It seems PETSc works fine with manually built compilers. We are pretty > > > > > much sure that the issue is related to conda. Conda might introduce > > > extra > > > > > flags. > > > > > > > > > > We still need to make it work with conda because we deliver our package > > > > > via conda for users. > > > > > > > > > > > > > > > I unset all flags from conda, and got slightly different results this > > > > > time. The log was attached. Anyone could explain the motivation that > > > we > > > > > try to build executable without a main function? > > > > > > > > > > Thanks, > > > > > > > > > > Fande > > > > > > > > > > Executing: mpicc -c -o > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > > > > > -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers > > > > > -fPIC > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c > > > > > > > > > > Successful compile: > > > > > Source: > > > > > #include "confdefs.h" > > > > > #include "conffix.h" > > > > > #include > > > > > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; > > > > > void foo(void){ > > > > > fprintf_ptr(stdout,"hello"); > > > > > return; > > > > > } > > > > > void bar(void){foo();} > > > > > Running Executable WITHOUT threads to time it out > > > > > Executing: mpicc -o > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so > > > > > -dynamic -fPIC > > > > > > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o > > > > > > > > > > Possible ERROR while running linker: exit code 1 > > > > > stderr: > > > > > Undefined symbols for architecture x86_64: > > > > > "_main", referenced from: > > > > > implicit entry/start for main executable > > > > > ld: symbol(s) not found for architecture x86_64 > > > > > clang-11: error: linker command failed with exit code 1 (use -v to see > > > > > invocation) > > > > > Rejected C compiler flag -fPIC because it was not compatible > > > > > with shared linker mpicc using flags ['-dynamic'] > > > > > > > > > > > > > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith > wrote: > > > > > > > > > >> > > > > >> Fande, > > > > >> > > > > >> I see you are using CONDA, this can cause issues since it sticks > > > all > > > > >> kinds of things into the environment. PETSc tries to remove some of > > > them > > > > >> but perhaps not enough. If you run printenv you will see all the mess > > > it is > > > > >> dumping in. > > > > >> > > > > >> Can you trying the same build without CONDA environment? > > > > >> > > > > >> Barry > > > > >> > > > > >> > > > > >> On Mar 8, 2021, at 7:31 PM, Matthew Knepley > > > > wrote: > > > > >> > > > > >> On Mon, Mar 8, 2021 at 8:23 PM Fande Kong > > > > wrote: > > > > >> > > > > >>> Thanks Matthew, > > > > >>> > > > > >>> Hmm, we still have the same issue after shutting off all unknown > > > flags. > > > > >>> > > > > >> > > > > >> Oh, I was misinterpreting the error message: > > > > >> > > > > >> ld: can't link with a main executable file > > > > >> > > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > > > >> > > > > >> So clang did not _actually_ make a shared library, it made an > > > executable. > > > > >> Did clang-11 change the options it uses to build a shared library? > > > > >> > > > > >> Satish, do we test with clang-11? > > > > >> > > > > >> Thanks, > > > > >> > > > > >> Matt > > > > >> > > > > >> Thanks, > > > > >>> > > > > >>> Fande > > > > >>> > > > > >>> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley > > > > > >>> wrote: > > > > >>> > > > > >>>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong > > > > wrote: > > > > >>>> > > > > >>>>> Hi All, > > > > >>>>> > > > > >>>>> mpicc rejected "-fPIC". Anyone has a clue how to work around this > > > > >>>>> issue? > > > > >>>>> > > > > >>>> > > > > >>>> The failure is at the last step > > > > >>>> > > > > >>>> Executing: mpicc -o > > > > >>>> > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest > > > > >>>> -fPIC > > > > >>>> > > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o > > > > >>>> > > > -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers > > > > >>>> -lconftest > > > > >>>> Possible ERROR while running linker: exit code 1 > > > > >>>> stderr: > > > > >>>> ld: can't link with a main executable file > > > > >>>> > > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' > > > > >>>> for architecture x86_64 > > > > >>>> clang-11: error: linker command failed with exit code 1 (use -v to > > > see > > > > >>>> invocation) > > > > >>>> > > > > >>>> but you have some flags stuck in which may or may not affect this. I > > > > >>>> would try shutting them off: > > > > >>>> > > > > >>>> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs > > > -rpath > > > > >>>> /Users/kongf/miniconda3/envs/moose/lib > > > > >>>> -L/Users/kongf/miniconda3/envs/moose/lib > > > > >>>> > > > > >>>> I cannot tell exactly why clang is failing because it does not > > > report a > > > > >>>> specific error. > > > > >>>> > > > > >>>> Thanks, > > > > >>>> > > > > >>>> Matt > > > > >>>> > > > > >>>> The log was attached. > > > > >>>>> > > > > >>>>> Thanks so much, > > > > >>>>> > > > > >>>>> Fande > > > > >>>>> > > > > >>>> > > > > >>>> > > > > >>>> -- > > > > >>>> What most experimenters take for granted before they begin their > > > > >>>> experiments is infinitely more interesting than any results to > > > which their > > > > >>>> experiments lead. > > > > >>>> -- Norbert Wiener > > > > >>>> > > > > >>>> https://www.cse.buffalo.edu/~knepley/ > > > > >>>> > > > > > >>>> > > > > >>> > > > > >> > > > > >> -- > > > > >> What most experimenters take for granted before they begin their > > > > >> experiments is infinitely more interesting than any results to which > > > their > > > > >> experiments lead. > > > > >> -- Norbert Wiener > > > > >> > > > > >> https://www.cse.buffalo.edu/~knepley/ > > > > >> > > > > > >> > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Wed Mar 10 20:46:23 2021 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 10 Mar 2021 19:46:23 -0700 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> <1c46a1e2-ca46-cac1-a955-529c37986ff@mcs.anl.gov> <71f053b9-6e4-af4a-fa7c-2f20704c8029@mcs.anl.gov> Message-ID: Thanks Barry, Your branch works very well. Thanks for your help!!! Could you merge it to upstream? Fande On Wed, Mar 10, 2021 at 6:30 PM Barry Smith wrote: > > Fande, > > Before send the files I requested in my last email could you try with > the branch *barry/2021-03-10/handle-pie-flag-conda/release *and send its > configure.log if it fails. > > Thanks > > Barry > > > On Mar 10, 2021, at 5:59 PM, Fande Kong wrote: > > Do not know what the fix should look like, but this works for me > > > @staticmethod > @@ -1194,7 +1194,6 @@ class Configure(config.base.Configure): > output.find('unrecognized command line option') >= 0 or > output.find('unrecognized option') >= 0 or output.find('unrecognised > option') >= 0 or > output.find('not recognized') >= 0 or output.find('not recognised') > >= 0 or > output.find('unknown option') >= 0 or output.find('unknown flag') >= > 0 or output.find('Unknown switch') >= 0 or > - output.find('ignoring option') >= 0 or output.find('ignored') >= 0 or > output.find('argument unused') >= 0 or output.find('not supported') > >= 0 or > # When checking for the existence of 'attribute' > output.find('is unsupported and will be skipped') >= 0 or > > > > Thanks, > > Fande > > On Wed, Mar 10, 2021 at 4:21 PM Fande Kong wrote: > >> >> >> On Wed, Mar 10, 2021 at 1:36 PM Satish Balay wrote: >> >>> Can you use a different MPI for this conda install? >>> >> >> We control how to build MPI. If I take "-pie" options out of LDFLAGS, >> conda can not compile mpich. >> >> >> >> >>> >>> Alternative: >>> >>> ./configure CC=x86_64-apple-darwin13.4.0-clang COPTFLAGS="-march=core2 >>> -mtune=haswell" CPPFLAGS=-I/Users/kongf/miniconda3/envs/testpetsc/include >>> LDFLAGS="-Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs >>> -Wl,-commons,use_dylibs" >>> LIBS="-Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi" >>> >> >> MPI can not generate an executable because we took out "-pie". >> >> Thanks, >> >> Fande >> >> >>> >>> etc.. [don't know if you really need LDFLAGS options] >>> >>> Satish >>> >>> On Wed, 10 Mar 2021, Fande Kong wrote: >>> >>> > I guess it was encoded in mpicc >>> > >>> > petsc % mpicc -show >>> > x86_64-apple-darwin13.4.0-clang -march=core2 -mtune=haswell -Wl,-pie >>> > -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs >>> > -Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib >>> > -L/Users/kongf/miniconda3/envs/testpetsc/lib -Wl,-commons,use_dylibs >>> > -I/Users/kongf/miniconda3/envs/testpetsc/include >>> > -L/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi >>> > >>> > >>> > Thanks, >>> > >>> > Fande >>> > >>> > On Wed, Mar 10, 2021 at 12:51 PM Satish Balay >>> wrote: >>> > >>> > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs >>> -rpath >>> > > /Users/kongf/miniconda3/envs/testpetsc/lib >>> > > -L/Users/kongf/miniconda3/envs/testpetsc/lib >>> > > >>> > > Does conda compiler pick up '-pie' from this env variable? If so - >>> perhaps >>> > > its easier to just modify it? >>> > > >>> > > Or is it encoded in mpicc wrapper? [mpicc -show] >>> > > >>> > > Satish >>> > > >>> > > On Wed, 10 Mar 2021, Fande Kong wrote: >>> > > >>> > > > Thanks Barry, >>> > > > >>> > > > Got the same result, but "-pie" was not filtered out somehow. >>> > > > >>> > > > I did changes like this: >>> > > > >>> > > > kongf at x86_64-apple-darwin13 petsc % git diff >>> > > > diff --git a/config/BuildSystem/config/framework.py >>> > > > b/config/BuildSystem/config/framework.py >>> > > > index beefe82956..c31fbeb95e 100644 >>> > > > --- a/config/BuildSystem/config/framework.py >>> > > > +++ b/config/BuildSystem/config/framework.py >>> > > > @@ -504,6 +504,8 @@ class Framework(config.base.Configure, >>> > > > script.LanguageProcessor): >>> > > > lines = [s for s in lines if s.find('Load a valid targeting >>> module or >>> > > > set CRAY_CPU_TARGET') < 0] >>> > > > # pgi dumps filename on stderr - but returns 0 errorcode' >>> > > > lines = [s for s in lines if lines != 'conftest.c:'] >>> > > > + # in case -pie is always being passed to linker >>> > > > + lines = [s for s in lines if s.find('-pie being ignored. It is >>> only >>> > > > used when linking a main executable') < 0] >>> > > > if lines: output = reduce(lambda s, t: s+t, lines, '\n') >>> > > > else: output = '' >>> > > > log.write("Linker stderr after filtering:\n"+output+":\n") >>> > > > >>> > > > The log was attached again. >>> > > > >>> > > > Thanks, >>> > > > >>> > > > Fande >>> > > > >>> > > > >>> > > > On Wed, Mar 10, 2021 at 12:05 PM Barry Smith >>> wrote: >>> > > > >>> > > > > Fande, >>> > > > > >>> > > > > Please add in config/BuildSystem/config/framework.py line 528 >>> two >>> > > new >>> > > > > lines >>> > > > > >>> > > > > # pgi dumps filename on stderr - but returns 0 errorcode' >>> > > > > lines = [s for s in lines if lines != 'conftest.c:'] >>> > > > > # in case -pie is always being passed to linker >>> > > > > lines = [s for s in lines if s.find('-pie being ignored. >>> It is >>> > > only >>> > > > > used when linking a main executable') < 0] >>> > > > > >>> > > > > Barry >>> > > > > >>> > > > > You have (another of Conda's "take over the world my way" >>> approach) >>> > > > > >>> > > > > LDFLAGS_LD=-pie -headerpad_max_install_names >>> -dead_strip_dylibs >>> > > -rpath >>> > > > > /Users/kongf/miniconda3/envs/testpetsc/lib >>> > > > > -L/Users/kongf/miniconda3/envs/testpetsc/lib >>> > > > > >>> > > > > Executing: mpicc -o >>> > > > > >>> > > >>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest >>> > > > > -dynamiclib -single_module >>> > > > > >>> > > >>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o >>> > > > > Possible ERROR while running linker: >>> > > > > stderr: >>> > > > > ld: warning: -pie being ignored. It is only used when linking a >>> main >>> > > > > executable >>> > > > > Rejecting C linker flag -dynamiclib -single_module >>> due to >>> > > > > >>> > > > > ld: warning: -pie being ignored. It is only used when linking a >>> main >>> > > > > executable >>> > > > > >>> > > > > This is the correct link command for the Mac but it is being >>> rejected >>> > > due >>> > > > > to the warning message. >>> > > > > >>> > > > > >>> > > > > On Mar 10, 2021, at 10:11 AM, Fande Kong >>> wrote: >>> > > > > >>> > > > > Thanks, Barry, >>> > > > > >>> > > > > It seems PETSc works fine with manually built compilers. We are >>> pretty >>> > > > > much sure that the issue is related to conda. Conda might >>> introduce >>> > > extra >>> > > > > flags. >>> > > > > >>> > > > > We still need to make it work with conda because we deliver our >>> package >>> > > > > via conda for users. >>> > > > > >>> > > > > >>> > > > > I unset all flags from conda, and got slightly different results >>> this >>> > > > > time. The log was attached. Anyone could explain the >>> motivation that >>> > > we >>> > > > > try to build executable without a main function? >>> > > > > >>> > > > > Thanks, >>> > > > > >>> > > > > Fande >>> > > > > >>> > > > > Executing: mpicc -c -o >>> > > > > >>> > > >>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o >>> > > > > >>> > > >>> -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers >>> > > > > -fPIC >>> > > > > >>> > > >>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c >>> > > > > >>> > > > > Successful compile: >>> > > > > Source: >>> > > > > #include "confdefs.h" >>> > > > > #include "conffix.h" >>> > > > > #include >>> > > > > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; >>> > > > > void foo(void){ >>> > > > > fprintf_ptr(stdout,"hello"); >>> > > > > return; >>> > > > > } >>> > > > > void bar(void){foo();} >>> > > > > Running Executable WITHOUT threads to time it out >>> > > > > Executing: mpicc -o >>> > > > > >>> > > >>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so >>> > > > > -dynamic -fPIC >>> > > > > >>> > > >>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o >>> > > > > >>> > > > > Possible ERROR while running linker: exit code 1 >>> > > > > stderr: >>> > > > > Undefined symbols for architecture x86_64: >>> > > > > "_main", referenced from: >>> > > > > implicit entry/start for main executable >>> > > > > ld: symbol(s) not found for architecture x86_64 >>> > > > > clang-11: error: linker command failed with exit code 1 (use -v >>> to see >>> > > > > invocation) >>> > > > > Rejected C compiler flag -fPIC because it was not >>> compatible >>> > > > > with shared linker mpicc using flags ['-dynamic'] >>> > > > > >>> > > > > >>> > > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith >>> wrote: >>> > > > > >>> > > > >> >>> > > > >> Fande, >>> > > > >> >>> > > > >> I see you are using CONDA, this can cause issues since it >>> sticks >>> > > all >>> > > > >> kinds of things into the environment. PETSc tries to remove >>> some of >>> > > them >>> > > > >> but perhaps not enough. If you run printenv you will see all >>> the mess >>> > > it is >>> > > > >> dumping in. >>> > > > >> >>> > > > >> Can you trying the same build without CONDA environment? >>> > > > >> >>> > > > >> Barry >>> > > > >> >>> > > > >> >>> > > > >> On Mar 8, 2021, at 7:31 PM, Matthew Knepley >>> > > wrote: >>> > > > >> >>> > > > >> On Mon, Mar 8, 2021 at 8:23 PM Fande Kong >>> > > wrote: >>> > > > >> >>> > > > >>> Thanks Matthew, >>> > > > >>> >>> > > > >>> Hmm, we still have the same issue after shutting off all >>> unknown >>> > > flags. >>> > > > >>> >>> > > > >> >>> > > > >> Oh, I was misinterpreting the error message: >>> > > > >> >>> > > > >> ld: can't link with a main executable file >>> > > > >> >>> > > >>> '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' >>> > > > >> >>> > > > >> So clang did not _actually_ make a shared library, it made an >>> > > executable. >>> > > > >> Did clang-11 change the options it uses to build a shared >>> library? >>> > > > >> >>> > > > >> Satish, do we test with clang-11? >>> > > > >> >>> > > > >> Thanks, >>> > > > >> >>> > > > >> Matt >>> > > > >> >>> > > > >> Thanks, >>> > > > >>> >>> > > > >>> Fande >>> > > > >>> >>> > > > >>> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley < >>> knepley at gmail.com> >>> > > > >>> wrote: >>> > > > >>> >>> > > > >>>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong < >>> fdkong.jd at gmail.com> >>> > > wrote: >>> > > > >>>> >>> > > > >>>>> Hi All, >>> > > > >>>>> >>> > > > >>>>> mpicc rejected "-fPIC". Anyone has a clue how to work around >>> this >>> > > > >>>>> issue? >>> > > > >>>>> >>> > > > >>>> >>> > > > >>>> The failure is at the last step >>> > > > >>>> >>> > > > >>>> Executing: mpicc -o >>> > > > >>>> >>> > > >>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest >>> > > > >>>> -fPIC >>> > > > >>>> >>> > > >>> /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o >>> > > > >>>> >>> > > >>> -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers >>> > > > >>>> -lconftest >>> > > > >>>> Possible ERROR while running linker: exit code 1 >>> > > > >>>> stderr: >>> > > > >>>> ld: can't link with a main executable file >>> > > > >>>> >>> > > >>> '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' >>> > > > >>>> for architecture x86_64 >>> > > > >>>> clang-11: error: linker command failed with exit code 1 (use >>> -v to >>> > > see >>> > > > >>>> invocation) >>> > > > >>>> >>> > > > >>>> but you have some flags stuck in which may or may not affect >>> this. I >>> > > > >>>> would try shutting them off: >>> > > > >>>> >>> > > > >>>> LDFLAGS_LD=-pie -headerpad_max_install_names >>> -dead_strip_dylibs >>> > > -rpath >>> > > > >>>> /Users/kongf/miniconda3/envs/moose/lib >>> > > > >>>> -L/Users/kongf/miniconda3/envs/moose/lib >>> > > > >>>> >>> > > > >>>> I cannot tell exactly why clang is failing because it does not >>> > > report a >>> > > > >>>> specific error. >>> > > > >>>> >>> > > > >>>> Thanks, >>> > > > >>>> >>> > > > >>>> Matt >>> > > > >>>> >>> > > > >>>> The log was attached. >>> > > > >>>>> >>> > > > >>>>> Thanks so much, >>> > > > >>>>> >>> > > > >>>>> Fande >>> > > > >>>>> >>> > > > >>>> >>> > > > >>>> >>> > > > >>>> -- >>> > > > >>>> What most experimenters take for granted before they begin >>> their >>> > > > >>>> experiments is infinitely more interesting than any results to >>> > > which their >>> > > > >>>> experiments lead. >>> > > > >>>> -- Norbert Wiener >>> > > > >>>> >>> > > > >>>> https://www.cse.buffalo.edu/~knepley/ >>> > > > >>>> >>> > > > >>>> >>> > > > >>> >>> > > > >> >>> > > > >> -- >>> > > > >> What most experimenters take for granted before they begin their >>> > > > >> experiments is infinitely more interesting than any results to >>> which >>> > > their >>> > > > >> experiments lead. >>> > > > >> -- Norbert Wiener >>> > > > >> >>> > > > >> https://www.cse.buffalo.edu/~knepley/ >>> > > > >> >>> > > > >> >>> > > > >> >>> > > > >> >>> > > > > >>> > > > > >>> > > > > >>> > > > >>> > > >>> > > >>> > >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Mar 10 23:19:25 2021 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 10 Mar 2021 23:19:25 -0600 Subject: [petsc-users] Exhausted all shared linker guesses. Could not determine how to create a shared library! In-Reply-To: References: <91DF0A8C-547A-42BC-81BA-ADFF7CA948C8@petsc.dev> <1c46a1e2-ca46-cac1-a955-529c37986ff@mcs.anl.gov> <71f053b9-6e4-af4a-fa7c-2f20704c8029@mcs.anl.gov> Message-ID: Excellent, thanks for letting me know. It is a bit miraculous since I could not reproduce your problem (and was not willing to install Conda myself). https://gitlab.com/petsc/petsc/-/merge_requests/3703 > On Mar 10, 2021, at 8:46 PM, Fande Kong wrote: > > Thanks Barry, > > Your branch works very well. Thanks for your help!!! > > Could you merge it to upstream? > > Fande > > On Wed, Mar 10, 2021 at 6:30 PM Barry Smith > wrote: > > Fande, > > Before send the files I requested in my last email could you try with the branch barry/2021-03-10/handle-pie-flag-conda/release and send its configure.log if it fails. > > Thanks > > Barry > > >> On Mar 10, 2021, at 5:59 PM, Fande Kong > wrote: >> >> Do not know what the fix should look like, but this works for me >> >> >> @staticmethod >> @@ -1194,7 +1194,6 @@ class Configure(config.base.Configure): >> output.find('unrecognized command line option') >= 0 or output.find('unrecognized option') >= 0 or output.find('unrecognised option') >= 0 or >> output.find('not recognized') >= 0 or output.find('not recognised') >= 0 or >> output.find('unknown option') >= 0 or output.find('unknown flag') >= 0 or output.find('Unknown switch') >= 0 or >> - output.find('ignoring option') >= 0 or output.find('ignored') >= 0 or >> output.find('argument unused') >= 0 or output.find('not supported') >= 0 or >> # When checking for the existence of 'attribute' >> output.find('is unsupported and will be skipped') >= 0 or >> >> >> >> Thanks, >> >> Fande >> >> On Wed, Mar 10, 2021 at 4:21 PM Fande Kong > wrote: >> >> >> On Wed, Mar 10, 2021 at 1:36 PM Satish Balay > wrote: >> Can you use a different MPI for this conda install? >> >> We control how to build MPI. If I take "-pie" options out of LDFLAGS, conda can not compile mpich. >> >> >> >> >> Alternative: >> >> ./configure CC=x86_64-apple-darwin13.4.0-clang COPTFLAGS="-march=core2 -mtune=haswell" CPPFLAGS=-I/Users/kongf/miniconda3/envs/testpetsc/include >> LDFLAGS="-Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs -Wl,-commons,use_dylibs" LIBS="-Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi" >> >> MPI can not generate an executable because we took out "-pie". >> >> Thanks, >> >> Fande >> >> >> etc.. [don't know if you really need LDFLAGS options] >> >> Satish >> >> On Wed, 10 Mar 2021, Fande Kong wrote: >> >> > I guess it was encoded in mpicc >> > >> > petsc % mpicc -show >> > x86_64-apple-darwin13.4.0-clang -march=core2 -mtune=haswell -Wl,-pie >> > -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs >> > -Wl,-rpath,/Users/kongf/miniconda3/envs/testpetsc/lib >> > -L/Users/kongf/miniconda3/envs/testpetsc/lib -Wl,-commons,use_dylibs >> > -I/Users/kongf/miniconda3/envs/testpetsc/include >> > -L/Users/kongf/miniconda3/envs/testpetsc/lib -lmpi -lpmpi >> > >> > >> > Thanks, >> > >> > Fande >> > >> > On Wed, Mar 10, 2021 at 12:51 PM Satish Balay > wrote: >> > >> > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs -rpath >> > > /Users/kongf/miniconda3/envs/testpetsc/lib >> > > -L/Users/kongf/miniconda3/envs/testpetsc/lib >> > > >> > > Does conda compiler pick up '-pie' from this env variable? If so - perhaps >> > > its easier to just modify it? >> > > >> > > Or is it encoded in mpicc wrapper? [mpicc -show] >> > > >> > > Satish >> > > >> > > On Wed, 10 Mar 2021, Fande Kong wrote: >> > > >> > > > Thanks Barry, >> > > > >> > > > Got the same result, but "-pie" was not filtered out somehow. >> > > > >> > > > I did changes like this: >> > > > >> > > > kongf at x86_64-apple-darwin13 petsc % git diff >> > > > diff --git a/config/BuildSystem/config/framework.py >> > > > b/config/BuildSystem/config/framework.py >> > > > index beefe82956..c31fbeb95e 100644 >> > > > --- a/config/BuildSystem/config/framework.py >> > > > +++ b/config/BuildSystem/config/framework.py >> > > > @@ -504,6 +504,8 @@ class Framework(config.base.Configure, >> > > > script.LanguageProcessor): >> > > > lines = [s for s in lines if s.find('Load a valid targeting module or >> > > > set CRAY_CPU_TARGET') < 0] >> > > > # pgi dumps filename on stderr - but returns 0 errorcode' >> > > > lines = [s for s in lines if lines != 'conftest.c:'] >> > > > + # in case -pie is always being passed to linker >> > > > + lines = [s for s in lines if s.find('-pie being ignored. It is only >> > > > used when linking a main executable') < 0] >> > > > if lines: output = reduce(lambda s, t: s+t, lines, '\n') >> > > > else: output = '' >> > > > log.write("Linker stderr after filtering:\n"+output+":\n") >> > > > >> > > > The log was attached again. >> > > > >> > > > Thanks, >> > > > >> > > > Fande >> > > > >> > > > >> > > > On Wed, Mar 10, 2021 at 12:05 PM Barry Smith > wrote: >> > > > >> > > > > Fande, >> > > > > >> > > > > Please add in config/BuildSystem/config/framework.py line 528 two >> > > new >> > > > > lines >> > > > > >> > > > > # pgi dumps filename on stderr - but returns 0 errorcode' >> > > > > lines = [s for s in lines if lines != 'conftest.c:'] >> > > > > # in case -pie is always being passed to linker >> > > > > lines = [s for s in lines if s.find('-pie being ignored. It is >> > > only >> > > > > used when linking a main executable') < 0] >> > > > > >> > > > > Barry >> > > > > >> > > > > You have (another of Conda's "take over the world my way" approach) >> > > > > >> > > > > LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs >> > > -rpath >> > > > > /Users/kongf/miniconda3/envs/testpetsc/lib >> > > > > -L/Users/kongf/miniconda3/envs/testpetsc/lib >> > > > > >> > > > > Executing: mpicc -o >> > > > > >> > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest >> > > > > -dynamiclib -single_module >> > > > > >> > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o >> > > > > Possible ERROR while running linker: >> > > > > stderr: >> > > > > ld: warning: -pie being ignored. It is only used when linking a main >> > > > > executable >> > > > > Rejecting C linker flag -dynamiclib -single_module due to >> > > > > >> > > > > ld: warning: -pie being ignored. It is only used when linking a main >> > > > > executable >> > > > > >> > > > > This is the correct link command for the Mac but it is being rejected >> > > due >> > > > > to the warning message. >> > > > > >> > > > > >> > > > > On Mar 10, 2021, at 10:11 AM, Fande Kong > wrote: >> > > > > >> > > > > Thanks, Barry, >> > > > > >> > > > > It seems PETSc works fine with manually built compilers. We are pretty >> > > > > much sure that the issue is related to conda. Conda might introduce >> > > extra >> > > > > flags. >> > > > > >> > > > > We still need to make it work with conda because we deliver our package >> > > > > via conda for users. >> > > > > >> > > > > >> > > > > I unset all flags from conda, and got slightly different results this >> > > > > time. The log was attached. Anyone could explain the motivation that >> > > we >> > > > > try to build executable without a main function? >> > > > > >> > > > > Thanks, >> > > > > >> > > > > Fande >> > > > > >> > > > > Executing: mpicc -c -o >> > > > > >> > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o >> > > > > >> > > -I/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers >> > > > > -fPIC >> > > > > >> > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.c >> > > > > >> > > > > Successful compile: >> > > > > Source: >> > > > > #include "confdefs.h" >> > > > > #include "conffix.h" >> > > > > #include >> > > > > int (*fprintf_ptr)(FILE*,const char*,...) = fprintf; >> > > > > void foo(void){ >> > > > > fprintf_ptr(stdout,"hello"); >> > > > > return; >> > > > > } >> > > > > void bar(void){foo();} >> > > > > Running Executable WITHOUT threads to time it out >> > > > > Executing: mpicc -o >> > > > > >> > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/libconftest.so >> > > > > -dynamic -fPIC >> > > > > >> > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-pkset22y/config.setCompilers/conftest.o >> > > > > >> > > > > Possible ERROR while running linker: exit code 1 >> > > > > stderr: >> > > > > Undefined symbols for architecture x86_64: >> > > > > "_main", referenced from: >> > > > > implicit entry/start for main executable >> > > > > ld: symbol(s) not found for architecture x86_64 >> > > > > clang-11: error: linker command failed with exit code 1 (use -v to see >> > > > > invocation) >> > > > > Rejected C compiler flag -fPIC because it was not compatible >> > > > > with shared linker mpicc using flags ['-dynamic'] >> > > > > >> > > > > >> > > > > On Mon, Mar 8, 2021 at 7:28 PM Barry Smith > wrote: >> > > > > >> > > > >> >> > > > >> Fande, >> > > > >> >> > > > >> I see you are using CONDA, this can cause issues since it sticks >> > > all >> > > > >> kinds of things into the environment. PETSc tries to remove some of >> > > them >> > > > >> but perhaps not enough. If you run printenv you will see all the mess >> > > it is >> > > > >> dumping in. >> > > > >> >> > > > >> Can you trying the same build without CONDA environment? >> > > > >> >> > > > >> Barry >> > > > >> >> > > > >> >> > > > >> On Mar 8, 2021, at 7:31 PM, Matthew Knepley > >> > > wrote: >> > > > >> >> > > > >> On Mon, Mar 8, 2021 at 8:23 PM Fande Kong > >> > > wrote: >> > > > >> >> > > > >>> Thanks Matthew, >> > > > >>> >> > > > >>> Hmm, we still have the same issue after shutting off all unknown >> > > flags. >> > > > >>> >> > > > >> >> > > > >> Oh, I was misinterpreting the error message: >> > > > >> >> > > > >> ld: can't link with a main executable file >> > > > >> >> > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' >> > > > >> >> > > > >> So clang did not _actually_ make a shared library, it made an >> > > executable. >> > > > >> Did clang-11 change the options it uses to build a shared library? >> > > > >> >> > > > >> Satish, do we test with clang-11? >> > > > >> >> > > > >> Thanks, >> > > > >> >> > > > >> Matt >> > > > >> >> > > > >> Thanks, >> > > > >>> >> > > > >>> Fande >> > > > >>> >> > > > >>> On Mon, Mar 8, 2021 at 6:07 PM Matthew Knepley > >> > > > >>> wrote: >> > > > >>> >> > > > >>>> On Mon, Mar 8, 2021 at 7:55 PM Fande Kong > >> > > wrote: >> > > > >>>> >> > > > >>>>> Hi All, >> > > > >>>>> >> > > > >>>>> mpicc rejected "-fPIC". Anyone has a clue how to work around this >> > > > >>>>> issue? >> > > > >>>>> >> > > > >>>> >> > > > >>>> The failure is at the last step >> > > > >>>> >> > > > >>>> Executing: mpicc -o >> > > > >>>> >> > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest >> > > > >>>> -fPIC >> > > > >>>> >> > > /var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/conftest.o >> > > > >>>> >> > > -L/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers >> > > > >>>> -lconftest >> > > > >>>> Possible ERROR while running linker: exit code 1 >> > > > >>>> stderr: >> > > > >>>> ld: can't link with a main executable file >> > > > >>>> >> > > '/var/folders/tv/ljnkj46x3nq45cp9tbkc000c0000gn/T/petsc-6v1w4q4u/config.setCompilers/libconftest.dylib' >> > > > >>>> for architecture x86_64 >> > > > >>>> clang-11: error: linker command failed with exit code 1 (use -v to >> > > see >> > > > >>>> invocation) >> > > > >>>> >> > > > >>>> but you have some flags stuck in which may or may not affect this. I >> > > > >>>> would try shutting them off: >> > > > >>>> >> > > > >>>> LDFLAGS_LD=-pie -headerpad_max_install_names -dead_strip_dylibs >> > > -rpath >> > > > >>>> /Users/kongf/miniconda3/envs/moose/lib >> > > > >>>> -L/Users/kongf/miniconda3/envs/moose/lib >> > > > >>>> >> > > > >>>> I cannot tell exactly why clang is failing because it does not >> > > report a >> > > > >>>> specific error. >> > > > >>>> >> > > > >>>> Thanks, >> > > > >>>> >> > > > >>>> Matt >> > > > >>>> >> > > > >>>> The log was attached. >> > > > >>>>> >> > > > >>>>> Thanks so much, >> > > > >>>>> >> > > > >>>>> Fande >> > > > >>>>> >> > > > >>>> >> > > > >>>> >> > > > >>>> -- >> > > > >>>> What most experimenters take for granted before they begin their >> > > > >>>> experiments is infinitely more interesting than any results to >> > > which their >> > > > >>>> experiments lead. >> > > > >>>> -- Norbert Wiener >> > > > >>>> >> > > > >>>> https://www.cse.buffalo.edu/~knepley/ >> > > > >>>> > >> > > > >>>> >> > > > >>> >> > > > >> >> > > > >> -- >> > > > >> What most experimenters take for granted before they begin their >> > > > >> experiments is infinitely more interesting than any results to which >> > > their >> > > > >> experiments lead. >> > > > >> -- Norbert Wiener >> > > > >> >> > > > >> https://www.cse.buffalo.edu/~knepley/ >> > > > >> > >> > > > >> >> > > > >> >> > > > >> >> > > > > >> > > > > >> > > > > >> > > > >> > > >> > > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mathieu.dutour at gmail.com Thu Mar 11 02:26:23 2021 From: mathieu.dutour at gmail.com (Mathieu Dutour) Date: Thu, 11 Mar 2021 09:26:23 +0100 Subject: [petsc-users] Block unstructured grid Message-ID: Dear all, I would like to work with a special kind of linear system that ought to be very common but I am not sure that it is possible in PETSC. What we have is an unstructured grid with say 3.10^5 nodes in it. At each node, we have a number of frequency/direction and together this makes about 1000 values at the node. So, in total the linear system has say 3.10^8 values. We managed to implement this system with Petsc but the performance was unsatisfactory. We think that Petsc is not exploiting the special structure of the matrix and we wonder if this structure can be implemented in Petsc. By special structure we mean the following. An entry in the linear system is of the form (i, j) with 1<=i<=1000 and 1<=j<=N with N = 3.10^5. The node (i , j) is adjacent to all the nodes (i' , j) and thus they make a block diagonal entry. But the node (i , j) is also adjacent to some nodes (i , j') [About 6 such nodes, but it varies]. Would there be a way to exploit this special structure in Petsc? I think this should be fairly common and significant speedup could be obtained. Best, Mathieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From thijs.smit at hest.ethz.ch Thu Mar 11 02:36:45 2021 From: thijs.smit at hest.ethz.ch (Smit Thijs) Date: Thu, 11 Mar 2021 08:36:45 +0000 Subject: [petsc-users] Outputting cell data in stead of point data while writing .vtr file Message-ID: <0c6a1fcc56784bfbb2cb9bf01a9d0586@hest.ethz.ch> Hi All, I am outputting several vectors to a .vtr file successfully for viewing in Paraview. At this moment the information is written to point data. How can I change this and make sure the data is written to cell data? The code I am currently using for outputting: PetscViewer viewer; ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD, "test.vtr", FILE_MODE_WRITE, &viewer); CHKERRQ(ierr); ierr = DMView(nd, viewer); CHKERRQ(ierr); PetscObjectSetName((PetscObject)xPhys,"xPhys"); ierr = VecView(xPhys, viewer); CHKERRQ(ierr); PetscObjectSetName((PetscObject)S,"SvonMises"); ierr = VecView(S, viewer); CHKERRQ(ierr); ierr = PetscViewerDestroy(&viewer); CHKERRQ(ierr); Best regards, Thijs Smit PhD Candidate ETH Zurich Institute for Biomechanics -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Mar 11 06:52:29 2021 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 11 Mar 2021 07:52:29 -0500 Subject: [petsc-users] Block unstructured grid In-Reply-To: References: Message-ID: Mathieu, We have "FieldSplit" support for fields, but I don't know if it has ever been pushed to 1000's of fields so it might fall down. It might work. FieldSplit lets you manipulate the ordering, say field major (j) or node major (i). What was unsatisfactory? It sounds like you made a rectangular matrix A(1000,3e5) . Is that correct? Mark On Thu, Mar 11, 2021 at 3:27 AM Mathieu Dutour wrote: > Dear all, > > I would like to work with a special kind of linear system that ought to be > very common but I am not sure that it is possible in PETSC. > > What we have is an unstructured grid with say 3.10^5 nodes in it. > At each node, we have a number of frequency/direction and together > this makes about 1000 values at the node. So, in total the linear system > has say 3.10^8 values. > > We managed to implement this system with Petsc but the performance > was unsatisfactory. We think that Petsc is not exploiting the special > structure of the matrix and we wonder if this structure can be implemented > in Petsc. > > By special structure we mean the following. An entry in the linear system > is of the form (i, j) with 1<=i<=1000 and 1<=j<=N with N = 3.10^5. > The node (i , j) is adjacent to all the nodes (i' , j) and thus they make > a block > diagonal entry. But the node (i , j) is also adjacent to some nodes (i , > j') > [About 6 such nodes, but it varies]. > > Would there be a way to exploit this special structure in Petsc? I think > this should be fairly common and significant speedup could be obtained. > > Best, > > Mathieu > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 11 07:07:51 2021 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Mar 2021 08:07:51 -0500 Subject: [petsc-users] Outputting cell data in stead of point data while writing .vtr file In-Reply-To: <0c6a1fcc56784bfbb2cb9bf01a9d0586@hest.ethz.ch> References: <0c6a1fcc56784bfbb2cb9bf01a9d0586@hest.ethz.ch> Message-ID: What kind of DM is it? Thanks, Matt On Thu, Mar 11, 2021 at 3:36 AM Smit Thijs wrote: > Hi All, > > > > I am outputting several vectors to a .vtr file successfully for viewing in > Paraview. At this moment the information is written to point data. How can > I change this and make sure the data is written to cell data? > > > > The code I am currently using for outputting: > > > > PetscViewer viewer; > > > > ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD, ?test.vtr?, FILE_MODE_WRITE, > &viewer); > > CHKERRQ(ierr); > > > > ierr = DMView(nd, viewer); > > CHKERRQ(ierr); > > > > PetscObjectSetName((PetscObject)xPhys,"xPhys"); > > ierr = VecView(xPhys, viewer); > > CHKERRQ(ierr); > > > > PetscObjectSetName((PetscObject)S,"SvonMises"); > > ierr = VecView(S, viewer); > > CHKERRQ(ierr); > > > > ierr = PetscViewerDestroy(&viewer); > > CHKERRQ(ierr); > > > > Best regards, > > > > Thijs Smit > > > > PhD Candidate > > ETH Zurich > > Institute for Biomechanics > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From thijs.smit at hest.ethz.ch Thu Mar 11 07:17:01 2021 From: thijs.smit at hest.ethz.ch (Smit Thijs) Date: Thu, 11 Mar 2021 13:17:01 +0000 Subject: [petsc-users] Outputting cell data in stead of point data while writing .vtr file In-Reply-To: References: <0c6a1fcc56784bfbb2cb9bf01a9d0586@hest.ethz.ch> Message-ID: <54dc57ab87464bc2a5da663226c11805@hest.ethz.ch> Hi Matt, Actually I have two 3D DMDA?s, one for the nodal data, where the FEM is solved on. The other DMDA is a cell centered one for the volume data, like the density of a particular voxel. Ideally I would like to write both point data (displacement field) and cell data (density) to the vtr. Code for DMDA. DMBoundaryType bx = DM_BOUNDARY_NONE; DMBoundaryType by = DM_BOUNDARY_NONE; DMBoundaryType bz = DM_BOUNDARY_NONE; DMDAStencilType stype = DMDA_STENCIL_BOX; PetscInt stencilwidth = 1; // Create the nodal mesh ierr = DMDACreate3d(PETSC_COMM_WORLD, bx, by, bz, stype, nx, ny, nz, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, numnodaldof, stencilwidth, 0, 0, 0, &(da_nodes)); CHKERRQ(ierr); DMSetFromOptions(da_nodes); DMSetUp(da_nodes); ierr = DMDASetUniformCoordinates(da_nodes, xmin, xmax, ymin, ymax, zmin, zmax); CHKERRQ(ierr); ierr = DMDASetElementType(da_nodes, DMDA_ELEMENT_Q1); CHKERRQ(ierr); Best, Thijs From: Matthew Knepley Sent: 11 March 2021 14:08 To: Smit Thijs Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Outputting cell data in stead of point data while writing .vtr file What kind of DM is it? Thanks, Matt On Thu, Mar 11, 2021 at 3:36 AM Smit Thijs > wrote: Hi All, I am outputting several vectors to a .vtr file successfully for viewing in Paraview. At this moment the information is written to point data. How can I change this and make sure the data is written to cell data? The code I am currently using for outputting: PetscViewer viewer; ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD, ?test.vtr?, FILE_MODE_WRITE, &viewer); CHKERRQ(ierr); ierr = DMView(nd, viewer); CHKERRQ(ierr); PetscObjectSetName((PetscObject)xPhys,"xPhys"); ierr = VecView(xPhys, viewer); CHKERRQ(ierr); PetscObjectSetName((PetscObject)S,"SvonMises"); ierr = VecView(S, viewer); CHKERRQ(ierr); ierr = PetscViewerDestroy(&viewer); CHKERRQ(ierr); Best regards, Thijs Smit PhD Candidate ETH Zurich Institute for Biomechanics -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 11 07:19:23 2021 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Mar 2021 08:19:23 -0500 Subject: [petsc-users] Block unstructured grid In-Reply-To: References: Message-ID: On Thu, Mar 11, 2021 at 3:27 AM Mathieu Dutour wrote: > Dear all, > > I would like to work with a special kind of linear system that ought to be > very common but I am not sure that it is possible in PETSC. > > What we have is an unstructured grid with say 3.10^5 nodes in it. > At each node, we have a number of frequency/direction and together > this makes about 1000 values at the node. So, in total the linear system > has say 3.10^8 values. > > We managed to implement this system with Petsc but the performance > was unsatisfactory. We think that Petsc is not exploiting the special > structure of the matrix and we wonder if this structure can be implemented > in Petsc. > > By special structure we mean the following. An entry in the linear system > is of the form (i, j) with 1<=i<=1000 and 1<=j<=N with N = 3.10^5. > The node (i , j) is adjacent to all the nodes (i' , j) and thus they make > a block > diagonal entry. But the node (i , j) is also adjacent to some nodes (i , > j') > [About 6 such nodes, but it varies]. > I do not understand this explanation "(i, j) with 1<=i<=1000 and 1<=j<=N". Your linear system is rectangular of size (1000, N)? Do you mean instead that each entry in the linear system (i, j) is a 1000x1000 block? We might have something that can help. If the structure of each entry is the same, we have this class https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATKAIJ.html This would be writing the system as a Kronecker product A \otimes T where T is your 1000x1000 matrix. We have only run this for moderate size T, say 16x16, so further optimizations might be necessary, but it is a place to start. Is it possible to write your system in this way? Thanks, Matt > Would there be a way to exploit this special structure in Petsc? I think > this should be fairly common and significant speedup could be obtained. > > Best, > > Mathieu > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From snailsoar at hotmail.com Thu Mar 11 07:35:34 2021 From: snailsoar at hotmail.com (feng wang) Date: Thu, 11 Mar 2021 13:35:34 +0000 Subject: [petsc-users] Questions on matrix-free GMRES implementation Message-ID: Dear All, I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: the matrix-free matrix is created as: ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); KSP linear operator is set up as: ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix Before calling KSPSolve, I do: ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side The call back function is defined as: PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) { PetscErrorCode ierr; cFdDomain *user_ctx; cout << "FormFunction_mf called\n"; //in_vec: flow states //out_vec: right hand side + diagonal contributions from CFL number user_ctx = (cFdDomain*)ctx; //get perturbed conservative variables from petsc user_ctx->petsc_getcsv(in_vec); //get new right side user_ctx->petsc_fd_rhs(); //set new right hand side to the output vector user_ctx->petsc_setrhs(out_vec); ierr = 0; return ierr; } The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? Thanks for your help in advance. Feng -------------- next part -------------- An HTML attachment was scrubbed... URL: From mathieu.dutour at gmail.com Thu Mar 11 07:48:50 2021 From: mathieu.dutour at gmail.com (Mathieu Dutour) Date: Thu, 11 Mar 2021 14:48:50 +0100 Subject: [petsc-users] Block unstructured grid In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 at 13:52, Mark Adams wrote: > Mathieu, > We have "FieldSplit" support for fields, but I don't know if it has ever > been pushed to 1000's of fields so it might fall down. It might work. > FieldSplit lets you manipulate the ordering, say field major (j) or node > major (i). > I just looked at it and FieldSplit appears to be used in preconditioner so not exactly relevant. What was unsatisfactory? > It sounds like you made a rectangular matrix A(1000,3e5) . Is that correct? > That is incorrect. The matrix is of size (N, N) with N = 1000 * 3e^5. It is a square matrix coming from an implicit scheme. Since the other answer appears to have the same misunderstanding, let me try to re-explain my point: --- In many contexts we need a partial differential equation that is not scalar. For example, the shallow water equation has b = 3 fields: H, HU, HV. There are other examples like wave modelling where we have something like b = 1000 fields (in a discretization). --- So, if we work with say an unstructured grid with N nodes then the total number of variables of the system will be N_tot = 3N or N_tot = 1000N. The linear system has N_tot unknowns and N_tot equations. The entries can be written as idx = (i , j) with 1 <= i <= b and 1 <= j <= N. Thus the non-zero entries in the matrix will be of two kinds: --- (idx1, idx2) with idx1 = (i , j) and idx2 = (i' , j) , 1 <= i, i' <= b and 1 <= j <= N. Together those define a block in the matrix. --- (idx1, idx2) with idx1 = (i , j) and idx2 = (i, j'), 1<= i <= b and 1<= j, j' <= N. For each unknown idx1, there will be about 6 unknowns idx2 of this form. Otherwise, the block matrices do not have the same coefficients, so a tensor product approach does not appear to be workable. Mathieu > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 11 07:54:05 2021 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Mar 2021 08:54:05 -0500 Subject: [petsc-users] Outputting cell data in stead of point data while writing .vtr file In-Reply-To: <54dc57ab87464bc2a5da663226c11805@hest.ethz.ch> References: <0c6a1fcc56784bfbb2cb9bf01a9d0586@hest.ethz.ch> <54dc57ab87464bc2a5da663226c11805@hest.ethz.ch> Message-ID: On Thu, Mar 11, 2021 at 8:17 AM Smit Thijs wrote: > Hi Matt, > > > > Actually I have two 3D DMDA?s, one for the nodal data, where the FEM is > solved on. The other DMDA is a cell centered one for the volume data, like > the density of a particular voxel. Ideally I would like to write both point > data (displacement field) and cell data (density) to the vtr. > > > > Code for DMDA. > > > > DMBoundaryType bx = DM_BOUNDARY_NONE; > > DMBoundaryType by = DM_BOUNDARY_NONE; > > DMBoundaryType bz = DM_BOUNDARY_NONE; > > > > DMDAStencilType stype = DMDA_STENCIL_BOX; > > > > PetscInt stencilwidth = 1; > > > > // Create the nodal mesh > > ierr = DMDACreate3d(PETSC_COMM_WORLD, bx, by, bz, stype, nx, ny, nz, > PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, > > numnodaldof, stencilwidth, 0, 0, 0, &(da_nodes)); > > CHKERRQ(ierr); > > > > DMSetFromOptions(da_nodes); > > DMSetUp(da_nodes); > > > > ierr = DMDASetUniformCoordinates(da_nodes, xmin, xmax, ymin, ymax, > zmin, zmax); > > CHKERRQ(ierr); > > > > ierr = DMDASetElementType(da_nodes, DMDA_ELEMENT_Q1); > > CHKERRQ(ierr); > When I wrote the original version which output to ASCII VTK, we allowed switching between point data and cell data. It is fragile since you have to assure that the different grids match properly. When the viewer was rewritten to use the XML, all output was point data. It looks like it would take code changes to get this done. Thanks, Matt > Best, Thijs > > > > *From:* Matthew Knepley > *Sent:* 11 March 2021 14:08 > *To:* Smit Thijs > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Outputting cell data in stead of point data > while writing .vtr file > > > > What kind of DM is it? > > > > Thanks, > > > > Matt > > > > On Thu, Mar 11, 2021 at 3:36 AM Smit Thijs > wrote: > > Hi All, > > > > I am outputting several vectors to a .vtr file successfully for viewing in > Paraview. At this moment the information is written to point data. How can > I change this and make sure the data is written to cell data? > > > > The code I am currently using for outputting: > > > > PetscViewer viewer; > > > > ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD, ?test.vtr?, FILE_MODE_WRITE, > &viewer); > > CHKERRQ(ierr); > > > > ierr = DMView(nd, viewer); > > CHKERRQ(ierr); > > > > PetscObjectSetName((PetscObject)xPhys,"xPhys"); > > ierr = VecView(xPhys, viewer); > > CHKERRQ(ierr); > > > > PetscObjectSetName((PetscObject)S,"SvonMises"); > > ierr = VecView(S, viewer); > > CHKERRQ(ierr); > > > > ierr = PetscViewerDestroy(&viewer); > > CHKERRQ(ierr); > > > > Best regards, > > > > Thijs Smit > > > > PhD Candidate > > ETH Zurich > > Institute for Biomechanics > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Thu Mar 11 08:01:11 2021 From: pierre at joliv.et (Pierre Jolivet) Date: Thu, 11 Mar 2021 15:01:11 +0100 Subject: [petsc-users] Block unstructured grid In-Reply-To: References: Message-ID: <138BD618-C3A5-421B-B087-6F871069117B@joliv.et> > On 11 Mar 2021, at 2:48 PM, Mathieu Dutour wrote: > > On Thu, 11 Mar 2021 at 13:52, Mark Adams > wrote: > Mathieu, > We have "FieldSplit" support for fields, but I don't know if it has ever been pushed to 1000's of fields so it might fall down. It might work. > FieldSplit lets you manipulate the ordering, say field major (j) or node major (i). > I just looked at it and FieldSplit appears to be used in preconditioner so not exactly relevant. > > What was unsatisfactory? > It sounds like you made a rectangular matrix A(1000,3e5) . Is that correct? > That is incorrect. The matrix is of size (N, N) with N = 1000 * 3e^5. It is a square > matrix coming from an implicit scheme. > > Since the other answer appears to have the same misunderstanding, let me try > to re-explain my point: > --- In many contexts we need a partial differential equation that is not scalar. > For example, the shallow water equation has b = 3 fields: H, HU, HV. There are other > examples like wave modelling where we have something like b = 1000 fields (in a > discretization). I think what you want is MatBAIJ (https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatCreateBAIJ.html ) with bs = 1000, but just like Matt and Mark, I?m not quite sure I understand your notations from below 100%. Thanks, Pierre > --- So, if we work with say an unstructured grid with N nodes then the total number > of variables of the system will be N_tot = 3N or N_tot = 1000N. > > The linear system has N_tot unknowns and N_tot equations. The entries > can be written as idx = (i , j) with 1 <= i <= b and 1 <= j <= N. > > Thus the non-zero entries in the matrix will be of two kinds: > --- (idx1, idx2) with idx1 = (i , j) and idx2 = (i' , j) , 1 <= i, i' <= b and 1 <= j <= N. > Together those define a block in the matrix. > > --- (idx1, idx2) with idx1 = (i , j) and idx2 = (i, j'), 1<= i <= b and 1<= j, j' <= N. > For each unknown idx1, there will be about 6 unknowns idx2 of this form. > > Otherwise, the block matrices do not have the same coefficients, so a tensor > product approach does not appear to be workable. > > Mathieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From thijs.smit at hest.ethz.ch Thu Mar 11 08:02:52 2021 From: thijs.smit at hest.ethz.ch (Smit Thijs) Date: Thu, 11 Mar 2021 14:02:52 +0000 Subject: [petsc-users] Outputting cell data in stead of point data while writing .vtr file In-Reply-To: References: <0c6a1fcc56784bfbb2cb9bf01a9d0586@hest.ethz.ch> <54dc57ab87464bc2a5da663226c11805@hest.ethz.ch> Message-ID: <9591a94bb3ec4181899a52f65d6a17b7@hest.ethz.ch> Hi Matt, Oke, would a code change be difficult? I mean, feasible for me to do as a mechanical engineer? ;) Or can I use an other version of PETSc where the output to ASCII VTK is still available? Best, Thijs From: Matthew Knepley Sent: 11 March 2021 14:54 To: Smit Thijs Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Outputting cell data in stead of point data while writing .vtr file On Thu, Mar 11, 2021 at 8:17 AM Smit Thijs > wrote: Hi Matt, Actually I have two 3D DMDA?s, one for the nodal data, where the FEM is solved on. The other DMDA is a cell centered one for the volume data, like the density of a particular voxel. Ideally I would like to write both point data (displacement field) and cell data (density) to the vtr. Code for DMDA. DMBoundaryType bx = DM_BOUNDARY_NONE; DMBoundaryType by = DM_BOUNDARY_NONE; DMBoundaryType bz = DM_BOUNDARY_NONE; DMDAStencilType stype = DMDA_STENCIL_BOX; PetscInt stencilwidth = 1; // Create the nodal mesh ierr = DMDACreate3d(PETSC_COMM_WORLD, bx, by, bz, stype, nx, ny, nz, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, numnodaldof, stencilwidth, 0, 0, 0, &(da_nodes)); CHKERRQ(ierr); DMSetFromOptions(da_nodes); DMSetUp(da_nodes); ierr = DMDASetUniformCoordinates(da_nodes, xmin, xmax, ymin, ymax, zmin, zmax); CHKERRQ(ierr); ierr = DMDASetElementType(da_nodes, DMDA_ELEMENT_Q1); CHKERRQ(ierr); When I wrote the original version which output to ASCII VTK, we allowed switching between point data and cell data. It is fragile since you have to assure that the different grids match properly. When the viewer was rewritten to use the XML, all output was point data. It looks like it would take code changes to get this done. Thanks, Matt Best, Thijs From: Matthew Knepley > Sent: 11 March 2021 14:08 To: Smit Thijs > Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Outputting cell data in stead of point data while writing .vtr file What kind of DM is it? Thanks, Matt On Thu, Mar 11, 2021 at 3:36 AM Smit Thijs > wrote: Hi All, I am outputting several vectors to a .vtr file successfully for viewing in Paraview. At this moment the information is written to point data. How can I change this and make sure the data is written to cell data? The code I am currently using for outputting: PetscViewer viewer; ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD, ?test.vtr?, FILE_MODE_WRITE, &viewer); CHKERRQ(ierr); ierr = DMView(nd, viewer); CHKERRQ(ierr); PetscObjectSetName((PetscObject)xPhys,"xPhys"); ierr = VecView(xPhys, viewer); CHKERRQ(ierr); PetscObjectSetName((PetscObject)S,"SvonMises"); ierr = VecView(S, viewer); CHKERRQ(ierr); ierr = PetscViewerDestroy(&viewer); CHKERRQ(ierr); Best regards, Thijs Smit PhD Candidate ETH Zurich Institute for Biomechanics -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Thu Mar 11 08:59:24 2021 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 11 Mar 2021 15:59:24 +0100 Subject: [petsc-users] Block unstructured grid In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 at 09:27, Mathieu Dutour wrote: > Dear all, > > I would like to work with a special kind of linear system that ought to be > very common but I am not sure that it is possible in PETSC. > > What we have is an unstructured grid with say 3.10^5 nodes in it. > At each node, we have a number of frequency/direction and together > this makes about 1000 values at the node. So, in total the linear system > has say 3.10^8 values. > > We managed to implement this system with Petsc but the performance > was unsatisfactory. > I think part of the reason the answers you are getting aren't helpful to you is that you have not identified "what" exactly you find to be unsatisfactory. Nor is it obvious what you consider to be satisfactory. For example, does "unsatisfactory" relate to any of these items? * memory usage of the matrix * time taken to assemble the matrix * time taken to perform MatMult() * solve time If it does, providing the output from -log_view (from an optimized build of petsc) would be helpful, and moreover it would provide developers with a baseline result with which they could compare to should any implementation changes be made. Having established what functionality is causing you concern, it would then be help for you to explain why you think it should be better, e.g. based on a performance model, prior experience with other software, etc. More information would help. Thanks, Dave > We think that Petsc is not exploiting the special > structure of the matrix and we wonder if this structure can be implemented > in Petsc. > > By special structure we mean the following. An entry in the linear system > is of the form (i, j) with 1<=i<=1000 and 1<=j<=N with N = 3.10^5. > The node (i , j) is adjacent to all the nodes (i' , j) and thus they make > a block > diagonal entry. But the node (i , j) is also adjacent to some nodes (i , > j') > [About 6 such nodes, but it varies]. > > Would there be a way to exploit this special structure in Petsc? I think > this should be fairly common and significant speedup could be obtained. > > Best, > > Mathieu > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 11 12:09:30 2021 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Mar 2021 13:09:30 -0500 Subject: [petsc-users] Outputting cell data in stead of point data while writing .vtr file In-Reply-To: <9591a94bb3ec4181899a52f65d6a17b7@hest.ethz.ch> References: <0c6a1fcc56784bfbb2cb9bf01a9d0586@hest.ethz.ch> <54dc57ab87464bc2a5da663226c11805@hest.ethz.ch> <9591a94bb3ec4181899a52f65d6a17b7@hest.ethz.ch> Message-ID: On Thu, Mar 11, 2021 at 9:02 AM Smit Thijs wrote: > Hi Matt, > > > > Oke, would a code change be difficult? I mean, feasible for me to do as a > mechanical engineer? ;) > Sure. The hard part is determining what to do with a given vector. Here https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/da/gr2.c#L677 you can see that we just save vectors until they are written out. You would have to 1) Enhance the DMCompatible() check to allow both cell and vertex meshes here, and mark the cell vector somehow. 2) Save that cell data mark when you save the vector 3) Then when vectors are written here https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/da/grvtk.c#L106 you have to add a section. I do not understand the structured grid XML format, but I assume you can look this up. How does that sound? Thanks, Matt > Or can I use an other version of PETSc where the output to ASCII VTK is > still available? > > > > Best, Thijs > > > > *From:* Matthew Knepley > *Sent:* 11 March 2021 14:54 > *To:* Smit Thijs > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Outputting cell data in stead of point data > while writing .vtr file > > > > On Thu, Mar 11, 2021 at 8:17 AM Smit Thijs > wrote: > > Hi Matt, > > > > Actually I have two 3D DMDA?s, one for the nodal data, where the FEM is > solved on. The other DMDA is a cell centered one for the volume data, like > the density of a particular voxel. Ideally I would like to write both point > data (displacement field) and cell data (density) to the vtr. > > > > Code for DMDA. > > > > DMBoundaryType bx = DM_BOUNDARY_NONE; > > DMBoundaryType by = DM_BOUNDARY_NONE; > > DMBoundaryType bz = DM_BOUNDARY_NONE; > > > > DMDAStencilType stype = DMDA_STENCIL_BOX; > > > > PetscInt stencilwidth = 1; > > > > // Create the nodal mesh > > ierr = DMDACreate3d(PETSC_COMM_WORLD, bx, by, bz, stype, nx, ny, nz, > PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, > > numnodaldof, stencilwidth, 0, 0, 0, &(da_nodes)); > > CHKERRQ(ierr); > > > > DMSetFromOptions(da_nodes); > > DMSetUp(da_nodes); > > > > ierr = DMDASetUniformCoordinates(da_nodes, xmin, xmax, ymin, ymax, > zmin, zmax); > > CHKERRQ(ierr); > > > > ierr = DMDASetElementType(da_nodes, DMDA_ELEMENT_Q1); > > CHKERRQ(ierr); > > > > When I wrote the original version which output to ASCII VTK, we allowed > switching between point data and cell data. It is > > fragile since you have to assure that the different grids match properly. > When the viewer was rewritten to use the XML, > > all output was point data. It looks like it would take code changes to get > this done. > > > > Thanks, > > > > Matt > > > > Best, Thijs > > > > *From:* Matthew Knepley > *Sent:* 11 March 2021 14:08 > *To:* Smit Thijs > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Outputting cell data in stead of point data > while writing .vtr file > > > > What kind of DM is it? > > > > Thanks, > > > > Matt > > > > On Thu, Mar 11, 2021 at 3:36 AM Smit Thijs > wrote: > > Hi All, > > > > I am outputting several vectors to a .vtr file successfully for viewing in > Paraview. At this moment the information is written to point data. How can > I change this and make sure the data is written to cell data? > > > > The code I am currently using for outputting: > > > > PetscViewer viewer; > > > > ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD, ?test.vtr?, FILE_MODE_WRITE, > &viewer); > > CHKERRQ(ierr); > > > > ierr = DMView(nd, viewer); > > CHKERRQ(ierr); > > > > PetscObjectSetName((PetscObject)xPhys,"xPhys"); > > ierr = VecView(xPhys, viewer); > > CHKERRQ(ierr); > > > > PetscObjectSetName((PetscObject)S,"SvonMises"); > > ierr = VecView(S, viewer); > > CHKERRQ(ierr); > > > > ierr = PetscViewerDestroy(&viewer); > > CHKERRQ(ierr); > > > > Best regards, > > > > Thijs Smit > > > > PhD Candidate > > ETH Zurich > > Institute for Biomechanics > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 11 12:23:20 2021 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Mar 2021 13:23:20 -0500 Subject: [petsc-users] Block unstructured grid In-Reply-To: References: Message-ID: On Thu, Mar 11, 2021 at 8:49 AM Mathieu Dutour wrote: > On Thu, 11 Mar 2021 at 13:52, Mark Adams wrote: > >> Mathieu, >> We have "FieldSplit" support for fields, but I don't know if it has ever >> been pushed to 1000's of fields so it might fall down. It might work. >> FieldSplit lets you manipulate the ordering, say field major (j) or node >> major (i). >> > I just looked at it and FieldSplit appears to be used in preconditioner so > not exactly relevant. > > What was unsatisfactory? >> It sounds like you made a rectangular matrix A(1000,3e5) . Is that >> correct? >> > That is incorrect. The matrix is of size (N, N) with N = 1000 * 3e^5. It > is a square > matrix coming from an implicit scheme. > > Since the other answer appears to have the same misunderstanding, let me > try > to re-explain my point: > --- In many contexts we need a partial differential equation that is not > scalar. > For example, the shallow water equation has b = 3 fields: H, HU, HV. There > are other > examples like wave modelling where we have something like b = 1000 fields > (in a > discretization). > --- So, if we work with say an unstructured grid with N nodes then the > total number > of variables of the system will be N_tot = 3N or N_tot = 1000N. > > The linear system has N_tot unknowns and N_tot equations. The entries > can be written as idx = (i , j) with 1 <= i <= b and 1 <= j <= N. > This notation does not make sense to me. You are labeling an unknown with an index (i, j) where i in [1, b] and j in [1, N]. Usually, we label an unknown in a matrix using (i, j) where i, j in [1, N]. Perhaps what you want is a multiindex alpha = (i, k) where i in [1, N] and k in [1, b] Then a matrix entry is indexed using (alpha, beta) = ((i, k), (j, l)) As Pierre points out, this is exactly what we do in MATBAIJ, where we call "i" a "block index". If you only have block structure, then BAIJ is the best you can do. If you have product structure, then KAIJ is better. Thanks, Matt > Thus the non-zero entries in the matrix will be of two kinds: > --- (idx1, idx2) with idx1 = (i , j) and idx2 = (i' , j) , 1 <= i, i' <= > b and 1 <= j <= N. > Together those define a block in the matrix. > > --- (idx1, idx2) with idx1 = (i , j) and idx2 = (i, j'), 1<= i <= b and > 1<= j, j' <= N. > For each unknown idx1, there will be about 6 unknowns idx2 of this form. > > Otherwise, the block matrices do not have the same coefficients, so a > tensor > product approach does not appear to be workable. > > Mathieu > >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Mar 11 16:15:03 2021 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 11 Mar 2021 16:15:03 -0600 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: References: Message-ID: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> Feng, The first thing to check is that for each linear solve that involves a new operator (values in the base vector) the MFFD matrix knows it is using a new operator. The easiest way is to call MatMFFDSetBase() before each solve that involves a new operator (new values in the base vector). Also be careful about petsc_baserhs, when you change the base vector's values you also need to change the petsc_baserhs values to the function evaluation at that point. If that is correct I would check with a trivial function evaluator to make sure the infrastructure is all set up correctly. For examples use for the matrix free a 1 4 1 operator applied matrix free. Barry > On Mar 11, 2021, at 7:35 AM, feng wang wrote: > > Dear All, > > I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: > > the matrix-free matrix is created as: > > ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); > ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); > > KSP linear operator is set up as: > > ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix > > Before calling KSPSolve, I do: > > ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side > > The call back function is defined as: > > PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) > { > PetscErrorCode ierr; > cFdDomain *user_ctx; > > cout << "FormFunction_mf called\n"; > > //in_vec: flow states > //out_vec: right hand side + diagonal contributions from CFL number > > user_ctx = (cFdDomain*)ctx; > > //get perturbed conservative variables from petsc > user_ctx->petsc_getcsv(in_vec); > > //get new right side > user_ctx->petsc_fd_rhs(); > > //set new right hand side to the output vector > user_ctx->petsc_setrhs(out_vec); > > ierr = 0; > return ierr; > } > > The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. > > The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? > > Thanks for your help in advance. > Feng -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Mar 11 16:27:12 2021 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 11 Mar 2021 16:27:12 -0600 Subject: [petsc-users] Block unstructured grid In-Reply-To: References: Message-ID: <05B443E6-05D1-4FEF-B8CA-6210297CEA58@petsc.dev> If it is not a BAIJ matrix, perhaps a figure of the matrix (for some small b and N) might help us understand the structure. We understand vector valued (non-scalar) PDEs and have worked with cases of 1000s of entries at each grid point but don't understand the index notation you are using below. Also did you previously use an AIJ that resulted in poor performance? Barry > On Mar 11, 2021, at 7:48 AM, Mathieu Dutour wrote: > > On Thu, 11 Mar 2021 at 13:52, Mark Adams > wrote: > Mathieu, > We have "FieldSplit" support for fields, but I don't know if it has ever been pushed to 1000's of fields so it might fall down. It might work. > FieldSplit lets you manipulate the ordering, say field major (j) or node major (i). > I just looked at it and FieldSplit appears to be used in preconditioner so not exactly relevant. > > What was unsatisfactory? > It sounds like you made a rectangular matrix A(1000,3e5) . Is that correct? > That is incorrect. The matrix is of size (N, N) with N = 1000 * 3e^5. It is a square > matrix coming from an implicit scheme. > > Since the other answer appears to have the same misunderstanding, let me try > to re-explain my point: > --- In many contexts we need a partial differential equation that is not scalar. > For example, the shallow water equation has b = 3 fields: H, HU, HV. There are other > examples like wave modelling where we have something like b = 1000 fields (in a > discretization). > --- So, if we work with say an unstructured grid with N nodes then the total number > of variables of the system will be N_tot = 3N or N_tot = 1000N. > > The linear system has N_tot unknowns and N_tot equations. The entries > can be written as idx = (i , j) with 1 <= i <= b and 1 <= j <= N. > > Thus the non-zero entries in the matrix will be of two kinds: > --- (idx1, idx2) with idx1 = (i , j) and idx2 = (i' , j) , 1 <= i, i' <= b and 1 <= j <= N. > Together those define a block in the matrix. > > --- (idx1, idx2) with idx1 = (i , j) and idx2 = (i, j'), 1<= i <= b and 1<= j, j' <= N. > For each unknown idx1, there will be about 6 unknowns idx2 of this form. > > Otherwise, the block matrices do not have the same coefficients, so a tensor > product approach does not appear to be workable. > > Mathieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From snailsoar at hotmail.com Fri Mar 12 05:02:10 2021 From: snailsoar at hotmail.com (feng wang) Date: Fri, 12 Mar 2021 11:02:10 +0000 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> References: , <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> Message-ID: Hi Barry, Thanks for your advice. You are right on this. somehow there is some inconsistency when I compute the right hand side (true RHS + time-stepping contribution to the diagonal matrix) to compute the finite difference Jacobian. If I just use the call back function to recompute my RHS before I call MatMFFDSetBase, then it works like a charm. But now I end up with computing my RHS three times. 1st time is to compute the true RHS, the rest two is for computing finite difference Jacobian. In my previous buggy version, I only compute RHS twice. If possible, could you elaborate on your comments "Also be careful about petsc_baserhs", so I may possibly understand what was going on with my buggy version. Besides, for a parallel implementation, my code already has its own partition method, is it possible to allow petsc read in a user-defined partition? if not what is a better way to do this? Many thanks, Feng ________________________________ From: Barry Smith Sent: 11 March 2021 22:15 To: feng wang Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation Feng, The first thing to check is that for each linear solve that involves a new operator (values in the base vector) the MFFD matrix knows it is using a new operator. The easiest way is to call MatMFFDSetBase() before each solve that involves a new operator (new values in the base vector). Also be careful about petsc_baserhs, when you change the base vector's values you also need to change the petsc_baserhs values to the function evaluation at that point. If that is correct I would check with a trivial function evaluator to make sure the infrastructure is all set up correctly. For examples use for the matrix free a 1 4 1 operator applied matrix free. Barry On Mar 11, 2021, at 7:35 AM, feng wang > wrote: Dear All, I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: the matrix-free matrix is created as: ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); KSP linear operator is set up as: ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix Before calling KSPSolve, I do: ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side The call back function is defined as: PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) { PetscErrorCode ierr; cFdDomain *user_ctx; cout << "FormFunction_mf called\n"; //in_vec: flow states //out_vec: right hand side + diagonal contributions from CFL number user_ctx = (cFdDomain*)ctx; //get perturbed conservative variables from petsc user_ctx->petsc_getcsv(in_vec); //get new right side user_ctx->petsc_fd_rhs(); //set new right hand side to the output vector user_ctx->petsc_setrhs(out_vec); ierr = 0; return ierr; } The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? Thanks for your help in advance. Feng -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 12 06:05:16 2021 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 12 Mar 2021 07:05:16 -0500 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> Message-ID: On Fri, Mar 12, 2021 at 6:02 AM feng wang wrote: > Hi Barry, > > Thanks for your advice. > > You are right on this. somehow there is some inconsistency when I compute > the right hand side (true RHS + time-stepping contribution to the diagonal > matrix) to compute the finite difference Jacobian. If I just use the call > back function to recompute my RHS before I call *MatMFFDSetBase*, then it > works like a charm. But now I end up with computing my RHS three times. 1st > time is to compute the true RHS, the rest two is for computing finite > difference Jacobian. > > In my previous buggy version, I only compute RHS twice. If possible, > could you elaborate on your comments "Also be careful about petsc_baserhs", > so I may possibly understand what was going on with my buggy version. > Our FD implementation is simple. It approximates the action of the Jacobian as J(b) v = (F(b + h v) - F(b)) / h ||v|| where h is some small parameter and b is the base vector, namely the one that you are linearizing around. In a Newton step, b is the previous solution and v is the proposed solution update. > Besides, for a parallel implementation, my code already has its own > partition method, is it possible to allow petsc read in a user-defined > partition? if not what is a better way to do this? > Sure https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html Thanks, Matt > Many thanks, > Feng > > ------------------------------ > *From:* Barry Smith > *Sent:* 11 March 2021 22:15 > *To:* feng wang > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Questions on matrix-free GMRES implementation > > > Feng, > > The first thing to check is that for each linear solve that involves > a new operator (values in the base vector) the MFFD matrix knows it is > using a new operator. > > The easiest way is to call MatMFFDSetBase() before each solve that > involves a new operator (new values in the base vector). Also be careful > about petsc_baserhs, when you change the base vector's values you also > need to change the petsc_baserhs values to the function evaluation at > that point. > > If that is correct I would check with a trivial function evaluator to make > sure the infrastructure is all set up correctly. For examples use for the > matrix free a 1 4 1 operator applied matrix free. > > Barry > > > On Mar 11, 2021, at 7:35 AM, feng wang wrote: > > Dear All, > > I am new to petsc and trying to implement a matrix-free GMRES. I have > assembled an approximate Jacobian matrix just for preconditioning. After > reading some previous questions on this topic, my approach is: > > the matrix-free matrix is created as: > > ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, > PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); > ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); > CHKERRQ(ierr); > > KSP linear operator is set up as: > > ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); > CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix > > Before calling KSPSolve, I do: > > ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); > CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the > pre-computed right hand side > > The call back function is defined as: > > PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec > out_vec) > { > PetscErrorCode ierr; > cFdDomain *user_ctx; > > cout << "FormFunction_mf called\n"; > > //in_vec: flow states > //out_vec: right hand side + diagonal contributions from CFL number > > user_ctx = (cFdDomain*)ctx; > > //get perturbed conservative variables from petsc > user_ctx->petsc_getcsv(in_vec); > > //get new right side > user_ctx->petsc_fd_rhs(); > > //set new right hand side to the output vector > user_ctx->petsc_setrhs(out_vec); > > ierr = 0; > return ierr; > } > > The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D > is a diagonal matrix and it is used to stabilise the solution at the start > but reduced gradually when the solution moves on to recover Newton's > method. I add D*x to the true right side when non-linear function is > computed to work out finite difference Jacobian, so when finite difference > is used, it actually computes (J+D)*dx. > > The code runs but diverges in the end. If I don't do matrix-free and use > my approximate Jacobian matrix, GMRES works. So something is wrong with my > matrix-free implementation. Have I missed something in my implementation? > Besides, is there a way to check if the finite difference Jacobian matrix > is computed correctly in a matrix-free implementation? > > Thanks for your help in advance. > Feng > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From snailsoar at hotmail.com Fri Mar 12 07:54:21 2021 From: snailsoar at hotmail.com (feng wang) Date: Fri, 12 Mar 2021 13:54:21 +0000 Subject: [petsc-users] Outputting cell data in stead of point data while writing .vtr file In-Reply-To: <9591a94bb3ec4181899a52f65d6a17b7@hest.ethz.ch> References: <0c6a1fcc56784bfbb2cb9bf01a9d0586@hest.ethz.ch> <54dc57ab87464bc2a5da663226c11805@hest.ethz.ch> , <9591a94bb3ec4181899a52f65d6a17b7@hest.ethz.ch> Message-ID: Hi Thijs, If you don't want to do any coding, In Paraview, there is a built-in filter to allow you to interpolate from point-based data to cell-based data, or vice versa. Hope this would be helpful. Feng ________________________________ From: petsc-users on behalf of Smit Thijs Sent: 11 March 2021 14:02 To: Matthew Knepley Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Outputting cell data in stead of point data while writing .vtr file Hi Matt, Oke, would a code change be difficult? I mean, feasible for me to do as a mechanical engineer? ;) Or can I use an other version of PETSc where the output to ASCII VTK is still available? Best, Thijs From: Matthew Knepley Sent: 11 March 2021 14:54 To: Smit Thijs Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Outputting cell data in stead of point data while writing .vtr file On Thu, Mar 11, 2021 at 8:17 AM Smit Thijs > wrote: Hi Matt, Actually I have two 3D DMDA?s, one for the nodal data, where the FEM is solved on. The other DMDA is a cell centered one for the volume data, like the density of a particular voxel. Ideally I would like to write both point data (displacement field) and cell data (density) to the vtr. Code for DMDA. DMBoundaryType bx = DM_BOUNDARY_NONE; DMBoundaryType by = DM_BOUNDARY_NONE; DMBoundaryType bz = DM_BOUNDARY_NONE; DMDAStencilType stype = DMDA_STENCIL_BOX; PetscInt stencilwidth = 1; // Create the nodal mesh ierr = DMDACreate3d(PETSC_COMM_WORLD, bx, by, bz, stype, nx, ny, nz, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE, numnodaldof, stencilwidth, 0, 0, 0, &(da_nodes)); CHKERRQ(ierr); DMSetFromOptions(da_nodes); DMSetUp(da_nodes); ierr = DMDASetUniformCoordinates(da_nodes, xmin, xmax, ymin, ymax, zmin, zmax); CHKERRQ(ierr); ierr = DMDASetElementType(da_nodes, DMDA_ELEMENT_Q1); CHKERRQ(ierr); When I wrote the original version which output to ASCII VTK, we allowed switching between point data and cell data. It is fragile since you have to assure that the different grids match properly. When the viewer was rewritten to use the XML, all output was point data. It looks like it would take code changes to get this done. Thanks, Matt Best, Thijs From: Matthew Knepley > Sent: 11 March 2021 14:08 To: Smit Thijs > Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Outputting cell data in stead of point data while writing .vtr file What kind of DM is it? Thanks, Matt On Thu, Mar 11, 2021 at 3:36 AM Smit Thijs > wrote: Hi All, I am outputting several vectors to a .vtr file successfully for viewing in Paraview. At this moment the information is written to point data. How can I change this and make sure the data is written to cell data? The code I am currently using for outputting: PetscViewer viewer; ierr = PetscViewerVTKOpen(PETSC_COMM_WORLD, ?test.vtr?, FILE_MODE_WRITE, &viewer); CHKERRQ(ierr); ierr = DMView(nd, viewer); CHKERRQ(ierr); PetscObjectSetName((PetscObject)xPhys,"xPhys"); ierr = VecView(xPhys, viewer); CHKERRQ(ierr); PetscObjectSetName((PetscObject)S,"SvonMises"); ierr = VecView(S, viewer); CHKERRQ(ierr); ierr = PetscViewerDestroy(&viewer); CHKERRQ(ierr); Best regards, Thijs Smit PhD Candidate ETH Zurich Institute for Biomechanics -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From snailsoar at hotmail.com Fri Mar 12 08:55:00 2021 From: snailsoar at hotmail.com (feng wang) Date: Fri, 12 Mar 2021 14:55:00 +0000 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> , Message-ID: Hi Mat, Thanks for your reply. I will try the parallel implementation. I've got a serial matrix-free GMRES working, but I would like to know why my initial version of matrix-free implementation does not work and there is still something I don't understand. I did some debugging and find that the callback function to compute the RHS for the matrix-free matrix is called twice by Petsc when it computes the finite difference Jacobian, but it should only be called once. I don't know why, could you please give some advice? Thanks, Feng ________________________________ From: Matthew Knepley Sent: 12 March 2021 12:05 To: feng wang Cc: Barry Smith ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 6:02 AM feng wang > wrote: Hi Barry, Thanks for your advice. You are right on this. somehow there is some inconsistency when I compute the right hand side (true RHS + time-stepping contribution to the diagonal matrix) to compute the finite difference Jacobian. If I just use the call back function to recompute my RHS before I call MatMFFDSetBase, then it works like a charm. But now I end up with computing my RHS three times. 1st time is to compute the true RHS, the rest two is for computing finite difference Jacobian. In my previous buggy version, I only compute RHS twice. If possible, could you elaborate on your comments "Also be careful about petsc_baserhs", so I may possibly understand what was going on with my buggy version. Our FD implementation is simple. It approximates the action of the Jacobian as J(b) v = (F(b + h v) - F(b)) / h ||v|| where h is some small parameter and b is the base vector, namely the one that you are linearizing around. In a Newton step, b is the previous solution and v is the proposed solution update. Besides, for a parallel implementation, my code already has its own partition method, is it possible to allow petsc read in a user-defined partition? if not what is a better way to do this? Sure https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html Thanks, Matt Many thanks, Feng ________________________________ From: Barry Smith > Sent: 11 March 2021 22:15 To: feng wang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation Feng, The first thing to check is that for each linear solve that involves a new operator (values in the base vector) the MFFD matrix knows it is using a new operator. The easiest way is to call MatMFFDSetBase() before each solve that involves a new operator (new values in the base vector). Also be careful about petsc_baserhs, when you change the base vector's values you also need to change the petsc_baserhs values to the function evaluation at that point. If that is correct I would check with a trivial function evaluator to make sure the infrastructure is all set up correctly. For examples use for the matrix free a 1 4 1 operator applied matrix free. Barry On Mar 11, 2021, at 7:35 AM, feng wang > wrote: Dear All, I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: the matrix-free matrix is created as: ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); KSP linear operator is set up as: ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix Before calling KSPSolve, I do: ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side The call back function is defined as: PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) { PetscErrorCode ierr; cFdDomain *user_ctx; cout << "FormFunction_mf called\n"; //in_vec: flow states //out_vec: right hand side + diagonal contributions from CFL number user_ctx = (cFdDomain*)ctx; //get perturbed conservative variables from petsc user_ctx->petsc_getcsv(in_vec); //get new right side user_ctx->petsc_fd_rhs(); //set new right hand side to the output vector user_ctx->petsc_setrhs(out_vec); ierr = 0; return ierr; } The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? Thanks for your help in advance. Feng -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 12 09:08:05 2021 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 12 Mar 2021 10:08:05 -0500 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> Message-ID: On Fri, Mar 12, 2021 at 9:55 AM feng wang wrote: > Hi Mat, > > Thanks for your reply. I will try the parallel implementation. > > I've got a serial matrix-free GMRES working, but I would like to know why > my initial version of matrix-free implementation does not work and there is > still something I don't understand. I did some debugging and find that the > callback function to compute the RHS for the matrix-free matrix is called > twice by Petsc when it computes the finite difference Jacobian, but it > should only be called once. I don't know why, could you please give some > advice? > F is called once to calculate the base point and once to get the perturbation. The base point is not recalculated, so if you do many iterates, it is amortized. Thanks, Matt > Thanks, > Feng > > > > ------------------------------ > *From:* Matthew Knepley > *Sent:* 12 March 2021 12:05 > *To:* feng wang > *Cc:* Barry Smith ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Questions on matrix-free GMRES implementation > > On Fri, Mar 12, 2021 at 6:02 AM feng wang wrote: > > Hi Barry, > > Thanks for your advice. > > You are right on this. somehow there is some inconsistency when I compute > the right hand side (true RHS + time-stepping contribution to the diagonal > matrix) to compute the finite difference Jacobian. If I just use the call > back function to recompute my RHS before I call *MatMFFDSetBase*, then it > works like a charm. But now I end up with computing my RHS three times. 1st > time is to compute the true RHS, the rest two is for computing finite > difference Jacobian. > > In my previous buggy version, I only compute RHS twice. If possible, > could you elaborate on your comments "Also be careful about petsc_baserhs", > so I may possibly understand what was going on with my buggy version. > > > Our FD implementation is simple. It approximates the action of the > Jacobian as > > J(b) v = (F(b + h v) - F(b)) / h ||v|| > > where h is some small parameter and b is the base vector, namely the one > that you are linearizing around. In a Newton step, b is the previous > solution > and v is the proposed solution update. > > > Besides, for a parallel implementation, my code already has its own > partition method, is it possible to allow petsc read in a user-defined > partition? if not what is a better way to do this? > > > Sure > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html > > Thanks, > > Matt > > > Many thanks, > Feng > > ------------------------------ > *From:* Barry Smith > *Sent:* 11 March 2021 22:15 > *To:* feng wang > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Questions on matrix-free GMRES implementation > > > Feng, > > The first thing to check is that for each linear solve that involves > a new operator (values in the base vector) the MFFD matrix knows it is > using a new operator. > > The easiest way is to call MatMFFDSetBase() before each solve that > involves a new operator (new values in the base vector). Also be careful > about petsc_baserhs, when you change the base vector's values you also > need to change the petsc_baserhs values to the function evaluation at > that point. > > If that is correct I would check with a trivial function evaluator to make > sure the infrastructure is all set up correctly. For examples use for the > matrix free a 1 4 1 operator applied matrix free. > > Barry > > > On Mar 11, 2021, at 7:35 AM, feng wang wrote: > > Dear All, > > I am new to petsc and trying to implement a matrix-free GMRES. I have > assembled an approximate Jacobian matrix just for preconditioning. After > reading some previous questions on this topic, my approach is: > > the matrix-free matrix is created as: > > ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, > PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); > ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); > CHKERRQ(ierr); > > KSP linear operator is set up as: > > ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); > CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix > > Before calling KSPSolve, I do: > > ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); > CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the > pre-computed right hand side > > The call back function is defined as: > > PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec > out_vec) > { > PetscErrorCode ierr; > cFdDomain *user_ctx; > > cout << "FormFunction_mf called\n"; > > //in_vec: flow states > //out_vec: right hand side + diagonal contributions from CFL number > > user_ctx = (cFdDomain*)ctx; > > //get perturbed conservative variables from petsc > user_ctx->petsc_getcsv(in_vec); > > //get new right side > user_ctx->petsc_fd_rhs(); > > //set new right hand side to the output vector > user_ctx->petsc_setrhs(out_vec); > > ierr = 0; > return ierr; > } > > The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D > is a diagonal matrix and it is used to stabilise the solution at the start > but reduced gradually when the solution moves on to recover Newton's > method. I add D*x to the true right side when non-linear function is > computed to work out finite difference Jacobian, so when finite difference > is used, it actually computes (J+D)*dx. > > The code runs but diverges in the end. If I don't do matrix-free and use > my approximate Jacobian matrix, GMRES works. So something is wrong with my > matrix-free implementation. Have I missed something in my implementation? > Besides, is there a way to check if the finite difference Jacobian matrix > is computed correctly in a matrix-free implementation? > > Thanks for your help in advance. > Feng > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From snailsoar at hotmail.com Fri Mar 12 09:37:26 2021 From: snailsoar at hotmail.com (feng wang) Date: Fri, 12 Mar 2021 15:37:26 +0000 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> , Message-ID: Hi Matt, Thanks for your prompt response. Below are my two versions. one is buggy and the 2nd one is working. For the first one, I add the diagonal contribution to the true RHS (variable: rhs) and then set the base point, the callback function is somehow called twice afterwards to compute Jacobian. For the 2nd one, I just call the callback function manually to recompute everything, the callback function is then called once as expected to compute the Jacobian. For me, both versions should do the same things. but I don't know why in the first one the callback function is called twice after I set the base point. what could possibly go wrong? Thanks, Feng //This does not work fld->cnsv( iqs,iqe, q, aux, csv ); //add contribution of time-stepping for(iv=0; ivcnsv( iqs,iqe, q, aux, csv ); ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); ierr = FormFunction_mf(this, petsc_csv, petsc_baserhs); //this is my callback function, now call it manually ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); ________________________________ From: Matthew Knepley Sent: 12 March 2021 15:08 To: feng wang Cc: Barry Smith ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 9:55 AM feng wang > wrote: Hi Mat, Thanks for your reply. I will try the parallel implementation. I've got a serial matrix-free GMRES working, but I would like to know why my initial version of matrix-free implementation does not work and there is still something I don't understand. I did some debugging and find that the callback function to compute the RHS for the matrix-free matrix is called twice by Petsc when it computes the finite difference Jacobian, but it should only be called once. I don't know why, could you please give some advice? F is called once to calculate the base point and once to get the perturbation. The base point is not recalculated, so if you do many iterates, it is amortized. Thanks, Matt Thanks, Feng ________________________________ From: Matthew Knepley > Sent: 12 March 2021 12:05 To: feng wang > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 6:02 AM feng wang > wrote: Hi Barry, Thanks for your advice. You are right on this. somehow there is some inconsistency when I compute the right hand side (true RHS + time-stepping contribution to the diagonal matrix) to compute the finite difference Jacobian. If I just use the call back function to recompute my RHS before I call MatMFFDSetBase, then it works like a charm. But now I end up with computing my RHS three times. 1st time is to compute the true RHS, the rest two is for computing finite difference Jacobian. In my previous buggy version, I only compute RHS twice. If possible, could you elaborate on your comments "Also be careful about petsc_baserhs", so I may possibly understand what was going on with my buggy version. Our FD implementation is simple. It approximates the action of the Jacobian as J(b) v = (F(b + h v) - F(b)) / h ||v|| where h is some small parameter and b is the base vector, namely the one that you are linearizing around. In a Newton step, b is the previous solution and v is the proposed solution update. Besides, for a parallel implementation, my code already has its own partition method, is it possible to allow petsc read in a user-defined partition? if not what is a better way to do this? Sure https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html Thanks, Matt Many thanks, Feng ________________________________ From: Barry Smith > Sent: 11 March 2021 22:15 To: feng wang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation Feng, The first thing to check is that for each linear solve that involves a new operator (values in the base vector) the MFFD matrix knows it is using a new operator. The easiest way is to call MatMFFDSetBase() before each solve that involves a new operator (new values in the base vector). Also be careful about petsc_baserhs, when you change the base vector's values you also need to change the petsc_baserhs values to the function evaluation at that point. If that is correct I would check with a trivial function evaluator to make sure the infrastructure is all set up correctly. For examples use for the matrix free a 1 4 1 operator applied matrix free. Barry On Mar 11, 2021, at 7:35 AM, feng wang > wrote: Dear All, I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: the matrix-free matrix is created as: ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); KSP linear operator is set up as: ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix Before calling KSPSolve, I do: ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side The call back function is defined as: PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) { PetscErrorCode ierr; cFdDomain *user_ctx; cout << "FormFunction_mf called\n"; //in_vec: flow states //out_vec: right hand side + diagonal contributions from CFL number user_ctx = (cFdDomain*)ctx; //get perturbed conservative variables from petsc user_ctx->petsc_getcsv(in_vec); //get new right side user_ctx->petsc_fd_rhs(); //set new right hand side to the output vector user_ctx->petsc_setrhs(out_vec); ierr = 0; return ierr; } The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? Thanks for your help in advance. Feng -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 12 09:40:31 2021 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 12 Mar 2021 10:40:31 -0500 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> Message-ID: On Fri, Mar 12, 2021 at 10:37 AM feng wang wrote: > Hi Matt, > > Thanks for your prompt response. > > Below are my two versions. one is buggy and the 2nd one is working. For > the first one, I add the diagonal contribution to the true RHS (variable: > rhs) and then set the base point, the callback function is somehow called > twice afterwards to compute Jacobian. For the 2nd one, I just call the > callback function manually to recompute everything, the callback function > is then called once as expected to compute the Jacobian. For me, both > versions should do the same things. but I don't know why in the first one > the callback function is called twice after I set the base point. what > could possibly go wrong? > If you reset the base point, we need to recompute F(b), so you get two function calls. Normally, you do not reset the base point within a Newton iteration. Maybe you should write out mathematically what you are doing. I cannot understand it right now. Thanks, Matt > Thanks, > Feng > > *//This does not work* > fld->cnsv( iqs,iqe, q, aux, csv ); > //add contribution of time-stepping > for(iv=0; iv { > for(iq=0; iq { > //use conservative variables here > rhs[iv][iq] = -rhs[iv][iq] + > csv[iv][iq]*lhsa[nlhs-1][iq]/cfl; > } > } > ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); > ierr = petsc_setrhs(petsc_baserhs); CHKERRQ(ierr); > ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); > CHKERRQ(ierr); > > *//This works* > fld->cnsv( iqs,iqe, q, aux, csv ); > ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); > ierr = FormFunction_mf(this, petsc_csv, petsc_baserhs); //this is > my callback function, now call it manually > ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); > CHKERRQ(ierr); > > > > ------------------------------ > *From:* Matthew Knepley > *Sent:* 12 March 2021 15:08 > *To:* feng wang > *Cc:* Barry Smith ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Questions on matrix-free GMRES implementation > > On Fri, Mar 12, 2021 at 9:55 AM feng wang wrote: > > Hi Mat, > > Thanks for your reply. I will try the parallel implementation. > > I've got a serial matrix-free GMRES working, but I would like to know why > my initial version of matrix-free implementation does not work and there is > still something I don't understand. I did some debugging and find that the > callback function to compute the RHS for the matrix-free matrix is called > twice by Petsc when it computes the finite difference Jacobian, but it > should only be called once. I don't know why, could you please give some > advice? > > > F is called once to calculate the base point and once to get the > perturbation. The base point is not recalculated, so if you do many > iterates, it is amortized. > > Thanks, > > Matt > > > Thanks, > Feng > > > > ------------------------------ > *From:* Matthew Knepley > *Sent:* 12 March 2021 12:05 > *To:* feng wang > *Cc:* Barry Smith ; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject:* Re: [petsc-users] Questions on matrix-free GMRES implementation > > On Fri, Mar 12, 2021 at 6:02 AM feng wang wrote: > > Hi Barry, > > Thanks for your advice. > > You are right on this. somehow there is some inconsistency when I compute > the right hand side (true RHS + time-stepping contribution to the diagonal > matrix) to compute the finite difference Jacobian. If I just use the call > back function to recompute my RHS before I call *MatMFFDSetBase*, then it > works like a charm. But now I end up with computing my RHS three times. 1st > time is to compute the true RHS, the rest two is for computing finite > difference Jacobian. > > In my previous buggy version, I only compute RHS twice. If possible, > could you elaborate on your comments "Also be careful about petsc_baserhs", > so I may possibly understand what was going on with my buggy version. > > > Our FD implementation is simple. It approximates the action of the > Jacobian as > > J(b) v = (F(b + h v) - F(b)) / h ||v|| > > where h is some small parameter and b is the base vector, namely the one > that you are linearizing around. In a Newton step, b is the previous > solution > and v is the proposed solution update. > > > Besides, for a parallel implementation, my code already has its own > partition method, is it possible to allow petsc read in a user-defined > partition? if not what is a better way to do this? > > > Sure > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html > > Thanks, > > Matt > > > Many thanks, > Feng > > ------------------------------ > *From:* Barry Smith > *Sent:* 11 March 2021 22:15 > *To:* feng wang > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Questions on matrix-free GMRES implementation > > > Feng, > > The first thing to check is that for each linear solve that involves > a new operator (values in the base vector) the MFFD matrix knows it is > using a new operator. > > The easiest way is to call MatMFFDSetBase() before each solve that > involves a new operator (new values in the base vector). Also be careful > about petsc_baserhs, when you change the base vector's values you also > need to change the petsc_baserhs values to the function evaluation at > that point. > > If that is correct I would check with a trivial function evaluator to make > sure the infrastructure is all set up correctly. For examples use for the > matrix free a 1 4 1 operator applied matrix free. > > Barry > > > On Mar 11, 2021, at 7:35 AM, feng wang wrote: > > Dear All, > > I am new to petsc and trying to implement a matrix-free GMRES. I have > assembled an approximate Jacobian matrix just for preconditioning. After > reading some previous questions on this topic, my approach is: > > the matrix-free matrix is created as: > > ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, > PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); > ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); > CHKERRQ(ierr); > > KSP linear operator is set up as: > > ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); > CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix > > Before calling KSPSolve, I do: > > ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); > CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the > pre-computed right hand side > > The call back function is defined as: > > PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec > out_vec) > { > PetscErrorCode ierr; > cFdDomain *user_ctx; > > cout << "FormFunction_mf called\n"; > > //in_vec: flow states > //out_vec: right hand side + diagonal contributions from CFL number > > user_ctx = (cFdDomain*)ctx; > > //get perturbed conservative variables from petsc > user_ctx->petsc_getcsv(in_vec); > > //get new right side > user_ctx->petsc_fd_rhs(); > > //set new right hand side to the output vector > user_ctx->petsc_setrhs(out_vec); > > ierr = 0; > return ierr; > } > > The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D > is a diagonal matrix and it is used to stabilise the solution at the start > but reduced gradually when the solution moves on to recover Newton's > method. I add D*x to the true right side when non-linear function is > computed to work out finite difference Jacobian, so when finite difference > is used, it actually computes (J+D)*dx. > > The code runs but diverges in the end. If I don't do matrix-free and use > my approximate Jacobian matrix, GMRES works. So something is wrong with my > matrix-free implementation. Have I missed something in my implementation? > Besides, is there a way to check if the finite difference Jacobian matrix > is computed correctly in a matrix-free implementation? > > Thanks for your help in advance. > Feng > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From snailsoar at hotmail.com Fri Mar 12 11:12:53 2021 From: snailsoar at hotmail.com (feng wang) Date: Fri, 12 Mar 2021 17:12:53 +0000 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> , Message-ID: Hi Matt, Thanks for your reply. I've checked my math, it is correct. I did not intend to reset the base point within a Newton iteration. what criteria is used in petsc to decide if it needs to automatically re-compute the basepoint? Thanks, Feng ________________________________ From: Matthew Knepley Sent: 12 March 2021 15:40 To: feng wang Cc: Barry Smith ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 10:37 AM feng wang > wrote: Hi Matt, Thanks for your prompt response. Below are my two versions. one is buggy and the 2nd one is working. For the first one, I add the diagonal contribution to the true RHS (variable: rhs) and then set the base point, the callback function is somehow called twice afterwards to compute Jacobian. For the 2nd one, I just call the callback function manually to recompute everything, the callback function is then called once as expected to compute the Jacobian. For me, both versions should do the same things. but I don't know why in the first one the callback function is called twice after I set the base point. what could possibly go wrong? If you reset the base point, we need to recompute F(b), so you get two function calls. Normally, you do not reset the base point within a Newton iteration. Maybe you should write out mathematically what you are doing. I cannot understand it right now. Thanks, Matt Thanks, Feng //This does not work fld->cnsv( iqs,iqe, q, aux, csv ); //add contribution of time-stepping for(iv=0; ivcnsv( iqs,iqe, q, aux, csv ); ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); ierr = FormFunction_mf(this, petsc_csv, petsc_baserhs); //this is my callback function, now call it manually ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); ________________________________ From: Matthew Knepley > Sent: 12 March 2021 15:08 To: feng wang > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 9:55 AM feng wang > wrote: Hi Mat, Thanks for your reply. I will try the parallel implementation. I've got a serial matrix-free GMRES working, but I would like to know why my initial version of matrix-free implementation does not work and there is still something I don't understand. I did some debugging and find that the callback function to compute the RHS for the matrix-free matrix is called twice by Petsc when it computes the finite difference Jacobian, but it should only be called once. I don't know why, could you please give some advice? F is called once to calculate the base point and once to get the perturbation. The base point is not recalculated, so if you do many iterates, it is amortized. Thanks, Matt Thanks, Feng ________________________________ From: Matthew Knepley > Sent: 12 March 2021 12:05 To: feng wang > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 6:02 AM feng wang > wrote: Hi Barry, Thanks for your advice. You are right on this. somehow there is some inconsistency when I compute the right hand side (true RHS + time-stepping contribution to the diagonal matrix) to compute the finite difference Jacobian. If I just use the call back function to recompute my RHS before I call MatMFFDSetBase, then it works like a charm. But now I end up with computing my RHS three times. 1st time is to compute the true RHS, the rest two is for computing finite difference Jacobian. In my previous buggy version, I only compute RHS twice. If possible, could you elaborate on your comments "Also be careful about petsc_baserhs", so I may possibly understand what was going on with my buggy version. Our FD implementation is simple. It approximates the action of the Jacobian as J(b) v = (F(b + h v) - F(b)) / h ||v|| where h is some small parameter and b is the base vector, namely the one that you are linearizing around. In a Newton step, b is the previous solution and v is the proposed solution update. Besides, for a parallel implementation, my code already has its own partition method, is it possible to allow petsc read in a user-defined partition? if not what is a better way to do this? Sure https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html Thanks, Matt Many thanks, Feng ________________________________ From: Barry Smith > Sent: 11 March 2021 22:15 To: feng wang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation Feng, The first thing to check is that for each linear solve that involves a new operator (values in the base vector) the MFFD matrix knows it is using a new operator. The easiest way is to call MatMFFDSetBase() before each solve that involves a new operator (new values in the base vector). Also be careful about petsc_baserhs, when you change the base vector's values you also need to change the petsc_baserhs values to the function evaluation at that point. If that is correct I would check with a trivial function evaluator to make sure the infrastructure is all set up correctly. For examples use for the matrix free a 1 4 1 operator applied matrix free. Barry On Mar 11, 2021, at 7:35 AM, feng wang > wrote: Dear All, I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: the matrix-free matrix is created as: ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); KSP linear operator is set up as: ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix Before calling KSPSolve, I do: ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side The call back function is defined as: PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) { PetscErrorCode ierr; cFdDomain *user_ctx; cout << "FormFunction_mf called\n"; //in_vec: flow states //out_vec: right hand side + diagonal contributions from CFL number user_ctx = (cFdDomain*)ctx; //get perturbed conservative variables from petsc user_ctx->petsc_getcsv(in_vec); //get new right side user_ctx->petsc_fd_rhs(); //set new right hand side to the output vector user_ctx->petsc_setrhs(out_vec); ierr = 0; return ierr; } The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? Thanks for your help in advance. Feng -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From snailsoar at hotmail.com Fri Mar 12 11:19:16 2021 From: snailsoar at hotmail.com (feng wang) Date: Fri, 12 Mar 2021 17:19:16 +0000 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> , , Message-ID: Hi Matt, please ignore my previous email. it was a silly question. Thanks, Feng ________________________________ From: petsc-users on behalf of feng wang Sent: 12 March 2021 17:12 To: Matthew Knepley Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation Hi Matt, Thanks for your reply. I've checked my math, it is correct. I did not intend to reset the base point within a Newton iteration. what criteria is used in petsc to decide if it needs to automatically re-compute the basepoint? Thanks, Feng ________________________________ From: Matthew Knepley Sent: 12 March 2021 15:40 To: feng wang Cc: Barry Smith ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 10:37 AM feng wang > wrote: Hi Matt, Thanks for your prompt response. Below are my two versions. one is buggy and the 2nd one is working. For the first one, I add the diagonal contribution to the true RHS (variable: rhs) and then set the base point, the callback function is somehow called twice afterwards to compute Jacobian. For the 2nd one, I just call the callback function manually to recompute everything, the callback function is then called once as expected to compute the Jacobian. For me, both versions should do the same things. but I don't know why in the first one the callback function is called twice after I set the base point. what could possibly go wrong? If you reset the base point, we need to recompute F(b), so you get two function calls. Normally, you do not reset the base point within a Newton iteration. Maybe you should write out mathematically what you are doing. I cannot understand it right now. Thanks, Matt Thanks, Feng //This does not work fld->cnsv( iqs,iqe, q, aux, csv ); //add contribution of time-stepping for(iv=0; ivcnsv( iqs,iqe, q, aux, csv ); ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); ierr = FormFunction_mf(this, petsc_csv, petsc_baserhs); //this is my callback function, now call it manually ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); ________________________________ From: Matthew Knepley > Sent: 12 March 2021 15:08 To: feng wang > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 9:55 AM feng wang > wrote: Hi Mat, Thanks for your reply. I will try the parallel implementation. I've got a serial matrix-free GMRES working, but I would like to know why my initial version of matrix-free implementation does not work and there is still something I don't understand. I did some debugging and find that the callback function to compute the RHS for the matrix-free matrix is called twice by Petsc when it computes the finite difference Jacobian, but it should only be called once. I don't know why, could you please give some advice? F is called once to calculate the base point and once to get the perturbation. The base point is not recalculated, so if you do many iterates, it is amortized. Thanks, Matt Thanks, Feng ________________________________ From: Matthew Knepley > Sent: 12 March 2021 12:05 To: feng wang > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 6:02 AM feng wang > wrote: Hi Barry, Thanks for your advice. You are right on this. somehow there is some inconsistency when I compute the right hand side (true RHS + time-stepping contribution to the diagonal matrix) to compute the finite difference Jacobian. If I just use the call back function to recompute my RHS before I call MatMFFDSetBase, then it works like a charm. But now I end up with computing my RHS three times. 1st time is to compute the true RHS, the rest two is for computing finite difference Jacobian. In my previous buggy version, I only compute RHS twice. If possible, could you elaborate on your comments "Also be careful about petsc_baserhs", so I may possibly understand what was going on with my buggy version. Our FD implementation is simple. It approximates the action of the Jacobian as J(b) v = (F(b + h v) - F(b)) / h ||v|| where h is some small parameter and b is the base vector, namely the one that you are linearizing around. In a Newton step, b is the previous solution and v is the proposed solution update. Besides, for a parallel implementation, my code already has its own partition method, is it possible to allow petsc read in a user-defined partition? if not what is a better way to do this? Sure https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html Thanks, Matt Many thanks, Feng ________________________________ From: Barry Smith > Sent: 11 March 2021 22:15 To: feng wang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation Feng, The first thing to check is that for each linear solve that involves a new operator (values in the base vector) the MFFD matrix knows it is using a new operator. The easiest way is to call MatMFFDSetBase() before each solve that involves a new operator (new values in the base vector). Also be careful about petsc_baserhs, when you change the base vector's values you also need to change the petsc_baserhs values to the function evaluation at that point. If that is correct I would check with a trivial function evaluator to make sure the infrastructure is all set up correctly. For examples use for the matrix free a 1 4 1 operator applied matrix free. Barry On Mar 11, 2021, at 7:35 AM, feng wang > wrote: Dear All, I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: the matrix-free matrix is created as: ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); KSP linear operator is set up as: ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix Before calling KSPSolve, I do: ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side The call back function is defined as: PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) { PetscErrorCode ierr; cFdDomain *user_ctx; cout << "FormFunction_mf called\n"; //in_vec: flow states //out_vec: right hand side + diagonal contributions from CFL number user_ctx = (cFdDomain*)ctx; //get perturbed conservative variables from petsc user_ctx->petsc_getcsv(in_vec); //get new right side user_ctx->petsc_fd_rhs(); //set new right hand side to the output vector user_ctx->petsc_setrhs(out_vec); ierr = 0; return ierr; } The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? Thanks for your help in advance. Feng -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Mar 12 17:40:43 2021 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 12 Mar 2021 17:40:43 -0600 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> Message-ID: <38509E59-D27A-47C6-8D97-EAAEBFC15FBF@petsc.dev> > On Mar 12, 2021, at 9:37 AM, feng wang wrote: > > Hi Matt, > > Thanks for your prompt response. > > Below are my two versions. one is buggy and the 2nd one is working. For the first one, I add the diagonal contribution to the true RHS (variable: rhs) and then set the base point, the callback function is somehow called twice afterwards to compute Jacobian. Do you mean "to compute the Jacobian matrix-vector product?" Is it only in the first computation of the product (for the given base vector) that it calls it twice or every matrix-vector product? It is possible there is a bug in our logic; run in the debugger with a break point in FormFunction_mf and each time the function is hit in the debugger type where or bt to get the stack frames from the calls. Send this. From this we can all see if it is being called excessively and why. > For the 2nd one, I just call the callback function manually to recompute everything, the callback function is then called once as expected to compute the Jacobian. For me, both versions should do the same things. but I don't know why in the first one the callback function is called twice after I set the base point. what could possibly go wrong? The logic of how it is suppose to work is shown below. > > Thanks, > Feng > > //This does not work > fld->cnsv( iqs,iqe, q, aux, csv ); > //add contribution of time-stepping > for(iv=0; iv { > for(iq=0; iq { > //use conservative variables here > rhs[iv][iq] = -rhs[iv][iq] + csv[iv][iq]*lhsa[nlhs-1][iq]/cfl; > } > } > ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); > ierr = petsc_setrhs(petsc_baserhs); CHKERRQ(ierr); > ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); > > //This works > fld->cnsv( iqs,iqe, q, aux, csv ); > ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); > ierr = FormFunction_mf(this, petsc_csv, petsc_baserhs); //this is my callback function, now call it manually > ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); > > Since you provide petsc_baserhs MatMFFD assumes (naturally) that you will keep the correct values in it. Hence for each new base value YOU need to compute the new values in petsc_baserhs. This approach gives you a bit more control over reusing the information in petsc_baserhs. If you would prefer that MatMFFD recomputes the base values, as needed, then you call FormFunction_mf(this, petsc_csv, NULL); and PETSc will allocate a vector and fill it up as needed by calling your FormFunction_mf() But you need to call MatAssemblyBegin/End each time you the base input vector this, petsc_csv values change. For example MatAssemblyBegin(petsc_A_mf,...) MatAssemblyEnd(petsc_A_mf,...) KSPSolve() > > From: Matthew Knepley > Sent: 12 March 2021 15:08 > To: feng wang > Cc: Barry Smith ; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation > > On Fri, Mar 12, 2021 at 9:55 AM feng wang > wrote: > Hi Mat, > > Thanks for your reply. I will try the parallel implementation. > > I've got a serial matrix-free GMRES working, but I would like to know why my initial version of matrix-free implementation does not work and there is still something I don't understand. I did some debugging and find that the callback function to compute the RHS for the matrix-free matrix is called twice by Petsc when it computes the finite difference Jacobian, but it should only be called once. I don't know why, could you please give some advice? > > F is called once to calculate the base point and once to get the perturbation. The base point is not recalculated, so if you do many iterates, it is amortized. > > Thanks, > > Matt > > Thanks, > Feng > > > > From: Matthew Knepley > > Sent: 12 March 2021 12:05 > To: feng wang > > Cc: Barry Smith >; petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation > > On Fri, Mar 12, 2021 at 6:02 AM feng wang > wrote: > Hi Barry, > > Thanks for your advice. > > You are right on this. somehow there is some inconsistency when I compute the right hand side (true RHS + time-stepping contribution to the diagonal matrix) to compute the finite difference Jacobian. If I just use the call back function to recompute my RHS before I call MatMFFDSetBase, then it works like a charm. But now I end up with computing my RHS three times. 1st time is to compute the true RHS, the rest two is for computing finite difference Jacobian. > > In my previous buggy version, I only compute RHS twice. If possible, could you elaborate on your comments "Also be careful about petsc_baserhs", so I may possibly understand what was going on with my buggy version. > > Our FD implementation is simple. It approximates the action of the Jacobian as > > J(b) v = (F(b + h v) - F(b)) / h ||v|| > > where h is some small parameter and b is the base vector, namely the one that you are linearizing around. In a Newton step, b is the previous solution > and v is the proposed solution update. > > Besides, for a parallel implementation, my code already has its own partition method, is it possible to allow petsc read in a user-defined partition? if not what is a better way to do this? > > Sure > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html > > Thanks, > > Matt > > Many thanks, > Feng > > From: Barry Smith > > Sent: 11 March 2021 22:15 > To: feng wang > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation > > > Feng, > > The first thing to check is that for each linear solve that involves a new operator (values in the base vector) the MFFD matrix knows it is using a new operator. > > The easiest way is to call MatMFFDSetBase() before each solve that involves a new operator (new values in the base vector). Also be careful about petsc_baserhs, when you change the base vector's values you also need to change the petsc_baserhs values to the function evaluation at that point. > > If that is correct I would check with a trivial function evaluator to make sure the infrastructure is all set up correctly. For examples use for the matrix free a 1 4 1 operator applied matrix free. > > Barry > > >> On Mar 11, 2021, at 7:35 AM, feng wang > wrote: >> >> Dear All, >> >> I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: >> >> the matrix-free matrix is created as: >> >> ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); >> ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); >> >> KSP linear operator is set up as: >> >> ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix >> >> Before calling KSPSolve, I do: >> >> ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side >> >> The call back function is defined as: >> >> PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) >> { >> PetscErrorCode ierr; >> cFdDomain *user_ctx; >> >> cout << "FormFunction_mf called\n"; >> >> //in_vec: flow states >> //out_vec: right hand side + diagonal contributions from CFL number >> >> user_ctx = (cFdDomain*)ctx; >> >> //get perturbed conservative variables from petsc >> user_ctx->petsc_getcsv(in_vec); >> >> //get new right side >> user_ctx->petsc_fd_rhs(); >> >> //set new right hand side to the output vector >> user_ctx->petsc_setrhs(out_vec); >> >> ierr = 0; >> return ierr; >> } >> >> The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. >> >> The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? >> >> Thanks for your help in advance. >> Feng > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam.guo at cd-adapco.com Tue Mar 16 13:50:13 2021 From: sam.guo at cd-adapco.com (Sam Guo) Date: Tue, 16 Mar 2021 11:50:13 -0700 Subject: [petsc-users] error message Message-ID: Dear PETSc dev team, When there is an PETSc error, I go following overly verbose error message. Is it possible to get a simple error message like "Initial vector is zero or belongs to the deflection space"? Thanks, Sam [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Initial vector is zero or belongs to the deflation space [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.11.3, Jun, 26, 2019 [0]PETSC ERROR: Unknown Name on a arch-starccmplus_serial_real named pl2denbg0033pc.net.plm.eds.com by cd3ppw Tue Mar 16 16:19:28 2021 [0]PETSC ERROR: Configure options --with-x=0 --with-fc=0 --with-debugging=0 --with-blaslapack-dir=/usr/local/jenkins/dev1/mkl/2017.2-cda-001/linux/lib/intel64/../.. --with-mpi=0 -CFLAGS=-O3 -CXXFLAGS=-O3 --with-clean=1 --force --with-scalar-type=real [0]PETSC ERROR: #1 EPSGetStartVector() line 806 in ../../../slepc/src/eps/interface/epssolve.c [0]PETSC ERROR: #2 EPSSolve_KrylovSchur_Default() line 259 in ../../../slepc/src/eps/impls/krylov/krylovschur/krylovschur.c [0]PETSC ERROR: #3 EPSSolve() line 149 in ../../../slepc/src/eps/interface/epssolve.c -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Tue Mar 16 17:43:50 2021 From: dave.mayhem23 at gmail.com (Dave May) Date: Tue, 16 Mar 2021 23:43:50 +0100 Subject: [petsc-users] error message In-Reply-To: References: Message-ID: On Tue, 16 Mar 2021 at 19:50, Sam Guo wrote: > Dear PETSc dev team, > When there is an PETSc error, I go following overly verbose error > message. Is it possible to get a simple error message like "Initial vector > is zero or belongs to the deflection space"? > > When an error occurs and the execution is halted, a verbose and informative error message is shown. I would argue this is useful (very useful), and should never ever be shortened or truncated. This error thrown by PETSc gives you a stack trace. You can see where the error occurred, and the calling code which resulted in the error. In anything but a trivial code, this information is incredibly useful to isolate and fix the problem. I also think it's neat that you see the stack without having to even use a debugger. Currently if your code does not produce errors, no message is displayed. However, when an error occurs, a loud, long and informative message is displayed - and the code stops. What is the use case which would cause / require you to change the current behaviour? Thanks, Dave > Thanks, > Sam > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Initial vector is zero or belongs to the deflation space > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.11.3, Jun, 26, 2019 > [0]PETSC ERROR: Unknown Name on a arch-starccmplus_serial_real named pl2denbg0033pc.net.plm.eds.com by cd3ppw Tue Mar 16 16:19:28 2021 > [0]PETSC ERROR: Configure options --with-x=0 --with-fc=0 --with-debugging=0 --with-blaslapack-dir=/usr/local/jenkins/dev1/mkl/2017.2-cda-001/linux/lib/intel64/../.. --with-mpi=0 -CFLAGS=-O3 -CXXFLAGS=-O3 --with-clean=1 --force --with-scalar-type=real > [0]PETSC ERROR: #1 EPSGetStartVector() line 806 in ../../../slepc/src/eps/interface/epssolve.c > [0]PETSC ERROR: #2 EPSSolve_KrylovSchur_Default() line 259 in ../../../slepc/src/eps/impls/krylov/krylovschur/krylovschur.c > [0]PETSC ERROR: #3 EPSSolve() line 149 in ../../../slepc/src/eps/interface/epssolve.c > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Mar 16 23:28:27 2021 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 16 Mar 2021 23:28:27 -0500 Subject: [petsc-users] error message In-Reply-To: References: Message-ID: Sam, You can pass a simple C function to PetscPushErrorHandler() that prints the top message and then immediately aborts to get the effect you want. But I agree with Dave you lose a lot of useful information by producing such a simple error message. Barry > On Mar 16, 2021, at 5:43 PM, Dave May wrote: > > > > On Tue, 16 Mar 2021 at 19:50, Sam Guo > wrote: > Dear PETSc dev team, > When there is an PETSc error, I go following overly verbose error message. Is it possible to get a simple error message like "Initial vector is zero or belongs to the deflection space"? > > > When an error occurs and the execution is halted, a verbose and informative error message is shown. > I would argue this is useful (very useful), and should never ever be shortened or truncated. > > This error thrown by PETSc gives you a stack trace. You can see where the error occurred, and the calling code which resulted in the error. In anything but a trivial code, this information is incredibly useful to isolate and fix the problem. I also think it's neat that you see the stack without having to even use a debugger. > > Currently if your code does not produce errors, no message is displayed. > However, when an error occurs, a loud, long and informative message is displayed - and the code stops. > What is the use case which would cause / require you to change the current behaviour? > > Thanks, > Dave > > > Thanks, > Sam > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Initial vector is zero or belongs to the deflation space > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.11.3, Jun, 26, 2019 > [0]PETSC ERROR: Unknown Name on a arch-starccmplus_serial_real named pl2denbg0033pc.net.plm.eds.com by cd3ppw Tue Mar 16 16:19:28 2021 > [0]PETSC ERROR: Configure options --with-x=0 --with-fc=0 --with-debugging=0 --with-blaslapack-dir=/usr/local/jenkins/dev1/mkl/2017.2-cda-001/linux/lib/intel64/../.. --with-mpi=0 -CFLAGS=-O3 -CXXFLAGS=-O3 --with-clean=1 --force --with-scalar-type=real > [0]PETSC ERROR: #1 EPSGetStartVector() line 806 in ../../../slepc/src/eps/interface/epssolve.c > [0]PETSC ERROR: #2 EPSSolve_KrylovSchur_Default() line 259 in ../../../slepc/src/eps/impls/krylov/krylovschur/krylovschur.c > [0]PETSC ERROR: #3 EPSSolve() line 149 in ../../../slepc/src/eps/interface/epssolve.c -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam.guo at cd-adapco.com Wed Mar 17 00:30:09 2021 From: sam.guo at cd-adapco.com (Sam Guo) Date: Tue, 16 Mar 2021 22:30:09 -0700 Subject: [petsc-users] error message In-Reply-To: References: Message-ID: Dave, You made a very point. Thanks, Sam On Tuesday, March 16, 2021, Dave May wrote: > > > On Tue, 16 Mar 2021 at 19:50, Sam Guo wrote: > >> Dear PETSc dev team, >> When there is an PETSc error, I go following overly verbose error >> message. Is it possible to get a simple error message like "Initial vector >> is zero or belongs to the deflection space"? >> >> > When an error occurs and the execution is halted, a verbose and > informative error message is shown. > I would argue this is useful (very useful), and should never ever be > shortened or truncated. > > This error thrown by PETSc gives you a stack trace. You can see where the > error occurred, and the calling code which resulted in the error. In > anything but a trivial code, this information is incredibly useful to > isolate and fix the problem. I also think it's neat that you see the stack > without having to even use a debugger. > > Currently if your code does not produce errors, no message is displayed. > However, when an error occurs, a loud, long and informative message is > displayed - and the code stops. > What is the use case which would cause / require you to change the current > behaviour? > > Thanks, > Dave > > > >> Thanks, >> Sam >> >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: Initial vector is zero or belongs to the deflation space >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.11.3, Jun, 26, 2019 >> [0]PETSC ERROR: Unknown Name on a arch-starccmplus_serial_real named pl2denbg0033pc.net.plm.eds.com by cd3ppw Tue Mar 16 16:19:28 2021 >> [0]PETSC ERROR: Configure options --with-x=0 --with-fc=0 --with-debugging=0 --with-blaslapack-dir=/usr/local/jenkins/dev1/mkl/2017.2-cda-001/linux/lib/intel64/../.. --with-mpi=0 -CFLAGS=-O3 -CXXFLAGS=-O3 --with-clean=1 --force --with-scalar-type=real >> [0]PETSC ERROR: #1 EPSGetStartVector() line 806 in ../../../slepc/src/eps/interface/epssolve.c >> [0]PETSC ERROR: #2 EPSSolve_KrylovSchur_Default() line 259 in ../../../slepc/src/eps/impls/krylov/krylovschur/krylovschur.c >> [0]PETSC ERROR: #3 EPSSolve() line 149 in ../../../slepc/src/eps/interface/epssolve.c >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam.guo at cd-adapco.com Wed Mar 17 00:31:07 2021 From: sam.guo at cd-adapco.com (Sam Guo) Date: Tue, 16 Mar 2021 22:31:07 -0700 Subject: [petsc-users] error message In-Reply-To: References: Message-ID: Barry, Thanks. Sam On Tuesday, March 16, 2021, Barry Smith wrote: > > Sam, > > You can pass a simple C function to PetscPushErrorHandler() that prints > the top message and then immediately aborts to get the effect you want. But > I agree with Dave you lose a lot of useful information by producing such a > simple error message. > > Barry > > On Mar 16, 2021, at 5:43 PM, Dave May wrote: > > > > On Tue, 16 Mar 2021 at 19:50, Sam Guo wrote: > >> Dear PETSc dev team, >> When there is an PETSc error, I go following overly verbose error >> message. Is it possible to get a simple error message like "Initial vector >> is zero or belongs to the deflection space"? >> >> > When an error occurs and the execution is halted, a verbose and > informative error message is shown. > I would argue this is useful (very useful), and should never ever be > shortened or truncated. > > This error thrown by PETSc gives you a stack trace. You can see where the > error occurred, and the calling code which resulted in the error. In > anything but a trivial code, this information is incredibly useful to > isolate and fix the problem. I also think it's neat that you see the stack > without having to even use a debugger. > > Currently if your code does not produce errors, no message is displayed. > However, when an error occurs, a loud, long and informative message is > displayed - and the code stops. > What is the use case which would cause / require you to change the current > behaviour? > > Thanks, > Dave > > > >> Thanks, >> Sam >> >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: Initial vector is zero or belongs to the deflation space >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.11.3, Jun, 26, 2019 >> [0]PETSC ERROR: Unknown Name on a arch-starccmplus_serial_real named pl2denbg0033pc.net.plm.eds.com by cd3ppw Tue Mar 16 16:19:28 2021 >> [0]PETSC ERROR: Configure options --with-x=0 --with-fc=0 --with-debugging=0 --with-blaslapack-dir=/usr/local/jenkins/dev1/mkl/2017.2-cda-001/linux/lib/intel64/../.. --with-mpi=0 -CFLAGS=-O3 -CXXFLAGS=-O3 --with-clean=1 --force --with-scalar-type=real >> [0]PETSC ERROR: #1 EPSGetStartVector() line 806 in ../../../slepc/src/eps/interface/epssolve.c >> [0]PETSC ERROR: #2 EPSSolve_KrylovSchur_Default() line 259 in ../../../slepc/src/eps/impls/krylov/krylovschur/krylovschur.c >> [0]PETSC ERROR: #3 EPSSolve() line 149 in ../../../slepc/src/eps/interface/epssolve.c >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mazumder at purdue.edu Wed Mar 17 09:15:27 2021 From: mazumder at purdue.edu (Sanjoy Kumar Mazumder) Date: Wed, 17 Mar 2021 14:15:27 +0000 Subject: [petsc-users] OOM error while using TSSUNDIALS in PETSc Message-ID: Hi all, I am trying to solve a set of coupled stiff ODEs in parallel using TSSUNDIALS with SUNDIALS_BDF as 'TSSundialsSetType' in PETSc. I am using a sparse Jacobian matrix of type MATMPIAIJ with no preconditioner. It runs for a long time with a very small timestep (~10^-8 - 10^-10) and then terminates abruptly with the following error: 'slurmstepd: error: Detected 4 oom-kill event(s) in step 1701844.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.' After going through some of the common suggestions in the mailing list before, 1) I tried increasing the memory alloted per cpu (--mem-per-cpu) in my batch script but the problem still remains. 2) I have also checked for proper deallocation of the arrays in my function and jacobian sub-routines before every TS iteration. 3) The time allotted for my job in the assigned nodes (wall-time) far exceed the time for which the job is actually running. Is there anything I am missing out or not doing properly? Given below is the complete error that is showing up after the termination. Thanks With regards, Sanjoy -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 11 in communicator MPI_COMM_WORLD with errorcode 50176059. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [1]PETSC ERROR: likely location of problem given in stack below [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [1]PETSC ERROR: INSTEAD the line number of the start of the function [1]PETSC ERROR: is given. [1]PETSC ERROR: [1] TSStep_Sundials line 121 /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c [1]PETSC ERROR: [1] TSStep line 3736 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c [1]PETSC ERROR: [1] TSSolve line 4046 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Signal received [1]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 [1]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 [1]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack --download-sundials=yes --with-debugging [1]PETSC ERROR: #1 User provided function() line 0 in unknown file [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [2]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [2]PETSC ERROR: likely location of problem given in stack below [2]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [2]PETSC ERROR: INSTEAD the line number of the start of the function [2]PETSC ERROR: is given. [2]PETSC ERROR: [2] TSStep_Sundials line 121 /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c [2]PETSC ERROR: [2] TSStep line 3736 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c [2]PETSC ERROR: [2] TSSolve line 4046 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [2]PETSC ERROR: Signal received [2]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [2]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 [2]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 [2]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack --download-sundials=yes --with-debugging [2]PETSC ERROR: #1 User provided function() line 0 in unknown file [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [3]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [3]PETSC ERROR: likely location of problem given in stack below [3]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [3]PETSC ERROR: INSTEAD the line number of the start of the function [3]PETSC ERROR: is given. [3]PETSC ERROR: [3] TSStep_Sundials line 121 /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c [3]PETSC ERROR: [3] TSStep line 3736 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c [3]PETSC ERROR: [3] TSSolve line 4046 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [3]PETSC ERROR: Signal received [3]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [3]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 [3]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 [3]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack --download-sundials=yes --with-debugging [3]PETSC ERROR: #1 User provided function() line 0 in unknown file [4]PETSC ERROR: ------------------------------------------------------------------------ [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [4]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [4]PETSC ERROR: likely location of problem given in stack below [4]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [4]PETSC ERROR: INSTEAD the line number of the start of the function [4]PETSC ERROR: is given. [4]PETSC ERROR: [4] TSStep_Sundials line 121 /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c [4]PETSC ERROR: [4] TSStep line 3736 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c [4]PETSC ERROR: [4] TSSolve line 4046 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c [4]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [4]PETSC ERROR: Signal received [4]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [4]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 [4]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 [4]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack --download-sundials=yes --with-debugging [4]PETSC ERROR: #1 User provided function() line 0 in unknown file -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 0 on node bell-a017 exited on signal 9 (Killed). -------------------------------------------------------------------------- [bell-a017.rcac.purdue.edu:62310] 62 more processes have sent help message help-mpi-api.txt / mpi-abort [bell-a017.rcac.purdue.edu:62310] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages slurmstepd: error: Detected 4 oom-kill event(s) in step 1701844.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler. -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Wed Mar 17 09:27:14 2021 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Wed, 17 Mar 2021 15:27:14 +0100 Subject: [petsc-users] OOM error while using TSSUNDIALS in PETSc In-Reply-To: References: Message-ID: > Am 17.03.2021 um 15:15 schrieb Sanjoy Kumar Mazumder : > > Hi all, > > I am trying to solve a set of coupled stiff ODEs in parallel using TSSUNDIALS with SUNDIALS_BDF as 'TSSundialsSetType' in PETSc. I am using a sparse Jacobian matrix of type MATMPIAIJ with no preconditioner. It runs for a long time with a very small timestep (~10^-8 - 10^-10) and then terminates abruptly with the following error: > > 'slurmstepd: error: Detected 4 oom-kill event(s) in step 1701844.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.' > This general class of problem can arise if there is a (small) memory leak occuring at every time step, so that is the first thing to rule out. > After going through some of the common suggestions in the mailing list before, > > 1) I tried increasing the memory alloted per cpu (--mem-per-cpu) in my batch script but the problem still remains. When you tried increasing the memory allocated per CPU, did the solver take more timesteps before the OOM error? > 2) I have also checked for proper deallocation of the arrays in my function and jacobian sub-routines before every TS iteration. Did you confirm this with a tool like valgrind? If not, Is it possible for you to run a few time steps of your code on a local machine with valgrind? > 3) The time allotted for my job in the assigned nodes (wall-time) far exceed the time for which the job is actually running. > > Is there anything I am missing out or not doing properly? Given below is the complete error that is showing up after the termination. > > Thanks > > With regards, > Sanjoy > > -------------------------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code. Per user-direction, the job has been aborted. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 11 in communicator MPI_COMM_WORLD > with errorcode 50176059. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > [1]PETSC ERROR: ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [1]PETSC ERROR: INSTEAD the line number of the start of the function > [1]PETSC ERROR: is given. > [1]PETSC ERROR: [1] TSStep_Sundials line 121 /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c > [1]PETSC ERROR: [1] TSStep line 3736 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [1]PETSC ERROR: [1] TSSolve line 4046 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: Signal received > [1]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 > [1]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 > [1]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack --download-sundials=yes --with-debugging > [1]PETSC ERROR: #1 User provided function() line 0 in unknown file > [2]PETSC ERROR: ------------------------------------------------------------------------ > [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [2]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [2]PETSC ERROR: likely location of problem given in stack below > [2]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [2]PETSC ERROR: INSTEAD the line number of the start of the function > [2]PETSC ERROR: is given. > [2]PETSC ERROR: [2] TSStep_Sundials line 121 /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c > [2]PETSC ERROR: [2] TSStep line 3736 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [2]PETSC ERROR: [2] TSSolve line 4046 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [2]PETSC ERROR: Signal received > [2]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [2]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 > [2]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 > [2]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack --download-sundials=yes --with-debugging > [2]PETSC ERROR: #1 User provided function() line 0 in unknown file > [3]PETSC ERROR: ------------------------------------------------------------------------ > [3]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [3]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [3]PETSC ERROR: likely location of problem given in stack below > [3]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [3]PETSC ERROR: INSTEAD the line number of the start of the function > [3]PETSC ERROR: is given. > [3]PETSC ERROR: [3] TSStep_Sundials line 121 /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c > [3]PETSC ERROR: [3] TSStep line 3736 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [3]PETSC ERROR: [3] TSSolve line 4046 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [3]PETSC ERROR: Signal received > [3]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [3]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 > [3]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 > [3]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack --download-sundials=yes --with-debugging > [3]PETSC ERROR: #1 User provided function() line 0 in unknown file > [4]PETSC ERROR: ------------------------------------------------------------------------ > [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [4]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [4]PETSC ERROR: likely location of problem given in stack below > [4]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [4]PETSC ERROR: INSTEAD the line number of the start of the function > [4]PETSC ERROR: is given. > [4]PETSC ERROR: [4] TSStep_Sundials line 121 /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c > [4]PETSC ERROR: [4] TSStep line 3736 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [4]PETSC ERROR: [4] TSSolve line 4046 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [4]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [4]PETSC ERROR: Signal received > [4]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [4]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 > [4]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 > [4]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack --download-sundials=yes --with-debugging > [4]PETSC ERROR: #1 User provided function() line 0 in unknown file > -------------------------------------------------------------------------- > mpirun noticed that process rank 0 with PID 0 on node bell-a017 exited on signal 9 (Killed). > -------------------------------------------------------------------------- > [bell-a017.rcac.purdue.edu:62310 ] 62 more processes have sent help message help-mpi-api.txt / mpi-abort > [bell-a017.rcac.purdue.edu:62310 ] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages > slurmstepd: error: Detected 4 oom-kill event(s) in step 1701844.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler. -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Mar 17 09:42:01 2021 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 17 Mar 2021 09:42:01 -0500 Subject: [petsc-users] OOM error while using TSSUNDIALS in PETSc In-Reply-To: References: Message-ID: <549fe285-88cf-3c24-bec0-19919dd967b3@mcs.anl.gov> On Wed, 17 Mar 2021, Patrick Sanan wrote: > > > > > > Am 17.03.2021 um 15:15 schrieb Sanjoy Kumar Mazumder : > > > > Hi all, > > > > I am trying to solve a set of coupled stiff ODEs in parallel using TSSUNDIALS with SUNDIALS_BDF as 'TSSundialsSetType' in PETSc. I am using a sparse Jacobian matrix of type MATMPIAIJ with no preconditioner. It runs for a long time with a very small timestep (~10^-8 - 10^-10) and then terminates abruptly with the following error: > > > > 'slurmstepd: error: Detected 4 oom-kill event(s) in step 1701844.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.' > > > This general class of problem can arise if there is a (small) memory leak occuring at every time step, so that is the first thing to rule out. > > > After going through some of the common suggestions in the mailing list before, > > > > 1) I tried increasing the memory alloted per cpu (--mem-per-cpu) in my batch script but the problem still remains. > When you tried increasing the memory allocated per CPU, did the solver take more timesteps before the OOM error? > > > 2) I have also checked for proper deallocation of the arrays in my function and jacobian sub-routines before every TS iteration. > Did you confirm this with a tool like valgrind? If not, Is it possible for you to run a few time steps of your code on a local machine with valgrind? If PetscMalloc is used [or petsc objects not destroyed] - you can check with -malloc_dump Satish > > > 3) The time allotted for my job in the assigned nodes (wall-time) far exceed the time for which the job is actually running. > > > > Is there anything I am missing out or not doing properly? Given below is the complete error that is showing up after the termination. > > > > Thanks > > > > With regards, > > Sanjoy > > > > -------------------------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code. Per user-direction, the job has been aborted. > > -------------------------------------------------------------------------- > > -------------------------------------------------------------------------- > > MPI_ABORT was invoked on rank 11 in communicator MPI_COMM_WORLD > > with errorcode 50176059. > > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > > You may or may not see output from other processes, depending on > > exactly when Open MPI kills them. > > -------------------------------------------------------------------------- > > [1]PETSC ERROR: ------------------------------------------------------------------------ > > [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [1]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > [1]PETSC ERROR: likely location of problem given in stack below > > [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > [1]PETSC ERROR: INSTEAD the line number of the start of the function > > [1]PETSC ERROR: is given. > > [1]PETSC ERROR: [1] TSStep_Sundials line 121 /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c > > [1]PETSC ERROR: [1] TSStep line 3736 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > > [1]PETSC ERROR: [1] TSSolve line 4046 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [1]PETSC ERROR: Signal received > > [1]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 > > [1]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 > > [1]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack --download-sundials=yes --with-debugging > > [1]PETSC ERROR: #1 User provided function() line 0 in unknown file > > [2]PETSC ERROR: ------------------------------------------------------------------------ > > [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > > [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [2]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > [2]PETSC ERROR: likely location of problem given in stack below > > [2]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > [2]PETSC ERROR: INSTEAD the line number of the start of the function > > [2]PETSC ERROR: is given. > > [2]PETSC ERROR: [2] TSStep_Sundials line 121 /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c > > [2]PETSC ERROR: [2] TSStep line 3736 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > > [2]PETSC ERROR: [2] TSSolve line 4046 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > > [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [2]PETSC ERROR: Signal received > > [2]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [2]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 > > [2]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 > > [2]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack --download-sundials=yes --with-debugging > > [2]PETSC ERROR: #1 User provided function() line 0 in unknown file > > [3]PETSC ERROR: ------------------------------------------------------------------------ > > [3]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > > [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [3]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > [3]PETSC ERROR: likely location of problem given in stack below > > [3]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > [3]PETSC ERROR: INSTEAD the line number of the start of the function > > [3]PETSC ERROR: is given. > > [3]PETSC ERROR: [3] TSStep_Sundials line 121 /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c > > [3]PETSC ERROR: [3] TSStep line 3736 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > > [3]PETSC ERROR: [3] TSSolve line 4046 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > > [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [3]PETSC ERROR: Signal received > > [3]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [3]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 > > [3]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 > > [3]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack --download-sundials=yes --with-debugging > > [3]PETSC ERROR: #1 User provided function() line 0 in unknown file > > [4]PETSC ERROR: ------------------------------------------------------------------------ > > [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > > [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [4]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > [4]PETSC ERROR: likely location of problem given in stack below > > [4]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > [4]PETSC ERROR: INSTEAD the line number of the start of the function > > [4]PETSC ERROR: is given. > > [4]PETSC ERROR: [4] TSStep_Sundials line 121 /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c > > [4]PETSC ERROR: [4] TSStep line 3736 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > > [4]PETSC ERROR: [4] TSSolve line 4046 /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > > [4]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [4]PETSC ERROR: Signal received > > [4]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [4]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 > > [4]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 > > [4]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack --download-sundials=yes --with-debugging > > [4]PETSC ERROR: #1 User provided function() line 0 in unknown file > > -------------------------------------------------------------------------- > > mpirun noticed that process rank 0 with PID 0 on node bell-a017 exited on signal 9 (Killed). > > -------------------------------------------------------------------------- > > [bell-a017.rcac.purdue.edu:62310 ] 62 more processes have sent help message help-mpi-api.txt / mpi-abort > > [bell-a017.rcac.purdue.edu:62310 ] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages > > slurmstepd: error: Detected 4 oom-kill event(s) in step 1701844.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler. > > From knepley at gmail.com Wed Mar 17 09:51:30 2021 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 Mar 2021 10:51:30 -0400 Subject: [petsc-users] OOM error while using TSSUNDIALS in PETSc In-Reply-To: References: Message-ID: On Wed, Mar 17, 2021 at 10:15 AM Sanjoy Kumar Mazumder wrote: > Hi all, > > I am trying to solve a set of coupled stiff ODEs in parallel using > TSSUNDIALS with SUNDIALS_BDF as 'TSSundialsSetType' in PETSc. I am using a > sparse Jacobian matrix of type MATMPIAIJ with no preconditioner. > What is the convergence of your linear solver like? You can see this using -ksp_converged_reason Thanks, Matt > It runs for a long time with a very small timestep (~10^-8 - 10^-10) and > then terminates abruptly with the following error: > > 'slurmstepd: error: Detected 4 oom-kill event(s) in step 1701844.batch > cgroup. Some of your processes may have been killed by the cgroup > out-of-memory handler.' > > After going through some of the common suggestions in the mailing list > before, > > 1) I tried increasing the memory alloted per cpu (--mem-per-cpu) in my > batch script but the problem still remains. > 2) I have also checked for proper deallocation of the arrays in my > function and jacobian sub-routines before every TS iteration. > 3) The time allotted for my job in the assigned nodes (wall-time) far > exceed the time for which the job is actually running. > > Is there anything I am missing out or not doing properly? Given below is > the complete error that is showing up after the termination. > > Thanks > > With regards, > Sanjoy > > -------------------------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code. Per user-direction, the job has been aborted. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 11 in communicator MPI_COMM_WORLD > with errorcode 50176059. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see > https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [1]PETSC ERROR: INSTEAD the line number of the start of the function > [1]PETSC ERROR: is given. > [1]PETSC ERROR: [1] TSStep_Sundials line 121 > /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c > [1]PETSC ERROR: [1] TSStep line 3736 > /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [1]PETSC ERROR: [1] TSSolve line 4046 > /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Signal received > [1]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 > [1]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named > bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 > [1]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx > --with-fc=mpif90 --download-fblaslapack --download-sundials=yes > --with-debugging > [1]PETSC ERROR: #1 User provided function() line 0 in unknown file > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [2]PETSC ERROR: or see > https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [2]PETSC ERROR: likely location of problem given in stack below > [2]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [2]PETSC ERROR: INSTEAD the line number of the start of the function > [2]PETSC ERROR: is given. > [2]PETSC ERROR: [2] TSStep_Sundials line 121 > /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c > [2]PETSC ERROR: [2] TSStep line 3736 > /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [2]PETSC ERROR: [2] TSSolve line 4046 > /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [2]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [2]PETSC ERROR: Signal received > [2]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [2]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 > [2]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named > bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 > [2]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx > --with-fc=mpif90 --download-fblaslapack --download-sundials=yes > --with-debugging > [2]PETSC ERROR: #1 User provided function() line 0 in unknown file > [3]PETSC ERROR: > ------------------------------------------------------------------------ > [3]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [3]PETSC ERROR: or see > https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [3]PETSC ERROR: likely location of problem given in stack below > [3]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [3]PETSC ERROR: INSTEAD the line number of the start of the function > [3]PETSC ERROR: is given. > [3]PETSC ERROR: [3] TSStep_Sundials line 121 > /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c > [3]PETSC ERROR: [3] TSStep line 3736 > /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [3]PETSC ERROR: [3] TSSolve line 4046 > /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [3]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [3]PETSC ERROR: Signal received > [3]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [3]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 > [3]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named > bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 > [3]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx > --with-fc=mpif90 --download-fblaslapack --download-sundials=yes > --with-debugging > [3]PETSC ERROR: #1 User provided function() line 0 in unknown file > [4]PETSC ERROR: > ------------------------------------------------------------------------ > [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [4]PETSC ERROR: or see > https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [4]PETSC ERROR: likely location of problem given in stack below > [4]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [4]PETSC ERROR: INSTEAD the line number of the start of the function > [4]PETSC ERROR: is given. > [4]PETSC ERROR: [4] TSStep_Sundials line 121 > /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c > [4]PETSC ERROR: [4] TSStep line 3736 > /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [4]PETSC ERROR: [4] TSSolve line 4046 > /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c > [4]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [4]PETSC ERROR: Signal received > [4]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [4]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 > [4]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named > bell-a017.rcac.purdue.edu by mazumder Mon Mar 15 13:26:36 2021 > [4]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx > --with-fc=mpif90 --download-fblaslapack --download-sundials=yes > --with-debugging > [4]PETSC ERROR: #1 User provided function() line 0 in unknown file > -------------------------------------------------------------------------- > mpirun noticed that process rank 0 with PID 0 on node bell-a017 exited on > signal 9 (Killed). > -------------------------------------------------------------------------- > [bell-a017.rcac.purdue.edu:62310] 62 more processes have sent help > message help-mpi-api.txt / mpi-abort > [bell-a017.rcac.purdue.edu:62310] Set MCA parameter > "orte_base_help_aggregate" to 0 to see all help / error messages > slurmstepd: error: Detected 4 oom-kill event(s) in step 1701844.batch > cgroup. Some of your processes may have been killed by the cgroup > out-of-memory handler. > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Mar 19 09:44:32 2021 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 19 Mar 2021 10:44:32 -0400 Subject: [petsc-users] sort of error messages Message-ID: We are statically linking and I get this "error" message but the tests look successful. Should we care about this? Thanks, Mark 07:40 nid00274 release= ~/petsc_install/petsc$ make PETSC_DIR=/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel PETSC_ARCH="" check Running check examples to verify correct installation Using PETSC_DIR=/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel and PETSC_ARCH= *******************Error detected during compile or link!******************* See http://www.mcs.anl.gov/petsc/documentation/faq.html /global/homes/m/madams/petsc_install/petsc/src/snes/tutorials ex19 ********************************************************************************* cc -g -O2 -fp-model fast -g -O2 -fp-model fast -I/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/include ex19.c -L/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib -Wl,-rpath,/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib -L/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib -L/opt/cray/pe/libsci/19.06.1/INTEL/16.0/x86_64/lib -L/opt/cray/dmapp/default/lib64 -L/opt/cray/pe/mpt/7.7.10/gni/mpich-intel/16.0/lib -L/opt/cray/rca/2.2.20-7.0.1.1_4.61__g8e3fb5b.ari/lib64 -L/opt/cray/alps/6.6.58-7.0.1.1_6.19__g437d88db.ari/lib64 -L/opt/cray/xpmem/2.2.20-7.0.1.1_4.20__g0475745.ari/lib64 -L/opt/cray/pe/pmi/5.0.14/lib64 -L/opt/cray/ugni/6.0.14.0-7.0.1.1_7.49__ge78e5b0.ari/lib64 -L/opt/cray/udreg/2.3.2-7.0.1.1_3.47__g8175d3d.ari/lib64 -L/opt/cray/pe/atp/2.1.3/libApp -L/opt/cray/wlm_detect/1.3.3-7.0.1.1_4.19__g7109084.ari/lib64 -L/opt/intel/compilers_and_libraries_2019.3.199/linux/compiler/lib/intel64 -L/opt/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64 -L/opt/intel/compilers_and_libraries_2019.3.199/linux/compiler/lib/intel64_lin -L/usr/lib64/gcc/x86_64-suse-linux/7 -L/usr/x86_64-suse-linux/lib -lpetsc -lflapack -lfblas -lparmetis -lmetis -lstdc++ -ldl -lpthread -lmpichf90_intel -lrt -lugni -lpmi -lm -lsci_intel_mpi -lsci_intel -lmpich_intel -lalpslli -lwlm_detect -lalpsutil -lrca -lxpmem -ludreg -lhugetlbfs -lAtpSigHandler -lAtpSigHCommData -limf -lifport -lifcore -lsvml -lipgo -lirc -lgcc_eh -lirc_s -lstdc++ -ldl -o ex19 /usr/bin/ld: /project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib/libpetsc.a(dlimpl.o): in function `PetscDLOpen': /global/u2/m/madams/petsc_install/petsc/src/sys/dll/dlimpl.c:108: warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking /usr/bin/ld: /project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib/libpetsc.a(send.o): in function `PetscOpenSocket': /global/u2/m/madams/petsc_install/petsc/src/sys/classes/viewer/impls/socket/send.c:108: warning: Using 'gethostbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes *******************Error detected during compile or link!******************* See http://www.mcs.anl.gov/petsc/documentation/faq.html /global/homes/m/madams/petsc_install/petsc/src/snes/tutorials ex5f ********************************************************* ftn -g -O2 -fp-model fast -g -O2 -fp-model fast -I/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/include ex5f.F90 -L/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib -Wl,-rpath,/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib -L/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib -L/opt/cray/pe/libsci/19.06.1/INTEL/16.0/x86_64/lib -L/opt/cray/dmapp/default/lib64 -L/opt/cray/pe/mpt/7.7.10/gni/mpich-intel/16.0/lib -L/opt/cray/rca/2.2.20-7.0.1.1_4.61__g8e3fb5b.ari/lib64 -L/opt/cray/alps/6.6.58-7.0.1.1_6.19__g437d88db.ari/lib64 -L/opt/cray/xpmem/2.2.20-7.0.1.1_4.20__g0475745.ari/lib64 -L/opt/cray/pe/pmi/5.0.14/lib64 -L/opt/cray/ugni/6.0.14.0-7.0.1.1_7.49__ge78e5b0.ari/lib64 -L/opt/cray/udreg/2.3.2-7.0.1.1_3.47__g8175d3d.ari/lib64 -L/opt/cray/pe/atp/2.1.3/libApp -L/opt/cray/wlm_detect/1.3.3-7.0.1.1_4.19__g7109084.ari/lib64 -L/opt/intel/compilers_and_libraries_2019.3.199/linux/compiler/lib/intel64 -L/opt/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64 -L/opt/intel/compilers_and_libraries_2019.3.199/linux/compiler/lib/intel64_lin -L/usr/lib64/gcc/x86_64-suse-linux/7 -L/usr/x86_64-suse-linux/lib -lpetsc -lflapack -lfblas -lparmetis -lmetis -lstdc++ -ldl -lpthread -lmpichf90_intel -lrt -lugni -lpmi -lm -lsci_intel_mpi -lsci_intel -lmpich_intel -lalpslli -lwlm_detect -lalpsutil -lrca -lxpmem -ludreg -lhugetlbfs -lAtpSigHandler -lAtpSigHCommData -limf -lifport -lifcore -lsvml -lipgo -lirc -lgcc_eh -lirc_s -lstdc++ -ldl -o ex5f /usr/bin/ld: /project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib/libpetsc.a(dlimpl.o): in function `PetscDLOpen': /global/u2/m/madams/petsc_install/petsc/src/sys/dll/dlimpl.c:108: warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking /usr/bin/ld: /project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib/libpetsc.a(send.o): in function `PetscOpenSocket': /global/u2/m/madams/petsc_install/petsc/src/sys/classes/viewer/impls/socket/send.c:108: warning: Using 'gethostbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process Completed test examples -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Mar 19 10:02:52 2021 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 19 Mar 2021 11:02:52 -0400 Subject: [petsc-users] sort of error messages In-Reply-To: References: Message-ID: Oh, these are just common things that we see. Mark On Fri, Mar 19, 2021 at 10:44 AM Mark Adams wrote: > We are statically linking and I get this "error" message but the tests > look successful. > > Should we care about this? > > Thanks, > Mark > > 07:40 nid00274 release= ~/petsc_install/petsc$ make > PETSC_DIR=/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel > PETSC_ARCH="" check > Running check examples to verify correct installation > Using > PETSC_DIR=/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel > and PETSC_ARCH= > *******************Error detected during compile or > link!******************* > See http://www.mcs.anl.gov/petsc/documentation/faq.html > /global/homes/m/madams/petsc_install/petsc/src/snes/tutorials ex19 > > ********************************************************************************* > cc -g -O2 -fp-model fast -g -O2 -fp-model fast > -I/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/include > ex19.c > -L/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib > -Wl,-rpath,/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib > -L/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib > -L/opt/cray/pe/libsci/19.06.1/INTEL/16.0/x86_64/lib > -L/opt/cray/dmapp/default/lib64 > -L/opt/cray/pe/mpt/7.7.10/gni/mpich-intel/16.0/lib > -L/opt/cray/rca/2.2.20-7.0.1.1_4.61__g8e3fb5b.ari/lib64 > -L/opt/cray/alps/6.6.58-7.0.1.1_6.19__g437d88db.ari/lib64 > -L/opt/cray/xpmem/2.2.20-7.0.1.1_4.20__g0475745.ari/lib64 > -L/opt/cray/pe/pmi/5.0.14/lib64 > -L/opt/cray/ugni/6.0.14.0-7.0.1.1_7.49__ge78e5b0.ari/lib64 > -L/opt/cray/udreg/2.3.2-7.0.1.1_3.47__g8175d3d.ari/lib64 > -L/opt/cray/pe/atp/2.1.3/libApp > -L/opt/cray/wlm_detect/1.3.3-7.0.1.1_4.19__g7109084.ari/lib64 > -L/opt/intel/compilers_and_libraries_2019.3.199/linux/compiler/lib/intel64 > -L/opt/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64 > -L/opt/intel/compilers_and_libraries_2019.3.199/linux/compiler/lib/intel64_lin > -L/usr/lib64/gcc/x86_64-suse-linux/7 -L/usr/x86_64-suse-linux/lib -lpetsc > -lflapack -lfblas -lparmetis -lmetis -lstdc++ -ldl -lpthread > -lmpichf90_intel -lrt -lugni -lpmi -lm -lsci_intel_mpi -lsci_intel > -lmpich_intel -lalpslli -lwlm_detect -lalpsutil -lrca -lxpmem -ludreg > -lhugetlbfs -lAtpSigHandler -lAtpSigHCommData -limf -lifport -lifcore > -lsvml -lipgo -lirc -lgcc_eh -lirc_s -lstdc++ -ldl -o ex19 > /usr/bin/ld: > /project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib/libpetsc.a(dlimpl.o): > in function `PetscDLOpen': > /global/u2/m/madams/petsc_install/petsc/src/sys/dll/dlimpl.c:108: warning: > Using 'dlopen' in statically linked applications requires at runtime the > shared libraries from the glibc version used for linking > /usr/bin/ld: > /project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib/libpetsc.a(send.o): > in function `PetscOpenSocket': > /global/u2/m/madams/petsc_install/petsc/src/sys/classes/viewer/impls/socket/send.c:108: > warning: Using 'gethostbyname' in statically linked applications requires > at runtime the shared libraries from the glibc version used for linking > C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process > C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes > *******************Error detected during compile or > link!******************* > See http://www.mcs.anl.gov/petsc/documentation/faq.html > /global/homes/m/madams/petsc_install/petsc/src/snes/tutorials ex5f > ********************************************************* > ftn -g -O2 -fp-model fast -g -O2 -fp-model fast > -I/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/include > ex5f.F90 > -L/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib > -Wl,-rpath,/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib > -L/project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib > -L/opt/cray/pe/libsci/19.06.1/INTEL/16.0/x86_64/lib > -L/opt/cray/dmapp/default/lib64 > -L/opt/cray/pe/mpt/7.7.10/gni/mpich-intel/16.0/lib > -L/opt/cray/rca/2.2.20-7.0.1.1_4.61__g8e3fb5b.ari/lib64 > -L/opt/cray/alps/6.6.58-7.0.1.1_6.19__g437d88db.ari/lib64 > -L/opt/cray/xpmem/2.2.20-7.0.1.1_4.20__g0475745.ari/lib64 > -L/opt/cray/pe/pmi/5.0.14/lib64 > -L/opt/cray/ugni/6.0.14.0-7.0.1.1_7.49__ge78e5b0.ari/lib64 > -L/opt/cray/udreg/2.3.2-7.0.1.1_3.47__g8175d3d.ari/lib64 > -L/opt/cray/pe/atp/2.1.3/libApp > -L/opt/cray/wlm_detect/1.3.3-7.0.1.1_4.19__g7109084.ari/lib64 > -L/opt/intel/compilers_and_libraries_2019.3.199/linux/compiler/lib/intel64 > -L/opt/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64 > -L/opt/intel/compilers_and_libraries_2019.3.199/linux/compiler/lib/intel64_lin > -L/usr/lib64/gcc/x86_64-suse-linux/7 -L/usr/x86_64-suse-linux/lib -lpetsc > -lflapack -lfblas -lparmetis -lmetis -lstdc++ -ldl -lpthread > -lmpichf90_intel -lrt -lugni -lpmi -lm -lsci_intel_mpi -lsci_intel > -lmpich_intel -lalpslli -lwlm_detect -lalpsutil -lrca -lxpmem -ludreg > -lhugetlbfs -lAtpSigHandler -lAtpSigHCommData -limf -lifport -lifcore > -lsvml -lipgo -lirc -lgcc_eh -lirc_s -lstdc++ -ldl -o ex5f > /usr/bin/ld: > /project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib/libpetsc.a(dlimpl.o): > in function `PetscDLOpen': > /global/u2/m/madams/petsc_install/petsc/src/sys/dll/dlimpl.c:108: warning: > Using 'dlopen' in statically linked applications requires at runtime the > shared libraries from the glibc version used for linking > /usr/bin/ld: > /project/projectdirs/m499/Software/petsc/3.14.5/cori_haswell/intel/lib/libpetsc.a(send.o): > in function `PetscOpenSocket': > /global/u2/m/madams/petsc_install/petsc/src/sys/classes/viewer/impls/socket/send.c:108: > warning: Using 'gethostbyname' in statically linked applications requires > at runtime the shared libraries from the glibc version used for linking > Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process > Completed test examples > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourdin at lsu.edu Fri Mar 19 10:22:00 2021 From: bourdin at lsu.edu (Blaise A Bourdin) Date: Fri, 19 Mar 2021 15:22:00 +0000 Subject: [petsc-users] Petsc on AMD EPYC Message-ID: Hi, I am pricing out a new small cluster for my group and have been out of the HW loop for a while. Does anybody on the list have experience running petsc on recent generation AMD EPYC? I assume that the intel will not generate optimized code for non-intel COU. How about gcc / clang? My metric is not pure performance but rather performance over price. I do mostly implicit finite element codes and 90% of my walltime is KSP / SNES solves Regards, Blaise -- A.K. & Shirley Barton Professor of Mathematics Adjunct Professor of Mechanical Engineering Adjunct of the Center for Computation & Technology Louisiana State University, Lockett Hall Room 344, Baton Rouge, LA 70803, USA Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 Web http://www.math.lsu.edu/~bourdin From jed at jedbrown.org Fri Mar 19 10:27:36 2021 From: jed at jedbrown.org (Jed Brown) Date: Fri, 19 Mar 2021 09:27:36 -0600 Subject: [petsc-users] Petsc on AMD EPYC In-Reply-To: References: Message-ID: <87blbf2mfr.fsf@jedbrown.org> Blaise A Bourdin writes: > Hi, > > I am pricing out a new small cluster for my group and have been out of the HW loop for a while. > Does anybody on the list have experience running petsc on recent generation AMD EPYC? > I assume that the intel will not generate optimized code for non-intel COU. How about gcc / clang? > My metric is not pure performance but rather performance over price. I have a 2-socket 7452, which is a great price point for our sort of work. If you're purely memory bandwidth-limited, you can get fewer cores, but that doesn't save a lot of money. You'd probably want to compare pricing with the new Zen 3 chips too. I'll forward you a log file from a multigrid solver with some analysis. Suffice it to say, it matches to significantly outperforms (depending on the operation) a 2-socket Xeon 8280. > I do mostly implicit finite element codes and 90% of my walltime is KSP / SNES solves > > Regards, > Blaise > > -- > A.K. & Shirley Barton Professor of Mathematics > Adjunct Professor of Mechanical Engineering > Adjunct of the Center for Computation & Technology > Louisiana State University, Lockett Hall Room 344, Baton Rouge, LA 70803, USA > Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 Web http://www.math.lsu.edu/~bourdin From bourdin at lsu.edu Fri Mar 19 11:19:47 2021 From: bourdin at lsu.edu (Blaise A Bourdin) Date: Fri, 19 Mar 2021 16:19:47 +0000 Subject: [petsc-users] Petsc on AMD EPYC In-Reply-To: <87blbf2mfr.fsf@jedbrown.org> References: <87blbf2mfr.fsf@jedbrown.org> Message-ID: That would be great. What would be even greater would be if I could run the same test on LSU machines (Xeon 8260). Blaise On Mar 19, 2021, at 10:27 AM, Jed Brown > wrote: Blaise A Bourdin > writes: Hi, I am pricing out a new small cluster for my group and have been out of the HW loop for a while. Does anybody on the list have experience running petsc on recent generation AMD EPYC? I assume that the intel will not generate optimized code for non-intel COU. How about gcc / clang? My metric is not pure performance but rather performance over price. I have a 2-socket 7452, which is a great price point for our sort of work. If you're purely memory bandwidth-limited, you can get fewer cores, but that doesn't save a lot of money. You'd probably want to compare pricing with the new Zen 3 chips too. I'll forward you a log file from a multigrid solver with some analysis. Suffice it to say, it matches to significantly outperforms (depending on the operation) a 2-socket Xeon 8280. I do mostly implicit finite element codes and 90% of my walltime is KSP / SNES solves Regards, Blaise -- A.K. & Shirley Barton Professor of Mathematics Adjunct Professor of Mechanical Engineering Adjunct of the Center for Computation & Technology Louisiana State University, Lockett Hall Room 344, Baton Rouge, LA 70803, USA Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 Web http://www.math.lsu.edu/~bourdin -- A.K. & Shirley Barton Professor of Mathematics Adjunct Professor of Mechanical Engineering Adjunct of the Center for Computation & Technology Louisiana State University, Lockett Hall Room 344, Baton Rouge, LA 70803, USA Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 Web http://www.math.lsu.edu/~bourdin -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 19 13:22:10 2021 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 19 Mar 2021 14:22:10 -0400 Subject: [petsc-users] Petsc on AMD EPYC In-Reply-To: References: <87blbf2mfr.fsf@jedbrown.org> Message-ID: I think to some extent it matters what deal you can get. I got the vendor to throw in the interconnect if I got the right processors, which overall was a much better deal for us. Matt On Fri, Mar 19, 2021 at 12:19 PM Blaise A Bourdin wrote: > That would be great. > What would be even greater would be if I could run the same test on LSU > machines (Xeon 8260). > Blaise > > > On Mar 19, 2021, at 10:27 AM, Jed Brown wrote: > > Blaise A Bourdin writes: > > Hi, > > I am pricing out a new small cluster for my group and have been out of the > HW loop for a while. > Does anybody on the list have experience running petsc on recent > generation AMD EPYC? > I assume that the intel will not generate optimized code for non-intel > COU. How about gcc / clang? > My metric is not pure performance but rather performance over price. > > > I have a 2-socket 7452, which is a great price point for our sort of work. > If you're purely memory bandwidth-limited, you can get fewer cores, but > that doesn't save a lot of money. You'd probably want to compare pricing > with the new Zen 3 chips too. I'll forward you a log file from a multigrid > solver with some analysis. Suffice it to say, it matches to significantly > outperforms (depending on the operation) a 2-socket Xeon 8280. > > I do mostly implicit finite element codes and 90% of my walltime is KSP / > SNES solves > > Regards, > Blaise > > -- > A.K. & Shirley Barton Professor of Mathematics > Adjunct Professor of Mechanical Engineering > Adjunct of the Center for Computation & Technology > Louisiana State University, Lockett Hall Room 344, Baton Rouge, LA 70803, > USA > Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 Web > http://www.math.lsu.edu/~bourdin > > > -- > A.K. & Shirley Barton Professor of Mathematics > Adjunct Professor of Mechanical Engineering > Adjunct of the Center for Computation & Technology > Louisiana State University, Lockett Hall Room 344, Baton Rouge, LA 70803, > USA > Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 Web > http://www.math.lsu.edu/~bourdin > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Mar 19 14:11:47 2021 From: jed at jedbrown.org (Jed Brown) Date: Fri, 19 Mar 2021 13:11:47 -0600 Subject: [petsc-users] Petsc on AMD EPYC In-Reply-To: References: <87blbf2mfr.fsf@jedbrown.org> Message-ID: <87sg4r0xho.fsf@jedbrown.org> Should all work from the logs I sent. I'll mention a couple points here: 1. Set NPS4 in the BIOS -- it improves memory bandwidth for jobs with good memory locality, such as MPI. 2. Make sure to get a chassis that supports the fastest memory (DDR4-3200 for Zen 2). 3. Use BLIS instead of MKL or OpenBLAS. You can use a package manager or configure PETSc with --download-blis --download-f2cblaslapack. https://github.com/flame/blis/blob/master/docs/Performance.md#zen2-results Blaise A Bourdin writes: > That would be great. > What would be even greater would be if I could run the same test on LSU machines (Xeon 8260). > Blaise > > > On Mar 19, 2021, at 10:27 AM, Jed Brown > wrote: > > Blaise A Bourdin > writes: > > Hi, > > I am pricing out a new small cluster for my group and have been out of the HW loop for a while. > Does anybody on the list have experience running petsc on recent generation AMD EPYC? > I assume that the intel will not generate optimized code for non-intel COU. How about gcc / clang? > My metric is not pure performance but rather performance over price. > > I have a 2-socket 7452, which is a great price point for our sort of work. If you're purely memory bandwidth-limited, you can get fewer cores, but that doesn't save a lot of money. You'd probably want to compare pricing with the new Zen 3 chips too. I'll forward you a log file from a multigrid solver with some analysis. Suffice it to say, it matches to significantly outperforms (depending on the operation) a 2-socket Xeon 8280. > > I do mostly implicit finite element codes and 90% of my walltime is KSP / SNES solves > > Regards, > Blaise > > -- > A.K. & Shirley Barton Professor of Mathematics > Adjunct Professor of Mechanical Engineering > Adjunct of the Center for Computation & Technology > Louisiana State University, Lockett Hall Room 344, Baton Rouge, LA 70803, USA > Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 Web http://www.math.lsu.edu/~bourdin > > -- > A.K. & Shirley Barton Professor of Mathematics > Adjunct Professor of Mechanical Engineering > Adjunct of the Center for Computation & Technology > Louisiana State University, Lockett Hall Room 344, Baton Rouge, LA 70803, USA > Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 Web http://www.math.lsu.edu/~bourdin From zjorti at lanl.gov Fri Mar 19 19:49:54 2021 From: zjorti at lanl.gov (Jorti, Zakariae) Date: Sat, 20 Mar 2021 00:49:54 +0000 Subject: [petsc-users] Runtime error Message-ID: Hi, I have a PETSc code which works in some cases circumstances whereas in others it gives me the error message below. The code works on my machine when I compile PETSc with debug flag on: ./configure PETSC_DIR=/Users/zjorti/software/petsc-3.14.5 PETSC_ARCH=macx --with-fc=0 --with-mpi-dir=$HOME/.brew --download-hypre --with-debugging=1 --with-cxx-dialect=C++11 It does not work however when I compile Petsc with debug flag off: ./configure PETSC_DIR=/Users/zjorti/software/petsc-3.14.5 PETSC_ARCH=macx --with-fc=0 --with-mpi-dir=$HOME/.brew --download-hypre --with-debugging=0 --with-cxx-dialect=C++11 I asked a colleague to test this same code. He compiles PETSc with debug flag off, runs the code and it works for him. He has an older machine but we both use the same version PETSc version which is 3.14.5. As I do not have any debugger I could not identify the cause of this runtime error. One more thing: We did not run the code in parallel. Is this a hardware related issue or is it something else? Thanks, Zakariae [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 [0]PETSC ERROR: ./mimeticcurleuler2 on a macx named pn2032683.lanl.gov by zjorti Fri Mar 19 16:07:03 2021 [0]PETSC ERROR: Configure options PETSC_DIR=/Users/zjorti/software/petsc-3.14.5 PETSC_ARCH=macx --with-fc=0 --with-mpi-dir=/Users/zjorti/.brew --download-hypre --with-debugging=0 --with-cxx-dialect=C++11 [0]PETSC ERROR: #1 User provided function() line 0 in unknown file [0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash. -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 50176059. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 19 21:23:32 2021 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 19 Mar 2021 22:23:32 -0400 Subject: [petsc-users] Runtime error In-Reply-To: References: Message-ID: On Fri, Mar 19, 2021 at 8:50 PM Jorti, Zakariae via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > > > I have a PETSc code which works in some cases circumstances whereas in > others it gives me the error message below. > > The code works on my machine when I compile PETSc with debug flag on: > > ./configure PETSC_DIR=/Users/zjorti/software/petsc-3.14.5 PETSC_ARCH=macx > --with-fc=0 --with-mpi-dir=$HOME/.brew --download-hypre --with-debugging=1 > --with-cxx-dialect=C++11 > > > > It does not work however when I compile Petsc with debug flag off: > > ./configure PETSC_DIR=/Users/zjorti/software/petsc-3.14.5 PETSC_ARCH=macx > --with-fc=0 --with-mpi-dir=$HOME/.brew --download-hypre --with-debugging=0 > --with-cxx-dialect=C++11 > > > > I asked a colleague to test this same code. > > He compiles PETSc with debug flag off, runs the code and it works for him. > He has an older machine but we both use the same version PETSc version > which is 3.14.5. > These are the symptoms of an uninitialized variable or memory overwrite. Both things should be caught by valgrind, which we highly recommend, but you can start by running with -malloc_debug. Thanks, Matt > As I do not have any debugger I could not identify the cause of this > runtime error. > > One more thing: We did not run the code in parallel. > > Is this a hardware related issue or is it something else? > > Thanks, > > > Zakariae > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [0]PETSC ERROR: to get more information on the crash. > > *[0]PETSC ERROR: --------------------- Error Message > --------------------------------------------------------------* > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021 > > [0]PETSC ERROR: ./mimeticcurleuler2 on a macx named pn2032683.lanl.gov by > zjorti Fri Mar 19 16:07:03 2021 > > [0]PETSC ERROR: Configure options > PETSC_DIR=/Users/zjorti/software/petsc-3.14.5 PETSC_ARCH=macx --with-fc=0 > --with-mpi-dir=/Users/zjorti/.brew --download-hypre --with-debugging=0 > --with-cxx-dialect=C++11 > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > [0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is > causing the crash. > > -------------------------------------------------------------------------- > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > > with errorcode 50176059. > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > > You may or may not see output from other processes, depending on > > exactly when Open MPI kills them. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Sat Mar 20 09:06:57 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Sat, 20 Mar 2021 15:06:57 +0100 Subject: [petsc-users] Re-ordering in DMPlexCreateFromCellListParallelPetsc Message-ID: Hi all, I'm building a plex from elements arrays using DMPlexCreateFromCellListParallelPetsc. Once the plex is built, I need to set up boundary labels. I have an array of faces containing a series of 3 vertex local indices. To rebuild boundary labels, I need to loop over the array and get the join of 3 consecutive points to find the corresponding face point in the DAG. Problem, vertices get reordered by DMPlexCreateFromCellListParallelPetsc so that locally owned vertices are before remote ones, so local indices are changed and the indices in the face array are not good anymore. Is there a way to track this renumbering ? For owned vertices, I can find the local index from the global one (so do old local index -> global index -> new local index). For the remote ones, I'm not sure. I can hash global indices, but is there a more idiomatic way ? Thanks, -- Nicolas From mfadams at lbl.gov Sun Mar 21 07:25:12 2021 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 21 Mar 2021 08:25:12 -0400 Subject: [petsc-users] funny link error Message-ID: We are having problems with linking and use static linking. We get this error and have seen others like it (eg, lpetsc_lib_gcc_s) /usr/bin/ld: cannot find -lpetsc_lib_wlm_detect-NOTFOUND wlm_detect is some sort of system library, but I have no idea where this petsc string comes from. This is on Cori and the application uses cmake. I can run PETSc tests fine. Any ideas? Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: application/octet-stream Size: 130541 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 2044412 bytes Desc: not available URL: From stefano.zampini at gmail.com Sun Mar 21 07:32:00 2021 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Sun, 21 Mar 2021 15:32:00 +0300 Subject: [petsc-users] funny link error In-Reply-To: References: Message-ID: This looks like a CMAKE issue. Good luck Il giorno dom 21 mar 2021 alle ore 15:26 Mark Adams ha scritto: > We are having problems with linking and use static linking. > We get this error and have seen others like it (eg, lpetsc_lib_gcc_s) > > /usr/bin/ld: cannot find -lpetsc_lib_wlm_detect-NOTFOUND > > wlm_detect is some sort of system library, but I have no idea where this > petsc string comes from. > This is on Cori and the application uses cmake. > I can run PETSc tests fine. > > Any ideas? > > Thanks, > Mark > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Mar 21 10:27:01 2021 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 21 Mar 2021 11:27:01 -0400 Subject: [petsc-users] funny link error In-Reply-To: References: Message-ID: Stefano is right I think. Start grepping for that string in the CMake logs. Thanks, Matt On Sun, Mar 21, 2021 at 8:32 AM Stefano Zampini wrote: > This looks like a CMAKE issue. Good luck > > Il giorno dom 21 mar 2021 alle ore 15:26 Mark Adams ha > scritto: > >> We are having problems with linking and use static linking. >> We get this error and have seen others like it (eg, lpetsc_lib_gcc_s) >> >> /usr/bin/ld: cannot find -lpetsc_lib_wlm_detect-NOTFOUND >> >> wlm_detect is some sort of system library, but I have no idea where this >> petsc string comes from. >> This is on Cori and the application uses cmake. >> I can run PETSc tests fine. >> >> Any ideas? >> >> Thanks, >> Mark >> > > > -- > Stefano > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Mar 21 15:29:31 2021 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 21 Mar 2021 16:29:31 -0400 Subject: [petsc-users] Re-ordering in DMPlexCreateFromCellListParallelPetsc In-Reply-To: References: Message-ID: On Sat, Mar 20, 2021 at 10:07 AM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > Hi all, > > I'm building a plex from elements arrays using > DMPlexCreateFromCellListParallelPetsc. Once the plex is built, I need to > set up boundary labels. I have an array of faces containing a series of > 3 vertex local indices. To rebuild boundary labels, I need to loop over > the array and get the join of 3 consecutive points to find the > corresponding face point in the DAG. > This is very common. We should have a built-in thing that does this. > Problem, vertices get reordered by DMPlexCreateFromCellListParallelPetsc > so that locally owned vertices are before remote ones, so local indices > are changed and the indices in the face array are not good anymore. > This is not exactly what happens. I will talk through the algorithm so that maybe we can find a good interface. I can probably write the code quickly: 1) We take in cells[numCells, numCorners], which is a list of all the vertices in each cell The vertex numbers do not have to be a contiguous set. You can have any integers you want. 2) We create a sorted list of the unique vertex numbers on each process. The new local vertex numbers are the locations in this list. Here is my proposed interface. We preserve this list of unique vertices, just as we preserve the vertexSF. Then after DMPlexCreateFromCellListParallelPetsc(), can DMPlexInterpolate(), you could call DMPlexBuildFaceLabelsFromCellList(dm, numFaces, faces, labelName, labelValues) Would that work for you? I think I could do that in a couple of hours. Thanks, Matt > Is there a way to track this renumbering ? For owned vertices, I can > find the local index from the global one (so do old local index -> > global index -> new local index). For the remote ones, I'm not sure. I > can hash global indices, but is there a more idiomatic way ? > > Thanks, > > -- > Nicolas > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From snailsoar at hotmail.com Sun Mar 21 18:22:22 2021 From: snailsoar at hotmail.com (feng wang) Date: Sun, 21 Mar 2021 23:22:22 +0000 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: <38509E59-D27A-47C6-8D97-EAAEBFC15FBF@petsc.dev> References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> , <38509E59-D27A-47C6-8D97-EAAEBFC15FBF@petsc.dev> Message-ID: Hi Barry, Thanks for your help, I really appreciate it. In the end I used a shell matrix to compute the matrix-vector product, it is clearer to me and there are more things under my control. I am now trying to do a parallel implementation, I have some questions on setting up parallel matrices and vectors for a user-defined partition, could you please provide some advice? Suppose I have already got a partition for 2 CPUs. Each cpu is assigned a list of elements and also their halo elements. 1. The global element index for each partition is not necessarily continuous, do I have to I re-order them to make them continuous? 2. When I set up the size of the matrix and vectors for each cpu, should I take into account the halo elements? 3. In my serial version, when I initialize my RHS vector, I am not using VecSetValues, Instead I use VecGetArray/VecRestoreArray to assign the values. VecAssemblyBegin()/VecAssemblyEnd() is never used. would this still work for a parallel version? Thanks, Feng ________________________________ From: Barry Smith Sent: 12 March 2021 23:40 To: feng wang Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Mar 12, 2021, at 9:37 AM, feng wang > wrote: Hi Matt, Thanks for your prompt response. Below are my two versions. one is buggy and the 2nd one is working. For the first one, I add the diagonal contribution to the true RHS (variable: rhs) and then set the base point, the callback function is somehow called twice afterwards to compute Jacobian. Do you mean "to compute the Jacobian matrix-vector product?" Is it only in the first computation of the product (for the given base vector) that it calls it twice or every matrix-vector product? It is possible there is a bug in our logic; run in the debugger with a break point in FormFunction_mf and each time the function is hit in the debugger type where or bt to get the stack frames from the calls. Send this. From this we can all see if it is being called excessively and why. For the 2nd one, I just call the callback function manually to recompute everything, the callback function is then called once as expected to compute the Jacobian. For me, both versions should do the same things. but I don't know why in the first one the callback function is called twice after I set the base point. what could possibly go wrong? The logic of how it is suppose to work is shown below. Thanks, Feng //This does not work fld->cnsv( iqs,iqe, q, aux, csv ); //add contribution of time-stepping for(iv=0; ivcnsv( iqs,iqe, q, aux, csv ); ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); ierr = FormFunction_mf(this, petsc_csv, petsc_baserhs); //this is my callback function, now call it manually ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); Since you provide petsc_baserhs MatMFFD assumes (naturally) that you will keep the correct values in it. Hence for each new base value YOU need to compute the new values in petsc_baserhs. This approach gives you a bit more control over reusing the information in petsc_baserhs. If you would prefer that MatMFFD recomputes the base values, as needed, then you call FormFunction_mf(this, petsc_csv, NULL); and PETSc will allocate a vector and fill it up as needed by calling your FormFunction_mf() But you need to call MatAssemblyBegin/End each time you the base input vector this, petsc_csv values change. For example MatAssemblyBegin(petsc_A_mf,...) MatAssemblyEnd(petsc_A_mf,...) KSPSolve() ________________________________ From: Matthew Knepley > Sent: 12 March 2021 15:08 To: feng wang > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 9:55 AM feng wang > wrote: Hi Mat, Thanks for your reply. I will try the parallel implementation. I've got a serial matrix-free GMRES working, but I would like to know why my initial version of matrix-free implementation does not work and there is still something I don't understand. I did some debugging and find that the callback function to compute the RHS for the matrix-free matrix is called twice by Petsc when it computes the finite difference Jacobian, but it should only be called once. I don't know why, could you please give some advice? F is called once to calculate the base point and once to get the perturbation. The base point is not recalculated, so if you do many iterates, it is amortized. Thanks, Matt Thanks, Feng ________________________________ From: Matthew Knepley > Sent: 12 March 2021 12:05 To: feng wang > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 6:02 AM feng wang > wrote: Hi Barry, Thanks for your advice. You are right on this. somehow there is some inconsistency when I compute the right hand side (true RHS + time-stepping contribution to the diagonal matrix) to compute the finite difference Jacobian. If I just use the call back function to recompute my RHS before I call MatMFFDSetBase, then it works like a charm. But now I end up with computing my RHS three times. 1st time is to compute the true RHS, the rest two is for computing finite difference Jacobian. In my previous buggy version, I only compute RHS twice. If possible, could you elaborate on your comments "Also be careful about petsc_baserhs", so I may possibly understand what was going on with my buggy version. Our FD implementation is simple. It approximates the action of the Jacobian as J(b) v = (F(b + h v) - F(b)) / h ||v|| where h is some small parameter and b is the base vector, namely the one that you are linearizing around. In a Newton step, b is the previous solution and v is the proposed solution update. Besides, for a parallel implementation, my code already has its own partition method, is it possible to allow petsc read in a user-defined partition? if not what is a better way to do this? Sure https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html Thanks, Matt Many thanks, Feng ________________________________ From: Barry Smith > Sent: 11 March 2021 22:15 To: feng wang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation Feng, The first thing to check is that for each linear solve that involves a new operator (values in the base vector) the MFFD matrix knows it is using a new operator. The easiest way is to call MatMFFDSetBase() before each solve that involves a new operator (new values in the base vector). Also be careful about petsc_baserhs, when you change the base vector's values you also need to change the petsc_baserhs values to the function evaluation at that point. If that is correct I would check with a trivial function evaluator to make sure the infrastructure is all set up correctly. For examples use for the matrix free a 1 4 1 operator applied matrix free. Barry On Mar 11, 2021, at 7:35 AM, feng wang > wrote: Dear All, I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: the matrix-free matrix is created as: ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); KSP linear operator is set up as: ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix Before calling KSPSolve, I do: ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side The call back function is defined as: PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) { PetscErrorCode ierr; cFdDomain *user_ctx; cout << "FormFunction_mf called\n"; //in_vec: flow states //out_vec: right hand side + diagonal contributions from CFL number user_ctx = (cFdDomain*)ctx; //get perturbed conservative variables from petsc user_ctx->petsc_getcsv(in_vec); //get new right side user_ctx->petsc_fd_rhs(); //set new right hand side to the output vector user_ctx->petsc_setrhs(out_vec); ierr = 0; return ierr; } The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? Thanks for your help in advance. Feng -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun Mar 21 20:28:28 2021 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 21 Mar 2021 20:28:28 -0500 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> <38509E59-D27A-47C6-8D97-EAAEBFC15FBF@petsc.dev> Message-ID: <5B018B57-B679-4015-8097-042B7C6B9D38@petsc.dev> > On Mar 21, 2021, at 6:22 PM, feng wang wrote: > > Hi Barry, > > Thanks for your help, I really appreciate it. > > In the end I used a shell matrix to compute the matrix-vector product, it is clearer to me and there are more things under my control. I am now trying to do a parallel implementation, I have some questions on setting up parallel matrices and vectors for a user-defined partition, could you please provide some advice? Suppose I have already got a partition for 2 CPUs. Each cpu is assigned a list of elements and also their halo elements. > The global element index for each partition is not necessarily continuous, do I have to I re-order them to make them continuous? Yes, in some sense. So long as each process can march over ITS elements computing the function and Jacobian matrix-vector product it doesn't matter how you have named/numbered entries. But conceptually the first process has the first set of vector entries and the second the second set. > When I set up the size of the matrix and vectors for each cpu, should I take into account the halo elements? The matrix and vectors the algebraic solvers see DO NOT have halo elements in their sizes. You will likely need a halo-ed work vector to do the matrix-free multiply from. The standard model is use VecScatterBegin/End to get the values from the non-halo-ed algebraic vector input to MatMult into a halo-ed one to do the local product. > In my serial version, when I initialize my RHS vector, I am not using VecSetValues, Instead I use VecGetArray/VecRestoreArray to assign the values. VecAssemblyBegin()/VecAssemblyEnd() is never used. would this still work for a parallel version? Yes, you can use Get/Restore but the input vector x will need to be, as noted above, scattered into a haloed version to get all the entries you will need to do the local part of the product. > Thanks, > Feng > > From: Barry Smith > > Sent: 12 March 2021 23:40 > To: feng wang > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation > > > >> On Mar 12, 2021, at 9:37 AM, feng wang > wrote: >> >> Hi Matt, >> >> Thanks for your prompt response. >> >> Below are my two versions. one is buggy and the 2nd one is working. For the first one, I add the diagonal contribution to the true RHS (variable: rhs) and then set the base point, the callback function is somehow called twice afterwards to compute Jacobian. > > Do you mean "to compute the Jacobian matrix-vector product?" > > Is it only in the first computation of the product (for the given base vector) that it calls it twice or every matrix-vector product? > > It is possible there is a bug in our logic; run in the debugger with a break point in FormFunction_mf and each time the function is hit in the debugger type where or bt to get the stack frames from the calls. Send this. From this we can all see if it is being called excessively and why. > >> For the 2nd one, I just call the callback function manually to recompute everything, the callback function is then called once as expected to compute the Jacobian. For me, both versions should do the same things. but I don't know why in the first one the callback function is called twice after I set the base point. what could possibly go wrong? > > The logic of how it is suppose to work is shown below. >> >> Thanks, >> Feng >> >> //This does not work >> fld->cnsv( iqs,iqe, q, aux, csv ); >> //add contribution of time-stepping >> for(iv=0; iv> { >> for(iq=0; iq> { >> //use conservative variables here >> rhs[iv][iq] = -rhs[iv][iq] + csv[iv][iq]*lhsa[nlhs-1][iq]/cfl; >> } >> } >> ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); >> ierr = petsc_setrhs(petsc_baserhs); CHKERRQ(ierr); >> ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); >> >> //This works >> fld->cnsv( iqs,iqe, q, aux, csv ); >> ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); >> ierr = FormFunction_mf(this, petsc_csv, petsc_baserhs); //this is my callback function, now call it manually >> ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); >> >> > Since you provide petsc_baserhs MatMFFD assumes (naturally) that you will keep the correct values in it. Hence for each new base value YOU need to compute the new values in petsc_baserhs. This approach gives you a bit more control over reusing the information in petsc_baserhs. > > If you would prefer that MatMFFD recomputes the base values, as needed, then you call FormFunction_mf(this, petsc_csv, NULL); and PETSc will allocate a vector and fill it up as needed by calling your FormFunction_mf() But you need to call MatAssemblyBegin/End each time you the base input vector this, petsc_csv values change. For example > > MatAssemblyBegin(petsc_A_mf,...) > MatAssemblyEnd(petsc_A_mf,...) > KSPSolve() > > > > >> >> From: Matthew Knepley > >> Sent: 12 March 2021 15:08 >> To: feng wang > >> Cc: Barry Smith >; petsc-users at mcs.anl.gov > >> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >> >> On Fri, Mar 12, 2021 at 9:55 AM feng wang > wrote: >> Hi Mat, >> >> Thanks for your reply. I will try the parallel implementation. >> >> I've got a serial matrix-free GMRES working, but I would like to know why my initial version of matrix-free implementation does not work and there is still something I don't understand. I did some debugging and find that the callback function to compute the RHS for the matrix-free matrix is called twice by Petsc when it computes the finite difference Jacobian, but it should only be called once. I don't know why, could you please give some advice? >> >> F is called once to calculate the base point and once to get the perturbation. The base point is not recalculated, so if you do many iterates, it is amortized. >> >> Thanks, >> >> Matt >> >> Thanks, >> Feng >> >> >> >> From: Matthew Knepley > >> Sent: 12 March 2021 12:05 >> To: feng wang > >> Cc: Barry Smith >; petsc-users at mcs.anl.gov > >> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >> >> On Fri, Mar 12, 2021 at 6:02 AM feng wang > wrote: >> Hi Barry, >> >> Thanks for your advice. >> >> You are right on this. somehow there is some inconsistency when I compute the right hand side (true RHS + time-stepping contribution to the diagonal matrix) to compute the finite difference Jacobian. If I just use the call back function to recompute my RHS before I call MatMFFDSetBase, then it works like a charm. But now I end up with computing my RHS three times. 1st time is to compute the true RHS, the rest two is for computing finite difference Jacobian. >> >> In my previous buggy version, I only compute RHS twice. If possible, could you elaborate on your comments "Also be careful about petsc_baserhs", so I may possibly understand what was going on with my buggy version. >> >> Our FD implementation is simple. It approximates the action of the Jacobian as >> >> J(b) v = (F(b + h v) - F(b)) / h ||v|| >> >> where h is some small parameter and b is the base vector, namely the one that you are linearizing around. In a Newton step, b is the previous solution >> and v is the proposed solution update. >> >> Besides, for a parallel implementation, my code already has its own partition method, is it possible to allow petsc read in a user-defined partition? if not what is a better way to do this? >> >> Sure >> >> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html >> >> Thanks, >> >> Matt >> >> Many thanks, >> Feng >> >> From: Barry Smith > >> Sent: 11 March 2021 22:15 >> To: feng wang > >> Cc: petsc-users at mcs.anl.gov > >> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >> >> >> Feng, >> >> The first thing to check is that for each linear solve that involves a new operator (values in the base vector) the MFFD matrix knows it is using a new operator. >> >> The easiest way is to call MatMFFDSetBase() before each solve that involves a new operator (new values in the base vector). Also be careful about petsc_baserhs, when you change the base vector's values you also need to change the petsc_baserhs values to the function evaluation at that point. >> >> If that is correct I would check with a trivial function evaluator to make sure the infrastructure is all set up correctly. For examples use for the matrix free a 1 4 1 operator applied matrix free. >> >> Barry >> >> >>> On Mar 11, 2021, at 7:35 AM, feng wang > wrote: >>> >>> Dear All, >>> >>> I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: >>> >>> the matrix-free matrix is created as: >>> >>> ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); >>> ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); >>> >>> KSP linear operator is set up as: >>> >>> ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix >>> >>> Before calling KSPSolve, I do: >>> >>> ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side >>> >>> The call back function is defined as: >>> >>> PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) >>> { >>> PetscErrorCode ierr; >>> cFdDomain *user_ctx; >>> >>> cout << "FormFunction_mf called\n"; >>> >>> //in_vec: flow states >>> //out_vec: right hand side + diagonal contributions from CFL number >>> >>> user_ctx = (cFdDomain*)ctx; >>> >>> //get perturbed conservative variables from petsc >>> user_ctx->petsc_getcsv(in_vec); >>> >>> //get new right side >>> user_ctx->petsc_fd_rhs(); >>> >>> //set new right hand side to the output vector >>> user_ctx->petsc_setrhs(out_vec); >>> >>> ierr = 0; >>> return ierr; >>> } >>> >>> The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. >>> >>> The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? >>> >>> Thanks for your help in advance. >>> Feng >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Mon Mar 22 05:22:55 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Mon, 22 Mar 2021 11:22:55 +0100 Subject: [petsc-users] Re-ordering in DMPlexCreateFromCellListParallelPetsc In-Reply-To: References: Message-ID: <954681c5-eb9f-a393-f09b-b1efc09bccdc@math.u-bordeaux.fr> On 21/03/2021 21:29, Matthew Knepley wrote: > On Sat, Mar 20, 2021 at 10:07 AM Nicolas Barral > > wrote: > > Hi all, > > I'm building a plex from elements arrays using > DMPlexCreateFromCellListParallelPetsc. Once the plex is built, I > need to > set up boundary labels. I have an array of faces containing a series of > 3 vertex local indices. To rebuild boundary labels, I need to loop over > the array and get the join of 3 consecutive points to find the > corresponding face point in the DAG. > > > This is very common. We should have a built-in thing that does this. > Ordering apart, it's not very complicated once you figure out "join" is the right operation to use. We need more doc on graph layout and operations in DMPlex, I think I'm going to make pictures when I'm done with that code because I waste too much time every time. Is there a starting point you like ? > Problem, vertices get reordered by > DMPlexCreateFromCellListParallelPetsc > so that locally owned vertices are before remote ones, so local indices > are changed and the indices in the face array are not good anymore. > > > This is not exactly what happens. I will talk through the algorithm so > that maybe we can find a good > interface. I can probably write the code quickly: > > 1) We take in cells[numCells, numCorners], which is a list of all the > vertices in each cell > > ? ? ?The vertex numbers do not have to be a contiguous set. You can > have any integers you want. > > 2) We create a sorted list of the unique vertex numbers on each process. > The new local vertex numbers > ? ? ?are the locations in this list. > Ok It took me re-writing this email a couple times but I think I understand. I was too focused on local/global indices. But if I get this right, you still make an assumption: that the numVertices*dim coordinates passed in vertexCoords are the coordinates of the numvertices first vertices in the sorted list. Is that true ? > Here is my proposed interface. We preserve this list of unique vertices, > just as we preserve the vertexSF. > Then after DMPlexCreateFromCellListParallelPetsc(), can > DMPlexInterpolate(), you could call > > ? DMPlexBuildFaceLabelsFromCellList(dm, numFaces, faces, labelName, > labelValues) > > Would that work for you? I think I could do that in a couple of hours. > So that function would be very helpful. But is it as simple as you're thinking ? The sorted list gives local index -> unique identifier, but what we need is the other way round, isn't it ? Once I understand better, I can have a first draft in my plexadapt code and we pull it out later. Thanks -- Nicolas > ? Thanks, > > ? ? ?Matt > > Is there a way to track this renumbering ? For owned vertices, I can > find the local index from the global one (so do old local index -> > global index -> new local index). For the remote ones, I'm not sure. I > can hash global indices, but is there a more idiomatic way ? > > Thanks, > > -- > Nicolas > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Mon Mar 22 07:45:37 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Mar 2021 08:45:37 -0400 Subject: [petsc-users] Re-ordering in DMPlexCreateFromCellListParallelPetsc In-Reply-To: <954681c5-eb9f-a393-f09b-b1efc09bccdc@math.u-bordeaux.fr> References: <954681c5-eb9f-a393-f09b-b1efc09bccdc@math.u-bordeaux.fr> Message-ID: On Mon, Mar 22, 2021 at 6:22 AM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > On 21/03/2021 21:29, Matthew Knepley wrote: > > On Sat, Mar 20, 2021 at 10:07 AM Nicolas Barral > > > > wrote: > > > > Hi all, > > > > I'm building a plex from elements arrays using > > DMPlexCreateFromCellListParallelPetsc. Once the plex is built, I > > need to > > set up boundary labels. I have an array of faces containing a series > of > > 3 vertex local indices. To rebuild boundary labels, I need to loop > over > > the array and get the join of 3 consecutive points to find the > > corresponding face point in the DAG. > > > > > > This is very common. We should have a built-in thing that does this. > > > Ordering apart, it's not very complicated once you figure out "join" is > the right operation to use. We need more doc on graph layout and > operations in DMPlex, I think I'm going to make pictures when I'm done > with that code because I waste too much time every time. Is there a > starting point you like ? > I think the discussion in @article{LangeMitchellKnepleyGorman2015, title = {Efficient mesh management in {Firedrake} using {PETSc-DMPlex}}, author = {Michael Lange and Lawrence Mitchell and Matthew G. Knepley and Gerard J. Gorman}, journal = {SIAM Journal on Scientific Computing}, volume = {38}, number = {5}, pages = {S143--S155}, eprint = {http://arxiv.org/abs/1506.07749}, doi = {10.1137/15M1026092}, year = {2016} } is pretty good. > > Problem, vertices get reordered by > > DMPlexCreateFromCellListParallelPetsc > > so that locally owned vertices are before remote ones, so local > indices > > are changed and the indices in the face array are not good anymore. > > > > > > This is not exactly what happens. I will talk through the algorithm so > > that maybe we can find a good > > interface. I can probably write the code quickly: > > > > 1) We take in cells[numCells, numCorners], which is a list of all the > > vertices in each cell > > > > The vertex numbers do not have to be a contiguous set. You can > > have any integers you want. > > > > 2) We create a sorted list of the unique vertex numbers on each process. > > The new local vertex numbers > > are the locations in this list. > > > > Ok It took me re-writing this email a couple times but I think I > understand. I was too focused on local/global indices. But if I get this > right, you still make an assumption: that the numVertices*dim > coordinates passed in vertexCoords are the coordinates of the > numvertices first vertices in the sorted list. Is that true ? > No. It can be arbitrary. That is why we make the vertexSF, so we can map those coordinates back to the right processes. > > Here is my proposed interface. We preserve this list of unique vertices, > > just as we preserve the vertexSF. > > Then after DMPlexCreateFromCellListParallelPetsc(), can > > DMPlexInterpolate(), you could call > > > > DMPlexBuildFaceLabelsFromCellList(dm, numFaces, faces, labelName, > > labelValues) > > > > Would that work for you? I think I could do that in a couple of hours. > > > > So that function would be very helpful. But is it as simple as you're > thinking ? The sorted list gives local index -> unique identifier, but what we need is the other way round, isn't it ? > You get the other way using search. Thanks, Matt > Once I understand better, I can have a first draft in my plexadapt code > and we pull it out later. > > Thanks > > -- > Nicolas > > > > Thanks, > > > > Matt > > > > Is there a way to track this renumbering ? For owned vertices, I can > > find the local index from the global one (so do old local index -> > > global index -> new local index). For the remote ones, I'm not sure. > I > > can hash global indices, but is there a more idiomatic way ? > > > > Thanks, > > > > -- > > Nicolas > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From skavou1 at lsu.edu Mon Mar 22 09:04:30 2021 From: skavou1 at lsu.edu (Sepideh Kavousi) Date: Mon, 22 Mar 2021 14:04:30 +0000 Subject: [petsc-users] PF+Navier stokes Message-ID: Hello, I want to solve PF solidification+Navier stokes using Finite different method, and I have a strange problem. My code runs fine for some system sizes and fails for some of the system sizes. When I run with the following options: mpirun -np 2 ./one.out -ts_monitor -snes_fd_color -ts_max_snes_failures -1 -ts_type bdf -ts_bdf_adapt -pc_type bjacobi -snes_linesearch_type l2 -snes_type ksponly -ksp_type gmres -ksp_gmres_restart 1001 -sub_pc_type ilu -sub_ksp_type preonly -snes_monitor -ksp_monitor -snes_linesearch_monitor -ksp_monitor_true_residual -ksp_converged_reason -log_view 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 ^C Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 Even setting pc_type to LU does not solve the problem. 0 TS dt 0.0001 time 0. copy! copy! Write output at step= 0! Write output at step= 0! 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT I guess the problem is that in mass conservation I used forward discretization for u (velocity in x) and for the moment in x , I used forward discretization for p (pressure) to ensure non-zero terms on the diagonal of matrix. I tried to run it with valgrind but it did not output anything. Does anyone have suggestions on how to solve this issue? Best, Sepideh -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 22 09:15:14 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Mar 2021 10:15:14 -0400 Subject: [petsc-users] PF+Navier stokes In-Reply-To: References: Message-ID: On Mon, Mar 22, 2021 at 10:04 AM Sepideh Kavousi wrote: > Hello, > I want to solve PF solidification+Navier stokes using Finite different > method, and I have a strange problem. My code runs fine for some system > sizes and fails for some of the system sizes. When I run with the following > options: > mpirun -np 2 ./one.out -ts_monitor -snes_fd_color -ts_max_snes_failures -1 > -ts_type bdf -ts_bdf_adapt -pc_type bjacobi -snes_linesearch_type l2 > -snes_type ksponly -ksp_type gmres -ksp_gmres_restart 1001 -sub_pc_type ilu > -sub_ksp_type preonly -snes_monitor -ksp_monitor -snes_linesearch_monitor > -ksp_monitor_true_residual -ksp_converged_reason -log_view > > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > ^C Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > > Even setting pc_type to LU does not solve the problem. > 0 TS dt 0.0001 time 0. > copy! > copy! > Write output at step= 0! > Write output at step= 0! > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT > > I guess the problem is that in mass conservation I used forward > discretization for u (velocity in x) and for the moment in x , I used > forward discretization for p (pressure) to ensure non-zero terms on the > diagonal of matrix. I tried to run it with valgrind but it did not output > anything. > > Does anyone have suggestions on how to solve this issue? > Your subproblems in Block_jacobi are singular. With multiphysics problems like this, definition of blocks can be tricky. I would first try to find a good preconditioner for this system in the literature, and then we can help you try it out. Thanks, Matt > > Best, > Sepideh > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From kruger at txcorp.com Mon Mar 22 09:45:30 2021 From: kruger at txcorp.com (Scott Kruger) Date: Mon, 22 Mar 2021 08:45:30 -0600 Subject: [petsc-users] funny link error In-Reply-To: References: Message-ID: <20210322144530.6kge37t3fuxswz32@txcorp.com> >From you make.log, it looks like petsc found it here: /opt/cray/wlm_detect/1.3.3-7.0.1.1_4.19__g7109084.ari/lib64 PETSc has it and found it because BuildSystem did the query of what it takes to get C/C++/Fortran to work together. Why is it needed? That's up to Cray. But the question is: How is your CMake build getting PETSc info? If a) it using pkg-config and CMake's ability to parse it, then it looks like our pkg-config export might need work. We'd need to see the pkg-config file to be sure though. and if b) it is not using pkg-config, then the answer is it should. Scott On 2021-03-21 08:25, Mark Adams did write: > We are having problems with linking and use static linking. > We get this error and have seen others like it (eg, lpetsc_lib_gcc_s) > > /usr/bin/ld: cannot find -lpetsc_lib_wlm_detect-NOTFOUND > > wlm_detect is some sort of system library, but I have no idea where this > petsc string comes from. > This is on Cori and the application uses cmake. > I can run PETSc tests fine. > > Any ideas? > > Thanks, > Mark -- Scott Kruger Tech-X Corporation kruger at txcorp.com 5621 Arapahoe Ave, Suite A Phone: (720) 466-3196 Boulder, CO 80303 Fax: (303) 448-7756 From mfadams at lbl.gov Mon Mar 22 10:03:39 2021 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 22 Mar 2021 11:03:39 -0400 Subject: [petsc-users] funny link error In-Reply-To: <20210322144530.6kge37t3fuxswz32@txcorp.com> References: <20210322144530.6kge37t3fuxswz32@txcorp.com> Message-ID: Thanks Scott, Can you please tell us where this pkg-config file is? Thank again, Mark On Mon, Mar 22, 2021 at 10:45 AM Scott Kruger wrote: > > From you make.log, it looks like petsc found it here: > /opt/cray/wlm_detect/1.3.3-7.0.1.1_4.19__g7109084.ari/lib64 > > PETSc has it and found it because BuildSystem did the query of what > it takes to get C/C++/Fortran to work together. Why is it needed? > That's up to Cray. > > But the question is: > > How is your CMake build getting PETSc info? > > If > a) it using pkg-config and CMake's ability to parse it, then it looks like > our pkg-config export might need work. > We'd need to see the pkg-config file to be sure though. > > and if > b) it is not using pkg-config, then the answer is it should. > > Scott > > > On 2021-03-21 08:25, Mark Adams did write: > > We are having problems with linking and use static linking. > > We get this error and have seen others like it (eg, lpetsc_lib_gcc_s) > > > > /usr/bin/ld: cannot find -lpetsc_lib_wlm_detect-NOTFOUND > > > > wlm_detect is some sort of system library, but I have no idea where this > > petsc string comes from. > > This is on Cori and the application uses cmake. > > I can run PETSc tests fine. > > > > Any ideas? > > > > Thanks, > > Mark > > > > > -- > Scott Kruger > Tech-X Corporation kruger at txcorp.com > 5621 Arapahoe Ave, Suite A Phone: (720) 466-3196 > Boulder, CO 80303 Fax: (303) 448-7756 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 22 11:11:12 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Mar 2021 12:11:12 -0400 Subject: [petsc-users] funny link error In-Reply-To: References: <20210322144530.6kge37t3fuxswz32@txcorp.com> Message-ID: On Mon, Mar 22, 2021 at 11:04 AM Mark Adams wrote: > Thanks Scott, > > Can you please tell us where this pkg-config file is? > Looks like ${PETSC_ARCH}/lib/pkgconfig Matt > Thank again, > Mark > > On Mon, Mar 22, 2021 at 10:45 AM Scott Kruger wrote: > >> >> From you make.log, it looks like petsc found it here: >> /opt/cray/wlm_detect/1.3.3-7.0.1.1_4.19__g7109084.ari/lib64 >> >> PETSc has it and found it because BuildSystem did the query of what >> it takes to get C/C++/Fortran to work together. Why is it needed? >> That's up to Cray. >> >> But the question is: >> >> How is your CMake build getting PETSc info? >> >> If >> a) it using pkg-config and CMake's ability to parse it, then it looks >> like our pkg-config export might need work. >> We'd need to see the pkg-config file to be sure though. >> >> and if >> b) it is not using pkg-config, then the answer is it should. >> >> Scott >> >> >> On 2021-03-21 08:25, Mark Adams did write: >> > We are having problems with linking and use static linking. >> > We get this error and have seen others like it (eg, lpetsc_lib_gcc_s) >> > >> > /usr/bin/ld: cannot find -lpetsc_lib_wlm_detect-NOTFOUND >> > >> > wlm_detect is some sort of system library, but I have no idea where this >> > petsc string comes from. >> > This is on Cori and the application uses cmake. >> > I can run PETSc tests fine. >> > >> > Any ideas? >> > >> > Thanks, >> > Mark >> >> >> >> >> -- >> Scott Kruger >> Tech-X Corporation kruger at txcorp.com >> 5621 Arapahoe Ave, Suite A Phone: (720) 466-3196 >> Boulder, CO 80303 Fax: (303) 448-7756 >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From kruger at txcorp.com Mon Mar 22 11:16:39 2021 From: kruger at txcorp.com (Scott Kruger) Date: Mon, 22 Mar 2021 10:16:39 -0600 Subject: [petsc-users] funny link error In-Reply-To: References: <20210322144530.6kge37t3fuxswz32@txcorp.com> Message-ID: <20210322161639.kn4dnwwqeypqter7@txcorp.com> The short answer is $PETSC_INSTALL_DIR/lib/pkgconfig/PETSc.pc Attached is a PETSc CMake snippet that should show you how to use pkg-config. It also shows how to set the compilers to the same as PETSc's for build consistency, but this part is not strictly necessary (but perhaps a good idea). FYI, Barry has been pushing to get a CMake build example as part of the test harness with the help of Jed and myself, but the Windows issues have been a real hold-up. Scott On 2021-03-22 11:03, Mark Adams did write: > Thanks Scott, > > Can you please tell us where this pkg-config file is? > > Thank again, > Mark > > On Mon, Mar 22, 2021 at 10:45 AM Scott Kruger wrote: > > > > > From you make.log, it looks like petsc found it here: > > /opt/cray/wlm_detect/1.3.3-7.0.1.1_4.19__g7109084.ari/lib64 > > > > PETSc has it and found it because BuildSystem did the query of what > > it takes to get C/C++/Fortran to work together. Why is it needed? > > That's up to Cray. > > > > But the question is: > > > > How is your CMake build getting PETSc info? > > > > If > > a) it using pkg-config and CMake's ability to parse it, then it looks like > > our pkg-config export might need work. > > We'd need to see the pkg-config file to be sure though. > > > > and if > > b) it is not using pkg-config, then the answer is it should. > > > > Scott > > > > > > On 2021-03-21 08:25, Mark Adams did write: > > > We are having problems with linking and use static linking. > > > We get this error and have seen others like it (eg, lpetsc_lib_gcc_s) > > > > > > /usr/bin/ld: cannot find -lpetsc_lib_wlm_detect-NOTFOUND > > > > > > wlm_detect is some sort of system library, but I have no idea where this > > > petsc string comes from. > > > This is on Cori and the application uses cmake. > > > I can run PETSc tests fine. > > > > > > Any ideas? > > > > > > Thanks, > > > Mark > > > > > > > > > > -- > > Scott Kruger > > Tech-X Corporation kruger at txcorp.com > > 5621 Arapahoe Ave, Suite A Phone: (720) 466-3196 > > Boulder, CO 80303 Fax: (303) 448-7756 > > -- Scott Kruger Tech-X Corporation kruger at txcorp.com 5621 Arapahoe Ave, Suite A Phone: (720) 466-3196 Boulder, CO 80303 Fax: (303) 448-7756 -------------- next part -------------- # set root of location to find PETSc configuration set(PETSC $ENV{PETSC_DIR}/$ENV{PETSC_ARCH}) set(ENV{PKG_CONFIG_PATH} ${PETSC}/lib/pkgconfig) # Determine PETSc compilers execute_process ( COMMAND pkg-config PETSc --variable=ccompiler COMMAND tr -d '\n' OUTPUT_VARIABLE C_COMPILER) SET(CMAKE_C_COMPILER ${C_COMPILER}) execute_process ( COMMAND pkg-config PETSc --variable=cxxcompiler COMMAND tr -d '\n' OUTPUT_VARIABLE CXX_COMPILER) if (CXX_COMPILER) SET(CMAKE_CXX_COMPILER ${CXX_COMPILER}) endif (CXX_COMPILER) execute_process ( COMMAND pkg-config PETSc --variable=fcompiler COMMAND tr -d '\n' OUTPUT_VARIABLE FORTRAN_COMPILER) if (FORTRAN_COMPILER) SET(CMAKE_Fortran_COMPILER ${FORTRAN_COMPILER}) enable_language(Fortran) endif (FORTRAN_COMPILER) # Get petsc and it's dependencies find_package(PkgConfig REQUIRED) pkg_search_module(PETSC REQUIRED IMPORTED_TARGET PETSc) target_link_libraries(ex1 PkgConfig::PETSC) From nicolas.barral at math.u-bordeaux.fr Mon Mar 22 11:20:35 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Mon, 22 Mar 2021 17:20:35 +0100 Subject: [petsc-users] Re-ordering in DMPlexCreateFromCellListParallelPetsc In-Reply-To: References: <954681c5-eb9f-a393-f09b-b1efc09bccdc@math.u-bordeaux.fr> Message-ID: <8c151189-d30d-b0ec-485d-3d75e954210e@math.u-bordeaux.fr> Thanks for your answers Matt. On 22/03/2021 13:45, Matthew Knepley wrote: > On Mon, Mar 22, 2021 at 6:22 AM Nicolas Barral > > wrote: > > On 21/03/2021 21:29, Matthew Knepley wrote: > > On Sat, Mar 20, 2021 at 10:07 AM Nicolas Barral > > > > >> wrote: > > > >? ? ?Hi all, > > > >? ? ?I'm building a plex from elements arrays using > >? ? ?DMPlexCreateFromCellListParallelPetsc. Once the plex is built, I > >? ? ?need to > >? ? ?set up boundary labels. I have an array of faces containing a > series of > >? ? ?3 vertex local indices. To rebuild boundary labels, I need to > loop over > >? ? ?the array and get the join of 3 consecutive points to find the > >? ? ?corresponding face point in the DAG. > > > > > > This is very common. We should have a built-in thing that does this. > > > Ordering apart, it's not very complicated once you figure out "join" is > the right operation to use. We need more doc on graph layout and > operations in DMPlex, I think I'm going to make pictures when I'm done > with that code because I waste too much time every time. Is there a > starting point you like ? > > > I think the discussion in > > @article{LangeMitchellKnepleyGorman2015, > ? title ? ? = {Efficient mesh management in {Firedrake} using > {PETSc-DMPlex}}, > ? author ? ?= {Michael Lange and Lawrence Mitchell and Matthew G. > Knepley and Gerard J. Gorman}, > ? journal ? = {SIAM Journal on Scientific Computing}, > ? volume ? ?= {38}, > ? number ? ?= {5}, > ? pages ? ? = {S143--S155}, > ? eprint ? ?= {http://arxiv.org/abs/1506.07749}, > ? doi ? ? ? = {10.1137/15M1026092}, > ? year ? ? ?= {2016} > } > > is pretty good. > It is, I'd just like to have something more complete (meet, join, height, depth...) and with more 2D & 3D pictures. It's all information available somewhere, but I would find it convenient to be all at the same place. Are sources of the paper available somewhere ? > >? ? ?Problem, vertices get reordered by > >? ? ?DMPlexCreateFromCellListParallelPetsc > >? ? ?so that locally owned vertices are before remote ones, so > local indices > >? ? ?are changed and the indices in the face array are not good > anymore. > > > > > > This is not exactly what happens. I will talk through the > algorithm so > > that maybe we can find a good > > interface. I can probably write the code quickly: > > > > 1) We take in cells[numCells, numCorners], which is a list of all > the > > vertices in each cell > > > >? ? ? ?The vertex numbers do not have to be a contiguous set. You can > > have any integers you want. > > > > 2) We create a sorted list of the unique vertex numbers on each > process. > > The new local vertex numbers > >? ? ? ?are the locations in this list. > > > > Ok It took me re-writing this email a couple times but I think I > understand. I was too focused on local/global indices. But if I get > this > right, you still make an assumption: that the numVertices*dim > coordinates passed in vertexCoords are the coordinates of the > numvertices first vertices in the sorted list. Is that true ? > > > No. It can be arbitrary. That is why we make the vertexSF, so we can map > those coordinates back to the right processes. > So how do you know which coordinates correspond to which vertex since no map is explicitly provided ? > > Here is my proposed interface. We preserve this list of unique > vertices, > > just as we preserve the vertexSF. > > Then after DMPlexCreateFromCellListParallelPetsc(), can > > DMPlexInterpolate(), you could call > > > >? ? DMPlexBuildFaceLabelsFromCellList(dm, numFaces, faces, labelName, > > labelValues) > > > > Would that work for you? I think I could do that in a couple of > hours. > > > > So that function would be very helpful. But is it as simple as you're > thinking ? The sorted list gives local index -> unique identifier, but > > what we need is the other way round, isn't it ? > > > You get the other way using search. > Do you mean a linear search for each vertex ? Thanks, -- Nicolas > ? Thanks, > > ? ? ?Matt > > Once I understand better, I can have a first draft in my plexadapt code > and we pull it out later. > > Thanks > > -- > Nicolas > > > >? ? Thanks, > > > >? ? ? ?Matt > > > >? ? ?Is there a way to track this renumbering ? For owned > vertices, I can > >? ? ?find the local index from the global one (so do old local > index -> > >? ? ?global index -> new local index). For the remote ones, I'm > not sure. I > >? ? ?can hash global indices, but is there a more idiomatic way ? > > > >? ? ?Thanks, > > > >? ? ?-- > >? ? ?Nicolas > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Mon Mar 22 11:53:06 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Mar 2021 12:53:06 -0400 Subject: [petsc-users] Re-ordering in DMPlexCreateFromCellListParallelPetsc In-Reply-To: <8c151189-d30d-b0ec-485d-3d75e954210e@math.u-bordeaux.fr> References: <954681c5-eb9f-a393-f09b-b1efc09bccdc@math.u-bordeaux.fr> <8c151189-d30d-b0ec-485d-3d75e954210e@math.u-bordeaux.fr> Message-ID: On Mon, Mar 22, 2021 at 12:20 PM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > Thanks for your answers Matt. > > On 22/03/2021 13:45, Matthew Knepley wrote: > > On Mon, Mar 22, 2021 at 6:22 AM Nicolas Barral > > > > wrote: > > > > On 21/03/2021 21:29, Matthew Knepley wrote: > > > On Sat, Mar 20, 2021 at 10:07 AM Nicolas Barral > > > > > > > > >> wrote: > > > > > > Hi all, > > > > > > I'm building a plex from elements arrays using > > > DMPlexCreateFromCellListParallelPetsc. Once the plex is > built, I > > > need to > > > set up boundary labels. I have an array of faces containing a > > series of > > > 3 vertex local indices. To rebuild boundary labels, I need to > > loop over > > > the array and get the join of 3 consecutive points to find the > > > corresponding face point in the DAG. > > > > > > > > > This is very common. We should have a built-in thing that does > this. > > > > > Ordering apart, it's not very complicated once you figure out "join" > is > > the right operation to use. We need more doc on graph layout and > > operations in DMPlex, I think I'm going to make pictures when I'm > done > > with that code because I waste too much time every time. Is there a > > starting point you like ? > > > > > > I think the discussion in > > > > @article{LangeMitchellKnepleyGorman2015, > > title = {Efficient mesh management in {Firedrake} using > > {PETSc-DMPlex}}, > > author = {Michael Lange and Lawrence Mitchell and Matthew G. > > Knepley and Gerard J. Gorman}, > > journal = {SIAM Journal on Scientific Computing}, > > volume = {38}, > > number = {5}, > > pages = {S143--S155}, > > eprint = {http://arxiv.org/abs/1506.07749}, > > doi = {10.1137/15M1026092}, > > year = {2016} > > } > > > > is pretty good. > > > It is, I'd just like to have something more complete (meet, join, > height, depth...) and with more 2D & 3D pictures. It's all information > available somewhere, but I would find it convenient to be all at the > same place. Are sources of the paper available somewhere ? > I think the right answer is no, there is not a complete reference. I have some things in presentations, but no complete work has been written. Maybe it is time to do that. At this point, a book is probably possible :) > > > Problem, vertices get reordered by > > > DMPlexCreateFromCellListParallelPetsc > > > so that locally owned vertices are before remote ones, so > > local indices > > > are changed and the indices in the face array are not good > > anymore. > > > > > > > > > This is not exactly what happens. I will talk through the > > algorithm so > > > that maybe we can find a good > > > interface. I can probably write the code quickly: > > > > > > 1) We take in cells[numCells, numCorners], which is a list of all > > the > > > vertices in each cell > > > > > > The vertex numbers do not have to be a contiguous set. You > can > > > have any integers you want. > > > > > > 2) We create a sorted list of the unique vertex numbers on each > > process. > > > The new local vertex numbers > > > are the locations in this list. > > > > > > > Ok It took me re-writing this email a couple times but I think I > > understand. I was too focused on local/global indices. But if I get > > this > > right, you still make an assumption: that the numVertices*dim > > coordinates passed in vertexCoords are the coordinates of the > > numvertices first vertices in the sorted list. Is that true ? > > > > > > No. It can be arbitrary. That is why we make the vertexSF, so we can map > > those coordinates back to the right processes. > > > So how do you know which coordinates correspond to which vertex since no > map is explicitly provided ? > Ah, it is based on the input assumptions when everything is put together. You could call BuildParallel() with any old numbering of vertices. However, if you call CreateParallel(), so that you are also passing in an array for vertex coordinates. Now vertices are assumed to be numbered [0, Nv) to correspond to the input array. Now, that array of coordinates can be chopped up differently in parallel than the vertices for subdomains. The vertexSF can convert between the mappings. > > > Here is my proposed interface. We preserve this list of unique > > vertices, > > > just as we preserve the vertexSF. > > > Then after DMPlexCreateFromCellListParallelPetsc(), can > > > DMPlexInterpolate(), you could call > > > > > > DMPlexBuildFaceLabelsFromCellList(dm, numFaces, faces, > labelName, > > > labelValues) > > > > > > Would that work for you? I think I could do that in a couple of > > hours. > > > > > > > So that function would be very helpful. But is it as simple as you're > > thinking ? The sorted list gives local index -> unique identifier, > but > > > > what we need is the other way round, isn't it ? > > > > > > You get the other way using search. > > > Do you mean a linear search for each vertex ? > Since it is sorted, it is a log search. Thanks, Matt > Thanks, > > -- > Nicolas > > > Thanks, > > > > Matt > > > > Once I understand better, I can have a first draft in my plexadapt > code > > and we pull it out later. > > > > Thanks > > > > -- > > Nicolas > > > > > > > Thanks, > > > > > > Matt > > > > > > Is there a way to track this renumbering ? For owned > > vertices, I can > > > find the local index from the global one (so do old local > > index -> > > > global index -> new local index). For the remote ones, I'm > > not sure. I > > > can hash global indices, but is there a more idiomatic way ? > > > > > > Thanks, > > > > > > -- > > > Nicolas > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to > which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 22 12:19:42 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Mar 2021 13:19:42 -0400 Subject: [petsc-users] funny link error In-Reply-To: <20210322161639.kn4dnwwqeypqter7@txcorp.com> References: <20210322144530.6kge37t3fuxswz32@txcorp.com> <20210322161639.kn4dnwwqeypqter7@txcorp.com> Message-ID: On Mon, Mar 22, 2021 at 12:16 PM Scott Kruger wrote: > > The short answer is $PETSC_INSTALL_DIR/lib/pkgconfig/PETSc.pc > > Attached is a PETSc CMake snippet that should show you how to use > pkg-config. > > It also shows how to set the compilers to the same as PETSc's for build > consistency, but this part is not strictly necessary (but perhaps a good > idea). > > FYI, Barry has been pushing to get a CMake build example as part of the > test harness with the help of Jed and myself, but the Windows issues > have been a real hold-up. > Also, letting people go over a cliff is one thing, but pointing them over is another ;) Matt > Scott > > > On 2021-03-22 11:03, Mark Adams did write: > > Thanks Scott, > > > > Can you please tell us where this pkg-config file is? > > > > Thank again, > > Mark > > > > On Mon, Mar 22, 2021 at 10:45 AM Scott Kruger wrote: > > > > > > > > From you make.log, it looks like petsc found it here: > > > /opt/cray/wlm_detect/1.3.3-7.0.1.1_4.19__g7109084.ari/lib64 > > > > > > PETSc has it and found it because BuildSystem did the query of what > > > it takes to get C/C++/Fortran to work together. Why is it needed? > > > That's up to Cray. > > > > > > But the question is: > > > > > > How is your CMake build getting PETSc info? > > > > > > If > > > a) it using pkg-config and CMake's ability to parse it, then it looks > like > > > our pkg-config export might need work. > > > We'd need to see the pkg-config file to be sure though. > > > > > > and if > > > b) it is not using pkg-config, then the answer is it should. > > > > > > Scott > > > > > > > > > On 2021-03-21 08:25, Mark Adams did write: > > > > We are having problems with linking and use static linking. > > > > We get this error and have seen others like it (eg, lpetsc_lib_gcc_s) > > > > > > > > /usr/bin/ld: cannot find -lpetsc_lib_wlm_detect-NOTFOUND > > > > > > > > wlm_detect is some sort of system library, but I have no idea where > this > > > > petsc string comes from. > > > > This is on Cori and the application uses cmake. > > > > I can run PETSc tests fine. > > > > > > > > Any ideas? > > > > > > > > Thanks, > > > > Mark > > > > > > > > > > > > > > > -- > > > Scott Kruger > > > Tech-X Corporation kruger at txcorp.com > > > 5621 Arapahoe Ave, Suite A Phone: (720) 466-3196 > > > Boulder, CO 80303 Fax: (303) 448-7756 > > > > > -- > Scott Kruger > Tech-X Corporation kruger at txcorp.com > 5621 Arapahoe Ave, Suite A Phone: (720) 466-3196 > Boulder, CO 80303 Fax: (303) 448-7756 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at resfrac.com Mon Mar 22 12:56:06 2021 From: chris at resfrac.com (Chris Hewson) Date: Mon, 22 Mar 2021 11:56:06 -0600 Subject: [petsc-users] MUMPS failure Message-ID: Hi All, I have been having a problem with MUMPS randomly crashing in our program and causing the entire program to crash. I am compiling in -O2 optimization mode and using --download-mumps etc. to compile PETSc. If I rerun the program, 95%+ of the time I can't reproduce the error. It seems to be a similar issue to this thread: https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html Similar to the resolution there I am going to try and increase icntl_14 and see if that resolves the issue. Any other thoughts on this? Thanks, *Chris Hewson* Senior Reservoir Simulation Engineer ResFrac +1.587.575.9792 -------------- next part -------------- An HTML attachment was scrubbed... URL: From skavou1 at lsu.edu Mon Mar 22 13:00:14 2021 From: skavou1 at lsu.edu (Sepideh Kavousi) Date: Mon, 22 Mar 2021 18:00:14 +0000 Subject: [petsc-users] PF+Navier stokes In-Reply-To: References: , Message-ID: I found some discussions in the following link and https://lists.mcs.anl.gov/pipermail/petsc-users/2010-May/006422.html and the following paper: https://www.sciencedirect.com/science/article/pii/S0021999107004330 But I am engineer and the discussions are confusing for me. Would you please tell me what is the correct path to follow? Should I go ahead and apply SIMPLER algorithm for this problem, or I should learn to apply fieldsplit to determine individual preconditioning on each unknown? Best, Sepideh ________________________________ From: Matthew Knepley Sent: Monday, March 22, 2021 9:15 AM To: Sepideh Kavousi Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PF+Navier stokes On Mon, Mar 22, 2021 at 10:04 AM Sepideh Kavousi > wrote: Hello, I want to solve PF solidification+Navier stokes using Finite different method, and I have a strange problem. My code runs fine for some system sizes and fails for some of the system sizes. When I run with the following options: mpirun -np 2 ./one.out -ts_monitor -snes_fd_color -ts_max_snes_failures -1 -ts_type bdf -ts_bdf_adapt -pc_type bjacobi -snes_linesearch_type l2 -snes_type ksponly -ksp_type gmres -ksp_gmres_restart 1001 -sub_pc_type ilu -sub_ksp_type preonly -snes_monitor -ksp_monitor -snes_linesearch_monitor -ksp_monitor_true_residual -ksp_converged_reason -log_view 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 ^C Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 Even setting pc_type to LU does not solve the problem. 0 TS dt 0.0001 time 0. copy! copy! Write output at step= 0! Write output at step= 0! 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT I guess the problem is that in mass conservation I used forward discretization for u (velocity in x) and for the moment in x , I used forward discretization for p (pressure) to ensure non-zero terms on the diagonal of matrix. I tried to run it with valgrind but it did not output anything. Does anyone have suggestions on how to solve this issue? Your subproblems in Block_jacobi are singular. With multiphysics problems like this, definition of blocks can be tricky. I would first try to find a good preconditioner for this system in the literature, and then we can help you try it out. Thanks, Matt Best, Sepideh -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 22 13:02:36 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Mar 2021 14:02:36 -0400 Subject: [petsc-users] PF+Navier stokes In-Reply-To: References: Message-ID: On Mon, Mar 22, 2021 at 2:00 PM Sepideh Kavousi wrote: > I found some discussions in the following link and > https://lists.mcs.anl.gov/pipermail/petsc-users/2010-May/006422.html and > the following paper: > https://www.sciencedirect.com/science/article/pii/S0021999107004330 > But I am engineer and the discussions are confusing for me. Would you > please tell me what is the correct path to follow? > Should I go ahead and apply SIMPLER algorithm for this problem, or I > should learn to apply fieldsplit to determine individual preconditioning on > each unknown? > If you implement SIMPLER you will have done all the work you need to do to use PCFIELDSPLIT, but FieldSplit has a wide array of solvers you can try. Thus, I would first make the ISes that split your system into fields. Thanks, Matt > Best, > Sepideh > ------------------------------ > *From:* Matthew Knepley > *Sent:* Monday, March 22, 2021 9:15 AM > *To:* Sepideh Kavousi > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] PF+Navier stokes > > On Mon, Mar 22, 2021 at 10:04 AM Sepideh Kavousi wrote: > > Hello, > I want to solve PF solidification+Navier stokes using Finite different > method, and I have a strange problem. My code runs fine for some system > sizes and fails for some of the system sizes. When I run with the following > options: > mpirun -np 2 ./one.out -ts_monitor -snes_fd_color -ts_max_snes_failures -1 > -ts_type bdf -ts_bdf_adapt -pc_type bjacobi -snes_linesearch_type l2 > -snes_type ksponly -ksp_type gmres -ksp_gmres_restart 1001 -sub_pc_type ilu > -sub_ksp_type preonly -snes_monitor -ksp_monitor -snes_linesearch_monitor > -ksp_monitor_true_residual -ksp_converged_reason -log_view > > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > ^C Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > > Even setting pc_type to LU does not solve the problem. > 0 TS dt 0.0001 time 0. > copy! > copy! > Write output at step= 0! > Write output at step= 0! > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT > > I guess the problem is that in mass conservation I used forward > discretization for u (velocity in x) and for the moment in x , I used > forward discretization for p (pressure) to ensure non-zero terms on the > diagonal of matrix. I tried to run it with valgrind but it did not output > anything. > > Does anyone have suggestions on how to solve this issue? > > > Your subproblems in Block_jacobi are singular. With multiphysics problems > like this, definition of blocks can be > tricky. I would first try to find a good preconditioner for this system in > the literature, and then we can help you > try it out. > > Thanks, > > Matt > > > > Best, > Sepideh > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 22 13:04:01 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Mar 2021 14:04:01 -0400 Subject: [petsc-users] MUMPS failure In-Reply-To: References: Message-ID: On Mon, Mar 22, 2021 at 1:56 PM Chris Hewson wrote: > Hi All, > > I have been having a problem with MUMPS randomly crashing in our program > and causing the entire program to crash. I am compiling in -O2 optimization > mode and using --download-mumps etc. to compile PETSc. If I rerun the > program, 95%+ of the time I can't reproduce the error. It seems to be a > similar issue to this thread: > > https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html > > Similar to the resolution there I am going to try and increase icntl_14 > and see if that resolves the issue. Any other thoughts on this? > When it fails, do you get a stack trace? Thanks, Matt > Thanks, > > *Chris Hewson* > Senior Reservoir Simulation Engineer > ResFrac > +1.587.575.9792 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at resfrac.com Mon Mar 22 13:07:06 2021 From: chris at resfrac.com (Chris Hewson) Date: Mon, 22 Mar 2021 12:07:06 -0600 Subject: [petsc-users] MUMPS failure In-Reply-To: References: Message-ID: Hi Matt, No, we are running it without debugging in prod and then running debug I can't reproduce the error, from stderr we get: [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [1]PETSC ERROR: to get more information on the crash. [1]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash. application called MPI_Abort(MPI_COMM_WORLD, 50176059) - process 1 *Chris Hewson* Senior Reservoir Simulation Engineer ResFrac +1.587.575.9792 On Mon, Mar 22, 2021 at 12:04 PM Matthew Knepley wrote: > On Mon, Mar 22, 2021 at 1:56 PM Chris Hewson wrote: > >> Hi All, >> >> I have been having a problem with MUMPS randomly crashing in our program >> and causing the entire program to crash. I am compiling in -O2 optimization >> mode and using --download-mumps etc. to compile PETSc. If I rerun the >> program, 95%+ of the time I can't reproduce the error. It seems to be a >> similar issue to this thread: >> >> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html >> >> Similar to the resolution there I am going to try and increase icntl_14 >> and see if that resolves the issue. Any other thoughts on this? >> > > When it fails, do you get a stack trace? > > Thanks, > > Matt > > >> Thanks, >> >> *Chris Hewson* >> Senior Reservoir Simulation Engineer >> ResFrac >> +1.587.575.9792 >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 22 13:09:33 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Mar 2021 14:09:33 -0400 Subject: [petsc-users] MUMPS failure In-Reply-To: References: Message-ID: On Mon, Mar 22, 2021 at 2:07 PM Chris Hewson wrote: > Hi Matt, > > No, we are running it without debugging in prod and then running debug I > can't reproduce the error, from stderr we get: > > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see > https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > [1]PETSC ERROR: to get more information on the crash. > [1]PETSC ERROR: Run with -malloc_debug to check if memory corruption is > causing the crash. > application called MPI_Abort(MPI_COMM_WORLD, 50176059) - process 1 > If you can afford it, running an instance with -on_error_attach_debugger so that if it fails we can get a stack trace, would be very valuable, since right now we do not know exactly what is failing. Thanks, Matt > *Chris Hewson* > Senior Reservoir Simulation Engineer > ResFrac > +1.587.575.9792 > > > On Mon, Mar 22, 2021 at 12:04 PM Matthew Knepley > wrote: > >> On Mon, Mar 22, 2021 at 1:56 PM Chris Hewson wrote: >> >>> Hi All, >>> >>> I have been having a problem with MUMPS randomly crashing in our program >>> and causing the entire program to crash. I am compiling in -O2 optimization >>> mode and using --download-mumps etc. to compile PETSc. If I rerun the >>> program, 95%+ of the time I can't reproduce the error. It seems to be a >>> similar issue to this thread: >>> >>> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html >>> >>> Similar to the resolution there I am going to try and increase icntl_14 >>> and see if that resolves the issue. Any other thoughts on this? >>> >> >> When it fails, do you get a stack trace? >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> >>> *Chris Hewson* >>> Senior Reservoir Simulation Engineer >>> ResFrac >>> +1.587.575.9792 >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Mon Mar 22 13:16:15 2021 From: pierre at joliv.et (Pierre Jolivet) Date: Mon, 22 Mar 2021 19:16:15 +0100 Subject: [petsc-users] MUMPS failure In-Reply-To: References: Message-ID: <71B2E6AD-0B3A-4872-8111-2E675F09BC86@joliv.et> Also, maybe run with -info dump and grep for MUMPS errors in dump.%p, because some failures are silent otherwise. Thanks, Pierre > On 22 Mar 2021, at 7:09 PM, Matthew Knepley wrote: > > On Mon, Mar 22, 2021 at 2:07 PM Chris Hewson > wrote: > Hi Matt, > > No, we are running it without debugging in prod and then running debug I can't reproduce the error, from stderr we get: > > [1]PETSC ERROR: ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > [1]PETSC ERROR: to get more information on the crash. > [1]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash. > application called MPI_Abort(MPI_COMM_WORLD, 50176059) - process 1 > > If you can afford it, running an instance with -on_error_attach_debugger so that if it fails we can get a stack trace, would be > very valuable, since right now we do not know exactly what is failing. > > Thanks, > > Matt > > Chris Hewson > Senior Reservoir Simulation Engineer > ResFrac > +1.587.575.9792 > > > On Mon, Mar 22, 2021 at 12:04 PM Matthew Knepley > wrote: > On Mon, Mar 22, 2021 at 1:56 PM Chris Hewson > wrote: > Hi All, > > I have been having a problem with MUMPS randomly crashing in our program and causing the entire program to crash. I am compiling in -O2 optimization mode and using --download-mumps etc. to compile PETSc. If I rerun the program, 95%+ of the time I can't reproduce the error. It seems to be a similar issue to this thread: > > https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html > > Similar to the resolution there I am going to try and increase icntl_14 and see if that resolves the issue. Any other thoughts on this? > > When it fails, do you get a stack trace? > > Thanks, > > Matt > > Thanks, > > Chris Hewson > Senior Reservoir Simulation Engineer > ResFrac > +1.587.575.9792 > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Mar 22 13:39:47 2021 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 22 Mar 2021 13:39:47 -0500 Subject: [petsc-users] MUMPS failure In-Reply-To: References: Message-ID: Version of PETSc and MUMPS? We fixed a bug in MUMPs a couple years ago that produced error messages as below. Please confirm you are using the latest PETSc and MUMPS. You can run your production version with the option -malloc_debug ; this will slow it down a bit but if there is memory corruption it may detect it and indicate the problematic error. One also has to be careful about the size of the problem passed to MUMPs since PETSc/MUMPs does not fully support using all 64 bit integers. Is it only crashing for problems near 2 billion entries in the sparse matrix? valgrind is the gold standard for detecting memory corruption. Barry > On Mar 22, 2021, at 12:56 PM, Chris Hewson wrote: > > Hi All, > > I have been having a problem with MUMPS randomly crashing in our program and causing the entire program to crash. I am compiling in -O2 optimization mode and using --download-mumps etc. to compile PETSc. If I rerun the program, 95%+ of the time I can't reproduce the error. It seems to be a similar issue to this thread: > > https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html > > Similar to the resolution there I am going to try and increase icntl_14 and see if that resolves the issue. Any other thoughts on this? > > Thanks, > > Chris Hewson > Senior Reservoir Simulation Engineer > ResFrac > +1.587.575.9792 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Mar 22 13:44:04 2021 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 22 Mar 2021 13:44:04 -0500 Subject: [petsc-users] funny link error In-Reply-To: References: <20210322144530.6kge37t3fuxswz32@txcorp.com> <20210322161639.kn4dnwwqeypqter7@txcorp.com> Message-ID: <50EA71D2-A37F-421B-9F34-F4906908FD45@petsc.dev> > On Mar 22, 2021, at 12:19 PM, Matthew Knepley wrote: > > On Mon, Mar 22, 2021 at 12:16 PM Scott Kruger > wrote: > > The short answer is $PETSC_INSTALL_DIR/lib/pkgconfig/PETSc.pc > > Attached is a PETSc CMake snippet that should show you how to use > pkg-config. > > It also shows how to set the compilers to the same as PETSc's for build > consistency, but this part is not strictly necessary (but perhaps a good > idea). > > FYI, Barry has been pushing to get a CMake build example as part of the > test harness with the help of Jed and myself, but the Windows issues > have been a real hold-up. > > Also, letting people go over a cliff is one thing, but pointing them over is another ;) We when see people racing to the cliff in their fast broken cars we need to have a nice gentle reliable donkey to point them to to get them down the cmake cliff. This does not mean we are advocating the donkey, it is just there to break the fall of those determined to plunge off the cliff regardless of the warnings they get. > > Matt > > Scott > > > On 2021-03-22 11:03, Mark Adams did write: > > Thanks Scott, > > > > Can you please tell us where this pkg-config file is? > > > > Thank again, > > Mark > > > > On Mon, Mar 22, 2021 at 10:45 AM Scott Kruger > wrote: > > > > > > > > From you make.log, it looks like petsc found it here: > > > /opt/cray/wlm_detect/1.3.3-7.0.1.1_4.19__g7109084.ari/lib64 > > > > > > PETSc has it and found it because BuildSystem did the query of what > > > it takes to get C/C++/Fortran to work together. Why is it needed? > > > That's up to Cray. > > > > > > But the question is: > > > > > > How is your CMake build getting PETSc info? > > > > > > If > > > a) it using pkg-config and CMake's ability to parse it, then it looks like > > > our pkg-config export might need work. > > > We'd need to see the pkg-config file to be sure though. > > > > > > and if > > > b) it is not using pkg-config, then the answer is it should. > > > > > > Scott > > > > > > > > > On 2021-03-21 08:25, Mark Adams did write: > > > > We are having problems with linking and use static linking. > > > > We get this error and have seen others like it (eg, lpetsc_lib_gcc_s) > > > > > > > > /usr/bin/ld: cannot find -lpetsc_lib_wlm_detect-NOTFOUND > > > > > > > > wlm_detect is some sort of system library, but I have no idea where this > > > > petsc string comes from. > > > > This is on Cori and the application uses cmake. > > > > I can run PETSc tests fine. > > > > > > > > Any ideas? > > > > > > > > Thanks, > > > > Mark > > > > > > > > > > > > > > > -- > > > Scott Kruger > > > Tech-X Corporation kruger at txcorp.com > > > 5621 Arapahoe Ave, Suite A Phone: (720) 466-3196 > > > Boulder, CO 80303 Fax: (303) 448-7756 > > > > > -- > Scott Kruger > Tech-X Corporation kruger at txcorp.com > 5621 Arapahoe Ave, Suite A Phone: (720) 466-3196 > Boulder, CO 80303 Fax: (303) 448-7756 > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Mar 22 13:56:30 2021 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 22 Mar 2021 13:56:30 -0500 Subject: [petsc-users] PF+Navier stokes In-Reply-To: References: Message-ID: <75FC52E9-8615-4966-9BA6-85BD43C01F7B@petsc.dev> Singular systems come up in solving PDEs almost always due to issues related to boundary conditions. For example all Neumann (natural) boundary conditions can produce singular systems. Direct factorizations generically will eventually hit a zero pivot in such cases and there is no universally acceptable approaches for what to do at that point to recover. If you think your operator is singular you should start by using MatSetNullSpace(), it won't "cure" the problem but is the tool we use to manage null spaces in operators. > On Mar 22, 2021, at 9:04 AM, Sepideh Kavousi wrote: > > Hello, > I want to solve PF solidification+Navier stokes using Finite different method, and I have a strange problem. My code runs fine for some system sizes and fails for some of the system sizes. When I run with the following options: > mpirun -np 2 ./one.out -ts_monitor -snes_fd_color -ts_max_snes_failures -1 -ts_type bdf -ts_bdf_adapt -pc_type bjacobi -snes_linesearch_type l2 -snes_type ksponly -ksp_type gmres -ksp_gmres_restart 1001 -sub_pc_type ilu -sub_ksp_type preonly -snes_monitor -ksp_monitor -snes_linesearch_monitor -ksp_monitor_true_residual -ksp_converged_reason -log_view > > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > ^C Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > > Even setting pc_type to LU does not solve the problem. > 0 TS dt 0.0001 time 0. > copy! > copy! > Write output at step= 0! > Write output at step= 0! > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT > > I guess the problem is that in mass conservation I used forward discretization for u (velocity in x) and for the moment in x , I used forward discretization for p (pressure) to ensure non-zero terms on the diagonal of matrix. I tried to run it with valgrind but it did not output anything. > > Does anyone have suggestions on how to solve this issue? > Best, > Sepideh -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Mon Mar 22 15:24:16 2021 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Mon, 22 Mar 2021 15:24:16 -0500 Subject: [petsc-users] MUMPS failure In-Reply-To: References: Message-ID: On Mon, Mar 22, 2021 at 1:39 PM Barry Smith wrote: > > Version of PETSc and MUMPS? We fixed a bug in MUMPs a couple years ago > that produced error messages as below. Please confirm you are using the > latest PETSc and MUMPS. > > You can run your production version with the option -malloc_debug ; > this will slow it down a bit but if there is memory corruption it may > detect it and indicate the problematic error. > > One also has to be careful about the size of the problem passed to > MUMPs since PETSc/MUMPs does not fully support using all 64 bit integers. > Is it only crashing for problems near 2 billion entries in the sparse > matrix? > "problems near 2 billion entries"? I don't understand. Should not be an issue if building petsc with 64-bit indices. > valgrind is the gold standard for detecting memory corruption. > > Barry > > > On Mar 22, 2021, at 12:56 PM, Chris Hewson wrote: > > Hi All, > > I have been having a problem with MUMPS randomly crashing in our program > and causing the entire program to crash. I am compiling in -O2 optimization > mode and using --download-mumps etc. to compile PETSc. If I rerun the > program, 95%+ of the time I can't reproduce the error. It seems to be a > similar issue to this thread: > > https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html > > Similar to the resolution there I am going to try and increase icntl_14 > and see if that resolves the issue. Any other thoughts on this? > > Thanks, > > *Chris Hewson* > Senior Reservoir Simulation Engineer > ResFrac > +1.587.575.9792 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skavou1 at lsu.edu Mon Mar 22 16:05:40 2021 From: skavou1 at lsu.edu (Sepideh Kavousi) Date: Mon, 22 Mar 2021 21:05:40 +0000 Subject: [petsc-users] PF+Navier stokes In-Reply-To: <75FC52E9-8615-4966-9BA6-85BD43C01F7B@petsc.dev> References: , <75FC52E9-8615-4966-9BA6-85BD43C01F7B@petsc.dev> Message-ID: I modified my BC such that on the left and right side of the interface the BC are constant value instead of Neumann(zero flux). This solves the problem but still the code has convergence problem: I even tried field split with the following order: the block size is 9. the first block is the fields related to for PF equation, the second split block is the velocities in x and y direction and the third block is pressure. -pc_type fieldsplit -pc_fieldsplit_block_size 9 -pc_fieldsplit_0_fields 0,1 (two fields related to for the Phasefield model) -pc_fieldsplit_1_fields 2,3 (velocity in x and y direction) -pc_fieldsplit_2_fields 4 (pressure) -fieldsplit_1_pc_fieldsplit_block_size 2 -fieldsplit_1_fieldsplit_0_pc_type ml (based on https://lists.mcs.anl.gov/pipermail/petsc-users/2015-February/024191.html) -fieldsplit_1_fieldsplit_1_pc_type ml (based on https://lists.mcs.anl.gov/pipermail/petsc-users/2015-February/024191.html) -fieldsplit_0_pc_type ilu (based on previous solutions of phase-field equations) -fieldsplit_2_pc_type ilu I guess changing the BCs the main reason that at first few steps the code does not fail. And as time increases, true resid norm increases such that at a finite time step (~30) it reaches 1e7 and the code results non-accurate velocity calculations. Can this also be resulted by forward/backward discritization? Best, Sepideh 34 TS dt 3.12462e-07 time 0.00709097 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 copy! copy! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 34! Write output at step= 34! 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 0 SNES Function norm 6.393295863037e+07 0 SNES Function norm 6.393295863037e+07 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.904272255718e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 33 TS dt 1.56231e-07 time 0.00709082 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! 1 SNES Function norm 1.904272255718e+07 Write output at step= 33! 33 TS dt 1.56231e-07 time 0.00709082 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 copy! 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 Write output at step= 33! 0 SNES Function norm 2.464456787205e+07 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.464456787205e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 0 KSP preconditioned resid norm 2.003930340911e+01 true resid norm 6.987120567963e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 1 KSP preconditioned resid norm 1.199890501875e-02 true resid norm 1.879731143354e+07 ||r(i)||/||b|| 2.690280101896e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 2 KSP preconditioned resid norm 3.018100763012e-04 true resid norm 1.879893603977e+07 ||r(i)||/||b|| 2.690512616309e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 3 KSP preconditioned resid norm 2.835332741838e-04 true resid norm 1.879893794065e+07 ||r(i)||/||b|| 2.690512888363e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 4 KSP preconditioned resid norm 1.860011376508e-04 true resid norm 1.879893735946e+07 ||r(i)||/||b|| 2.690512805182e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.886737547995e+07 31 TS dt 3.90578e-08 time 0.0070907 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 31! 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 0 SNES Function norm 1.888557765431e+07 1 SNES Function norm 1.938116032575e+07 1 SNES Function norm 1.938116032575e+07 34 TS dt 3.12462e-07 time 0.00709097 34 TS dt 3.12462e-07 time 0.00709097 copy! copy! Write output at step= 34! Write output at step= 34! 0 SNES Function norm 6.393295863037e+07 0 SNES Function norm 6.393295863037e+07 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 2.011100207442e+07 1 SNES Function norm 2.011100207442e+07 35 TS dt 6.24925e-07 time 0.00709129 35 TS dt 6.24925e-07 time 0.00709129 copy! copy! 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 35! Write output at step= 35! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.790215258123e+08 0 SNES Function norm 2.790215258123e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 1.938116032575e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 34 TS dt 3.12462e-07 time 0.00709097 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 34! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 0 SNES Function norm 6.393295863037e+07 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.338586074554e-01 true resid norm 1.888557765431e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.938116032575e+07 34 TS dt 3.12462e-07 time 0.00709097 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 1 SNES Function norm 1.938116032575e+07 34 TS dt 3.12462e-07 time 0.00709097 Write output at step= 34! copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 5.451259747447e-03 true resid norm 1.887927947148e+07 ||r(i)||/||b|| 9.996665083300e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 34! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 6.393295863037e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 6.393295863037e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 1 SNES Function norm 2.011100207442e+07 35 TS dt 6.24925e-07 time 0.00709129 1 SNES Function norm 2.011100207442e+07 copy! 2 KSP preconditioned resid norm 9.554577345960e-04 true resid norm 1.887930135577e+07 ||r(i)||/||b|| 9.996676671129e-01 35 TS dt 6.24925e-07 time 0.00709129 copy! Write output at step= 35! Write output at step= 35! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.790215258123e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.790215258123e+08 3 KSP preconditioned resid norm 9.378991224281e-04 true resid norm 1.887930134907e+07 ||r(i)||/||b|| 9.996676667583e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 3.652611805745e-04 true resid norm 1.887930205974e+07 ||r(i)||/||b|| 9.996677043885e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 2.918222127367e-04 true resid norm 1.887930204569e+07 ||r(i)||/||b|| 9.996677036447e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 6.114488674627e-05 true resid norm 1.887930243837e+07 ||r(i)||/||b|| 9.996677244370e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 3.763532951474e-05 true resid norm 1.887930248279e+07 ||r(i)||/||b|| 9.996677267895e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 2.112644035802e-05 true resid norm 1.887930251181e+07 ||r(i)||/||b|| 9.996677283257e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 1.113068460252e-05 true resid norm 1.887930250969e+07 ||r(i)||/||b|| 9.996677282137e-01 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 10 KSP preconditioned resid norm 1.352518287887e-06 true resid norm 1.887930250333e+07 ||r(i)||/||b|| 9.996677278767e-01 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 3.485261902880e+07 1 SNES Function norm 3.485261902880e+07 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 36 TS dt 1.24985e-06 time 0.00709191 36 TS dt 1.24985e-06 time 0.00709191 copy! copy! 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 36! Write output at step= 36! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 4.684646000717e+08 0 SNES Function norm 4.684646000717e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 11 KSP preconditioned resid norm 7.434707372444e-07 true resid norm 1.887930250410e+07 ||r(i)||/||b|| 9.996677279175e-01 1 SNES Function norm 1.887938190335e+07 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 32 TS dt 7.81156e-08 time 0.00709074 copy! Write output at step= 32! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.010558777785e+07 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 1 SNES Function norm 2.011100207442e+07 35 TS dt 6.24925e-07 time 0.00709129 copy! Write output at step= 35! 0 SNES Function norm 2.790215258123e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 1 SNES Function norm 2.011100207442e+07 1 SNES Function norm 3.485261902880e+07 36 TS dt 1.24985e-06 time 0.00709191 35 TS dt 6.24925e-07 time 0.00709129 copy! 1 SNES Function norm 3.485261902880e+07 copy! 36 TS dt 1.24985e-06 time 0.00709191 1 SNES Function norm 2.011100207442e+07 copy! 35 TS dt 6.24925e-07 time 0.00709129 Write output at step= 35! Write output at step= 36! copy! Write output at step= 36! Write output at step= 35! 0 SNES Function norm 2.790215258123e+08 0 SNES Function norm 4.684646000717e+08 0 SNES Function norm 4.684646000717e+08 0 SNES Function norm 2.790215258123e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 5.194818889106e+07 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 37 TS dt 1.4811e-06 time 0.00709316 copy! copy! 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Write output at step= 37! 0 KSP preconditioned resid norm 3.165458473526e+00 true resid norm 2.010558777785e+07 ||r(i)||/||b|| 1.000000000000e+00 Write output at step= 37! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 3.786081856480e+09 0 SNES Function norm 3.786081856480e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 1 KSP preconditioned resid norm 3.655364946441e-03 true resid norm 1.904269379864e+07 ||r(i)||/||b|| 9.471343991057e-01 1 SNES Function norm 3.485261902880e+07 36 TS dt 1.24985e-06 time 0.00709191 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 36! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 4.684646000717e+08 2 KSP preconditioned resid norm 2.207564350060e-03 true resid norm 1.904265942845e+07 ||r(i)||/||b|| 9.471326896210e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 2.188447918524e-03 true resid norm 1.904266151317e+07 ||r(i)||/||b|| 9.471327933098e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 7.425314556150e-04 true resid norm 1.904265807404e+07 ||r(i)||/||b|| 9.471326222560e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 6.052794097111e-04 true resid norm 1.904265841692e+07 ||r(i)||/||b|| 9.471326393103e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.251197915617e-04 true resid norm 1.904266159346e+07 ||r(i)||/||b|| 9.471327973028e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 1 SNES Function norm 3.485261902880e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 36 TS dt 1.24985e-06 time 0.00709191 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 1 SNES Function norm 3.485261902880e+07 Write output at step= 36! 36 TS dt 1.24985e-06 time 0.00709191 copy! 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 1 SNES Function norm 1.904272255718e+07 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 Write output at step= 36! 0 SNES Function norm 4.684646000717e+08 33 TS dt 1.56231e-07 time 0.00709082 copy! 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 1 SNES Function norm 5.194818889106e+07 Write output at step= 33! copy! 0 SNES Function norm 4.684646000717e+08 37 TS dt 1.4811e-06 time 0.00709316 copy! Write output at step= 37! Write output at step= 37! 0 SNES Function norm 2.464456787205e+07 0 SNES Function norm 3.786081856480e+09 0 SNES Function norm 3.786081856480e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.679071798524e+08 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 38 TS dt 2.09926e-06 time 0.00709464 copy! copy! 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Write output at step= 38! Write output at step= 38! 0 SNES Function norm 4.969343279719e+09 0 SNES Function norm 4.969343279719e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 copy! Write output at step= 37! 0 SNES Function norm 3.786081856480e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 1 SNES Function norm 1.679071798524e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 copy! 38 TS dt 2.09926e-06 time 0.00709464 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 38! Write output at step= 38! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 5.194818889106e+07 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 37 TS dt 1.4811e-06 time 0.00709316 0 SNES Function norm 4.969343279719e+09 copy! 0 SNES Function norm 4.969343279719e+09 Write output at step= 37! 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 copy! 0 SNES Function norm 3.786081856480e+09 Write output at step= 37! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 3.786081856480e+09 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 1.938116032575e+07 34 TS dt 3.12462e-07 time 0.00709097 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 34! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 0 SNES Function norm 6.393295863037e+07 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 1 SNES Function norm 1.426574249724e+08 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 39 TS dt 3.42747e-06 time 0.00709674 copy! copy! Write output at step= 39! Write output at step= 39! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 6.081085806316e+09 0 SNES Function norm 6.081085806316e+09 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 copy! Write output at step= 38! 0 SNES Function norm 4.969343279719e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 1 SNES Function norm 1.426574249724e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 39 TS dt 3.42747e-06 time 0.00709674 copy! Write output at step= 39! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 39! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 6.081085806316e+09 0 SNES Function norm 6.081085806316e+09 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 copy! Write output at step= 38! 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 copy! 0 SNES Function norm 4.969343279719e+09 Write output at step= 38! 0 SNES Function norm 4.969343279719e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 1 SNES Function norm 1.906687815261e+08 1 SNES Function norm 1.906687815261e+08 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 40 TS dt 6.85494e-06 time 0.00710017 40 TS dt 6.85494e-06 time 0.00710017 copy! copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 40! Write output at step= 40! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 0 SNES Function norm 2.386070312612e+09 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 2.011100207442e+07 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 35 TS dt 6.24925e-07 time 0.00709129 copy! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 35! 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 0 SNES Function norm 2.790215258123e+08 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 copy! Write output at step= 39! 0 SNES Function norm 6.081085806316e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 1 SNES Function norm 1.906687815261e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 40 TS dt 6.85494e-06 time 0.00710017 1 SNES Function norm 1.906687815261e+08 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 copy! 40 TS dt 6.85494e-06 time 0.00710017 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 40! Write output at step= 40! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 copy! Write output at step= 39! 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 copy! 0 SNES Function norm 6.081085806316e+09 Write output at step= 39! 0 SNES Function norm 6.081085806316e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.906687815261e+08 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 40 TS dt 6.85494e-06 time 0.00710017 copy! 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 Write output at step= 40! 1 SNES Function norm 3.485261902880e+07 36 TS dt 1.24985e-06 time 0.00709191 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 36! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 4.684646000717e+08 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 6.245293405655e+10 1 SNES Function norm 6.245293405655e+10 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.097192603535e+10 0 SNES Function norm 2.097192603535e+10 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.906687815261e+08 40 TS dt 6.85494e-06 time 0.00710017 copy! 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 Write output at step= 40! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.906687815261e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 40 TS dt 6.85494e-06 time 0.00710017 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 40! 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 0 SNES Function norm 2.386070312612e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 1 SNES Function norm 6.245293405655e+10 1 SNES Function norm 6.245293405655e+10 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.097192603535e+10 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.097192603535e+10 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 37! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 3.786081856480e+09 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.670669894784e+05 true resid norm 2.097192603535e+10 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.670669894784e+05 true resid norm 2.097192603535e+10 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.135766802417e+00 true resid norm 8.990660342789e+07 ||r(i)||/||b|| 4.286997926482e-03 1 KSP preconditioned resid norm 2.135766802417e+00 true resid norm 8.990660342789e+07 ||r(i)||/||b|| 4.286997926482e-03 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 7.093938024894e+08 1 SNES Function norm 7.093938024894e+08 ________________________________ From: Barry Smith Sent: Monday, March 22, 2021 1:56 PM To: Sepideh Kavousi Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PF+Navier stokes Singular systems come up in solving PDEs almost always due to issues related to boundary conditions. For example all Neumann (natural) boundary conditions can produce singular systems. Direct factorizations generically will eventually hit a zero pivot in such cases and there is no universally acceptable approaches for what to do at that point to recover. If you think your operator is singular you should start by using MatSetNullSpace(), it won't "cure" the problem but is the tool we use to manage null spaces in operators. On Mar 22, 2021, at 9:04 AM, Sepideh Kavousi > wrote: Hello, I want to solve PF solidification+Navier stokes using Finite different method, and I have a strange problem. My code runs fine for some system sizes and fails for some of the system sizes. When I run with the following options: mpirun -np 2 ./one.out -ts_monitor -snes_fd_color -ts_max_snes_failures -1 -ts_type bdf -ts_bdf_adapt -pc_type bjacobi -snes_linesearch_type l2 -snes_type ksponly -ksp_type gmres -ksp_gmres_restart 1001 -sub_pc_type ilu -sub_ksp_type preonly -snes_monitor -ksp_monitor -snes_linesearch_monitor -ksp_monitor_true_residual -ksp_converged_reason -log_view 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 ^C Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 Even setting pc_type to LU does not solve the problem. 0 TS dt 0.0001 time 0. copy! copy! Write output at step= 0! Write output at step= 0! 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT I guess the problem is that in mass conservation I used forward discretization for u (velocity in x) and for the moment in x , I used forward discretization for p (pressure) to ensure non-zero terms on the diagonal of matrix. I tried to run it with valgrind but it did not output anything. Does anyone have suggestions on how to solve this issue? Best, Sepideh -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Mon Mar 22 16:54:54 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Mon, 22 Mar 2021 22:54:54 +0100 Subject: [petsc-users] Re-ordering in DMPlexCreateFromCellListParallelPetsc In-Reply-To: References: <954681c5-eb9f-a393-f09b-b1efc09bccdc@math.u-bordeaux.fr> <8c151189-d30d-b0ec-485d-3d75e954210e@math.u-bordeaux.fr> Message-ID: @+ -- Nicolas On 22/03/2021 17:53, Matthew Knepley wrote: > On Mon, Mar 22, 2021 at 12:20 PM Nicolas Barral > > wrote: > > Thanks for your answers Matt. > > On 22/03/2021 13:45, Matthew Knepley wrote: > > On Mon, Mar 22, 2021 at 6:22 AM Nicolas Barral > > > > >> wrote: > > > >? ? ?On 21/03/2021 21:29, Matthew Knepley wrote: > >? ? ? > On Sat, Mar 20, 2021 at 10:07 AM Nicolas Barral > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? >>> wrote: > >? ? ? > > >? ? ? >? ? ?Hi all, > >? ? ? > > >? ? ? >? ? ?I'm building a plex from elements arrays using > >? ? ? >? ? ?DMPlexCreateFromCellListParallelPetsc. Once the plex > is built, I > >? ? ? >? ? ?need to > >? ? ? >? ? ?set up boundary labels. I have an array of faces > containing a > >? ? ?series of > >? ? ? >? ? ?3 vertex local indices. To rebuild boundary labels, I > need to > >? ? ?loop over > >? ? ? >? ? ?the array and get the join of 3 consecutive points to > find the > >? ? ? >? ? ?corresponding face point in the DAG. > >? ? ? > > >? ? ? > > >? ? ? > This is very common. We should have a built-in thing that > does this. > >? ? ? > > >? ? ?Ordering apart, it's not very complicated once you figure out > "join" is > >? ? ?the right operation to use. We need more doc on graph layout and > >? ? ?operations in DMPlex, I think I'm going to make pictures when > I'm done > >? ? ?with that code because I waste too much time every time. Is > there a > >? ? ?starting point you like ? > > > > > > I think the discussion in > > > > @article{LangeMitchellKnepleyGorman2015, > >? ? title ? ? = {Efficient mesh management in {Firedrake} using > > {PETSc-DMPlex}}, > >? ? author ? ?= {Michael Lange and Lawrence Mitchell and Matthew G. > > Knepley and Gerard J. Gorman}, > >? ? journal ? = {SIAM Journal on Scientific Computing}, > >? ? volume ? ?= {38}, > >? ? number ? ?= {5}, > >? ? pages ? ? = {S143--S155}, > >? ? eprint ? ?= {http://arxiv.org/abs/1506.07749}, > >? ? doi ? ? ? = {10.1137/15M1026092}, > >? ? year ? ? ?= {2016} > > } > > > > is pretty good. > > > It is, I'd just like to have something more complete (meet, join, > height, depth...) and with more 2D & 3D pictures. It's all information > available somewhere, but I would find it convenient to be all at the > same place. Are sources of the paper available somewhere ? > > > I think the right answer is no, there is not a complete reference. I > have some things in presentations, > but no complete work has been written. Maybe it is time to do that. At > this point, a book is probably > possible :) I'll gladly help with figures or coffees if it can speed up the process ;) > > >? ? ? >? ? ?Problem, vertices get reordered by > >? ? ? >? ? ?DMPlexCreateFromCellListParallelPetsc > >? ? ? >? ? ?so that locally owned vertices are before remote ones, so > >? ? ?local indices > >? ? ? >? ? ?are changed and the indices in the face array are not good > >? ? ?anymore. > >? ? ? > > >? ? ? > > >? ? ? > This is not exactly what happens. I will talk through the > >? ? ?algorithm so > >? ? ? > that maybe we can find a good > >? ? ? > interface. I can probably write the code quickly: > >? ? ? > > >? ? ? > 1) We take in cells[numCells, numCorners], which is a list > of all > >? ? ?the > >? ? ? > vertices in each cell > >? ? ? > > >? ? ? >? ? ? ?The vertex numbers do not have to be a contiguous > set. You can > >? ? ? > have any integers you want. > >? ? ? > > >? ? ? > 2) We create a sorted list of the unique vertex numbers on > each > >? ? ?process. > >? ? ? > The new local vertex numbers > >? ? ? >? ? ? ?are the locations in this list. > >? ? ? > > > > >? ? ?Ok It took me re-writing this email a couple times but I think I > >? ? ?understand. I was too focused on local/global indices. But if > I get > >? ? ?this > >? ? ?right, you still make an assumption: that the numVertices*dim > >? ? ?coordinates passed in vertexCoords are the coordinates of the > >? ? ?numvertices first vertices in the sorted list. Is that true ? > > > > > > No. It can be arbitrary. That is why we make the vertexSF, so we > can map > > those coordinates back to the right processes. > > > So how do you know which coordinates correspond to which vertex > since no > map is explicitly provided ? > > > Ah, it is based on the input assumptions when everything is put > together. You could call BuildParallel() > with any old numbering of vertices. However, if you call > CreateParallel(), so that you are also passing > in an array for vertex coordinates. Now vertices are assumed to be > numbered [0, Nv) to correspond to > the input array. Now, that array of coordinates can be chopped up > differently in parallel than the vertices > for subdomains. The vertexSF can convert between the mappings. So if on rank 0, I have numCells tets using say 10 different vertices, and I pass a list of 7*2 coordinates, the coordinates will be assumed to be those of the 7 first vertices in the sorted unique identifier list ? Is this fool proof ? (meaning: if it works for me now, can I assume I did the right thing or should I be careful and look twice ?) > > >? ? ? > Here is my proposed interface. We preserve this list of unique > >? ? ?vertices, > >? ? ? > just as we preserve the vertexSF. > >? ? ? > Then after DMPlexCreateFromCellListParallelPetsc(), can > >? ? ? > DMPlexInterpolate(), you could call > >? ? ? > > >? ? ? >? ? DMPlexBuildFaceLabelsFromCellList(dm, numFaces, faces, > labelName, > >? ? ? > labelValues) > >? ? ? > > >? ? ? > Would that work for you? I think I could do that in a > couple of > >? ? ?hours. > >? ? ? > > > > >? ? ?So that function would be very helpful. But is it as simple > as you're > >? ? ?thinking ? The sorted list gives local index -> unique > identifier, but > > > >? ? ?what we need is the other way round, isn't it ? > > > > > > You get the other way using search. > > > Do you mean a linear search for each vertex ? > > > Since it is sorted, it is a log search. > You are right. Are you opposed to hash tables though ? :) Thanks -- Nicolas > ? Thanks, > > ? ? ?Matt > > Thanks, > > -- > Nicolas > > >? ? Thanks, > > > >? ? ? ?Matt > > > >? ? ?Once I understand better, I can have a first draft in my > plexadapt code > >? ? ?and we pull it out later. > > > >? ? ?Thanks > > > >? ? ?-- > >? ? ?Nicolas > > > > > >? ? ? >? ? Thanks, > >? ? ? > > >? ? ? >? ? ? ?Matt > >? ? ? > > >? ? ? >? ? ?Is there a way to track this renumbering ? For owned > >? ? ?vertices, I can > >? ? ? >? ? ?find the local index from the global one (so do old local > >? ? ?index -> > >? ? ? >? ? ?global index -> new local index). For the remote ones, I'm > >? ? ?not sure. I > >? ? ? >? ? ?can hash global indices, but is there a more idiomatic > way ? > >? ? ? > > >? ? ? >? ? ?Thanks, > >? ? ? > > >? ? ? >? ? ?-- > >? ? ? >? ? ?Nicolas > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? > -- > >? ? ? > What most experimenters take for granted before they begin > their > >? ? ? > experiments is infinitely more interesting than any > results to which > >? ? ? > their experiments lead. > >? ? ? > -- Norbert Wiener > >? ? ? > > >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From salazardetro1 at llnl.gov Mon Mar 22 18:53:19 2021 From: salazardetro1 at llnl.gov (Salazar De Troya, Miguel) Date: Mon, 22 Mar 2021 23:53:19 +0000 Subject: [petsc-users] Local Discontinuous Galerkin with PETSc TS Message-ID: <3EE29E70-8ECF-4842-99DC-30E867769875@llnl.gov> Hello I am interested in implementing the LDG method in ?A local discontinuous Galerkin method for directly solving Hamilton?Jacobi equations? https://www.sciencedirect.com/science/article/pii/S0021999110005255. The equation is more or less of the form (for 1D case): p1 = f(u_x) p2 = g(u_x) u_t = H(p1, p2) where typically one solves for p1 and p2 using the previous time step solution ?u? and then plugs them into the third equation to obtain the next step solution. I am wondering if the TS infrastructure could be used to implement this solution scheme. Looking at the manual, I think one could set G(t, U) to the right-hand side in the above equations and F(t, u, u?) = 0 to the left-hand side, although the first two equations would not have time derivative. In that case, how could one take advantage of the operator split scheme I mentioned? Maybe using some block preconditioners? I am trying to solve the Hamilton-Jacobi equation u_t ? H(u_x) = 0. I welcome any suggestion for better methods. Thanks Miguel Miguel A. Salazar de Troya Postdoctoral Researcher, Lawrence Livermore National Laboratory B141 Rm: 1085-5 Ph: 1(925) 422-6411 -------------- next part -------------- An HTML attachment was scrubbed... URL: From zjorti at lanl.gov Mon Mar 22 18:54:47 2021 From: zjorti at lanl.gov (Jorti, Zakariae) Date: Mon, 22 Mar 2021 23:54:47 +0000 Subject: [petsc-users] Question about periodic conditions Message-ID: <1cf6b948af3d47308e69115f1f8e543f@lanl.gov> Hi, I implemented a PETSc code to solve Maxwell's equations for the magnetic and electric fields (B and E) in a cylinder: 0 < r_min <= r <= r_max; with r_max > r_min phi_min = 0 <= r <= phi_max = 2 ? z_min <= z =< z_max; with z_max > z_min. I am using a PETSc staggered grid with the electric field E defined on edge centers and the magnetic field B defined on face centers. (dof0 = 0, dof1 = 1,dof2 = 1, dof3 = 0;). I have two versions of my code: 1 - A first version in which I set the boundary type to DM_BOUNDARY_NONE in the three directions r, phi and z 2- A second version in which I set the boundary type to DM_BOUNDARY_NONE in the r and z directions, and DM_BOUNDARY_PERIODIC in the phi direction. When I print the solution vector X, which contains both E and B components, I notice that the vector is shorter with the second version compared to the first one. Is it normal? Besides, I was wondering if I have to change the way I define the value of the solution on the boundary. What I am doing so far in both versions is something like: B_phi [phi = 0] = 1.0; B_phi [phi = 2?] = 1.0; E_z [r, phi = 0] = 1/r; E_z [r, phi = 2?] = 1/r; Assuming that values at phi = 0 should be the same as at phi=2? with the periodic boundary conditions, is it sufficient for example to have only the following boundary conditions: B_phi [phi = 0] = 1.0; E_z [r, phi = 0] = 1/r ? Thank you. Best regards, Zakariae Jorti -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 22 19:20:19 2021 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Mar 2021 20:20:19 -0400 Subject: [petsc-users] Local Discontinuous Galerkin with PETSc TS In-Reply-To: <3EE29E70-8ECF-4842-99DC-30E867769875@llnl.gov> References: <3EE29E70-8ECF-4842-99DC-30E867769875@llnl.gov> Message-ID: On Mon, Mar 22, 2021 at 7:53 PM Salazar De Troya, Miguel via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello > > > > I am interested in implementing the LDG method in ?A local discontinuous > Galerkin method for directly solving Hamilton?Jacobi equations? > https://www.sciencedirect.com/science/article/pii/S0021999110005255. The > equation is more or less of the form (for 1D case): > > p1 = f(u_x) > > p2 = g(u_x) > > u_t = H(p1, p2) > > > > where typically one solves for p1 and p2 using the previous time step > solution ?u? and then plugs them into the third equation to obtain the next > step solution. I am wondering if the TS infrastructure could be used to > implement this solution scheme. Looking at the manual, I think one could > set G(t, U) to the right-hand side in the above equations and F(t, u, u?) = > 0 to the left-hand side, although the first two equations would not have > time derivative. In that case, how could one take advantage of the operator > split scheme I mentioned? Maybe using some block preconditioners? > Hi Miguel, I have a simple-minded way of understanding these TS things. My heuristic is that you put things in F that you expect to want at u^{n+1}, and things in G that you expect to want at u^n. It is not that simple, since you could for instance move F and G to the LHS and have Backward Euler, but it is my rule of thumb. So, were you looking for an IMEX scheme? If so, which terms should be lagged? Also, from the equations above, it is hard to see why you need a solve to calculate p1/p2. It looks like just a forward application of an operator. Thanks, Matt > I am trying to solve the Hamilton-Jacobi equation u_t ? H(u_x) = 0. I > welcome any suggestion for better methods. > > > > Thanks > > Miguel > > > > Miguel A. Salazar de Troya > > Postdoctoral Researcher, Lawrence Livermore National Laboratory > > B141 > > Rm: 1085-5 > > Ph: 1(925) 422-6411 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Mar 22 21:42:33 2021 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 22 Mar 2021 21:42:33 -0500 Subject: [petsc-users] Local Discontinuous Galerkin with PETSc TS In-Reply-To: References: <3EE29E70-8ECF-4842-99DC-30E867769875@llnl.gov> Message-ID: <775766A0-D6D6-4007-888C-A261A139941F@petsc.dev> u_t = G(u) I don't see why you won't just compute any needed u_x from the given u and then you can use any explicit or implicit TS solver trivially. For implicit methods it can automatically compute the Jacobian of G for you or you can provide it directly. Explicit methods will just use the "old" u while implicit methods will use the new. Barry > On Mar 22, 2021, at 7:20 PM, Matthew Knepley wrote: > > On Mon, Mar 22, 2021 at 7:53 PM Salazar De Troya, Miguel via petsc-users > wrote: > Hello > > > > I am interested in implementing the LDG method in ?A local discontinuous Galerkin method for directly solving Hamilton?Jacobi equations? https://www.sciencedirect.com/science/article/pii/S0021999110005255 . The equation is more or less of the form (for 1D case): > > p1 = f(u_x) > > p2 = g(u_x) > > u_t = H(p1, p2) > > > > where typically one solves for p1 and p2 using the previous time step solution ?u? and then plugs them into the third equation to obtain the next step solution. I am wondering if the TS infrastructure could be used to implement this solution scheme. Looking at the manual, I think one could set G(t, U) to the right-hand side in the above equations and F(t, u, u?) = 0 to the left-hand side, although the first two equations would not have time derivative. In that case, how could one take advantage of the operator split scheme I mentioned? Maybe using some block preconditioners? > > > Hi Miguel, > > I have a simple-minded way of understanding these TS things. My heuristic is that you put things in F that you expect to want > at u^{n+1}, and things in G that you expect to want at u^n. It is not that simple, since you could for instance move F and G > to the LHS and have Backward Euler, but it is my rule of thumb. > > So, were you looking for an IMEX scheme? If so, which terms should be lagged? Also, from the equations above, it is hard to > see why you need a solve to calculate p1/p2. It looks like just a forward application of an operator. > > Thanks, > > Matt > > I am trying to solve the Hamilton-Jacobi equation u_t ? H(u_x) = 0. I welcome any suggestion for better methods. > > > > Thanks > > Miguel > > > > Miguel A. Salazar de Troya > > Postdoctoral Researcher, Lawrence Livermore National Laboratory > > B141 > > Rm: 1085-5 > > Ph: 1(925) 422-6411 > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Mar 22 22:48:39 2021 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 22 Mar 2021 22:48:39 -0500 Subject: [petsc-users] PF+Navier stokes In-Reply-To: References: <75FC52E9-8615-4966-9BA6-85BD43C01F7B@petsc.dev> Message-ID: <22EEC3F3-4B97-4CCE-8B53-54409298E0FB@petsc.dev> Each of the individual fields "converging" does not imply the overall PCFIELDSPLIT will converge. This is scary, 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 often these huge differences in true residual come from the operator having a null space that is not handled properly. Or they can come from an ILU that produces absurd pivots. Recommend using the same PCFIELDSPLIT but using a direct solver on all the splits (no ML no ILU). If that results in great overall convergence it means one of the sub-systems is not converging properly with ML or ILU and you need better preconditioners on some of the splits. Barry > On Mar 22, 2021, at 4:05 PM, Sepideh Kavousi wrote: > > I modified my BC such that on the left and right side of the interface the BC are constant value instead of Neumann(zero flux). This solves the problem but still the code has convergence problem: > I even tried field split with the following order: > the block size is 9. the first block is the fields related to for PF equation, the second split block is the velocities in x and y direction and the third block is pressure. > > -pc_type fieldsplit > -pc_fieldsplit_block_size 9 > -pc_fieldsplit_0_fields 0,1 (two fields related to for the Phasefield model) > -pc_fieldsplit_1_fields 2,3 (velocity in x and y direction) > -pc_fieldsplit_2_fields 4 (pressure) > -fieldsplit_1_pc_fieldsplit_block_size 2 > -fieldsplit_1_fieldsplit_0_pc_type ml (based on https://lists.mcs.anl.gov/pipermail/petsc-users/2015-February/024191.html ) > -fieldsplit_1_fieldsplit_1_pc_type ml (based on https://lists.mcs.anl.gov/pipermail/petsc-users/2015-February/024191.html ) > -fieldsplit_0_pc_type ilu (based on previous solutions of phase-field equations) > -fieldsplit_2_pc_type ilu > > I guess changing the BCs the main reason that at first few steps the code does not fail. And as time increases, true resid norm increases such that at a finite time step (~30) it reaches 1e7 and the code results non-accurate velocity calculations. Can this also be resulted by forward/backward discritization? > Best, > Sepideh > > > 34 TS dt 3.12462e-07 time 0.00709097 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > copy! > copy! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 34! > Write output at step= 34! > 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 > 0 SNES Function norm 6.393295863037e+07 > 0 SNES Function norm 6.393295863037e+07 > 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 1 SNES Function norm 1.904272255718e+07 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 33 TS dt 1.56231e-07 time 0.00709082 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > copy! > 1 SNES Function norm 1.904272255718e+07 > Write output at step= 33! > 33 TS dt 1.56231e-07 time 0.00709082 > 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 > copy! > 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 > Write output at step= 33! > 0 SNES Function norm 2.464456787205e+07 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 2.464456787205e+07 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 > 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 > 0 KSP preconditioned resid norm 2.003930340911e+01 true resid norm 6.987120567963e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 > 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 > 1 KSP preconditioned resid norm 1.199890501875e-02 true resid norm 1.879731143354e+07 ||r(i)||/||b|| 2.690280101896e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 > 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 > 2 KSP preconditioned resid norm 3.018100763012e-04 true resid norm 1.879893603977e+07 ||r(i)||/||b|| 2.690512616309e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 > 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 > 3 KSP preconditioned resid norm 2.835332741838e-04 true resid norm 1.879893794065e+07 ||r(i)||/||b|| 2.690512888363e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 > 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 > 4 KSP preconditioned resid norm 1.860011376508e-04 true resid norm 1.879893735946e+07 ||r(i)||/||b|| 2.690512805182e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 1 SNES Function norm 1.886737547995e+07 > 31 TS dt 3.90578e-08 time 0.0070907 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > copy! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 31! > 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 > 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 > 0 SNES Function norm 1.888557765431e+07 > 1 SNES Function norm 1.938116032575e+07 > 1 SNES Function norm 1.938116032575e+07 > 34 TS dt 3.12462e-07 time 0.00709097 > 34 TS dt 3.12462e-07 time 0.00709097 > copy! > copy! > Write output at step= 34! > Write output at step= 34! > 0 SNES Function norm 6.393295863037e+07 > 0 SNES Function norm 6.393295863037e+07 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 SNES Function norm 2.011100207442e+07 > 1 SNES Function norm 2.011100207442e+07 > 35 TS dt 6.24925e-07 time 0.00709129 > 35 TS dt 6.24925e-07 time 0.00709129 > copy! > copy! > 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 35! > Write output at step= 35! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 2.790215258123e+08 > 0 SNES Function norm 2.790215258123e+08 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 > 1 SNES Function norm 1.938116032575e+07 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 34 TS dt 3.12462e-07 time 0.00709097 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 34! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 > 0 SNES Function norm 6.393295863037e+07 > 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 > 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 > 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 > 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 > 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.338586074554e-01 true resid norm 1.888557765431e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 1 SNES Function norm 1.938116032575e+07 > 34 TS dt 3.12462e-07 time 0.00709097 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > copy! > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > 1 SNES Function norm 1.938116032575e+07 > 34 TS dt 3.12462e-07 time 0.00709097 > Write output at step= 34! > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 5.451259747447e-03 true resid norm 1.887927947148e+07 ||r(i)||/||b|| 9.996665083300e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 34! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 6.393295863037e+07 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 6.393295863037e+07 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > 1 SNES Function norm 2.011100207442e+07 > 35 TS dt 6.24925e-07 time 0.00709129 > 1 SNES Function norm 2.011100207442e+07 > copy! > 2 KSP preconditioned resid norm 9.554577345960e-04 true resid norm 1.887930135577e+07 ||r(i)||/||b|| 9.996676671129e-01 > 35 TS dt 6.24925e-07 time 0.00709129 > copy! > Write output at step= 35! > Write output at step= 35! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 2.790215258123e+08 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 2.790215258123e+08 > 3 KSP preconditioned resid norm 9.378991224281e-04 true resid norm 1.887930134907e+07 ||r(i)||/||b|| 9.996676667583e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 4 KSP preconditioned resid norm 3.652611805745e-04 true resid norm 1.887930205974e+07 ||r(i)||/||b|| 9.996677043885e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 5 KSP preconditioned resid norm 2.918222127367e-04 true resid norm 1.887930204569e+07 ||r(i)||/||b|| 9.996677036447e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 6 KSP preconditioned resid norm 6.114488674627e-05 true resid norm 1.887930243837e+07 ||r(i)||/||b|| 9.996677244370e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 7 KSP preconditioned resid norm 3.763532951474e-05 true resid norm 1.887930248279e+07 ||r(i)||/||b|| 9.996677267895e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 8 KSP preconditioned resid norm 2.112644035802e-05 true resid norm 1.887930251181e+07 ||r(i)||/||b|| 9.996677283257e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 9 KSP preconditioned resid norm 1.113068460252e-05 true resid norm 1.887930250969e+07 ||r(i)||/||b|| 9.996677282137e-01 > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 10 KSP preconditioned resid norm 1.352518287887e-06 true resid norm 1.887930250333e+07 ||r(i)||/||b|| 9.996677278767e-01 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 SNES Function norm 3.485261902880e+07 > 1 SNES Function norm 3.485261902880e+07 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 36 TS dt 1.24985e-06 time 0.00709191 > 36 TS dt 1.24985e-06 time 0.00709191 > copy! > copy! > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 36! > Write output at step= 36! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 4.684646000717e+08 > 0 SNES Function norm 4.684646000717e+08 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 11 KSP preconditioned resid norm 7.434707372444e-07 true resid norm 1.887930250410e+07 ||r(i)||/||b|| 9.996677279175e-01 > 1 SNES Function norm 1.887938190335e+07 > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > 32 TS dt 7.81156e-08 time 0.00709074 > copy! > Write output at step= 32! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 2.010558777785e+07 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > 1 SNES Function norm 2.011100207442e+07 > 35 TS dt 6.24925e-07 time 0.00709129 > copy! > Write output at step= 35! > 0 SNES Function norm 2.790215258123e+08 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > 1 SNES Function norm 2.011100207442e+07 > 1 SNES Function norm 3.485261902880e+07 > 36 TS dt 1.24985e-06 time 0.00709191 > 35 TS dt 6.24925e-07 time 0.00709129 > copy! > 1 SNES Function norm 3.485261902880e+07 > copy! > 36 TS dt 1.24985e-06 time 0.00709191 > 1 SNES Function norm 2.011100207442e+07 > copy! > 35 TS dt 6.24925e-07 time 0.00709129 > Write output at step= 35! > Write output at step= 36! > copy! > Write output at step= 36! > Write output at step= 35! > 0 SNES Function norm 2.790215258123e+08 > 0 SNES Function norm 4.684646000717e+08 > 0 SNES Function norm 4.684646000717e+08 > 0 SNES Function norm 2.790215258123e+08 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 SNES Function norm 5.194818889106e+07 > 1 SNES Function norm 5.194818889106e+07 > 37 TS dt 1.4811e-06 time 0.00709316 > 37 TS dt 1.4811e-06 time 0.00709316 > copy! > copy! > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > Write output at step= 37! > 0 KSP preconditioned resid norm 3.165458473526e+00 true resid norm 2.010558777785e+07 ||r(i)||/||b|| 1.000000000000e+00 > Write output at step= 37! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 3.786081856480e+09 > 0 SNES Function norm 3.786081856480e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > 1 KSP preconditioned resid norm 3.655364946441e-03 true resid norm 1.904269379864e+07 ||r(i)||/||b|| 9.471343991057e-01 > 1 SNES Function norm 3.485261902880e+07 > 36 TS dt 1.24985e-06 time 0.00709191 > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 36! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 4.684646000717e+08 > 2 KSP preconditioned resid norm 2.207564350060e-03 true resid norm 1.904265942845e+07 ||r(i)||/||b|| 9.471326896210e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 3 KSP preconditioned resid norm 2.188447918524e-03 true resid norm 1.904266151317e+07 ||r(i)||/||b|| 9.471327933098e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 4 KSP preconditioned resid norm 7.425314556150e-04 true resid norm 1.904265807404e+07 ||r(i)||/||b|| 9.471326222560e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 5 KSP preconditioned resid norm 6.052794097111e-04 true resid norm 1.904265841692e+07 ||r(i)||/||b|| 9.471326393103e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 6 KSP preconditioned resid norm 1.251197915617e-04 true resid norm 1.904266159346e+07 ||r(i)||/||b|| 9.471327973028e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > 1 SNES Function norm 3.485261902880e+07 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 36 TS dt 1.24985e-06 time 0.00709191 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > copy! > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 > 1 SNES Function norm 3.485261902880e+07 > Write output at step= 36! > 36 TS dt 1.24985e-06 time 0.00709191 > copy! > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > 1 SNES Function norm 1.904272255718e+07 > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > Write output at step= 36! > 0 SNES Function norm 4.684646000717e+08 > 33 TS dt 1.56231e-07 time 0.00709082 > copy! > 1 SNES Function norm 5.194818889106e+07 > 37 TS dt 1.4811e-06 time 0.00709316 > 1 SNES Function norm 5.194818889106e+07 > Write output at step= 33! > copy! > 0 SNES Function norm 4.684646000717e+08 > 37 TS dt 1.4811e-06 time 0.00709316 > copy! > Write output at step= 37! > Write output at step= 37! > 0 SNES Function norm 2.464456787205e+07 > 0 SNES Function norm 3.786081856480e+09 > 0 SNES Function norm 3.786081856480e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 > 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 SNES Function norm 1.679071798524e+08 > 1 SNES Function norm 1.679071798524e+08 > 38 TS dt 2.09926e-06 time 0.00709464 > 38 TS dt 2.09926e-06 time 0.00709464 > copy! > copy! > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > Write output at step= 38! > Write output at step= 38! > 0 SNES Function norm 4.969343279719e+09 > 0 SNES Function norm 4.969343279719e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > 1 SNES Function norm 5.194818889106e+07 > 37 TS dt 1.4811e-06 time 0.00709316 > copy! > Write output at step= 37! > 0 SNES Function norm 3.786081856480e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 > 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 1 SNES Function norm 1.679071798524e+08 > 38 TS dt 2.09926e-06 time 0.00709464 > 1 SNES Function norm 1.679071798524e+08 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > copy! > 38 TS dt 2.09926e-06 time 0.00709464 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > copy! > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 38! > Write output at step= 38! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 SNES Function norm 5.194818889106e+07 > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > 37 TS dt 1.4811e-06 time 0.00709316 > 0 SNES Function norm 4.969343279719e+09 > copy! > 0 SNES Function norm 4.969343279719e+09 > Write output at step= 37! > 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 > 1 SNES Function norm 5.194818889106e+07 > 37 TS dt 1.4811e-06 time 0.00709316 > copy! > 0 SNES Function norm 3.786081856480e+09 > Write output at step= 37! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 3.786081856480e+09 > 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 > 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 > 1 SNES Function norm 1.938116032575e+07 > 34 TS dt 3.12462e-07 time 0.00709097 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 34! > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 > 0 SNES Function norm 6.393295863037e+07 > 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 > 1 SNES Function norm 1.426574249724e+08 > 1 SNES Function norm 1.426574249724e+08 > 39 TS dt 3.42747e-06 time 0.00709674 > 39 TS dt 3.42747e-06 time 0.00709674 > copy! > copy! > Write output at step= 39! > Write output at step= 39! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 6.081085806316e+09 > 0 SNES Function norm 6.081085806316e+09 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 > 1 SNES Function norm 1.679071798524e+08 > 38 TS dt 2.09926e-06 time 0.00709464 > copy! > Write output at step= 38! > 0 SNES Function norm 4.969343279719e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 > 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 > 1 SNES Function norm 1.426574249724e+08 > 39 TS dt 3.42747e-06 time 0.00709674 > 1 SNES Function norm 1.426574249724e+08 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > copy! > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 39 TS dt 3.42747e-06 time 0.00709674 > copy! > Write output at step= 39! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 39! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 6.081085806316e+09 > 0 SNES Function norm 6.081085806316e+09 > 1 SNES Function norm 1.679071798524e+08 > 38 TS dt 2.09926e-06 time 0.00709464 > 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 > copy! > Write output at step= 38! > 1 SNES Function norm 1.679071798524e+08 > 38 TS dt 2.09926e-06 time 0.00709464 > copy! > 0 SNES Function norm 4.969343279719e+09 > Write output at step= 38! > 0 SNES Function norm 4.969343279719e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 > 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 > 1 SNES Function norm 1.906687815261e+08 > 1 SNES Function norm 1.906687815261e+08 > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > 40 TS dt 6.85494e-06 time 0.00710017 > 40 TS dt 6.85494e-06 time 0.00710017 > copy! > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 40! > Write output at step= 40! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 2.386070312612e+09 > 0 SNES Function norm 2.386070312612e+09 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 > 1 SNES Function norm 2.011100207442e+07 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 35 TS dt 6.24925e-07 time 0.00709129 > copy! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 35! > 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 > 0 SNES Function norm 2.790215258123e+08 > 1 SNES Function norm 1.426574249724e+08 > 39 TS dt 3.42747e-06 time 0.00709674 > copy! > Write output at step= 39! > 0 SNES Function norm 6.081085806316e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 > 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 > 1 SNES Function norm 1.906687815261e+08 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 40 TS dt 6.85494e-06 time 0.00710017 > 1 SNES Function norm 1.906687815261e+08 > 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 > copy! > 40 TS dt 6.85494e-06 time 0.00710017 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > copy! > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 40! > Write output at step= 40! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 2.386070312612e+09 > 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 2.386070312612e+09 > 1 SNES Function norm 1.426574249724e+08 > 39 TS dt 3.42747e-06 time 0.00709674 > 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 > copy! > Write output at step= 39! > 1 SNES Function norm 1.426574249724e+08 > 39 TS dt 3.42747e-06 time 0.00709674 > copy! > 0 SNES Function norm 6.081085806316e+09 > Write output at step= 39! > 0 SNES Function norm 6.081085806316e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 > 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 > 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 > 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 SNES Function norm 1.906687815261e+08 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > 40 TS dt 6.85494e-06 time 0.00710017 > copy! > 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 > 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 > Write output at step= 40! > 1 SNES Function norm 3.485261902880e+07 > 36 TS dt 1.24985e-06 time 0.00709191 > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 2.386070312612e+09 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 36! > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 4.684646000717e+08 > 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 > 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 > 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 > 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 > 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 > 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 > 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 > 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 > 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 > 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 > 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 > 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 > 1 SNES Function norm 6.245293405655e+10 > 1 SNES Function norm 6.245293405655e+10 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 2.097192603535e+10 > 0 SNES Function norm 2.097192603535e+10 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 1 SNES Function norm 1.906687815261e+08 > 40 TS dt 6.85494e-06 time 0.00710017 > copy! > 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 > 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 > 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 > Write output at step= 40! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 1 SNES Function norm 1.906687815261e+08 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 2.386070312612e+09 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 40 TS dt 6.85494e-06 time 0.00710017 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > copy! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 40! > 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 > 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 > 0 SNES Function norm 2.386070312612e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 > 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 > 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 > 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 > 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 > 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 > 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 > 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 > 1 SNES Function norm 6.245293405655e+10 > 1 SNES Function norm 6.245293405655e+10 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 2.097192603535e+10 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 2.097192603535e+10 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 > 1 SNES Function norm 5.194818889106e+07 > 37 TS dt 1.4811e-06 time 0.00709316 > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Write output at step= 37! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 SNES Function norm 3.786081856480e+09 > 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 4.670669894784e+05 true resid norm 2.097192603535e+10 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 4.670669894784e+05 true resid norm 2.097192603535e+10 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > 1 KSP preconditioned resid norm 2.135766802417e+00 true resid norm 8.990660342789e+07 ||r(i)||/||b|| 4.286997926482e-03 > 1 KSP preconditioned resid norm 2.135766802417e+00 true resid norm 8.990660342789e+07 ||r(i)||/||b|| 4.286997926482e-03 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 > 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 > 1 SNES Function norm 7.093938024894e+08 > 1 SNES Function norm 7.093938024894e+08 > > From: Barry Smith > > Sent: Monday, March 22, 2021 1:56 PM > To: Sepideh Kavousi > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] PF+Navier stokes > > > Singular systems come up in solving PDEs almost always due to issues related to boundary conditions. For example all Neumann (natural) boundary conditions can produce singular systems. Direct factorizations generically will eventually hit a zero pivot in such cases and there is no universally acceptable approaches for what to do at that point to recover. If you think your operator is singular you should start by using MatSetNullSpace(), it won't "cure" the problem but is the tool we use to manage null spaces in operators. > > > > > >> On Mar 22, 2021, at 9:04 AM, Sepideh Kavousi > wrote: >> >> Hello, >> I want to solve PF solidification+Navier stokes using Finite different method, and I have a strange problem. My code runs fine for some system sizes and fails for some of the system sizes. When I run with the following options: >> mpirun -np 2 ./one.out -ts_monitor -snes_fd_color -ts_max_snes_failures -1 -ts_type bdf -ts_bdf_adapt -pc_type bjacobi -snes_linesearch_type l2 -snes_type ksponly -ksp_type gmres -ksp_gmres_restart 1001 -sub_pc_type ilu -sub_ksp_type preonly -snes_monitor -ksp_monitor -snes_linesearch_monitor -ksp_monitor_true_residual -ksp_converged_reason -log_view >> >> 0 SNES Function norm 1.465357113711e+01 >> 0 SNES Function norm 1.465357113711e+01 >> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >> PC_FAILED due to SUBPC_ERROR >> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >> PC_FAILED due to SUBPC_ERROR >> 0 SNES Function norm 1.465357113711e+01 >> 0 SNES Function norm 1.465357113711e+01 >> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >> PC_FAILED due to SUBPC_ERROR >> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >> PC_FAILED due to SUBPC_ERROR >> 0 SNES Function norm 1.465357113711e+01 >> 0 SNES Function norm 1.465357113711e+01 >> ^C Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >> PC_FAILED due to SUBPC_ERROR >> 0 SNES Function norm 1.465357113711e+01 >> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >> PC_FAILED due to SUBPC_ERROR >> 0 SNES Function norm 1.465357113711e+01 >> >> Even setting pc_type to LU does not solve the problem. >> 0 TS dt 0.0001 time 0. >> copy! >> copy! >> Write output at step= 0! >> Write output at step= 0! >> 0 SNES Function norm 1.465357113711e+01 >> 0 SNES Function norm 1.465357113711e+01 >> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >> PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT >> Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 >> PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT >> >> I guess the problem is that in mass conservation I used forward discretization for u (velocity in x) and for the moment in x , I used forward discretization for p (pressure) to ensure non-zero terms on the diagonal of matrix. I tried to run it with valgrind but it did not output anything. >> >> Does anyone have suggestions on how to solve this issue? >> Best, >> Sepideh -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Mar 22 22:53:41 2021 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 22 Mar 2021 22:53:41 -0500 Subject: [petsc-users] MUMPS failure In-Reply-To: References: Message-ID: > On Mar 22, 2021, at 3:24 PM, Junchao Zhang wrote: > > > > > On Mon, Mar 22, 2021 at 1:39 PM Barry Smith > wrote: > > Version of PETSc and MUMPS? We fixed a bug in MUMPs a couple years ago that produced error messages as below. Please confirm you are using the latest PETSc and MUMPS. > > You can run your production version with the option -malloc_debug ; this will slow it down a bit but if there is memory corruption it may detect it and indicate the problematic error. > > One also has to be careful about the size of the problem passed to MUMPs since PETSc/MUMPs does not fully support using all 64 bit integers. Is it only crashing for problems near 2 billion entries in the sparse matrix? > "problems near 2 billion entries"? I don't understand. Should not be an issue if building petsc with 64-bit indices. MUMPS does not have proper support for 64 bit indices. It relies on add-hoc Fortran compiler command line options to support to converting integer to 64 bit integers and does not work generically. Yes, Fortran lovers have been doing this for 30 years inside their applications but it does not really work in a library environment. But then a big feature of Fortran is "who needs libraries, we just write all the code we need" (except Eispack,Linpack,LAPACK :=-). > > > valgrind is the gold standard for detecting memory corruption. > > Barry > > >> On Mar 22, 2021, at 12:56 PM, Chris Hewson > wrote: >> >> Hi All, >> >> I have been having a problem with MUMPS randomly crashing in our program and causing the entire program to crash. I am compiling in -O2 optimization mode and using --download-mumps etc. to compile PETSc. If I rerun the program, 95%+ of the time I can't reproduce the error. It seems to be a similar issue to this thread: >> >> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html >> >> Similar to the resolution there I am going to try and increase icntl_14 and see if that resolves the issue. Any other thoughts on this? >> >> Thanks, >> >> Chris Hewson >> Senior Reservoir Simulation Engineer >> ResFrac >> +1.587.575.9792 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From s_g at berkeley.edu Tue Mar 23 00:07:15 2021 From: s_g at berkeley.edu (Sanjay Govindjee) Date: Mon, 22 Mar 2021 22:07:15 -0700 Subject: [petsc-users] MUMPS failure In-Reply-To: References: Message-ID: Barry, I am curious about your statement "does not work generically".? If I compile with -fdefault-integer-8, I would assume that this produces objects/libraries that will use 64bit integers.? As long as I have not declared explicit kind=4 integers, what else could go wrong. -sanjay PS: I am not advocating this as a great idea, but I am curious if there or other obscure compiler level things that could go wrong. On 3/22/21 8:53 PM, Barry Smith wrote: > > >> On Mar 22, 2021, at 3:24 PM, Junchao Zhang > > wrote: >> >> >> >> >> On Mon, Mar 22, 2021 at 1:39 PM Barry Smith > > wrote: >> >> >> ? ?Version of PETSc and MUMPS? We fixed a bug in MUMPs a couple >> years ago that produced error messages as below. Please confirm >> you are using the latest PETSc and MUMPS. >> >> ? ?You can run your production version with the option >> -malloc_debug ; this will slow it down a bit but if there is >> memory corruption it may detect it and indicate the problematic >> error. >> >> ? ? One also has to be careful about the size of the problem >> passed to MUMPs since PETSc/MUMPs does not fully support using >> all 64 bit integers. Is it only crashing for problems near 2 >> billion entries in the sparse matrix? >> >> ?"problems near 2 billion entries"?? I don't understand. Should not >> be an issue if building petsc with 64-bit indices. > > ? MUMPS does not have proper support for 64 bit indices. It relies on > add-hoc Fortran compiler command line options to support to converting > integer to 64 bit integers and does not work generically. Yes, Fortran > lovers have been doing this for 30 years inside their applications but > it does not really work in a library environment. But then a big > feature of Fortran is "who needs libraries, we just write all the code > we need" (except Eispack,Linpack,LAPACK :=-). > >> >> >> ? ? ?valgrind is the gold standard for detecting memory corruption. >> >> Barry >> >> >>> On Mar 22, 2021, at 12:56 PM, Chris Hewson >> > wrote: >>> >>> Hi All, >>> >>> I have been having a problem with MUMPS randomly crashing in our >>> program and causing the entire program to crash. I am compiling >>> in -O2 optimization mode and using --download-mumps etc. to >>> compile PETSc. If I rerun the program, 95%+ of the time I can't >>> reproduce the error. It seems to be a similar issue to this thread: >>> >>> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html >>> >>> >>> Similar to the resolution there I am going to try and increase >>> icntl_14 and see if that resolves the issue. Any other thoughts >>> on this? >>> >>> Thanks, >>> * >>> * >>> *Chris Hewson* >>> Senior Reservoir Simulation Engineer >>> ResFrac >>> +1.587.575.9792 >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gohardoust at gmail.com Tue Mar 23 00:23:25 2021 From: gohardoust at gmail.com (Mohammad Gohardoust) Date: Mon, 22 Mar 2021 22:23:25 -0700 Subject: [petsc-users] Code speedup after upgrading Message-ID: Hi, I am using a code which is based on petsc (and also parmetis). Recently I made the following changes and now the code is running about two times faster than before: - Upgraded Ubuntu 18.04 to 20.04 - Upgraded petsc 3.13.4 to 3.14.5 - This time I installed parmetis and metis directly via petsc by --download-parmetis --download-metis flags instead of installing them separately and using --with-parmetis-include=... and --with-parmetis-lib=... (the version of installed parmetis was 4.0.3 before) I was wondering what can possibly explain this speedup? Does anyone have any suggestions? Thanks, Mohammad -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Tue Mar 23 03:37:38 2021 From: dave.mayhem23 at gmail.com (Dave May) Date: Tue, 23 Mar 2021 09:37:38 +0100 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: References: Message-ID: Nice to hear! The answer is simple, PETSc is awesome :) Jokes aside, assuming both petsc builds were configured with ?with-debugging=0, I don?t think there is a definitive answer to your question with the information you provided. It could be as simple as one specific implementation you use was improved between petsc releases. Not being an Ubuntu expert, the change might be associated with using a different compiler, and or a more efficient BLAS implementation (non threaded vs threaded). However I doubt this is the origin of your 2x performance increase. If you really want to understand where the performance improvement originated from, you?d need to send to the email list the result of -log_view from both the old and new versions, running the exact same problem. >From that info, we can see what implementations in PETSc are being used and where the time reduction is occurring. Knowing that, it should be clearer to provide an explanation for it. Thanks, Dave On Tue 23. Mar 2021 at 06:24, Mohammad Gohardoust wrote: > Hi, > > I am using a code which is based on petsc (and also parmetis). Recently I > made the following changes and now the code is running about two times > faster than before: > > - Upgraded Ubuntu 18.04 to 20.04 > - Upgraded petsc 3.13.4 to 3.14.5 > - This time I installed parmetis and metis directly via petsc by > --download-parmetis --download-metis flags instead of installing them > separately and using --with-parmetis-include=... and > --with-parmetis-lib=... (the version of installed parmetis was 4.0.3 before) > > I was wondering what can possibly explain this speedup? Does anyone have > any suggestions? > > Thanks, > Mohammad > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Mar 23 08:40:05 2021 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 23 Mar 2021 09:40:05 -0400 Subject: [petsc-users] what is wrong? Message-ID: I rebased over main and seem to have a p4est problem. Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 1017515 bytes Desc: not available URL: From knepley at gmail.com Tue Mar 23 08:52:47 2021 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 23 Mar 2021 09:52:47 -0400 Subject: [petsc-users] what is wrong? In-Reply-To: References: Message-ID: You mpicc is somehow not linking the MPI/IO stuff? configure:5932: /usr/local/Cellar/mpich/3.4.1_1/bin/mpicc -o conftest -fstack-protector -fno-stack-check -Qunused-arguments -g -O0 -Wno-implicit-function-declaration -I/usr/local/Cellar/mpich/3.4.1_1/include conftest.c -lz -llapack -lblas -L/usr/local/opt/libevent/lib -L/usr/local/Cellar/open-mpi/4.1.0/lib -L/usr/local/Cellar/gcc/10.2.0_4/lib/gcc/10/gcc/x86_64-apple-darwin19/10.2.0 -L/usr/local/Cellar/gcc/10.2.0_4/lib/gcc/10/gcc/x86_64-apple-darwin19/10.2.0/../../.. -lz -llapack -lblas -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lquadmath -lm >&5 Undefined symbols for architecture x86_64: "_ADIOI_Datarep_head", referenced from: import-atom in libpmpi.dylib "_ADIOI_Datatype_iscontig", referenced from: import-atom in libpmpi.dylib "_ADIOI_Free_fn", referenced from: import-atom in libpmpi.dylib "_ADIOI_Get_byte_offset", referenced from: import-atom in libpmpi.dylib "_ADIOI_Get_eof_offset", referenced from: import-atom in libpmpi.dylib "_ADIOI_Get_position", referenced from: import-atom in libpmpi.dylib "_ADIOI_Malloc_fn", referenced from: import-atom in libpmpi.dylib "_ADIOI_Shfp_fname", referenced from: import-atom in libpmpi.dylib Matt On Tue, Mar 23, 2021 at 9:40 AM Mark Adams wrote: > I rebased over main and seem to have a p4est problem. > Thanks, > Mark > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From salazardetro1 at llnl.gov Tue Mar 23 10:54:26 2021 From: salazardetro1 at llnl.gov (Salazar De Troya, Miguel) Date: Tue, 23 Mar 2021 15:54:26 +0000 Subject: [petsc-users] Local Discontinuous Galerkin with PETSc TS In-Reply-To: <775766A0-D6D6-4007-888C-A261A139941F@petsc.dev> References: <3EE29E70-8ECF-4842-99DC-30E867769875@llnl.gov> <775766A0-D6D6-4007-888C-A261A139941F@petsc.dev> Message-ID: The calculation of p1 and p2 are done by solving an element-wise local problem using u^n. I guess I could embed this calculation inside of the calculation for G = H(p1, p2). However, I am hoping to be able to solve the problem using firedrake-ts so the formulation is all clearly in one place and in variational form. Reading the manual, Section 2.5.2 DAE formulations, the Hessenberg Index-1 DAE case seems to be what I need, although it is not clear to me how one can achieve this with an IMEX scheme. If I have: F(U', U, t) = G(t,U) p1 = f(u_x) p2 = g(u_x) u' - H(p1, p2) = 0 where U = (p1, p2, u), F(U?, U, t) = [p1, p2, u? - H(p1, p2)],] and G(t, U) = [f(u_x), g(u_x), 0], is there a solver strategy that will solve for p1 and p2 first and then use that to solve the last equation? The jacobian for F in this formulation would be dF/dU = [[M, 0, 0], [0, M, 0], [H'(p1), H'(p2), \sigma*M]] where M is a mass matrix, H'(p1) is the jacobian of H(p1, p2) w.r.t. p1 and H'(p2), the jacobian of H(p1, p2) w.r.t. p2. H'(p1) and H'(p2) are unnecessary for the solver strategy I want to implement. Thanks Miguel From: Barry Smith Date: Monday, March 22, 2021 at 7:42 PM To: Matthew Knepley Cc: "Salazar De Troya, Miguel" , "Jorti, Zakariae via petsc-users" Subject: Re: [petsc-users] Local Discontinuous Galerkin with PETSc TS u_t = G(u) I don't see why you won't just compute any needed u_x from the given u and then you can use any explicit or implicit TS solver trivially. For implicit methods it can automatically compute the Jacobian of G for you or you can provide it directly. Explicit methods will just use the "old" u while implicit methods will use the new. Barry On Mar 22, 2021, at 7:20 PM, Matthew Knepley > wrote: On Mon, Mar 22, 2021 at 7:53 PM Salazar De Troya, Miguel via petsc-users > wrote: Hello I am interested in implementing the LDG method in ?A local discontinuous Galerkin method for directly solving Hamilton?Jacobi equations? https://www.sciencedirect.com/science/article/pii/S0021999110005255. The equation is more or less of the form (for 1D case): p1 = f(u_x) p2 = g(u_x) u_t = H(p1, p2) where typically one solves for p1 and p2 using the previous time step solution ?u? and then plugs them into the third equation to obtain the next step solution. I am wondering if the TS infrastructure could be used to implement this solution scheme. Looking at the manual, I think one could set G(t, U) to the right-hand side in the above equations and F(t, u, u?) = 0 to the left-hand side, although the first two equations would not have time derivative. In that case, how could one take advantage of the operator split scheme I mentioned? Maybe using some block preconditioners? Hi Miguel, I have a simple-minded way of understanding these TS things. My heuristic is that you put things in F that you expect to want at u^{n+1}, and things in G that you expect to want at u^n. It is not that simple, since you could for instance move F and G to the LHS and have Backward Euler, but it is my rule of thumb. So, were you looking for an IMEX scheme? If so, which terms should be lagged? Also, from the equations above, it is hard to see why you need a solve to calculate p1/p2. It looks like just a forward application of an operator. Thanks, Matt I am trying to solve the Hamilton-Jacobi equation u_t ? H(u_x) = 0. I welcome any suggestion for better methods. Thanks Miguel Miguel A. Salazar de Troya Postdoctoral Researcher, Lawrence Livermore National Laboratory B141 Rm: 1085-5 Ph: 1(925) 422-6411 -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.huysegoms at fz-juelich.de Tue Mar 23 11:39:38 2021 From: m.huysegoms at fz-juelich.de (Marcel Huysegoms) Date: Tue, 23 Mar 2021 17:39:38 +0100 Subject: [petsc-users] Singular jocabian using SNES Message-ID: Hello everyone, I have a large system of nonlinear equations for which I'm trying to find the optimal solution. In order to get familiar with the SNES framework, I created a standalone python script (see below), which creates a set of 2D points and transforms them using an affine transformation. The optimizer should then "move" the points back to their original position given the jacobian and the residual vector. Now I have 2 questions regarding the usage of SNES. - As in my real application the jacobian often gets singular (contains many rows of only zeros), especially when it approaches the solution. This I simulate below by setting the 10th row equal to zero in the fill-functions. I read (https://scicomp.stackexchange.com/questions/21781/newtons method-goes-to-zero-determinant-jacobian) that quasi-newton approaches like BFGS might be able to deal with such a singular jacobian, however I cannot figure out a combination of solvers that converges in that case. I always get the message: Nonlinear solve did not converge due to DIVERGED_INNER iterations 0. What can I do in order to make the solver converge (to the least square minimum length solution)? Is there a solver that can deal with such a situation? What do I need to change in the example script? - In my real application I actually have an overdetermined MxN system. I've read in the manual that the SNES package expects a square jacobian. Is it possible to solve a system having more equations than unknowns? Many thanks in advance, Marcel ----------------------------------------- import sys import petsc4py import numpy as np petsc4py.init(sys.argv) from petsc4py import PETSc def fill_function(snes, x, f, points_original): x_values = x.getArray(readonly=True) diff_vectors = points_original.ravel() - x_values f_values = np.square(diff_vectors) # f_values[10] = 0 f.setValues(np.arange(f_values.size), f_values) f.assemble() def fill_jacobian(snes, x, J, P, points_original): x_values = x.getArray(readonly=True) points_original_flat = points_original.ravel() deriv_values = -2*(points_original_flat - x_values) # deriv_values[10] = 0 for i in range(x_values.size): P.setValue(i, i, deriv_values[i]) # print(deriv_values) P.assemble() # --------------------------------------------------------------------------------------------- if __name__ == '__main__': # Initialize original grid points grid_dim = 10 grid_spacing = 100 num_points = grid_dim * grid_dim points_original = np.zeros(shape=(num_points, 2), dtype=np.float64) for i in range(grid_dim): for j in range(grid_dim): points_original[i*grid_dim+j] = (i*grid_spacing, j*grid_spacing) # Compute transformed grid points affine_mat = np.array([[-0.5, -0.86, 100], [0.86, -0.5, 100]]) # createAffineMatrix(120, 1, 1, 100, 100) points_transformed = np.matmul(affine_mat[:2,:2], points_original.T).T + affine_mat[:2,2] # Initialize PETSc objects num_unknown = points_transformed.size mat_shape = (num_unknown, num_unknown) A = PETSc.Mat() A.createAIJ(size=mat_shape, comm=PETSc.COMM_WORLD) A.setUp() x, f = A.createVecs() options = PETSc.Options() options.setValue("-snes_qn_type", "lbfgs") # broyden/lbfgs options.setValue("-snes_qn_scale_type", "none") # none, diagonal, scalar, jacobian, options.setValue("-snes_monitor", "") # options.setValue("-snes_view", "") options.setValue("-snes_converged_reason", "") options.setFromOptions() snes = PETSc.SNES() snes.create(PETSc.COMM_WORLD) snes.setType("qn") snes.setFunction(fill_function, f, args=(points_original,)) snes.setJacobian(fill_jacobian, A, None, args=(points_original,)) snes_pc = snes.getNPC() # Inner snes instance (newtonls by default!) # snes_pc.setType("ngmres") snes.setFromOptions() ksp = snes_pc.getKSP() ksp.setType("cg") ksp.setTolerances(rtol=1e-10, max_it=40000) pc = ksp.getPC() pc.setType("asm") ksp.setFromOptions() x.setArray(points_transformed.ravel()) snes.solve(None, x) ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Volker Rieke Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Tue Mar 23 12:37:04 2021 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Tue, 23 Mar 2021 18:37:04 +0100 Subject: [petsc-users] Question about periodic conditions In-Reply-To: <1cf6b948af3d47308e69115f1f8e543f@lanl.gov> References: <1cf6b948af3d47308e69115f1f8e543f@lanl.gov> Message-ID: <3CEA21BE-8019-46E0-A2A0-D3E2A887E8F8@gmail.com> Hi Zakariae - sorry about the delay - responses inline below. I'd be curious to see your code (which you can send directly to me if you don't want to post it publicly), so I can give you more comments, as DMStag is a new component. > Am 23.03.2021 um 00:54 schrieb Jorti, Zakariae : > > Hi, > > I implemented a PETSc code to solve Maxwell's equations for the magnetic and electric fields (B and E) in a cylinder: > 0 < r_min <= r <= r_max; with r_max > r_min > phi_min = 0 <= r <= phi_max = 2 ? > z_min <= z =< z_max; with z_max > z_min. > > I am using a PETSc staggered grid with the electric field E defined on edge centers and the magnetic field B defined on face centers. (dof0 = 0, dof1 = 1,dof2 = 1, dof3 = 0;). > > I have two versions of my code: > 1 - A first version in which I set the boundary type to DM_BOUNDARY_NONE in the three directions r, phi and z > 2- A second version in which I set the boundary type to DM_BOUNDARY_NONE in the r and z directions, and DM_BOUNDARY_PERIODIC in the phi direction. > > When I print the solution vector X, which contains both E and B components, I notice that the vector is shorter with the second version compared to the first one. > Is it normal? Yes - with the periodic boundary conditions, there will be fewer points since there won't be the "extra" layer of faces and edges at phi = 2 * pi . If you consider a 1-d example with 1 dof on vertices and cells, with three elements, the periodic case looks like this, globally, x ---- x ---- x ---- as opposed to the non-periodic case, x ---- x ---- x ---- x > > Besides, I was wondering if I have to change the way I define the value of the solution on the boundary. What I am doing so far in both versions is something like: > B_phi [phi = 0] = 1.0; > B_phi [phi = 2?] = 1.0; > E_z [r, phi = 0] = 1/r; > E_z [r, phi = 2?] = 1/r; > > Assuming that values at phi = 0 should be the same as at phi=2? with the periodic boundary conditions, is it sufficient for example to have only the following boundary conditions: > B_phi [phi = 0] = 1.0; > E_z [r, phi = 0] = 1/r ? Yes - this is the intention, since the boundary at phi = 2 * pi is represented by the same entries in the global vector. Of course, you need to make sure that your continuous problem is well-posed, which in general could change when using different boundary conditions. > Thank you. > Best regards, > > Zakariae Jorti -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Mar 23 13:04:09 2021 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 23 Mar 2021 13:04:09 -0500 Subject: [petsc-users] MUMPS failure In-Reply-To: References: Message-ID: <3B8CB18A-556C-458E-8285-56D3C522E80E@petsc.dev> In a pure Fortran code using -fdefault-integer-8 is probably fine. But MUMPS is a mixture of Fortran and C code and PETSc uses MUMPs C interface. The -fdefault-integer-8 doesn't magically fix anything in the C parts of MUMPS. I also don't know about MPI calls and if they would need editing. I am not saying it is impossible to get it to work but one needs are to insure the C portions also switch to 64 bit integers in a consistent way. This may be all doable bit is not simply using -fdefault-integer-8 on MUMPS. Barry > On Mar 23, 2021, at 12:07 AM, Sanjay Govindjee wrote: > > Barry, > I am curious about your statement "does not work generically". If I compile with -fdefault-integer-8, > I would assume that this produces objects/libraries that will use 64bit integers. As long as I have not declared > explicit kind=4 integers, what else could go wrong. > -sanjay > > PS: I am not advocating this as a great idea, but I am curious if there or other obscure compiler level things that could go wrong. > > > On 3/22/21 8:53 PM, Barry Smith wrote: >> >> >>> On Mar 22, 2021, at 3:24 PM, Junchao Zhang > wrote: >>> >>> >>> >>> >>> On Mon, Mar 22, 2021 at 1:39 PM Barry Smith > wrote: >>> >>> Version of PETSc and MUMPS? We fixed a bug in MUMPs a couple years ago that produced error messages as below. Please confirm you are using the latest PETSc and MUMPS. >>> >>> You can run your production version with the option -malloc_debug ; this will slow it down a bit but if there is memory corruption it may detect it and indicate the problematic error. >>> >>> One also has to be careful about the size of the problem passed to MUMPs since PETSc/MUMPs does not fully support using all 64 bit integers. Is it only crashing for problems near 2 billion entries in the sparse matrix? >>> "problems near 2 billion entries"? I don't understand. Should not be an issue if building petsc with 64-bit indices. >> >> MUMPS does not have proper support for 64 bit indices. It relies on add-hoc Fortran compiler command line options to support to converting integer to 64 bit integers and does not work generically. Yes, Fortran lovers have been doing this for 30 years inside their applications but it does not really work in a library environment. But then a big feature of Fortran is "who needs libraries, we just write all the code we need" (except Eispack,Linpack,LAPACK :=-). >> >>> >>> >>> valgrind is the gold standard for detecting memory corruption. >>> >>> Barry >>> >>> >>>> On Mar 22, 2021, at 12:56 PM, Chris Hewson > wrote: >>>> >>>> Hi All, >>>> >>>> I have been having a problem with MUMPS randomly crashing in our program and causing the entire program to crash. I am compiling in -O2 optimization mode and using --download-mumps etc. to compile PETSc. If I rerun the program, 95%+ of the time I can't reproduce the error. It seems to be a similar issue to this thread: >>>> >>>> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html >>>> >>>> Similar to the resolution there I am going to try and increase icntl_14 and see if that resolves the issue. Any other thoughts on this? >>>> >>>> Thanks, >>>> >>>> Chris Hewson >>>> Senior Reservoir Simulation Engineer >>>> ResFrac >>>> +1.587.575.9792 >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From s_g at berkeley.edu Tue Mar 23 13:20:37 2021 From: s_g at berkeley.edu (Sanjay Govindjee) Date: Tue, 23 Mar 2021 11:20:37 -0700 Subject: [petsc-users] MUMPS failure In-Reply-To: <3B8CB18A-556C-458E-8285-56D3C522E80E@petsc.dev> References: <3B8CB18A-556C-458E-8285-56D3C522E80E@petsc.dev> Message-ID: <8dc371cc-61d8-c545-7ce9-6ae19221acdc@berkeley.edu> I agree.? If you are mixing C and Fortran, everything is /nota bene. /It is easy to miss argument mismatches. -sanjay On 3/23/21 11:04 AM, Barry Smith wrote: > > ? ?In a pure Fortran code using -fdefault-integer-8 is probably fine. > But MUMPS is a mixture of Fortran and C code and PETSc uses MUMPs C > interface. The ?-fdefault-integer-8 doesn't magically fix anything in > the C parts of MUMPS. ?I also don't know about MPI calls and if they > would need editing. > > ? ?I am not saying it is impossible to get it to work but one needs > are to insure the C portions also switch to 64 bit integers in a > consistent way. This may be all doable bit is not simply using > -fdefault-integer-8 on MUMPS. > > ? Barry > > >> On Mar 23, 2021, at 12:07 AM, Sanjay Govindjee > > wrote: >> >> Barry, >> I am curious about your statement "does not work generically".? If I >> compile with -fdefault-integer-8, >> I would assume that this produces objects/libraries that will use >> 64bit integers.? As long as I have not declared >> explicit kind=4 integers, what else could go wrong. >> -sanjay >> >> PS: I am not advocating this as a great idea, but I am curious if >> there or other obscure compiler level things that could go wrong. >> >> >> On 3/22/21 8:53 PM, Barry Smith wrote: >>> >>> >>>> On Mar 22, 2021, at 3:24 PM, Junchao Zhang >>> > wrote: >>>> >>>> >>>> >>>> >>>> On Mon, Mar 22, 2021 at 1:39 PM Barry Smith >>> > wrote: >>>> >>>> >>>> ? ?Version of PETSc and MUMPS? We fixed a bug in MUMPs a couple >>>> years ago that produced error messages as below. Please confirm >>>> you are using the latest PETSc and MUMPS. >>>> >>>> ? ?You can run your production version with the option >>>> -malloc_debug ; this will slow it down a bit but if there is >>>> memory corruption it may detect it and indicate the problematic >>>> error. >>>> >>>> ? ? One also has to be careful about the size of the problem >>>> passed to MUMPs since PETSc/MUMPs does not fully support using >>>> all 64 bit integers. Is it only crashing for problems near 2 >>>> billion entries in the sparse matrix? >>>> >>>> ?"problems near 2 billion entries"?? I don't understand. Should not >>>> be an issue if building petsc with 64-bit indices. >>> >>> ? MUMPS does not have proper support for 64 bit indices. It relies >>> on add-hoc Fortran compiler command line options to support to >>> converting integer to 64 bit integers and does not work generically. >>> Yes, Fortran lovers have been doing this for 30 years inside their >>> applications but it does not really work in a library environment. >>> But then a big feature of Fortran is "who needs libraries, we just >>> write all the code we need" (except Eispack,Linpack,LAPACK :=-). >>> >>>> >>>> >>>> ? ? ?valgrind is the gold standard for detecting memory >>>> corruption. >>>> >>>> Barry >>>> >>>> >>>>> On Mar 22, 2021, at 12:56 PM, Chris Hewson >>>> > wrote: >>>>> >>>>> Hi All, >>>>> >>>>> I have been having a problem with MUMPS randomly crashing in >>>>> our program and causing the entire program to crash. I am >>>>> compiling in -O2 optimization mode and using --download-mumps >>>>> etc. to compile PETSc. If I rerun the program, 95%+ of the >>>>> time I can't reproduce the error. It seems to be a similar >>>>> issue to this thread: >>>>> >>>>> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html >>>>> >>>>> >>>>> Similar to the resolution there I am going to try and increase >>>>> icntl_14 and see if that resolves the issue. Any other >>>>> thoughts on this? >>>>> >>>>> Thanks, >>>>> * >>>>> * >>>>> *Chris Hewson* >>>>> Senior Reservoir Simulation Engineer >>>>> ResFrac >>>>> +1.587.575.9792 >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 23 14:42:09 2021 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 23 Mar 2021 15:42:09 -0400 Subject: [petsc-users] Singular jocabian using SNES In-Reply-To: References: Message-ID: On Tue, Mar 23, 2021 at 12:39 PM Marcel Huysegoms wrote: > Hello everyone, > > I have a large system of nonlinear equations for which I'm trying to find > the optimal solution. > In order to get familiar with the SNES framework, I created a standalone > python script (see below), which creates a set of 2D points and transforms > them using an affine transformation. The optimizer should then "move" the > points back to their original position given the jacobian and the residual > vector. > > Now I have 2 questions regarding the usage of SNES. > > - As in my real application the jacobian often gets singular (contains > many rows of only zeros), especially when it approaches the solution. This > I simulate below by setting the 10th row equal to zero in the > fill-functions. I read ( > https://scicomp.stackexchange.com/questions/21781/newtons > method-goes-to-zero-determinant-jacobian) that quasi-newton approaches like > BFGS might be able to deal with such a singular jacobian, however I cannot > figure out a combination of solvers that converges in that case. > > I always get the message: *Nonlinear solve did not converge due to > DIVERGED_INNER iterations 0.* What can I do in order to make the solver > converge (to the least square minimum length solution)? Is there a solver > that can deal with such a situation? What do I need to change in the > example script? > > - In my real application I actually have an overdetermined MxN system. > I've read in the manual that the SNES package expects a square jacobian. Is > it possible to solve a system having more equations than unknowns? > SNES is only for solving systems of nonlinear equations. If you want optimization (least-square, etc.) then you want to formulate your problem in the TAO interface. It has quasi-Newton methods for those problems, and other methods as well. That is where I would start. Thanks, Matt > Many thanks in advance, > Marcel > > ----------------------------------------- > > import sysimport petsc4pyimport numpy as np > > petsc4py.init(sys.argv)from petsc4py import PETSc > def fill_function(snes, x, f, points_original): > x_values = x.getArray(readonly=True) > diff_vectors = points_original.ravel() - x_values > f_values = np.square(diff_vectors) > # f_values[10] = 0 f.setValues(np.arange(f_values.size), f_values) > f.assemble() > def fill_jacobian(snes, x, J, P, points_original): > x_values = x.getArray(readonly=True) > points_original_flat = points_original.ravel() > deriv_values = -2*(points_original_flat - x_values) > # deriv_values[10] = 0 for i in range(x_values.size): > P.setValue(i, i, deriv_values[i]) > # print(deriv_values) P.assemble() > # ---------------------------------------------------------------------------------------------if __name__ == '__main__': > # Initialize original grid points grid_dim = 10 grid_spacing = 100 num_points = grid_dim * grid_dim > points_original = np.zeros(shape=(num_points, 2), dtype=np.float64) > for i in range(grid_dim): > for j in range(grid_dim): > points_original[i*grid_dim+j] = (i*grid_spacing, j*grid_spacing) > > # Compute transformed grid points affine_mat = np.array([[-0.5, -0.86, 100], [0.86, -0.5, 100]]) # createAffineMatrix(120, 1, 1, 100, 100) points_transformed = np.matmul(affine_mat[:2,:2], points_original.T).T + affine_mat[:2,2] > > # Initialize PETSc objects num_unknown = points_transformed.size > mat_shape = (num_unknown, num_unknown) > A = PETSc.Mat() > A.createAIJ(size=mat_shape, comm=PETSc.COMM_WORLD) > A.setUp() > x, f = A.createVecs() > > options = PETSc.Options() > options.setValue("-snes_qn_type", "lbfgs") # broyden/lbfgs options.setValue("-snes_qn_scale_type", "none") # none, diagonal, scalar, jacobian, options.setValue("-snes_monitor", "") > # options.setValue("-snes_view", "") options.setValue("-snes_converged_reason", "") > options.setFromOptions() > > snes = PETSc.SNES() > snes.create(PETSc.COMM_WORLD) > snes.setType("qn") snes.setFunction(fill_function, f, args=(points_original,)) > snes.setJacobian(fill_jacobian, A, None, args=(points_original,)) > snes_pc = snes.getNPC() # Inner snes instance (newtonls by default!) # snes_pc.setType("ngmres") snes.setFromOptions() > > ksp = snes_pc.getKSP() > ksp.setType("cg") > ksp.setTolerances(rtol=1e-10, max_it=40000) > pc = ksp.getPC() > pc.setType("asm") > ksp.setFromOptions() > > x.setArray(points_transformed.ravel()) > snes.solve(None, x) > > > > > ------------------------------------------------------------------------------------------------ > > ------------------------------------------------------------------------------------------------ > Forschungszentrum Juelich GmbH > 52425 Juelich > Sitz der Gesellschaft: Juelich > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 > Vorsitzender des Aufsichtsrats: MinDir Volker Rieke > Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), > Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt > > ------------------------------------------------------------------------------------------------ > > ------------------------------------------------------------------------------------------------ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 23 14:57:40 2021 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 23 Mar 2021 15:57:40 -0400 Subject: [petsc-users] Local Discontinuous Galerkin with PETSc TS In-Reply-To: References: <3EE29E70-8ECF-4842-99DC-30E867769875@llnl.gov> <775766A0-D6D6-4007-888C-A261A139941F@petsc.dev> Message-ID: On Tue, Mar 23, 2021 at 11:54 AM Salazar De Troya, Miguel < salazardetro1 at llnl.gov> wrote: > The calculation of p1 and p2 are done by solving an element-wise local > problem using u^n. I guess I could embed this calculation inside of the > calculation for G = H(p1, p2). However, I am hoping to be able to solve the > problem using firedrake-ts so the formulation is all clearly in one place > and in variational form. Reading the manual, Section 2.5.2 DAE > formulations, the Hessenberg Index-1 DAE case seems to be what I need, > although it is not clear to me how one can achieve this with an IMEX > scheme. If I have: > I am almost certain that you do not want to do this. I am guessing the Firedrake guys will agree. Did they tell you to do this? If you had a large, nonlinear system for p1/p2, then a DAE would make sense. Since it is just element-wise elimination, you should roll it into the easy equation u' = H Then you can use any integrator, as Barry says, in particular a nice symplectic integrator. My understand is that SLATE is for exactly this kind of thing. Thanks, Matt > F(U', U, t) = G(t,U) > > p1 = f(u_x) > > p2 = g(u_x) > > u' - H(p1, p2) = 0 > > > > where U = (p1, p2, u), F(U?, U, t) = [p1, p2, u? - H(p1, p2)],] and G(t, > U) = [f(u_x), g(u_x), 0], is there a solver strategy that will solve for p1 > and p2 first and then use that to solve the last equation? The jacobian for > F in this formulation would be > > > > dF/dU = [[M, 0, 0], > > [0, M, 0], > > [H'(p1), H'(p2), \sigma*M]] > > > > where M is a mass matrix, H'(p1) is the jacobian of H(p1, p2) w.r.t. p1 > and H'(p2), the jacobian of H(p1, p2) w.r.t. p2. H'(p1) and H'(p2) are > unnecessary for the solver strategy I want to implement. > > > > Thanks > > Miguel > > > > > > > > *From: *Barry Smith > *Date: *Monday, March 22, 2021 at 7:42 PM > *To: *Matthew Knepley > *Cc: *"Salazar De Troya, Miguel" , "Jorti, > Zakariae via petsc-users" > *Subject: *Re: [petsc-users] Local Discontinuous Galerkin with PETSc TS > > > > > > u_t = G(u) > > > > I don't see why you won't just compute any needed u_x from the given u > and then you can use any explicit or implicit TS solver trivially. For > implicit methods it can automatically compute the Jacobian of G for you or > you can provide it directly. Explicit methods will just use the "old" u > while implicit methods will use the new. > > > > Barry > > > > > > On Mar 22, 2021, at 7:20 PM, Matthew Knepley wrote: > > > > On Mon, Mar 22, 2021 at 7:53 PM Salazar De Troya, Miguel via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello > > > > I am interested in implementing the LDG method in ?A local discontinuous > Galerkin method for directly solving Hamilton?Jacobi equations? > https://www.sciencedirect.com/science/article/pii/S0021999110005255 > . > The equation is more or less of the form (for 1D case): > > p1 = f(u_x) > > p2 = g(u_x) > > u_t = H(p1, p2) > > > > where typically one solves for p1 and p2 using the previous time step > solution ?u? and then plugs them into the third equation to obtain the next > step solution. I am wondering if the TS infrastructure could be used to > implement this solution scheme. Looking at the manual, I think one could > set G(t, U) to the right-hand side in the above equations and F(t, u, u?) = > 0 to the left-hand side, although the first two equations would not have > time derivative. In that case, how could one take advantage of the operator > split scheme I mentioned? Maybe using some block preconditioners? > > > > Hi Miguel, > > > > I have a simple-minded way of understanding these TS things. My heuristic > is that you put things in F that you expect to want > > at u^{n+1}, and things in G that you expect to want at u^n. It is not that > simple, since you could for instance move F and G > > to the LHS and have Backward Euler, but it is my rule of thumb. > > > > So, were you looking for an IMEX scheme? If so, which terms should be > lagged? Also, from the equations above, it is hard to > > see why you need a solve to calculate p1/p2. It looks like just a forward > application of an operator. > > > > Thanks, > > > > Matt > > > > I am trying to solve the Hamilton-Jacobi equation u_t ? H(u_x) = 0. I > welcome any suggestion for better methods. > > > > Thanks > > Miguel > > > > Miguel A. Salazar de Troya > > Postdoctoral Researcher, Lawrence Livermore National Laboratory > > B141 > > Rm: 1085-5 > > Ph: 1(925) 422-6411 > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From skavou1 at lsu.edu Tue Mar 23 15:07:56 2021 From: skavou1 at lsu.edu (Sepideh Kavousi) Date: Tue, 23 Mar 2021 20:07:56 +0000 Subject: [petsc-users] PF+Navier stokes In-Reply-To: <22EEC3F3-4B97-4CCE-8B53-54409298E0FB@petsc.dev> References: <75FC52E9-8615-4966-9BA6-85BD43C01F7B@petsc.dev> , <22EEC3F3-4B97-4CCE-8B53-54409298E0FB@petsc.dev> Message-ID: Thank you for you suggestions. In the discretization, I made sure that the main diagonal of the matrix is non-zero and I even chose Dirichlet boundary condition for pressure and velocity. And therefore, I do not get zero pivot. But even when setting preconditioning to be LU, the resid norm is still large: When I analyze the data, the velocity at regions close to the corners of solid phase becomes large. I am not sure how I could get reasonable results for one system size and totally nonsense for another. Sepideh 29 TS dt 3.2769e-06 time 0.00689069 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 copy! Write output at step= 29! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 1.453900186218e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 1 SNES Function norm 1.298945530538e+08 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 29 TS dt 3.2769e-06 time 0.00689069 copy! Write output at step= 29! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 1.453900186218e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.298945530538e+08 29 TS dt 3.2769e-06 time 0.00689069 copy! 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 Write output at step= 29! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 1.453900186218e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.298945530538e+08 29 TS dt 3.2769e-06 time 0.00689069 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 29! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 0 SNES Function norm 1.453900186218e+08 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.298945530538e+08 29 TS dt 3.2769e-06 time 0.00689069 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 29! 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 0 SNES Function norm 1.453900186218e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 1 SNES Function norm 1.298945530538e+08 29 TS dt 3.2769e-06 time 0.00689069 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 29! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 1.453900186218e+08 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 1 SNES Function norm 1.298945530538e+08 ________________________________ From: Barry Smith Sent: Monday, March 22, 2021 10:48 PM To: Sepideh Kavousi Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PF+Navier stokes Each of the individual fields "converging" does not imply the overall PCFIELDSPLIT will converge. This is scary, 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 often these huge differences in true residual come from the operator having a null space that is not handled properly. Or they can come from an ILU that produces absurd pivots. Recommend using the same PCFIELDSPLIT but using a direct solver on all the splits (no ML no ILU). If that results in great overall convergence it means one of the sub-systems is not converging properly with ML or ILU and you need better preconditioners on some of the splits. Barry On Mar 22, 2021, at 4:05 PM, Sepideh Kavousi > wrote: I modified my BC such that on the left and right side of the interface the BC are constant value instead of Neumann(zero flux). This solves the problem but still the code has convergence problem: I even tried field split with the following order: the block size is 9. the first block is the fields related to for PF equation, the second split block is the velocities in x and y direction and the third block is pressure. -pc_type fieldsplit -pc_fieldsplit_block_size 9 -pc_fieldsplit_0_fields 0,1 (two fields related to for the Phasefield model) -pc_fieldsplit_1_fields 2,3 (velocity in x and y direction) -pc_fieldsplit_2_fields 4 (pressure) -fieldsplit_1_pc_fieldsplit_block_size 2 -fieldsplit_1_fieldsplit_0_pc_type ml (based on https://lists.mcs.anl.gov/pipermail/petsc-users/2015-February/024191.html) -fieldsplit_1_fieldsplit_1_pc_type ml (based on https://lists.mcs.anl.gov/pipermail/petsc-users/2015-February/024191.html) -fieldsplit_0_pc_type ilu (based on previous solutions of phase-field equations) -fieldsplit_2_pc_type ilu I guess changing the BCs the main reason that at first few steps the code does not fail. And as time increases, true resid norm increases such that at a finite time step (~30) it reaches 1e7 and the code results non-accurate velocity calculations. Can this also be resulted by forward/backward discritization? Best, Sepideh 34 TS dt 3.12462e-07 time 0.00709097 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 copy! copy! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 34! Write output at step= 34! 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 0 SNES Function norm 6.393295863037e+07 0 SNES Function norm 6.393295863037e+07 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.904272255718e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 33 TS dt 1.56231e-07 time 0.00709082 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! 1 SNES Function norm 1.904272255718e+07 Write output at step= 33! 33 TS dt 1.56231e-07 time 0.00709082 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 copy! 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 Write output at step= 33! 0 SNES Function norm 2.464456787205e+07 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.464456787205e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 0 KSP preconditioned resid norm 2.003930340911e+01 true resid norm 6.987120567963e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 1 KSP preconditioned resid norm 1.199890501875e-02 true resid norm 1.879731143354e+07 ||r(i)||/||b|| 2.690280101896e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 2 KSP preconditioned resid norm 3.018100763012e-04 true resid norm 1.879893603977e+07 ||r(i)||/||b|| 2.690512616309e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 3 KSP preconditioned resid norm 2.835332741838e-04 true resid norm 1.879893794065e+07 ||r(i)||/||b|| 2.690512888363e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 4 KSP preconditioned resid norm 1.860011376508e-04 true resid norm 1.879893735946e+07 ||r(i)||/||b|| 2.690512805182e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.886737547995e+07 31 TS dt 3.90578e-08 time 0.0070907 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 31! 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 0 SNES Function norm 1.888557765431e+07 1 SNES Function norm 1.938116032575e+07 1 SNES Function norm 1.938116032575e+07 34 TS dt 3.12462e-07 time 0.00709097 34 TS dt 3.12462e-07 time 0.00709097 copy! copy! Write output at step= 34! Write output at step= 34! 0 SNES Function norm 6.393295863037e+07 0 SNES Function norm 6.393295863037e+07 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 2.011100207442e+07 1 SNES Function norm 2.011100207442e+07 35 TS dt 6.24925e-07 time 0.00709129 35 TS dt 6.24925e-07 time 0.00709129 copy! copy! 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 35! Write output at step= 35! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.790215258123e+08 0 SNES Function norm 2.790215258123e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 1.938116032575e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 34 TS dt 3.12462e-07 time 0.00709097 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 34! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 0 SNES Function norm 6.393295863037e+07 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.338586074554e-01 true resid norm 1.888557765431e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.938116032575e+07 34 TS dt 3.12462e-07 time 0.00709097 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 1 SNES Function norm 1.938116032575e+07 34 TS dt 3.12462e-07 time 0.00709097 Write output at step= 34! copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 5.451259747447e-03 true resid norm 1.887927947148e+07 ||r(i)||/||b|| 9.996665083300e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 34! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 6.393295863037e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 6.393295863037e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 1 SNES Function norm 2.011100207442e+07 35 TS dt 6.24925e-07 time 0.00709129 1 SNES Function norm 2.011100207442e+07 copy! 2 KSP preconditioned resid norm 9.554577345960e-04 true resid norm 1.887930135577e+07 ||r(i)||/||b|| 9.996676671129e-01 35 TS dt 6.24925e-07 time 0.00709129 copy! Write output at step= 35! Write output at step= 35! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.790215258123e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.790215258123e+08 3 KSP preconditioned resid norm 9.378991224281e-04 true resid norm 1.887930134907e+07 ||r(i)||/||b|| 9.996676667583e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 3.652611805745e-04 true resid norm 1.887930205974e+07 ||r(i)||/||b|| 9.996677043885e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 2.918222127367e-04 true resid norm 1.887930204569e+07 ||r(i)||/||b|| 9.996677036447e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 6.114488674627e-05 true resid norm 1.887930243837e+07 ||r(i)||/||b|| 9.996677244370e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 3.763532951474e-05 true resid norm 1.887930248279e+07 ||r(i)||/||b|| 9.996677267895e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 2.112644035802e-05 true resid norm 1.887930251181e+07 ||r(i)||/||b|| 9.996677283257e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 1.113068460252e-05 true resid norm 1.887930250969e+07 ||r(i)||/||b|| 9.996677282137e-01 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 10 KSP preconditioned resid norm 1.352518287887e-06 true resid norm 1.887930250333e+07 ||r(i)||/||b|| 9.996677278767e-01 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 3.485261902880e+07 1 SNES Function norm 3.485261902880e+07 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 36 TS dt 1.24985e-06 time 0.00709191 36 TS dt 1.24985e-06 time 0.00709191 copy! copy! 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 36! Write output at step= 36! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 4.684646000717e+08 0 SNES Function norm 4.684646000717e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 11 KSP preconditioned resid norm 7.434707372444e-07 true resid norm 1.887930250410e+07 ||r(i)||/||b|| 9.996677279175e-01 1 SNES Function norm 1.887938190335e+07 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 32 TS dt 7.81156e-08 time 0.00709074 copy! Write output at step= 32! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.010558777785e+07 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 1 SNES Function norm 2.011100207442e+07 35 TS dt 6.24925e-07 time 0.00709129 copy! Write output at step= 35! 0 SNES Function norm 2.790215258123e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 1 SNES Function norm 2.011100207442e+07 1 SNES Function norm 3.485261902880e+07 36 TS dt 1.24985e-06 time 0.00709191 35 TS dt 6.24925e-07 time 0.00709129 copy! 1 SNES Function norm 3.485261902880e+07 copy! 36 TS dt 1.24985e-06 time 0.00709191 1 SNES Function norm 2.011100207442e+07 copy! 35 TS dt 6.24925e-07 time 0.00709129 Write output at step= 35! Write output at step= 36! copy! Write output at step= 36! Write output at step= 35! 0 SNES Function norm 2.790215258123e+08 0 SNES Function norm 4.684646000717e+08 0 SNES Function norm 4.684646000717e+08 0 SNES Function norm 2.790215258123e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 5.194818889106e+07 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 37 TS dt 1.4811e-06 time 0.00709316 copy! copy! 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Write output at step= 37! 0 KSP preconditioned resid norm 3.165458473526e+00 true resid norm 2.010558777785e+07 ||r(i)||/||b|| 1.000000000000e+00 Write output at step= 37! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 3.786081856480e+09 0 SNES Function norm 3.786081856480e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 1 KSP preconditioned resid norm 3.655364946441e-03 true resid norm 1.904269379864e+07 ||r(i)||/||b|| 9.471343991057e-01 1 SNES Function norm 3.485261902880e+07 36 TS dt 1.24985e-06 time 0.00709191 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 36! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 4.684646000717e+08 2 KSP preconditioned resid norm 2.207564350060e-03 true resid norm 1.904265942845e+07 ||r(i)||/||b|| 9.471326896210e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 2.188447918524e-03 true resid norm 1.904266151317e+07 ||r(i)||/||b|| 9.471327933098e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 7.425314556150e-04 true resid norm 1.904265807404e+07 ||r(i)||/||b|| 9.471326222560e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 6.052794097111e-04 true resid norm 1.904265841692e+07 ||r(i)||/||b|| 9.471326393103e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.251197915617e-04 true resid norm 1.904266159346e+07 ||r(i)||/||b|| 9.471327973028e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 1 SNES Function norm 3.485261902880e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 36 TS dt 1.24985e-06 time 0.00709191 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 1 SNES Function norm 3.485261902880e+07 Write output at step= 36! 36 TS dt 1.24985e-06 time 0.00709191 copy! 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 1 SNES Function norm 1.904272255718e+07 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 Write output at step= 36! 0 SNES Function norm 4.684646000717e+08 33 TS dt 1.56231e-07 time 0.00709082 copy! 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 1 SNES Function norm 5.194818889106e+07 Write output at step= 33! copy! 0 SNES Function norm 4.684646000717e+08 37 TS dt 1.4811e-06 time 0.00709316 copy! Write output at step= 37! Write output at step= 37! 0 SNES Function norm 2.464456787205e+07 0 SNES Function norm 3.786081856480e+09 0 SNES Function norm 3.786081856480e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.679071798524e+08 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 38 TS dt 2.09926e-06 time 0.00709464 copy! copy! 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Write output at step= 38! Write output at step= 38! 0 SNES Function norm 4.969343279719e+09 0 SNES Function norm 4.969343279719e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 copy! Write output at step= 37! 0 SNES Function norm 3.786081856480e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 1 SNES Function norm 1.679071798524e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 copy! 38 TS dt 2.09926e-06 time 0.00709464 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 38! Write output at step= 38! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 5.194818889106e+07 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 37 TS dt 1.4811e-06 time 0.00709316 0 SNES Function norm 4.969343279719e+09 copy! 0 SNES Function norm 4.969343279719e+09 Write output at step= 37! 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 copy! 0 SNES Function norm 3.786081856480e+09 Write output at step= 37! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 3.786081856480e+09 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 1.938116032575e+07 34 TS dt 3.12462e-07 time 0.00709097 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 34! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 0 SNES Function norm 6.393295863037e+07 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 1 SNES Function norm 1.426574249724e+08 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 39 TS dt 3.42747e-06 time 0.00709674 copy! copy! Write output at step= 39! Write output at step= 39! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 6.081085806316e+09 0 SNES Function norm 6.081085806316e+09 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 copy! Write output at step= 38! 0 SNES Function norm 4.969343279719e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 1 SNES Function norm 1.426574249724e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 39 TS dt 3.42747e-06 time 0.00709674 copy! Write output at step= 39! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 39! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 6.081085806316e+09 0 SNES Function norm 6.081085806316e+09 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 copy! Write output at step= 38! 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 copy! 0 SNES Function norm 4.969343279719e+09 Write output at step= 38! 0 SNES Function norm 4.969343279719e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 1 SNES Function norm 1.906687815261e+08 1 SNES Function norm 1.906687815261e+08 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 40 TS dt 6.85494e-06 time 0.00710017 40 TS dt 6.85494e-06 time 0.00710017 copy! copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 40! Write output at step= 40! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 0 SNES Function norm 2.386070312612e+09 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 2.011100207442e+07 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 35 TS dt 6.24925e-07 time 0.00709129 copy! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 35! 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 0 SNES Function norm 2.790215258123e+08 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 copy! Write output at step= 39! 0 SNES Function norm 6.081085806316e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 1 SNES Function norm 1.906687815261e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 40 TS dt 6.85494e-06 time 0.00710017 1 SNES Function norm 1.906687815261e+08 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 copy! 40 TS dt 6.85494e-06 time 0.00710017 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 40! Write output at step= 40! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 copy! Write output at step= 39! 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 copy! 0 SNES Function norm 6.081085806316e+09 Write output at step= 39! 0 SNES Function norm 6.081085806316e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.906687815261e+08 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 40 TS dt 6.85494e-06 time 0.00710017 copy! 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 Write output at step= 40! 1 SNES Function norm 3.485261902880e+07 36 TS dt 1.24985e-06 time 0.00709191 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 36! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 4.684646000717e+08 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 6.245293405655e+10 1 SNES Function norm 6.245293405655e+10 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.097192603535e+10 0 SNES Function norm 2.097192603535e+10 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.906687815261e+08 40 TS dt 6.85494e-06 time 0.00710017 copy! 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 Write output at step= 40! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.906687815261e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 40 TS dt 6.85494e-06 time 0.00710017 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 40! 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 0 SNES Function norm 2.386070312612e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 1 SNES Function norm 6.245293405655e+10 1 SNES Function norm 6.245293405655e+10 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.097192603535e+10 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.097192603535e+10 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 37! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 3.786081856480e+09 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.670669894784e+05 true resid norm 2.097192603535e+10 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.670669894784e+05 true resid norm 2.097192603535e+10 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.135766802417e+00 true resid norm 8.990660342789e+07 ||r(i)||/||b|| 4.286997926482e-03 1 KSP preconditioned resid norm 2.135766802417e+00 true resid norm 8.990660342789e+07 ||r(i)||/||b|| 4.286997926482e-03 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 7.093938024894e+08 1 SNES Function norm 7.093938024894e+08 ________________________________ From: Barry Smith > Sent: Monday, March 22, 2021 1:56 PM To: Sepideh Kavousi > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] PF+Navier stokes Singular systems come up in solving PDEs almost always due to issues related to boundary conditions. For example all Neumann (natural) boundary conditions can produce singular systems. Direct factorizations generically will eventually hit a zero pivot in such cases and there is no universally acceptable approaches for what to do at that point to recover. If you think your operator is singular you should start by using MatSetNullSpace(), it won't "cure" the problem but is the tool we use to manage null spaces in operators. On Mar 22, 2021, at 9:04 AM, Sepideh Kavousi > wrote: Hello, I want to solve PF solidification+Navier stokes using Finite different method, and I have a strange problem. My code runs fine for some system sizes and fails for some of the system sizes. When I run with the following options: mpirun -np 2 ./one.out -ts_monitor -snes_fd_color -ts_max_snes_failures -1 -ts_type bdf -ts_bdf_adapt -pc_type bjacobi -snes_linesearch_type l2 -snes_type ksponly -ksp_type gmres -ksp_gmres_restart 1001 -sub_pc_type ilu -sub_ksp_type preonly -snes_monitor -ksp_monitor -snes_linesearch_monitor -ksp_monitor_true_residual -ksp_converged_reason -log_view 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 ^C Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 Even setting pc_type to LU does not solve the problem. 0 TS dt 0.0001 time 0. copy! copy! Write output at step= 0! Write output at step= 0! 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT I guess the problem is that in mass conservation I used forward discretization for u (velocity in x) and for the moment in x , I used forward discretization for p (pressure) to ensure non-zero terms on the diagonal of matrix. I tried to run it with valgrind but it did not output anything. Does anyone have suggestions on how to solve this issue? Best, Sepideh -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 23 15:20:30 2021 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 23 Mar 2021 16:20:30 -0400 Subject: [petsc-users] PF+Navier stokes In-Reply-To: References: <75FC52E9-8615-4966-9BA6-85BD43C01F7B@petsc.dev> <22EEC3F3-4B97-4CCE-8B53-54409298E0FB@petsc.dev> Message-ID: On Tue, Mar 23, 2021 at 4:08 PM Sepideh Kavousi wrote: > Thank you for you suggestions. In the discretization, I made sure that the > main diagonal of the matrix is non-zero and I even chose Dirichlet boundary > condition for pressure and velocity. And therefore, I do not get zero pivot. > But even when setting preconditioning to be LU, the resid norm is still > large: > It does not sound like this problem is well-posed. If these linear systems do not have solutions, how do you know your problem has a solution? Thanks, Matt > When I analyze the data, the velocity at regions close to the corners of > solid phase becomes large. I am not sure how I could get reasonable results > for one system size and totally nonsense for another. > Sepideh > 29 TS dt 3.2769e-06 time 0.00689069 > 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm > 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 > copy! > Write output at step= 29! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 1.453900186218e+08 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm > 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 > 1 SNES Function norm 1.298945530538e+08 > 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm > 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 > 29 TS dt 3.2769e-06 time 0.00689069 > copy! > Write output at step= 29! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 1.453900186218e+08 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm > 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 1.298945530538e+08 > 29 TS dt 3.2769e-06 time 0.00689069 > copy! > 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm > 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 > Write output at step= 29! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 1.453900186218e+08 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm > 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm > 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm > 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 > 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm > 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 1.298945530538e+08 > 29 TS dt 3.2769e-06 time 0.00689069 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 29! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm > 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 > 0 SNES Function norm 1.453900186218e+08 > 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm > 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 1.298945530538e+08 > 29 TS dt 3.2769e-06 time 0.00689069 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > copy! > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 29! > 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm > 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 > 0 SNES Function norm 1.453900186218e+08 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm > 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 > 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm > 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 > 1 SNES Function norm 1.298945530538e+08 > 29 TS dt 3.2769e-06 time 0.00689069 > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 29! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 1.453900186218e+08 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm > 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 > 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm > 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm > 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 > 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm > 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 > 1 SNES Function norm 1.298945530538e+08 > > ------------------------------ > *From:* Barry Smith > *Sent:* Monday, March 22, 2021 10:48 PM > *To:* Sepideh Kavousi > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] PF+Navier stokes > > > Each of the individual fields "converging" does not imply the overall > PCFIELDSPLIT will converge. > > This is scary, 7 KSP preconditioned resid norm 7.989463672336e-05 > true resid norm 1.904266194607e+07 often these huge differences in true > residual come from the operator having a null space that is not handled > properly. Or they can come from an ILU that produces absurd pivots. > > Recommend using the same PCFIELDSPLIT but using a direct solver on all > the splits (no ML no ILU). If that results in great overall convergence it > means one of the sub-systems is not converging properly with ML or ILU and > you need better preconditioners on some of the splits. > > Barry > > > > On Mar 22, 2021, at 4:05 PM, Sepideh Kavousi wrote: > > I modified my BC such that on the left and right side of the interface the > BC are constant value instead of Neumann(zero flux). This solves the > problem but still the code has convergence problem: > I even tried field split with the following order: > the block size is 9. the first block is the fields related to for PF > equation, the second split block is the velocities in x and y direction and > the third block is pressure. > > -pc_type fieldsplit > -pc_fieldsplit_block_size 9 > -pc_fieldsplit_0_fields 0,1 (two fields related to for the Phasefield > model) > -pc_fieldsplit_1_fields 2,3 (velocity in x and y direction) > -pc_fieldsplit_2_fields 4 (pressure) > -fieldsplit_1_pc_fieldsplit_block_size 2 > -fieldsplit_1_fieldsplit_0_pc_type ml (based on > https://lists.mcs.anl.gov/pipermail/petsc-users/2015-February/024191.html > > ) > -fieldsplit_1_fieldsplit_1_pc_type ml (based on > https://lists.mcs.anl.gov/pipermail/petsc-users/2015-February/024191.html > > ) > -fieldsplit_0_pc_type ilu (based on previous solutions of phase-field > equations) > -fieldsplit_2_pc_type ilu > > I guess changing the BCs the main reason that at first few steps the code > does not fail. And as time increases, true resid norm increases such that > at a finite time step (~30) it reaches 1e7 and the code results > non-accurate velocity calculations. Can this also be resulted by > forward/backward discritization? > Best, > Sepideh > > > 34 TS dt 3.12462e-07 time 0.00709097 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > copy! > copy! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 34! > Write output at step= 34! > 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm > 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 > 0 SNES Function norm 6.393295863037e+07 > 0 SNES Function norm 6.393295863037e+07 > 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm > 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm > 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm > 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm > 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm > 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm > 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm > 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 1.904272255718e+07 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 33 TS dt 1.56231e-07 time 0.00709082 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > copy! > 1 SNES Function norm 1.904272255718e+07 > Write output at step= 33! > 33 TS dt 1.56231e-07 time 0.00709082 > 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm > 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 > copy! > 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm > 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 > Write output at step= 33! > 0 SNES Function norm 2.464456787205e+07 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 2.464456787205e+07 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm > 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 > 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm > 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 > 0 KSP preconditioned resid norm 2.003930340911e+01 true resid norm > 6.987120567963e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm > 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 > 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm > 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 > 1 KSP preconditioned resid norm 1.199890501875e-02 true resid norm > 1.879731143354e+07 ||r(i)||/||b|| 2.690280101896e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm > 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 > 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm > 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 > 2 KSP preconditioned resid norm 3.018100763012e-04 true resid norm > 1.879893603977e+07 ||r(i)||/||b|| 2.690512616309e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm > 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 > 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm > 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 > 3 KSP preconditioned resid norm 2.835332741838e-04 true resid norm > 1.879893794065e+07 ||r(i)||/||b|| 2.690512888363e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm > 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 > 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm > 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 > 4 KSP preconditioned resid norm 1.860011376508e-04 true resid norm > 1.879893735946e+07 ||r(i)||/||b|| 2.690512805182e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 1.886737547995e+07 > 31 TS dt 3.90578e-08 time 0.0070907 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > copy! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 31! > 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm > 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 > 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm > 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 > 0 SNES Function norm 1.888557765431e+07 > 1 SNES Function norm 1.938116032575e+07 > 1 SNES Function norm 1.938116032575e+07 > 34 TS dt 3.12462e-07 time 0.00709097 > 34 TS dt 3.12462e-07 time 0.00709097 > copy! > copy! > Write output at step= 34! > Write output at step= 34! > 0 SNES Function norm 6.393295863037e+07 > 0 SNES Function norm 6.393295863037e+07 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm > 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm > 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm > 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm > 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm > 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm > 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm > 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm > 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm > 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm > 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm > 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm > 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 2.011100207442e+07 > 1 SNES Function norm 2.011100207442e+07 > 35 TS dt 6.24925e-07 time 0.00709129 > 35 TS dt 6.24925e-07 time 0.00709129 > copy! > copy! > 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm > 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 35! > Write output at step= 35! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 2.790215258123e+08 > 0 SNES Function norm 2.790215258123e+08 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm > 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm > 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm > 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 > 1 SNES Function norm 1.938116032575e+07 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 34 TS dt 3.12462e-07 time 0.00709097 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 34! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm > 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 > 0 SNES Function norm 6.393295863037e+07 > 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm > 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm > 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 > 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm > 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm > 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 > 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm > 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm > 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 > 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm > 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm > 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 > 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm > 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm > 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm > 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm > 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm > 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.338586074554e-01 true resid norm > 1.888557765431e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm > 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm > 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 1.938116032575e+07 > 34 TS dt 3.12462e-07 time 0.00709097 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm > 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > copy! > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm > 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > 1 SNES Function norm 1.938116032575e+07 > 34 TS dt 3.12462e-07 time 0.00709097 > Write output at step= 34! > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 5.451259747447e-03 true resid norm > 1.887927947148e+07 ||r(i)||/||b|| 9.996665083300e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 34! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 6.393295863037e+07 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 6.393295863037e+07 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm > 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm > 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > 1 SNES Function norm 2.011100207442e+07 > 35 TS dt 6.24925e-07 time 0.00709129 > 1 SNES Function norm 2.011100207442e+07 > copy! > 2 KSP preconditioned resid norm 9.554577345960e-04 true resid norm > 1.887930135577e+07 ||r(i)||/||b|| 9.996676671129e-01 > 35 TS dt 6.24925e-07 time 0.00709129 > copy! > Write output at step= 35! > Write output at step= 35! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 2.790215258123e+08 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 2.790215258123e+08 > 3 KSP preconditioned resid norm 9.378991224281e-04 true resid norm > 1.887930134907e+07 ||r(i)||/||b|| 9.996676667583e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 4 KSP preconditioned resid norm 3.652611805745e-04 true resid norm > 1.887930205974e+07 ||r(i)||/||b|| 9.996677043885e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 5 KSP preconditioned resid norm 2.918222127367e-04 true resid norm > 1.887930204569e+07 ||r(i)||/||b|| 9.996677036447e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 6 KSP preconditioned resid norm 6.114488674627e-05 true resid norm > 1.887930243837e+07 ||r(i)||/||b|| 9.996677244370e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 7 KSP preconditioned resid norm 3.763532951474e-05 true resid norm > 1.887930248279e+07 ||r(i)||/||b|| 9.996677267895e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 8 KSP preconditioned resid norm 2.112644035802e-05 true resid norm > 1.887930251181e+07 ||r(i)||/||b|| 9.996677283257e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 9 KSP preconditioned resid norm 1.113068460252e-05 true resid norm > 1.887930250969e+07 ||r(i)||/||b|| 9.996677282137e-01 > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm > 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm > 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 10 KSP preconditioned resid norm 1.352518287887e-06 true resid norm > 1.887930250333e+07 ||r(i)||/||b|| 9.996677278767e-01 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm > 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm > 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 3.485261902880e+07 > 1 SNES Function norm 3.485261902880e+07 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 36 TS dt 1.24985e-06 time 0.00709191 > 36 TS dt 1.24985e-06 time 0.00709191 > copy! > copy! > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm > 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 36! > Write output at step= 36! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 4.684646000717e+08 > 0 SNES Function norm 4.684646000717e+08 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 11 KSP preconditioned resid norm 7.434707372444e-07 true resid norm > 1.887930250410e+07 ||r(i)||/||b|| 9.996677279175e-01 > 1 SNES Function norm 1.887938190335e+07 > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm > 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > 32 TS dt 7.81156e-08 time 0.00709074 > copy! > Write output at step= 32! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 2.010558777785e+07 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm > 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > 1 SNES Function norm 2.011100207442e+07 > 35 TS dt 6.24925e-07 time 0.00709129 > copy! > Write output at step= 35! > 0 SNES Function norm 2.790215258123e+08 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm > 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm > 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm > 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm > 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm > 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm > 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm > 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm > 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm > 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm > 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > 1 SNES Function norm 2.011100207442e+07 > 1 SNES Function norm 3.485261902880e+07 > 36 TS dt 1.24985e-06 time 0.00709191 > 35 TS dt 6.24925e-07 time 0.00709129 > copy! > 1 SNES Function norm 3.485261902880e+07 > copy! > 36 TS dt 1.24985e-06 time 0.00709191 > 1 SNES Function norm 2.011100207442e+07 > copy! > 35 TS dt 6.24925e-07 time 0.00709129 > Write output at step= 35! > Write output at step= 36! > copy! > Write output at step= 36! > Write output at step= 35! > 0 SNES Function norm 2.790215258123e+08 > 0 SNES Function norm 4.684646000717e+08 > 0 SNES Function norm 4.684646000717e+08 > 0 SNES Function norm 2.790215258123e+08 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm > 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm > 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm > 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm > 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm > 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm > 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 5.194818889106e+07 > 1 SNES Function norm 5.194818889106e+07 > 37 TS dt 1.4811e-06 time 0.00709316 > 37 TS dt 1.4811e-06 time 0.00709316 > copy! > copy! > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm > 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > Write output at step= 37! > 0 KSP preconditioned resid norm 3.165458473526e+00 true resid norm > 2.010558777785e+07 ||r(i)||/||b|| 1.000000000000e+00 > Write output at step= 37! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 3.786081856480e+09 > 0 SNES Function norm 3.786081856480e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm > 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > 1 KSP preconditioned resid norm 3.655364946441e-03 true resid norm > 1.904269379864e+07 ||r(i)||/||b|| 9.471343991057e-01 > 1 SNES Function norm 3.485261902880e+07 > 36 TS dt 1.24985e-06 time 0.00709191 > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 36! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 4.684646000717e+08 > 2 KSP preconditioned resid norm 2.207564350060e-03 true resid norm > 1.904265942845e+07 ||r(i)||/||b|| 9.471326896210e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 3 KSP preconditioned resid norm 2.188447918524e-03 true resid norm > 1.904266151317e+07 ||r(i)||/||b|| 9.471327933098e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 4 KSP preconditioned resid norm 7.425314556150e-04 true resid norm > 1.904265807404e+07 ||r(i)||/||b|| 9.471326222560e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 5 KSP preconditioned resid norm 6.052794097111e-04 true resid norm > 1.904265841692e+07 ||r(i)||/||b|| 9.471326393103e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 6 KSP preconditioned resid norm 1.251197915617e-04 true resid norm > 1.904266159346e+07 ||r(i)||/||b|| 9.471327973028e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm > 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm > 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm > 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm > 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm > 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm > 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm > 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm > 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm > 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm > 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > 1 SNES Function norm 3.485261902880e+07 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 36 TS dt 1.24985e-06 time 0.00709191 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > copy! > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm > 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 > 1 SNES Function norm 3.485261902880e+07 > Write output at step= 36! > 36 TS dt 1.24985e-06 time 0.00709191 > copy! > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm > 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > 1 SNES Function norm 1.904272255718e+07 > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm > 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > Write output at step= 36! > 0 SNES Function norm 4.684646000717e+08 > 33 TS dt 1.56231e-07 time 0.00709082 > copy! > 1 SNES Function norm 5.194818889106e+07 > 37 TS dt 1.4811e-06 time 0.00709316 > 1 SNES Function norm 5.194818889106e+07 > Write output at step= 33! > copy! > 0 SNES Function norm 4.684646000717e+08 > 37 TS dt 1.4811e-06 time 0.00709316 > copy! > Write output at step= 37! > Write output at step= 37! > 0 SNES Function norm 2.464456787205e+07 > 0 SNES Function norm 3.786081856480e+09 > 0 SNES Function norm 3.786081856480e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm > 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm > 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm > 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 > 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm > 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 1.679071798524e+08 > 1 SNES Function norm 1.679071798524e+08 > 38 TS dt 2.09926e-06 time 0.00709464 > 38 TS dt 2.09926e-06 time 0.00709464 > copy! > copy! > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm > 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > Write output at step= 38! > Write output at step= 38! > 0 SNES Function norm 4.969343279719e+09 > 0 SNES Function norm 4.969343279719e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm > 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm > 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > 1 SNES Function norm 5.194818889106e+07 > 37 TS dt 1.4811e-06 time 0.00709316 > copy! > Write output at step= 37! > 0 SNES Function norm 3.786081856480e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm > 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm > 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm > 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm > 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm > 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm > 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm > 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 > 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm > 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm > 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 1.679071798524e+08 > 38 TS dt 2.09926e-06 time 0.00709464 > 1 SNES Function norm 1.679071798524e+08 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > copy! > 38 TS dt 2.09926e-06 time 0.00709464 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > copy! > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm > 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 38! > Write output at step= 38! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 5.194818889106e+07 > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm > 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > 37 TS dt 1.4811e-06 time 0.00709316 > 0 SNES Function norm 4.969343279719e+09 > copy! > 0 SNES Function norm 4.969343279719e+09 > Write output at step= 37! > 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm > 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 > 1 SNES Function norm 5.194818889106e+07 > 37 TS dt 1.4811e-06 time 0.00709316 > copy! > 0 SNES Function norm 3.786081856480e+09 > Write output at step= 37! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 3.786081856480e+09 > 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm > 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm > 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm > 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm > 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm > 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm > 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 > 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm > 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm > 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 > 1 SNES Function norm 1.938116032575e+07 > 34 TS dt 3.12462e-07 time 0.00709097 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 34! > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm > 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 > 0 SNES Function norm 6.393295863037e+07 > 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm > 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 > 1 SNES Function norm 1.426574249724e+08 > 1 SNES Function norm 1.426574249724e+08 > 39 TS dt 3.42747e-06 time 0.00709674 > 39 TS dt 3.42747e-06 time 0.00709674 > copy! > copy! > Write output at step= 39! > Write output at step= 39! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 6.081085806316e+09 > 0 SNES Function norm 6.081085806316e+09 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm > 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm > 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 > 1 SNES Function norm 1.679071798524e+08 > 38 TS dt 2.09926e-06 time 0.00709464 > copy! > Write output at step= 38! > 0 SNES Function norm 4.969343279719e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm > 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm > 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm > 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm > 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 > 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm > 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm > 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 > 1 SNES Function norm 1.426574249724e+08 > 39 TS dt 3.42747e-06 time 0.00709674 > 1 SNES Function norm 1.426574249724e+08 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > copy! > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 39 TS dt 3.42747e-06 time 0.00709674 > copy! > Write output at step= 39! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 39! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm > 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 6.081085806316e+09 > 0 SNES Function norm 6.081085806316e+09 > 1 SNES Function norm 1.679071798524e+08 > 38 TS dt 2.09926e-06 time 0.00709464 > 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm > 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 > copy! > Write output at step= 38! > 1 SNES Function norm 1.679071798524e+08 > 38 TS dt 2.09926e-06 time 0.00709464 > copy! > 0 SNES Function norm 4.969343279719e+09 > Write output at step= 38! > 0 SNES Function norm 4.969343279719e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm > 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm > 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm > 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm > 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 > 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm > 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 > 1 SNES Function norm 1.906687815261e+08 > 1 SNES Function norm 1.906687815261e+08 > 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm > 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 > 40 TS dt 6.85494e-06 time 0.00710017 > 40 TS dt 6.85494e-06 time 0.00710017 > copy! > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 40! > Write output at step= 40! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 2.386070312612e+09 > 0 SNES Function norm 2.386070312612e+09 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm > 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 > 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm > 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 > 1 SNES Function norm 2.011100207442e+07 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 35 TS dt 6.24925e-07 time 0.00709129 > copy! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 35! > 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm > 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 > 0 SNES Function norm 2.790215258123e+08 > 1 SNES Function norm 1.426574249724e+08 > 39 TS dt 3.42747e-06 time 0.00709674 > copy! > Write output at step= 39! > 0 SNES Function norm 6.081085806316e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm > 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm > 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm > 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 > 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm > 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm > 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 > 1 SNES Function norm 1.906687815261e+08 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 40 TS dt 6.85494e-06 time 0.00710017 > 1 SNES Function norm 1.906687815261e+08 > 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm > 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 > copy! > 40 TS dt 6.85494e-06 time 0.00710017 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > copy! > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 40! > Write output at step= 40! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 2.386070312612e+09 > 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm > 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 2.386070312612e+09 > 1 SNES Function norm 1.426574249724e+08 > 39 TS dt 3.42747e-06 time 0.00709674 > 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm > 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 > copy! > Write output at step= 39! > 1 SNES Function norm 1.426574249724e+08 > 39 TS dt 3.42747e-06 time 0.00709674 > copy! > 0 SNES Function norm 6.081085806316e+09 > Write output at step= 39! > 0 SNES Function norm 6.081085806316e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm > 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm > 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm > 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 > 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm > 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm > 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 > 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm > 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm > 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm > 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm > 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 > 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm > 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm > 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 1.906687815261e+08 > 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm > 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 > 40 TS dt 6.85494e-06 time 0.00710017 > copy! > 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm > 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 > 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm > 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 > Write output at step= 40! > 1 SNES Function norm 3.485261902880e+07 > 36 TS dt 1.24985e-06 time 0.00709191 > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 2.386070312612e+09 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 36! > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 4.684646000717e+08 > 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm > 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 > 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm > 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm > 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 > 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm > 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm > 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 > 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm > 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm > 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 > 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm > 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm > 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 > 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm > 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm > 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 > 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm > 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 > 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm > 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm > 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm > 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 > 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm > 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 > 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm > 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm > 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm > 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm > 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 > 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm > 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 > 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm > 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm > 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm > 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 > 1 SNES Function norm 6.245293405655e+10 > 1 SNES Function norm 6.245293405655e+10 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm > 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 2.097192603535e+10 > 0 SNES Function norm 2.097192603535e+10 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 1.906687815261e+08 > 40 TS dt 6.85494e-06 time 0.00710017 > copy! > 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm > 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 > 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm > 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 > 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm > 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 > Write output at step= 40! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 1 SNES Function norm 1.906687815261e+08 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 2.386070312612e+09 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 40 TS dt 6.85494e-06 time 0.00710017 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > copy! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 40! > 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm > 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 > 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm > 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 > 0 SNES Function norm 2.386070312612e+09 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm > 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 > 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm > 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm > 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 > 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm > 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm > 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 > 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm > 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm > 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 > 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm > 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm > 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 > 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm > 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm > 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm > 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm > 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm > 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 > 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm > 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm > 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm > 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 > 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm > 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 > 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm > 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 > 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm > 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 > 1 SNES Function norm 6.245293405655e+10 > 1 SNES Function norm 6.245293405655e+10 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 2.097192603535e+10 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 2.097192603535e+10 > 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm > 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 > 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm > 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm > 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 > 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm > 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 > 1 SNES Function norm 5.194818889106e+07 > 37 TS dt 1.4811e-06 time 0.00709316 > copy! > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Write output at step= 37! > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 SNES Function norm 3.786081856480e+09 > 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm > 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm > 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm > 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm > 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 4.670669894784e+05 true resid norm > 2.097192603535e+10 ||r(i)||/||b|| 1.000000000000e+00 > 0 KSP preconditioned resid norm 4.670669894784e+05 true resid norm > 2.097192603535e+10 ||r(i)||/||b|| 1.000000000000e+00 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm > 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > 1 KSP preconditioned resid norm 2.135766802417e+00 true resid norm > 8.990660342789e+07 ||r(i)||/||b|| 4.286997926482e-03 > 1 KSP preconditioned resid norm 2.135766802417e+00 true resid norm > 8.990660342789e+07 ||r(i)||/||b|| 4.286997926482e-03 > Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations > 1 > Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations > 1 > 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm > 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 > 1 SNES Function norm 7.093938024894e+08 > 1 SNES Function norm 7.093938024894e+08 > > ------------------------------ > *From:* Barry Smith > *Sent:* Monday, March 22, 2021 1:56 PM > *To:* Sepideh Kavousi > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] PF+Navier stokes > > > Singular systems come up in solving PDEs almost always due to issues > related to boundary conditions. For example all Neumann (natural) boundary > conditions can produce singular systems. Direct factorizations generically > will eventually hit a zero pivot in such cases and there is no universally > acceptable approaches for what to do at that point to recover. If you think > your operator is singular you should start by using MatSetNullSpace(), it > won't "cure" the problem but is the tool we use to manage null spaces in > operators. > > > > > > On Mar 22, 2021, at 9:04 AM, Sepideh Kavousi wrote: > > Hello, > I want to solve PF solidification+Navier stokes using Finite different > method, and I have a strange problem. My code runs fine for some system > sizes and fails for some of the system sizes. When I run with the following > options: > mpirun -np 2 ./one.out -ts_monitor -snes_fd_color -ts_max_snes_failures -1 > -ts_type bdf -ts_bdf_adapt -pc_type bjacobi -snes_linesearch_type l2 > -snes_type ksponly -ksp_type gmres -ksp_gmres_restart 1001 -sub_pc_type ilu > -sub_ksp_type preonly -snes_monitor -ksp_monitor -snes_linesearch_monitor > -ksp_monitor_true_residual -ksp_converged_reason -log_view > > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > ^C Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to SUBPC_ERROR > 0 SNES Function norm 1.465357113711e+01 > > Even setting pc_type to LU does not solve the problem. > 0 TS dt 0.0001 time 0. > copy! > copy! > Write output at step= 0! > Write output at step= 0! > 0 SNES Function norm 1.465357113711e+01 > 0 SNES Function norm 1.465357113711e+01 > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT > > I guess the problem is that in mass conservation I used forward > discretization for u (velocity in x) and for the moment in x , I used > forward discretization for p (pressure) to ensure non-zero terms on the > diagonal of matrix. I tried to run it with valgrind but it did not output > anything. > > Does anyone have suggestions on how to solve this issue? > Best, > Sepideh > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From skavou1 at lsu.edu Tue Mar 23 15:25:18 2021 From: skavou1 at lsu.edu (Sepideh Kavousi) Date: Tue, 23 Mar 2021 20:25:18 +0000 Subject: [petsc-users] PF+Navier stokes In-Reply-To: References: <75FC52E9-8615-4966-9BA6-85BD43C01F7B@petsc.dev> <22EEC3F3-4B97-4CCE-8B53-54409298E0FB@petsc.dev> , Message-ID: This is exactly the problem I have. When the mesh has 150 points in x direction ( no matter how many mesh points I have in y direction) the code converges and results are reasonable. But it fails when I use for example choose 200 points in x direction. Sepideh Get Outlook for iOS ________________________________ From: Matthew Knepley Sent: Tuesday, March 23, 2021 3:20:30 PM To: Sepideh Kavousi Cc: Barry Smith ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PF+Navier stokes On Tue, Mar 23, 2021 at 4:08 PM Sepideh Kavousi > wrote: Thank you for you suggestions. In the discretization, I made sure that the main diagonal of the matrix is non-zero and I even chose Dirichlet boundary condition for pressure and velocity. And therefore, I do not get zero pivot. But even when setting preconditioning to be LU, the resid norm is still large: It does not sound like this problem is well-posed. If these linear systems do not have solutions, how do you know your problem has a solution? Thanks, Matt When I analyze the data, the velocity at regions close to the corners of solid phase becomes large. I am not sure how I could get reasonable results for one system size and totally nonsense for another. Sepideh 29 TS dt 3.2769e-06 time 0.00689069 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 copy! Write output at step= 29! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 1.453900186218e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 1 SNES Function norm 1.298945530538e+08 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 29 TS dt 3.2769e-06 time 0.00689069 copy! Write output at step= 29! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 1.453900186218e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.298945530538e+08 29 TS dt 3.2769e-06 time 0.00689069 copy! 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 Write output at step= 29! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 1.453900186218e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.298945530538e+08 29 TS dt 3.2769e-06 time 0.00689069 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 29! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 0 SNES Function norm 1.453900186218e+08 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.298945530538e+08 29 TS dt 3.2769e-06 time 0.00689069 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 29! 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 0 SNES Function norm 1.453900186218e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 1 SNES Function norm 1.298945530538e+08 29 TS dt 3.2769e-06 time 0.00689069 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 29! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 1.453900186218e+08 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 0 KSP preconditioned resid norm 5.978616012547e+03 true resid norm 3.032434597885e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 4.099453531745e-02 true resid norm 2.719463080396e+06 ||r(i)||/||b|| 8.967919975234e-03 1 KSP preconditioned resid norm 1.760119119843e-01 true resid norm 2.196480894736e+06 ||r(i)||/||b|| 7.243291895784e-03 1 SNES Function norm 1.298945530538e+08 ________________________________ From: Barry Smith > Sent: Monday, March 22, 2021 10:48 PM To: Sepideh Kavousi > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] PF+Navier stokes Each of the individual fields "converging" does not imply the overall PCFIELDSPLIT will converge. This is scary, 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 often these huge differences in true residual come from the operator having a null space that is not handled properly. Or they can come from an ILU that produces absurd pivots. Recommend using the same PCFIELDSPLIT but using a direct solver on all the splits (no ML no ILU). If that results in great overall convergence it means one of the sub-systems is not converging properly with ML or ILU and you need better preconditioners on some of the splits. Barry On Mar 22, 2021, at 4:05 PM, Sepideh Kavousi > wrote: I modified my BC such that on the left and right side of the interface the BC are constant value instead of Neumann(zero flux). This solves the problem but still the code has convergence problem: I even tried field split with the following order: the block size is 9. the first block is the fields related to for PF equation, the second split block is the velocities in x and y direction and the third block is pressure. -pc_type fieldsplit -pc_fieldsplit_block_size 9 -pc_fieldsplit_0_fields 0,1 (two fields related to for the Phasefield model) -pc_fieldsplit_1_fields 2,3 (velocity in x and y direction) -pc_fieldsplit_2_fields 4 (pressure) -fieldsplit_1_pc_fieldsplit_block_size 2 -fieldsplit_1_fieldsplit_0_pc_type ml (based on https://lists.mcs.anl.gov/pipermail/petsc-users/2015-February/024191.html) -fieldsplit_1_fieldsplit_1_pc_type ml (based on https://lists.mcs.anl.gov/pipermail/petsc-users/2015-February/024191.html) -fieldsplit_0_pc_type ilu (based on previous solutions of phase-field equations) -fieldsplit_2_pc_type ilu I guess changing the BCs the main reason that at first few steps the code does not fail. And as time increases, true resid norm increases such that at a finite time step (~30) it reaches 1e7 and the code results non-accurate velocity calculations. Can this also be resulted by forward/backward discritization? Best, Sepideh 34 TS dt 3.12462e-07 time 0.00709097 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 copy! copy! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 34! Write output at step= 34! 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 0 SNES Function norm 6.393295863037e+07 0 SNES Function norm 6.393295863037e+07 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.904272255718e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 33 TS dt 1.56231e-07 time 0.00709082 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! 1 SNES Function norm 1.904272255718e+07 Write output at step= 33! 33 TS dt 1.56231e-07 time 0.00709082 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 copy! 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 Write output at step= 33! 0 SNES Function norm 2.464456787205e+07 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.464456787205e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 0 KSP preconditioned resid norm 2.003930340911e+01 true resid norm 6.987120567963e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 1 KSP preconditioned resid norm 1.199890501875e-02 true resid norm 1.879731143354e+07 ||r(i)||/||b|| 2.690280101896e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 2 KSP preconditioned resid norm 3.018100763012e-04 true resid norm 1.879893603977e+07 ||r(i)||/||b|| 2.690512616309e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 3 KSP preconditioned resid norm 2.835332741838e-04 true resid norm 1.879893794065e+07 ||r(i)||/||b|| 2.690512888363e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 4 KSP preconditioned resid norm 1.860011376508e-04 true resid norm 1.879893735946e+07 ||r(i)||/||b|| 2.690512805182e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.886737547995e+07 31 TS dt 3.90578e-08 time 0.0070907 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 31! 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 0 SNES Function norm 1.888557765431e+07 1 SNES Function norm 1.938116032575e+07 1 SNES Function norm 1.938116032575e+07 34 TS dt 3.12462e-07 time 0.00709097 34 TS dt 3.12462e-07 time 0.00709097 copy! copy! Write output at step= 34! Write output at step= 34! 0 SNES Function norm 6.393295863037e+07 0 SNES Function norm 6.393295863037e+07 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 2.011100207442e+07 1 SNES Function norm 2.011100207442e+07 35 TS dt 6.24925e-07 time 0.00709129 35 TS dt 6.24925e-07 time 0.00709129 copy! copy! 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 35! Write output at step= 35! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.790215258123e+08 0 SNES Function norm 2.790215258123e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 1.938116032575e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 34 TS dt 3.12462e-07 time 0.00709097 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 34! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 0 SNES Function norm 6.393295863037e+07 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.338586074554e-01 true resid norm 1.888557765431e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.938116032575e+07 34 TS dt 3.12462e-07 time 0.00709097 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 1 SNES Function norm 1.938116032575e+07 34 TS dt 3.12462e-07 time 0.00709097 Write output at step= 34! copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 5.451259747447e-03 true resid norm 1.887927947148e+07 ||r(i)||/||b|| 9.996665083300e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 34! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 6.393295863037e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 6.393295863037e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 1 SNES Function norm 2.011100207442e+07 35 TS dt 6.24925e-07 time 0.00709129 1 SNES Function norm 2.011100207442e+07 copy! 2 KSP preconditioned resid norm 9.554577345960e-04 true resid norm 1.887930135577e+07 ||r(i)||/||b|| 9.996676671129e-01 35 TS dt 6.24925e-07 time 0.00709129 copy! Write output at step= 35! Write output at step= 35! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.790215258123e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.790215258123e+08 3 KSP preconditioned resid norm 9.378991224281e-04 true resid norm 1.887930134907e+07 ||r(i)||/||b|| 9.996676667583e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 3.652611805745e-04 true resid norm 1.887930205974e+07 ||r(i)||/||b|| 9.996677043885e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 2.918222127367e-04 true resid norm 1.887930204569e+07 ||r(i)||/||b|| 9.996677036447e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 6.114488674627e-05 true resid norm 1.887930243837e+07 ||r(i)||/||b|| 9.996677244370e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 3.763532951474e-05 true resid norm 1.887930248279e+07 ||r(i)||/||b|| 9.996677267895e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 2.112644035802e-05 true resid norm 1.887930251181e+07 ||r(i)||/||b|| 9.996677283257e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 1.113068460252e-05 true resid norm 1.887930250969e+07 ||r(i)||/||b|| 9.996677282137e-01 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 10 KSP preconditioned resid norm 1.352518287887e-06 true resid norm 1.887930250333e+07 ||r(i)||/||b|| 9.996677278767e-01 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 3.485261902880e+07 1 SNES Function norm 3.485261902880e+07 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 36 TS dt 1.24985e-06 time 0.00709191 36 TS dt 1.24985e-06 time 0.00709191 copy! copy! 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 36! Write output at step= 36! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 4.684646000717e+08 0 SNES Function norm 4.684646000717e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 11 KSP preconditioned resid norm 7.434707372444e-07 true resid norm 1.887930250410e+07 ||r(i)||/||b|| 9.996677279175e-01 1 SNES Function norm 1.887938190335e+07 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 32 TS dt 7.81156e-08 time 0.00709074 copy! Write output at step= 32! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.010558777785e+07 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 1 SNES Function norm 2.011100207442e+07 35 TS dt 6.24925e-07 time 0.00709129 copy! Write output at step= 35! 0 SNES Function norm 2.790215258123e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 1 SNES Function norm 2.011100207442e+07 1 SNES Function norm 3.485261902880e+07 36 TS dt 1.24985e-06 time 0.00709191 35 TS dt 6.24925e-07 time 0.00709129 copy! 1 SNES Function norm 3.485261902880e+07 copy! 36 TS dt 1.24985e-06 time 0.00709191 1 SNES Function norm 2.011100207442e+07 copy! 35 TS dt 6.24925e-07 time 0.00709129 Write output at step= 35! Write output at step= 36! copy! Write output at step= 36! Write output at step= 35! 0 SNES Function norm 2.790215258123e+08 0 SNES Function norm 4.684646000717e+08 0 SNES Function norm 4.684646000717e+08 0 SNES Function norm 2.790215258123e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 5.194818889106e+07 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 37 TS dt 1.4811e-06 time 0.00709316 copy! copy! 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Write output at step= 37! 0 KSP preconditioned resid norm 3.165458473526e+00 true resid norm 2.010558777785e+07 ||r(i)||/||b|| 1.000000000000e+00 Write output at step= 37! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 3.786081856480e+09 0 SNES Function norm 3.786081856480e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 1 KSP preconditioned resid norm 3.655364946441e-03 true resid norm 1.904269379864e+07 ||r(i)||/||b|| 9.471343991057e-01 1 SNES Function norm 3.485261902880e+07 36 TS dt 1.24985e-06 time 0.00709191 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 36! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 4.684646000717e+08 2 KSP preconditioned resid norm 2.207564350060e-03 true resid norm 1.904265942845e+07 ||r(i)||/||b|| 9.471326896210e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 2.188447918524e-03 true resid norm 1.904266151317e+07 ||r(i)||/||b|| 9.471327933098e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 7.425314556150e-04 true resid norm 1.904265807404e+07 ||r(i)||/||b|| 9.471326222560e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 6.052794097111e-04 true resid norm 1.904265841692e+07 ||r(i)||/||b|| 9.471326393103e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.251197915617e-04 true resid norm 1.904266159346e+07 ||r(i)||/||b|| 9.471327973028e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.989463672336e-05 true resid norm 1.904266194607e+07 ||r(i)||/||b|| 9.471328148409e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 4.768163844493e-05 true resid norm 1.904266226755e+07 ||r(i)||/||b|| 9.471328308303e-01 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 1 SNES Function norm 3.485261902880e+07 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 36 TS dt 1.24985e-06 time 0.00709191 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 2.073486308871e-05 true resid norm 1.904266231364e+07 ||r(i)||/||b|| 9.471328331228e-01 1 SNES Function norm 3.485261902880e+07 Write output at step= 36! 36 TS dt 1.24985e-06 time 0.00709191 copy! 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 1 SNES Function norm 1.904272255718e+07 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 Write output at step= 36! 0 SNES Function norm 4.684646000717e+08 33 TS dt 1.56231e-07 time 0.00709082 copy! 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 1 SNES Function norm 5.194818889106e+07 Write output at step= 33! copy! 0 SNES Function norm 4.684646000717e+08 37 TS dt 1.4811e-06 time 0.00709316 copy! Write output at step= 37! Write output at step= 37! 0 SNES Function norm 2.464456787205e+07 0 SNES Function norm 3.786081856480e+09 0 SNES Function norm 3.786081856480e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.679071798524e+08 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 38 TS dt 2.09926e-06 time 0.00709464 copy! copy! 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Write output at step= 38! Write output at step= 38! 0 SNES Function norm 4.969343279719e+09 0 SNES Function norm 4.969343279719e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 copy! Write output at step= 37! 0 SNES Function norm 3.786081856480e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.502321597276e+01 true resid norm 2.464456787205e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 1 SNES Function norm 1.679071798524e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 copy! 38 TS dt 2.09926e-06 time 0.00709464 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 38! Write output at step= 38! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 5.194818889106e+07 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 37 TS dt 1.4811e-06 time 0.00709316 0 SNES Function norm 4.969343279719e+09 copy! 0 SNES Function norm 4.969343279719e+09 Write output at step= 37! 1 KSP preconditioned resid norm 8.755672227974e-03 true resid norm 1.938151775153e+07 ||r(i)||/||b|| 7.864417770341e-01 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 copy! 0 SNES Function norm 3.786081856480e+09 Write output at step= 37! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 3.786081856480e+09 2 KSP preconditioned resid norm 1.508190120513e-03 true resid norm 1.938016275793e+07 ||r(i)||/||b|| 7.863867956035e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 1.422130457598e-03 true resid norm 1.938011909055e+07 ||r(i)||/||b|| 7.863850237166e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 4 KSP preconditioned resid norm 4.957954730602e-04 true resid norm 1.938028047800e+07 ||r(i)||/||b|| 7.863915723180e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 3.552719096462e-04 true resid norm 1.938029700510e+07 ||r(i)||/||b|| 7.863922429363e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 1.508455316659e-04 true resid norm 1.938039640595e+07 ||r(i)||/||b|| 7.863962763140e-01 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 7.631525161839e-05 true resid norm 1.938041339557e+07 ||r(i)||/||b|| 7.863969657001e-01 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 1.938116032575e+07 34 TS dt 3.12462e-07 time 0.00709097 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 34! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 0 SNES Function norm 6.393295863037e+07 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 1 SNES Function norm 1.426574249724e+08 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 39 TS dt 3.42747e-06 time 0.00709674 copy! copy! Write output at step= 39! Write output at step= 39! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 6.081085806316e+09 0 SNES Function norm 6.081085806316e+09 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 copy! Write output at step= 38! 0 SNES Function norm 4.969343279719e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.626907080947e+04 true resid norm 3.786081856480e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 1 SNES Function norm 1.426574249724e+08 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 39 TS dt 3.42747e-06 time 0.00709674 copy! Write output at step= 39! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 39! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 6.081085806316e+09 0 SNES Function norm 6.081085806316e+09 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 1 KSP preconditioned resid norm 1.411143979706e-01 true resid norm 3.252255320579e+07 ||r(i)||/||b|| 8.590029069268e-03 copy! Write output at step= 38! 1 SNES Function norm 1.679071798524e+08 38 TS dt 2.09926e-06 time 0.00709464 copy! 0 SNES Function norm 4.969343279719e+09 Write output at step= 38! 0 SNES Function norm 4.969343279719e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.212273553655e+02 true resid norm 6.393295863037e+07 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 1 SNES Function norm 1.906687815261e+08 1 SNES Function norm 1.906687815261e+08 1 KSP preconditioned resid norm 2.328823680294e-02 true resid norm 2.016516515602e+07 ||r(i)||/||b|| 3.154111054458e-01 40 TS dt 6.85494e-06 time 0.00710017 40 TS dt 6.85494e-06 time 0.00710017 copy! copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 40! Write output at step= 40! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 0 SNES Function norm 2.386070312612e+09 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 2.757087112828e-04 true resid norm 2.010380358203e+07 ||r(i)||/||b|| 3.144513254621e-01 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 2.011100207442e+07 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 35 TS dt 6.24925e-07 time 0.00709129 copy! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 35! 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 0 SNES Function norm 2.790215258123e+08 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 copy! Write output at step= 39! 0 SNES Function norm 6.081085806316e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 1 SNES Function norm 1.906687815261e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 40 TS dt 6.85494e-06 time 0.00710017 1 SNES Function norm 1.906687815261e+08 0 KSP preconditioned resid norm 8.865880769163e+04 true resid norm 4.969343279719e+09 ||r(i)||/||b|| 1.000000000000e+00 copy! 40 TS dt 6.85494e-06 time 0.00710017 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 40! Write output at step= 40! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 1 KSP preconditioned resid norm 4.636177252582e-01 true resid norm 4.472959395106e+07 ||r(i)||/||b|| 9.001107678275e-03 copy! Write output at step= 39! 1 SNES Function norm 1.426574249724e+08 39 TS dt 3.42747e-06 time 0.00709674 copy! 0 SNES Function norm 6.081085806316e+09 Write output at step= 39! 0 SNES Function norm 6.081085806316e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.143120146053e+03 true resid norm 2.790215258123e+08 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.906687815261e+08 1 KSP preconditioned resid norm 1.086405760770e-02 true resid norm 2.165319343001e+07 ||r(i)||/||b|| 7.760402487575e-02 40 TS dt 6.85494e-06 time 0.00710017 copy! 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 Write output at step= 40! 1 SNES Function norm 3.485261902880e+07 36 TS dt 1.24985e-06 time 0.00709191 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 36! Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 4.684646000717e+08 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.932453778489e+05 true resid norm 6.081085806316e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 6.245293405655e+10 1 SNES Function norm 6.245293405655e+10 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.097192603535e+10 0 SNES Function norm 2.097192603535e+10 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.906687815261e+08 40 TS dt 6.85494e-06 time 0.00710017 copy! 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 1 KSP preconditioned resid norm 1.434243067909e+00 true resid norm 7.171981611988e+07 ||r(i)||/||b|| 1.179391615316e-02 Write output at step= 40! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 SNES Function norm 1.906687815261e+08 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.386070312612e+09 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 40 TS dt 6.85494e-06 time 0.00710017 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 copy! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 40! 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 0 SNES Function norm 2.386070312612e+09 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 10 KSP preconditioned resid norm 1.779870885679e+00 true resid norm 4.269989357798e+07 ||r(i)||/||b|| 1.789548839038e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 11 KSP preconditioned resid norm 1.701919459227e+00 true resid norm 4.102844905926e+07 ||r(i)||/||b|| 1.719498744124e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 5.030869192505e+00 true resid norm 1.519850942875e+08 ||r(i)||/||b|| 6.369682128989e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 12 KSP preconditioned resid norm 1.633671715659e+00 true resid norm 3.959508749664e+07 ||r(i)||/||b|| 1.659426685264e-02 0 KSP preconditioned resid norm 4.028400414901e+03 true resid norm 4.684646000717e+08 ||r(i)||/||b|| 1.000000000000e+00 2 KSP preconditioned resid norm 3.859648344440e+00 true resid norm 9.578310406856e+07 ||r(i)||/||b|| 4.014261589958e-02 1 SNES Function norm 6.245293405655e+10 1 SNES Function norm 6.245293405655e+10 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.097192603535e+10 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 2.097192603535e+10 1 KSP preconditioned resid norm 7.513114811690e-02 true resid norm 2.515581080270e+07 ||r(i)||/||b|| 5.369842416878e-02 3 KSP preconditioned resid norm 3.133252984070e+00 true resid norm 7.649343929054e+07 ||r(i)||/||b|| 3.205833410953e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 2 KSP preconditioned resid norm 3.361950980934e-02 true resid norm 2.589885319795e+07 ||r(i)||/||b|| 5.528454699457e-02 4 KSP preconditioned resid norm 2.753196206679e+00 true resid norm 6.468496966917e+07 ||r(i)||/||b|| 2.710941472566e-02 1 SNES Function norm 5.194818889106e+07 37 TS dt 1.4811e-06 time 0.00709316 copy! Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Write output at step= 37! Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 SNES Function norm 3.786081856480e+09 5 KSP preconditioned resid norm 2.500754637621e+00 true resid norm 5.655851544362e+07 ||r(i)||/||b|| 2.370362480296e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 6 KSP preconditioned resid norm 2.263885927226e+00 true resid norm 5.368792420993e+07 ||r(i)||/||b|| 2.250056250487e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 7 KSP preconditioned resid norm 2.104083158879e+00 true resid norm 5.002745241660e+07 ||r(i)||/||b|| 2.096646194883e-02 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 8 KSP preconditioned resid norm 1.976241956139e+00 true resid norm 4.707835526096e+07 ||r(i)||/||b|| 1.973049788689e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 4.670669894784e+05 true resid norm 2.097192603535e+10 ||r(i)||/||b|| 1.000000000000e+00 0 KSP preconditioned resid norm 4.670669894784e+05 true resid norm 2.097192603535e+10 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 9 KSP preconditioned resid norm 1.870078692623e+00 true resid norm 4.468270947214e+07 ||r(i)||/||b|| 1.872648481311e-02 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 2.135766802417e+00 true resid norm 8.990660342789e+07 ||r(i)||/||b|| 4.286997926482e-03 1 KSP preconditioned resid norm 2.135766802417e+00 true resid norm 8.990660342789e+07 ||r(i)||/||b|| 4.286997926482e-03 Linear fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_0_ solve converged due to CONVERGED_ITS iterations 1 Linear fieldsplit_2_ solve converged due to CONVERGED_ITS iterations 1 0 KSP preconditioned resid norm 1.698771295747e+05 true resid norm 2.386070312612e+09 ||r(i)||/||b|| 1.000000000000e+00 1 SNES Function norm 7.093938024894e+08 1 SNES Function norm 7.093938024894e+08 ________________________________ From: Barry Smith > Sent: Monday, March 22, 2021 1:56 PM To: Sepideh Kavousi > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] PF+Navier stokes Singular systems come up in solving PDEs almost always due to issues related to boundary conditions. For example all Neumann (natural) boundary conditions can produce singular systems. Direct factorizations generically will eventually hit a zero pivot in such cases and there is no universally acceptable approaches for what to do at that point to recover. If you think your operator is singular you should start by using MatSetNullSpace(), it won't "cure" the problem but is the tool we use to manage null spaces in operators. On Mar 22, 2021, at 9:04 AM, Sepideh Kavousi > wrote: Hello, I want to solve PF solidification+Navier stokes using Finite different method, and I have a strange problem. My code runs fine for some system sizes and fails for some of the system sizes. When I run with the following options: mpirun -np 2 ./one.out -ts_monitor -snes_fd_color -ts_max_snes_failures -1 -ts_type bdf -ts_bdf_adapt -pc_type bjacobi -snes_linesearch_type l2 -snes_type ksponly -ksp_type gmres -ksp_gmres_restart 1001 -sub_pc_type ilu -sub_ksp_type preonly -snes_monitor -ksp_monitor -snes_linesearch_monitor -ksp_monitor_true_residual -ksp_converged_reason -log_view 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 ^C Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to SUBPC_ERROR 0 SNES Function norm 1.465357113711e+01 Even setting pc_type to LU does not solve the problem. 0 TS dt 0.0001 time 0. copy! copy! Write output at step= 0! Write output at step= 0! 0 SNES Function norm 1.465357113711e+01 0 SNES Function norm 1.465357113711e+01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC_FAILED due to FACTOR_NUMERIC_ZEROPIVOT I guess the problem is that in mass conservation I used forward discretization for u (velocity in x) and for the moment in x , I used forward discretization for p (pressure) to ensure non-zero terms on the diagonal of matrix. I tried to run it with valgrind but it did not output anything. Does anyone have suggestions on how to solve this issue? Best, Sepideh -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From salazardetro1 at llnl.gov Tue Mar 23 16:17:38 2021 From: salazardetro1 at llnl.gov (Salazar De Troya, Miguel) Date: Tue, 23 Mar 2021 21:17:38 +0000 Subject: [petsc-users] Local Discontinuous Galerkin with PETSc TS In-Reply-To: References: <3EE29E70-8ECF-4842-99DC-30E867769875@llnl.gov> <775766A0-D6D6-4007-888C-A261A139941F@petsc.dev> Message-ID: <1F5BC542-F685-4DE0-8D40-43087DB3D1B0@llnl.gov> Ok I will investigate implementing it using SLATE, thanks. Miguel From: Matthew Knepley Date: Tuesday, March 23, 2021 at 12:57 PM To: "Salazar De Troya, Miguel" Cc: Barry Smith , "Jorti, Zakariae via petsc-users" Subject: Re: [petsc-users] Local Discontinuous Galerkin with PETSc TS On Tue, Mar 23, 2021 at 11:54 AM Salazar De Troya, Miguel > wrote: The calculation of p1 and p2 are done by solving an element-wise local problem using u^n. I guess I could embed this calculation inside of the calculation for G = H(p1, p2). However, I am hoping to be able to solve the problem using firedrake-ts so the formulation is all clearly in one place and in variational form. Reading the manual, Section 2.5.2 DAE formulations, the Hessenberg Index-1 DAE case seems to be what I need, although it is not clear to me how one can achieve this with an IMEX scheme. If I have: I am almost certain that you do not want to do this. I am guessing the Firedrake guys will agree. Did they tell you to do this? If you had a large, nonlinear system for p1/p2, then a DAE would make sense. Since it is just element-wise elimination, you should roll it into the easy equation u' = H Then you can use any integrator, as Barry says, in particular a nice symplectic integrator. My understand is that SLATE is for exactly this kind of thing. Thanks, Matt F(U', U, t) = G(t,U) p1 = f(u_x) p2 = g(u_x) u' - H(p1, p2) = 0 where U = (p1, p2, u), F(U?, U, t) = [p1, p2, u? - H(p1, p2)],] and G(t, U) = [f(u_x), g(u_x), 0], is there a solver strategy that will solve for p1 and p2 first and then use that to solve the last equation? The jacobian for F in this formulation would be dF/dU = [[M, 0, 0], [0, M, 0], [H'(p1), H'(p2), \sigma*M]] where M is a mass matrix, H'(p1) is the jacobian of H(p1, p2) w.r.t. p1 and H'(p2), the jacobian of H(p1, p2) w.r.t. p2. H'(p1) and H'(p2) are unnecessary for the solver strategy I want to implement. Thanks Miguel From: Barry Smith > Date: Monday, March 22, 2021 at 7:42 PM To: Matthew Knepley > Cc: "Salazar De Troya, Miguel" >, "Jorti, Zakariae via petsc-users" > Subject: Re: [petsc-users] Local Discontinuous Galerkin with PETSc TS u_t = G(u) I don't see why you won't just compute any needed u_x from the given u and then you can use any explicit or implicit TS solver trivially. For implicit methods it can automatically compute the Jacobian of G for you or you can provide it directly. Explicit methods will just use the "old" u while implicit methods will use the new. Barry On Mar 22, 2021, at 7:20 PM, Matthew Knepley > wrote: On Mon, Mar 22, 2021 at 7:53 PM Salazar De Troya, Miguel via petsc-users > wrote: Hello I am interested in implementing the LDG method in ?A local discontinuous Galerkin method for directly solving Hamilton?Jacobi equations? https://www.sciencedirect.com/science/article/pii/S0021999110005255. The equation is more or less of the form (for 1D case): p1 = f(u_x) p2 = g(u_x) u_t = H(p1, p2) where typically one solves for p1 and p2 using the previous time step solution ?u? and then plugs them into the third equation to obtain the next step solution. I am wondering if the TS infrastructure could be used to implement this solution scheme. Looking at the manual, I think one could set G(t, U) to the right-hand side in the above equations and F(t, u, u?) = 0 to the left-hand side, although the first two equations would not have time derivative. In that case, how could one take advantage of the operator split scheme I mentioned? Maybe using some block preconditioners? Hi Miguel, I have a simple-minded way of understanding these TS things. My heuristic is that you put things in F that you expect to want at u^{n+1}, and things in G that you expect to want at u^n. It is not that simple, since you could for instance move F and G to the LHS and have Backward Euler, but it is my rule of thumb. So, were you looking for an IMEX scheme? If so, which terms should be lagged? Also, from the equations above, it is hard to see why you need a solve to calculate p1/p2. It looks like just a forward application of an operator. Thanks, Matt I am trying to solve the Hamilton-Jacobi equation u_t ? H(u_x) = 0. I welcome any suggestion for better methods. Thanks Miguel Miguel A. Salazar de Troya Postdoctoral Researcher, Lawrence Livermore National Laboratory B141 Rm: 1085-5 Ph: 1(925) 422-6411 -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From gohardoust at gmail.com Tue Mar 23 17:23:30 2021 From: gohardoust at gmail.com (Mohammad Gohardoust) Date: Tue, 23 Mar 2021 15:23:30 -0700 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: References: Message-ID: Thanks Dave for your reply. For sure PETSc is awesome :D Yes, in both cases petsc was configured with --with-debugging=0 and fortunately I do have the old and new -log-veiw outputs which I attached. Best, Mohammad On Tue, Mar 23, 2021 at 1:37 AM Dave May wrote: > Nice to hear! > The answer is simple, PETSc is awesome :) > > Jokes aside, assuming both petsc builds were configured with > ?with-debugging=0, I don?t think there is a definitive answer to your > question with the information you provided. > > It could be as simple as one specific implementation you use was improved > between petsc releases. Not being an Ubuntu expert, the change might be > associated with using a different compiler, and or a more efficient BLAS > implementation (non threaded vs threaded). However I doubt this is the > origin of your 2x performance increase. > > If you really want to understand where the performance improvement > originated from, you?d need to send to the email list the result of > -log_view from both the old and new versions, running the exact same > problem. > > From that info, we can see what implementations in PETSc are being used > and where the time reduction is occurring. Knowing that, it should be > clearer to provide an explanation for it. > > > Thanks, > Dave > > > On Tue 23. Mar 2021 at 06:24, Mohammad Gohardoust > wrote: > >> Hi, >> >> I am using a code which is based on petsc (and also parmetis). Recently I >> made the following changes and now the code is running about two times >> faster than before: >> >> - Upgraded Ubuntu 18.04 to 20.04 >> - Upgraded petsc 3.13.4 to 3.14.5 >> - This time I installed parmetis and metis directly via petsc by >> --download-parmetis --download-metis flags instead of installing them >> separately and using --with-parmetis-include=... and >> --with-parmetis-lib=... (the version of installed parmetis was 4.0.3 before) >> >> I was wondering what can possibly explain this speedup? Does anyone have >> any suggestions? >> >> Thanks, >> Mohammad >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: run_new.py Type: text/x-python Size: 18749 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: run_old.py Type: text/x-python Size: 19545 bytes Desc: not available URL: From junchao.zhang at gmail.com Tue Mar 23 20:07:52 2021 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Tue, 23 Mar 2021 20:07:52 -0500 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: References: Message-ID: In the new log, I saw Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 5.4095e+00 2.3% 4.3700e+03 0.0% 4.764e+05 3.0% 3.135e+02 1.0% 2.244e+04 12.6% 1: Solute_Assembly: 1.3977e+02 59.4% 7.3353e+09 4.6% 3.263e+06 20.7% 1.278e+03 26.9% 1.059e+04 6.0% But I didn't see any event in this stage had a cost close to 140s. What happened? --- Event Stage 1: Solute_Assembly BuildTwoSided 3531 1.0 2.8025e+0026.3 0.00e+00 0.0 3.6e+05 4.0e+00 3.5e+03 1 0 2 0 2 1 0 11 0 33 0 BuildTwoSidedF 3531 1.0 2.8678e+0013.2 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 1 0 5 17 2 1 0 22 62 33 0 VecScatterBegin 7062 1.0 7.1911e-02 1.9 0.00e+00 0.0 7.1e+05 3.5e+02 0.0e+00 0 0 5 2 0 0 0 22 6 0 0 VecScatterEnd 7062 1.0 2.1248e-01 3.0 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 73 SFBcastOpBegin 3531 1.0 2.6516e-02 2.4 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 SFBcastOpEnd 3531 1.0 9.5041e-02 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFReduceBegin 3531 1.0 3.8955e-02 2.1 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 SFReduceEnd 3531 1.0 1.3791e-01 3.9 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 112 SFPack 7062 1.0 6.5591e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFUnpack 7062 1.0 7.4186e-03 2.1 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2080 MatAssemblyBegin 3531 1.0 4.7846e+00 1.1 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 2 0 5 17 2 3 0 22 62 33 0 MatAssemblyEnd 3531 1.0 1.5468e+00 2.7 1.68e+07 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 2 0 0 0 104 MatZeroEntries 3531 1.0 3.0998e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --Junchao Zhang On Tue, Mar 23, 2021 at 5:24 PM Mohammad Gohardoust wrote: > Thanks Dave for your reply. > > For sure PETSc is awesome :D > > Yes, in both cases petsc was configured with --with-debugging=0 and > fortunately I do have the old and new -log-veiw outputs which I attached. > > Best, > Mohammad > > On Tue, Mar 23, 2021 at 1:37 AM Dave May wrote: > >> Nice to hear! >> The answer is simple, PETSc is awesome :) >> >> Jokes aside, assuming both petsc builds were configured with >> ?with-debugging=0, I don?t think there is a definitive answer to your >> question with the information you provided. >> >> It could be as simple as one specific implementation you use was improved >> between petsc releases. Not being an Ubuntu expert, the change might be >> associated with using a different compiler, and or a more efficient BLAS >> implementation (non threaded vs threaded). However I doubt this is the >> origin of your 2x performance increase. >> >> If you really want to understand where the performance improvement >> originated from, you?d need to send to the email list the result of >> -log_view from both the old and new versions, running the exact same >> problem. >> >> From that info, we can see what implementations in PETSc are being used >> and where the time reduction is occurring. Knowing that, it should be >> clearer to provide an explanation for it. >> >> >> Thanks, >> Dave >> >> >> On Tue 23. Mar 2021 at 06:24, Mohammad Gohardoust >> wrote: >> >>> Hi, >>> >>> I am using a code which is based on petsc (and also parmetis). Recently >>> I made the following changes and now the code is running about two times >>> faster than before: >>> >>> - Upgraded Ubuntu 18.04 to 20.04 >>> - Upgraded petsc 3.13.4 to 3.14.5 >>> - This time I installed parmetis and metis directly via petsc by >>> --download-parmetis --download-metis flags instead of installing them >>> separately and using --with-parmetis-include=... and >>> --with-parmetis-lib=... (the version of installed parmetis was 4.0.3 before) >>> >>> I was wondering what can possibly explain this speedup? Does anyone have >>> any suggestions? >>> >>> Thanks, >>> Mohammad >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 23 20:30:01 2021 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 23 Mar 2021 21:30:01 -0400 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: References: Message-ID: On Tue, Mar 23, 2021 at 9:08 PM Junchao Zhang wrote: > In the new log, I saw > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count %Total Avg %Total Count %Total > 0: Main Stage: 5.4095e+00 2.3% 4.3700e+03 0.0% 4.764e+05 3.0% 3.135e+02 1.0% 2.244e+04 12.6% 1: Solute_Assembly: 1.3977e+02 59.4% 7.3353e+09 4.6% 3.263e+06 20.7% 1.278e+03 26.9% 1.059e+04 6.0% > > > But I didn't see any event in this stage had a cost close to 140s. What > happened? > This is true, but all the PETSc operations are speeding up by a factor 2x. It is hard to believe these were run on the same machine. For example, VecScale speeds up!?! So it is not network, or optimizations. I cannot explain this. Matt --- Event Stage 1: Solute_Assembly > > BuildTwoSided 3531 1.0 2.8025e+0026.3 0.00e+00 0.0 3.6e+05 4.0e+00 3.5e+03 1 0 2 0 2 1 0 11 0 33 0 > BuildTwoSidedF 3531 1.0 2.8678e+0013.2 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 1 0 5 17 2 1 0 22 62 33 0 > VecScatterBegin 7062 1.0 7.1911e-02 1.9 0.00e+00 0.0 7.1e+05 3.5e+02 0.0e+00 0 0 5 2 0 0 0 22 6 0 0 > VecScatterEnd 7062 1.0 2.1248e-01 3.0 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 73 > SFBcastOpBegin 3531 1.0 2.6516e-02 2.4 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 > SFBcastOpEnd 3531 1.0 9.5041e-02 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFReduceBegin 3531 1.0 3.8955e-02 2.1 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 > SFReduceEnd 3531 1.0 1.3791e-01 3.9 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 112 > SFPack 7062 1.0 6.5591e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFUnpack 7062 1.0 7.4186e-03 2.1 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2080 > MatAssemblyBegin 3531 1.0 4.7846e+00 1.1 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 2 0 5 17 2 3 0 22 62 33 0 > MatAssemblyEnd 3531 1.0 1.5468e+00 2.7 1.68e+07 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 2 0 0 0 104 > MatZeroEntries 3531 1.0 3.0998e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > --Junchao Zhang > > > > On Tue, Mar 23, 2021 at 5:24 PM Mohammad Gohardoust > wrote: > >> Thanks Dave for your reply. >> >> For sure PETSc is awesome :D >> >> Yes, in both cases petsc was configured with --with-debugging=0 and >> fortunately I do have the old and new -log-veiw outputs which I attached. >> >> Best, >> Mohammad >> >> On Tue, Mar 23, 2021 at 1:37 AM Dave May wrote: >> >>> Nice to hear! >>> The answer is simple, PETSc is awesome :) >>> >>> Jokes aside, assuming both petsc builds were configured with >>> ?with-debugging=0, I don?t think there is a definitive answer to your >>> question with the information you provided. >>> >>> It could be as simple as one specific implementation you use was >>> improved between petsc releases. Not being an Ubuntu expert, the change >>> might be associated with using a different compiler, and or a more >>> efficient BLAS implementation (non threaded vs threaded). However I doubt >>> this is the origin of your 2x performance increase. >>> >>> If you really want to understand where the performance improvement >>> originated from, you?d need to send to the email list the result of >>> -log_view from both the old and new versions, running the exact same >>> problem. >>> >>> From that info, we can see what implementations in PETSc are being used >>> and where the time reduction is occurring. Knowing that, it should be >>> clearer to provide an explanation for it. >>> >>> >>> Thanks, >>> Dave >>> >>> >>> On Tue 23. Mar 2021 at 06:24, Mohammad Gohardoust >>> wrote: >>> >>>> Hi, >>>> >>>> I am using a code which is based on petsc (and also parmetis). Recently >>>> I made the following changes and now the code is running about two times >>>> faster than before: >>>> >>>> - Upgraded Ubuntu 18.04 to 20.04 >>>> - Upgraded petsc 3.13.4 to 3.14.5 >>>> - This time I installed parmetis and metis directly via petsc by >>>> --download-parmetis --download-metis flags instead of installing them >>>> separately and using --with-parmetis-include=... and >>>> --with-parmetis-lib=... (the version of installed parmetis was 4.0.3 before) >>>> >>>> I was wondering what can possibly explain this speedup? Does anyone >>>> have any suggestions? >>>> >>>> Thanks, >>>> Mohammad >>>> >>> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Mar 23 20:49:16 2021 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 23 Mar 2021 20:49:16 -0500 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: References: Message-ID: <5b1891-509d-97e0-f7c5-8e543c22ab8@mcs.anl.gov> On Tue, 23 Mar 2021, Matthew Knepley wrote: > On Tue, Mar 23, 2021 at 9:08 PM Junchao Zhang > wrote: > > > In the new log, I saw > > > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- > > Avg %Total Avg %Total Count %Total Avg %Total Count %Total > > 0: Main Stage: 5.4095e+00 2.3% 4.3700e+03 0.0% 4.764e+05 3.0% 3.135e+02 1.0% 2.244e+04 12.6% 1: Solute_Assembly: 1.3977e+02 59.4% 7.3353e+09 4.6% 3.263e+06 20.7% 1.278e+03 26.9% 1.059e+04 6.0% > > > > > > But I didn't see any event in this stage had a cost close to 140s. What > > happened? > > > > This is true, but all the PETSc operations are speeding up by a factor 2x. > It is hard to believe these were run on the same machine. > For example, VecScale speeds up!?! So it is not network, or optimizations. > I cannot explain this. * Using C compiler: /home/mohammad/Programs/petsc/arch-linux-c-opt/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -Ofast -march=native -mtune=native Perhaps the CPU is new enough that '-march=native -mtune=native' makes a difference between '18.04 to 20.04'? You can build 3.13.4 again and see if the numbers are similar to the old or new numbers you currently have.. * --download-fblaslapack --download-openblas You should use one or the other - but not both. Perhaps one is using openblas in thread mode [vs single thread for the other]? Satish > > Matt > > --- Event Stage 1: Solute_Assembly > > > > BuildTwoSided 3531 1.0 2.8025e+0026.3 0.00e+00 0.0 3.6e+05 4.0e+00 3.5e+03 1 0 2 0 2 1 0 11 0 33 0 > > BuildTwoSidedF 3531 1.0 2.8678e+0013.2 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 1 0 5 17 2 1 0 22 62 33 0 > > VecScatterBegin 7062 1.0 7.1911e-02 1.9 0.00e+00 0.0 7.1e+05 3.5e+02 0.0e+00 0 0 5 2 0 0 0 22 6 0 0 > > VecScatterEnd 7062 1.0 2.1248e-01 3.0 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 73 > > SFBcastOpBegin 3531 1.0 2.6516e-02 2.4 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 > > SFBcastOpEnd 3531 1.0 9.5041e-02 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > SFReduceBegin 3531 1.0 3.8955e-02 2.1 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 > > SFReduceEnd 3531 1.0 1.3791e-01 3.9 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 112 > > SFPack 7062 1.0 6.5591e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > SFUnpack 7062 1.0 7.4186e-03 2.1 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2080 > > MatAssemblyBegin 3531 1.0 4.7846e+00 1.1 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 2 0 5 17 2 3 0 22 62 33 0 > > MatAssemblyEnd 3531 1.0 1.5468e+00 2.7 1.68e+07 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 2 0 0 0 104 > > MatZeroEntries 3531 1.0 3.0998e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > > > --Junchao Zhang > > > > > > > > On Tue, Mar 23, 2021 at 5:24 PM Mohammad Gohardoust > > wrote: > > > >> Thanks Dave for your reply. > >> > >> For sure PETSc is awesome :D > >> > >> Yes, in both cases petsc was configured with --with-debugging=0 and > >> fortunately I do have the old and new -log-veiw outputs which I attached. > >> > >> Best, > >> Mohammad > >> > >> On Tue, Mar 23, 2021 at 1:37 AM Dave May wrote: > >> > >>> Nice to hear! > >>> The answer is simple, PETSc is awesome :) > >>> > >>> Jokes aside, assuming both petsc builds were configured with > >>> ?with-debugging=0, I don?t think there is a definitive answer to your > >>> question with the information you provided. > >>> > >>> It could be as simple as one specific implementation you use was > >>> improved between petsc releases. Not being an Ubuntu expert, the change > >>> might be associated with using a different compiler, and or a more > >>> efficient BLAS implementation (non threaded vs threaded). However I doubt > >>> this is the origin of your 2x performance increase. > >>> > >>> If you really want to understand where the performance improvement > >>> originated from, you?d need to send to the email list the result of > >>> -log_view from both the old and new versions, running the exact same > >>> problem. > >>> > >>> From that info, we can see what implementations in PETSc are being used > >>> and where the time reduction is occurring. Knowing that, it should be > >>> clearer to provide an explanation for it. > >>> > >>> > >>> Thanks, > >>> Dave > >>> > >>> > >>> On Tue 23. Mar 2021 at 06:24, Mohammad Gohardoust > >>> wrote: > >>> > >>>> Hi, > >>>> > >>>> I am using a code which is based on petsc (and also parmetis). Recently > >>>> I made the following changes and now the code is running about two times > >>>> faster than before: > >>>> > >>>> - Upgraded Ubuntu 18.04 to 20.04 > >>>> - Upgraded petsc 3.13.4 to 3.14.5 > >>>> - This time I installed parmetis and metis directly via petsc by > >>>> --download-parmetis --download-metis flags instead of installing them > >>>> separately and using --with-parmetis-include=... and > >>>> --with-parmetis-lib=... (the version of installed parmetis was 4.0.3 before) > >>>> > >>>> I was wondering what can possibly explain this speedup? Does anyone > >>>> have any suggestions? > >>>> > >>>> Thanks, > >>>> Mohammad > >>>> > >>> > > From gohardoust at gmail.com Wed Mar 24 02:16:45 2021 From: gohardoust at gmail.com (Mohammad Gohardoust) Date: Wed, 24 Mar 2021 00:16:45 -0700 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: References: Message-ID: So the code itself is a finite-element scheme and in stage 1 and 3 there are expensive loops over entire mesh elements which consume a lot of time. Mohammad On Tue, Mar 23, 2021 at 6:08 PM Junchao Zhang wrote: > In the new log, I saw > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count %Total Avg %Total Count %Total > 0: Main Stage: 5.4095e+00 2.3% 4.3700e+03 0.0% 4.764e+05 3.0% 3.135e+02 1.0% 2.244e+04 12.6% 1: Solute_Assembly: 1.3977e+02 59.4% 7.3353e+09 4.6% 3.263e+06 20.7% 1.278e+03 26.9% 1.059e+04 6.0% > > > But I didn't see any event in this stage had a cost close to 140s. What > happened? > > --- Event Stage 1: Solute_Assembly > > BuildTwoSided 3531 1.0 2.8025e+0026.3 0.00e+00 0.0 3.6e+05 4.0e+00 3.5e+03 1 0 2 0 2 1 0 11 0 33 0 > BuildTwoSidedF 3531 1.0 2.8678e+0013.2 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 1 0 5 17 2 1 0 22 62 33 0 > VecScatterBegin 7062 1.0 7.1911e-02 1.9 0.00e+00 0.0 7.1e+05 3.5e+02 0.0e+00 0 0 5 2 0 0 0 22 6 0 0 > VecScatterEnd 7062 1.0 2.1248e-01 3.0 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 73 > SFBcastOpBegin 3531 1.0 2.6516e-02 2.4 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 > SFBcastOpEnd 3531 1.0 9.5041e-02 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFReduceBegin 3531 1.0 3.8955e-02 2.1 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 > SFReduceEnd 3531 1.0 1.3791e-01 3.9 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 112 > SFPack 7062 1.0 6.5591e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFUnpack 7062 1.0 7.4186e-03 2.1 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2080 > MatAssemblyBegin 3531 1.0 4.7846e+00 1.1 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 2 0 5 17 2 3 0 22 62 33 0 > MatAssemblyEnd 3531 1.0 1.5468e+00 2.7 1.68e+07 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 2 0 0 0 104 > MatZeroEntries 3531 1.0 3.0998e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > --Junchao Zhang > > > > On Tue, Mar 23, 2021 at 5:24 PM Mohammad Gohardoust > wrote: > >> Thanks Dave for your reply. >> >> For sure PETSc is awesome :D >> >> Yes, in both cases petsc was configured with --with-debugging=0 and >> fortunately I do have the old and new -log-veiw outputs which I attached. >> >> Best, >> Mohammad >> >> On Tue, Mar 23, 2021 at 1:37 AM Dave May wrote: >> >>> Nice to hear! >>> The answer is simple, PETSc is awesome :) >>> >>> Jokes aside, assuming both petsc builds were configured with >>> ?with-debugging=0, I don?t think there is a definitive answer to your >>> question with the information you provided. >>> >>> It could be as simple as one specific implementation you use was >>> improved between petsc releases. Not being an Ubuntu expert, the change >>> might be associated with using a different compiler, and or a more >>> efficient BLAS implementation (non threaded vs threaded). However I doubt >>> this is the origin of your 2x performance increase. >>> >>> If you really want to understand where the performance improvement >>> originated from, you?d need to send to the email list the result of >>> -log_view from both the old and new versions, running the exact same >>> problem. >>> >>> From that info, we can see what implementations in PETSc are being used >>> and where the time reduction is occurring. Knowing that, it should be >>> clearer to provide an explanation for it. >>> >>> >>> Thanks, >>> Dave >>> >>> >>> On Tue 23. Mar 2021 at 06:24, Mohammad Gohardoust >>> wrote: >>> >>>> Hi, >>>> >>>> I am using a code which is based on petsc (and also parmetis). Recently >>>> I made the following changes and now the code is running about two times >>>> faster than before: >>>> >>>> - Upgraded Ubuntu 18.04 to 20.04 >>>> - Upgraded petsc 3.13.4 to 3.14.5 >>>> - This time I installed parmetis and metis directly via petsc by >>>> --download-parmetis --download-metis flags instead of installing them >>>> separately and using --with-parmetis-include=... and >>>> --with-parmetis-lib=... (the version of installed parmetis was 4.0.3 before) >>>> >>>> I was wondering what can possibly explain this speedup? Does anyone >>>> have any suggestions? >>>> >>>> Thanks, >>>> Mohammad >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From gohardoust at gmail.com Wed Mar 24 02:24:01 2021 From: gohardoust at gmail.com (Mohammad Gohardoust) Date: Wed, 24 Mar 2021 00:24:01 -0700 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: <5b1891-509d-97e0-f7c5-8e543c22ab8@mcs.anl.gov> References: <5b1891-509d-97e0-f7c5-8e543c22ab8@mcs.anl.gov> Message-ID: I built petsc 3.13.4 and got results similar to the old ones. I am attaching the log-view output file here. Mohammad On Tue, Mar 23, 2021 at 6:49 PM Satish Balay via petsc-users < petsc-users at mcs.anl.gov> wrote: > On Tue, 23 Mar 2021, Matthew Knepley wrote: > > > On Tue, Mar 23, 2021 at 9:08 PM Junchao Zhang > > wrote: > > > > > In the new log, I saw > > > > > > Summary of Stages: ----- Time ------ ----- Flop ------ --- > Messages --- -- Message Lengths -- -- Reductions -- > > > Avg %Total Avg %Total Count > %Total Avg %Total Count %Total > > > 0: Main Stage: 5.4095e+00 2.3% 4.3700e+03 0.0% 4.764e+05 > 3.0% 3.135e+02 1.0% 2.244e+04 12.6% 1: Solute_Assembly: > 1.3977e+02 59.4% 7.3353e+09 4.6% 3.263e+06 20.7% 1.278e+03 > 26.9% 1.059e+04 6.0% > > > > > > > > > But I didn't see any event in this stage had a cost close to 140s. What > > > happened? > > > > > > > This is true, but all the PETSc operations are speeding up by a factor > 2x. > > It is hard to believe these were run on the same machine. > > For example, VecScale speeds up!?! So it is not network, or > optimizations. > > I cannot explain this. > > * Using C compiler: > /home/mohammad/Programs/petsc/arch-linux-c-opt/bin/mpicc -Wall > -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector > -fvisibility=hidden -Ofast -march=native -mtune=native > > Perhaps the CPU is new enough that '-march=native -mtune=native' makes a > difference between '18.04 to 20.04'? > > You can build 3.13.4 again and see if the numbers are similar to the old > or new numbers you currently have.. > > * --download-fblaslapack --download-openblas > > You should use one or the other - but not both. Perhaps one is using > openblas in thread mode [vs single thread for the other]? > > Satish > > > > > > Matt > > > > --- Event Stage 1: Solute_Assembly > > > > > > BuildTwoSided 3531 1.0 2.8025e+0026.3 0.00e+00 0.0 3.6e+05 > 4.0e+00 3.5e+03 1 0 2 0 2 1 0 11 0 33 0 > > > BuildTwoSidedF 3531 1.0 2.8678e+0013.2 0.00e+00 0.0 7.1e+05 > 3.6e+03 3.5e+03 1 0 5 17 2 1 0 22 62 33 0 > > > VecScatterBegin 7062 1.0 7.1911e-02 1.9 0.00e+00 0.0 7.1e+05 > 3.5e+02 0.0e+00 0 0 5 2 0 0 0 22 6 0 0 > > > VecScatterEnd 7062 1.0 2.1248e-01 3.0 1.60e+06 2.7 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 73 > > > SFBcastOpBegin 3531 1.0 2.6516e-02 2.4 0.00e+00 0.0 3.6e+05 > 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 > > > SFBcastOpEnd 3531 1.0 9.5041e-02 4.7 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > SFReduceBegin 3531 1.0 3.8955e-02 2.1 0.00e+00 0.0 3.6e+05 > 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 > > > SFReduceEnd 3531 1.0 1.3791e-01 3.9 1.60e+06 2.7 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 112 > > > SFPack 7062 1.0 6.5591e-03 2.5 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > SFUnpack 7062 1.0 7.4186e-03 2.1 1.60e+06 2.7 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2080 > > > MatAssemblyBegin 3531 1.0 4.7846e+00 1.1 0.00e+00 0.0 7.1e+05 > 3.6e+03 3.5e+03 2 0 5 17 2 3 0 22 62 33 0 > > > MatAssemblyEnd 3531 1.0 1.5468e+00 2.7 1.68e+07 2.7 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 1 2 0 0 0 104 > > > MatZeroEntries 3531 1.0 3.0998e-02 1.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > > > > > > > --Junchao Zhang > > > > > > > > > > > > On Tue, Mar 23, 2021 at 5:24 PM Mohammad Gohardoust < > gohardoust at gmail.com> > > > wrote: > > > > > >> Thanks Dave for your reply. > > >> > > >> For sure PETSc is awesome :D > > >> > > >> Yes, in both cases petsc was configured with --with-debugging=0 and > > >> fortunately I do have the old and new -log-veiw outputs which I > attached. > > >> > > >> Best, > > >> Mohammad > > >> > > >> On Tue, Mar 23, 2021 at 1:37 AM Dave May > wrote: > > >> > > >>> Nice to hear! > > >>> The answer is simple, PETSc is awesome :) > > >>> > > >>> Jokes aside, assuming both petsc builds were configured with > > >>> ?with-debugging=0, I don?t think there is a definitive answer to your > > >>> question with the information you provided. > > >>> > > >>> It could be as simple as one specific implementation you use was > > >>> improved between petsc releases. Not being an Ubuntu expert, the > change > > >>> might be associated with using a different compiler, and or a more > > >>> efficient BLAS implementation (non threaded vs threaded). However I > doubt > > >>> this is the origin of your 2x performance increase. > > >>> > > >>> If you really want to understand where the performance improvement > > >>> originated from, you?d need to send to the email list the result of > > >>> -log_view from both the old and new versions, running the exact same > > >>> problem. > > >>> > > >>> From that info, we can see what implementations in PETSc are being > used > > >>> and where the time reduction is occurring. Knowing that, it should be > > >>> clearer to provide an explanation for it. > > >>> > > >>> > > >>> Thanks, > > >>> Dave > > >>> > > >>> > > >>> On Tue 23. Mar 2021 at 06:24, Mohammad Gohardoust < > gohardoust at gmail.com> > > >>> wrote: > > >>> > > >>>> Hi, > > >>>> > > >>>> I am using a code which is based on petsc (and also parmetis). > Recently > > >>>> I made the following changes and now the code is running about two > times > > >>>> faster than before: > > >>>> > > >>>> - Upgraded Ubuntu 18.04 to 20.04 > > >>>> - Upgraded petsc 3.13.4 to 3.14.5 > > >>>> - This time I installed parmetis and metis directly via petsc by > > >>>> --download-parmetis --download-metis flags instead of installing > them > > >>>> separately and using --with-parmetis-include=... and > > >>>> --with-parmetis-lib=... (the version of installed parmetis was > 4.0.3 before) > > >>>> > > >>>> I was wondering what can possibly explain this speedup? Does anyone > > >>>> have any suggestions? > > >>>> > > >>>> Thanks, > > >>>> Mohammad > > >>>> > > >>> > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: run_new_v3.13.4.py Type: text/x-python Size: 18795 bytes Desc: not available URL: From snailsoar at hotmail.com Wed Mar 24 05:08:29 2021 From: snailsoar at hotmail.com (feng wang) Date: Wed, 24 Mar 2021 10:08:29 +0000 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: <5B018B57-B679-4015-8097-042B7C6B9D38@petsc.dev> References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> <38509E59-D27A-47C6-8D97-EAAEBFC15FBF@petsc.dev> , <5B018B57-B679-4015-8097-042B7C6B9D38@petsc.dev> Message-ID: Hi Barry, Thanks for your comments. It's very helpful. For your comments, I have a bit more questions 1. for your 1st comment " Yes, in some sense. So long as each process ....". * If I understand it correctly (hopefully) a parallel vector in petsc can hold discontinuous rows of data in a global array. If this is true, If I call "VecGetArray", it would create a copy in a continuous space if the data is not continuous, do some operations and petsc will figure out how to put updated values back to the right place in the global array? * This would generate an overhead. If I do the renumbering to make each process hold continuous rows, this overhead can be avoided when I call "VecGetArray"? 2. for your 2nd comment " The matrix and vectors the algebraic solvers see DO NOT have......." For the callback function of my shell matrix "mymult(Mat m ,Vec x, Vec y)", I need to get "x" for the halo elements to compute the non-linear function. My code will take care of other halo exchanges, but I am not sure how to use petsc to get the halo elements "x" in the shell matrix, could you please elaborate on this? some related examples or simple pesudo code would be great. Thanks, Feng ________________________________ From: Barry Smith Sent: 22 March 2021 1:28 To: feng wang Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Mar 21, 2021, at 6:22 PM, feng wang > wrote: Hi Barry, Thanks for your help, I really appreciate it. In the end I used a shell matrix to compute the matrix-vector product, it is clearer to me and there are more things under my control. I am now trying to do a parallel implementation, I have some questions on setting up parallel matrices and vectors for a user-defined partition, could you please provide some advice? Suppose I have already got a partition for 2 CPUs. Each cpu is assigned a list of elements and also their halo elements. 1. The global element index for each partition is not necessarily continuous, do I have to I re-order them to make them continuous? Yes, in some sense. So long as each process can march over ITS elements computing the function and Jacobian matrix-vector product it doesn't matter how you have named/numbered entries. But conceptually the first process has the first set of vector entries and the second the second set. 1. 2. When I set up the size of the matrix and vectors for each cpu, should I take into account the halo elements? The matrix and vectors the algebraic solvers see DO NOT have halo elements in their sizes. You will likely need a halo-ed work vector to do the matrix-free multiply from. The standard model is use VecScatterBegin/End to get the values from the non-halo-ed algebraic vector input to MatMult into a halo-ed one to do the local product. 1. In my serial version, when I initialize my RHS vector, I am not using VecSetValues, Instead I use VecGetArray/VecRestoreArray to assign the values. VecAssemblyBegin()/VecAssemblyEnd() is never used. would this still work for a parallel version? Yes, you can use Get/Restore but the input vector x will need to be, as noted above, scattered into a haloed version to get all the entries you will need to do the local part of the product. Thanks, Feng ________________________________ From: Barry Smith > Sent: 12 March 2021 23:40 To: feng wang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Mar 12, 2021, at 9:37 AM, feng wang > wrote: Hi Matt, Thanks for your prompt response. Below are my two versions. one is buggy and the 2nd one is working. For the first one, I add the diagonal contribution to the true RHS (variable: rhs) and then set the base point, the callback function is somehow called twice afterwards to compute Jacobian. Do you mean "to compute the Jacobian matrix-vector product?" Is it only in the first computation of the product (for the given base vector) that it calls it twice or every matrix-vector product? It is possible there is a bug in our logic; run in the debugger with a break point in FormFunction_mf and each time the function is hit in the debugger type where or bt to get the stack frames from the calls. Send this. From this we can all see if it is being called excessively and why. For the 2nd one, I just call the callback function manually to recompute everything, the callback function is then called once as expected to compute the Jacobian. For me, both versions should do the same things. but I don't know why in the first one the callback function is called twice after I set the base point. what could possibly go wrong? The logic of how it is suppose to work is shown below. Thanks, Feng //This does not work fld->cnsv( iqs,iqe, q, aux, csv ); //add contribution of time-stepping for(iv=0; ivcnsv( iqs,iqe, q, aux, csv ); ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); ierr = FormFunction_mf(this, petsc_csv, petsc_baserhs); //this is my callback function, now call it manually ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); Since you provide petsc_baserhs MatMFFD assumes (naturally) that you will keep the correct values in it. Hence for each new base value YOU need to compute the new values in petsc_baserhs. This approach gives you a bit more control over reusing the information in petsc_baserhs. If you would prefer that MatMFFD recomputes the base values, as needed, then you call FormFunction_mf(this, petsc_csv, NULL); and PETSc will allocate a vector and fill it up as needed by calling your FormFunction_mf() But you need to call MatAssemblyBegin/End each time you the base input vector this, petsc_csv values change. For example MatAssemblyBegin(petsc_A_mf,...) MatAssemblyEnd(petsc_A_mf,...) KSPSolve() ________________________________ From: Matthew Knepley > Sent: 12 March 2021 15:08 To: feng wang > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 9:55 AM feng wang > wrote: Hi Mat, Thanks for your reply. I will try the parallel implementation. I've got a serial matrix-free GMRES working, but I would like to know why my initial version of matrix-free implementation does not work and there is still something I don't understand. I did some debugging and find that the callback function to compute the RHS for the matrix-free matrix is called twice by Petsc when it computes the finite difference Jacobian, but it should only be called once. I don't know why, could you please give some advice? F is called once to calculate the base point and once to get the perturbation. The base point is not recalculated, so if you do many iterates, it is amortized. Thanks, Matt Thanks, Feng ________________________________ From: Matthew Knepley > Sent: 12 March 2021 12:05 To: feng wang > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 6:02 AM feng wang > wrote: Hi Barry, Thanks for your advice. You are right on this. somehow there is some inconsistency when I compute the right hand side (true RHS + time-stepping contribution to the diagonal matrix) to compute the finite difference Jacobian. If I just use the call back function to recompute my RHS before I call MatMFFDSetBase, then it works like a charm. But now I end up with computing my RHS three times. 1st time is to compute the true RHS, the rest two is for computing finite difference Jacobian. In my previous buggy version, I only compute RHS twice. If possible, could you elaborate on your comments "Also be careful about petsc_baserhs", so I may possibly understand what was going on with my buggy version. Our FD implementation is simple. It approximates the action of the Jacobian as J(b) v = (F(b + h v) - F(b)) / h ||v|| where h is some small parameter and b is the base vector, namely the one that you are linearizing around. In a Newton step, b is the previous solution and v is the proposed solution update. Besides, for a parallel implementation, my code already has its own partition method, is it possible to allow petsc read in a user-defined partition? if not what is a better way to do this? Sure https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html Thanks, Matt Many thanks, Feng ________________________________ From: Barry Smith > Sent: 11 March 2021 22:15 To: feng wang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation Feng, The first thing to check is that for each linear solve that involves a new operator (values in the base vector) the MFFD matrix knows it is using a new operator. The easiest way is to call MatMFFDSetBase() before each solve that involves a new operator (new values in the base vector). Also be careful about petsc_baserhs, when you change the base vector's values you also need to change the petsc_baserhs values to the function evaluation at that point. If that is correct I would check with a trivial function evaluator to make sure the infrastructure is all set up correctly. For examples use for the matrix free a 1 4 1 operator applied matrix free. Barry On Mar 11, 2021, at 7:35 AM, feng wang > wrote: Dear All, I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: the matrix-free matrix is created as: ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); KSP linear operator is set up as: ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix Before calling KSPSolve, I do: ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side The call back function is defined as: PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) { PetscErrorCode ierr; cFdDomain *user_ctx; cout << "FormFunction_mf called\n"; //in_vec: flow states //out_vec: right hand side + diagonal contributions from CFL number user_ctx = (cFdDomain*)ctx; //get perturbed conservative variables from petsc user_ctx->petsc_getcsv(in_vec); //get new right side user_ctx->petsc_fd_rhs(); //set new right hand side to the output vector user_ctx->petsc_setrhs(out_vec); ierr = 0; return ierr; } The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? Thanks for your help in advance. Feng -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.huysegoms at fz-juelich.de Wed Mar 24 05:19:13 2021 From: m.huysegoms at fz-juelich.de (Marcel Huysegoms) Date: Wed, 24 Mar 2021 11:19:13 +0100 Subject: [petsc-users] Singular jocabian using SNES In-Reply-To: References: Message-ID: thanks a lot for your suggestion! The TAO interface seems to be perfectly suited for the problem and I wasn't aware of its existence. However, I believe there is a *wrapper function missing in the petsc4py library*. When I perform (from the attached python script): tao = PETSc.TAO() tao.create(PETSc.COMM_WORLD) tao.setType("brgn") tao.setFromOptions() tao.setResidual(fill_function, f,args=(points_original,)) tao.setJacobian(fill_jacobian, A,args=(points_original,)) tao.solve(x) ... I get the following error message petsc4py.PETSc.Error: error code 58 [0] TaoSolve() line 215 in /opt/petsc/src/tao/interface/taosolver.c [0] TaoSetUp() line 269 in /opt/petsc/src/tao/interface/taosolver.c [0] TaoSetUp_BRGN() line 242 in /opt/petsc/src/tao/leastsquares/impls/brgn/brgn.c [0] Operation done in wrong order [0] *TaoSetResidualJacobianRoutine() must be called before setup!* The TAO documentation states, for solving LLSQ problems (i.e. using BRGN) the two functions "setResidualRoutine()" and "setJacobianResidualRoutine()" need to be called. The first one is invoked by petsc4py.setResidual(). However there is no petsc4py function that invokes the second routine. Looking into the source code of petsc4py (TAO.pyx), there is only setJacobian() which invokes ToaSetJacobianRoutine() but not ToaSetJacobian*Residual*Routine() https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Tao/TaoSetJacobianRoutine.html https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Tao/TaoSetJacobianResidualRoutine.html Am I missing something or is petsc4py actually lacking a wrapper function like setJacobianResidual()? Best regards, Marcel On 23.03.21 20:42, Matthew Knepley wrote: > On Tue, Mar 23, 2021 at 12:39 PM Marcel Huysegoms > > wrote: > > Hello everyone, > > I have a large system of nonlinear equations for which I'm trying > to find the optimal solution. > In order to get familiar with the SNES framework, I created a > standalone python script (see below), which creates a set of 2D > points and transforms them using an affine transformation. The > optimizer should then "move" the points back to their original > position given the jacobian and the residual vector. > > Now I have 2 questions regarding the usage of SNES. > > - As in my real application the jacobian often gets singular > (contains many rows of only zeros), especially when it approaches > the solution. This I simulate below by setting the 10th row equal > to zero in the fill-functions. I read > (https://scicomp.stackexchange.com/questions/21781/newtons > method-goes-to-zero-determinant-jacobian) that quasi-newton > approaches like BFGS might be able to deal with such a singular > jacobian, however I cannot figure out a combination of solvers > that converges in that case. > > I always get the message: /Nonlinear solve did not converge due to > DIVERGED_INNER iterations 0./ What can I do in order to make the > solver converge (to the least square minimum length solution)? Is > there a solver that can deal with such a situation? What do I need > to change in the example script? > > - In my real application I actually have an overdetermined MxN > system. I've read in the manual that the SNES package expects a > square jacobian. Is it possible to solve a system having more > equations than unknowns? > > > SNES is only for solving systems of nonlinear?equations. If you want > optimization (least-square, etc.) then you want to formulate your > problem in the TAO interface. It has quasi-Newton methods for those > problems, and other methods as well. That is where I would start. > > ? Thanks, > > ? ? ?Matt > > Many thanks in advance, > Marcel > > ----------------------------------------- > > import sys > import petsc4py > import numpyas np > > petsc4py.init(sys.argv) > from petsc4pyimport PETSc > > def fill_function(snes, x, f, points_original): > x_values = x.getArray(readonly=True) > diff_vectors = points_original.ravel() - x_values > f_values = np.square(diff_vectors) > # f_values[10] = 0 f.setValues(np.arange(f_values.size), f_values) > f.assemble() > > def fill_jacobian(snes, x,J, P, points_original): > x_values = x.getArray(readonly=True) > points_original_flat = points_original.ravel() > deriv_values = -2*(points_original_flat - x_values) > # deriv_values[10] = 0 for iin range(x_values.size): > P.setValue(i, i, deriv_values[i]) > # print(deriv_values) P.assemble() > > # > --------------------------------------------------------------------------------------------- > if __name__ =='__main__': > # Initialize original grid points grid_dim =10 grid_spacing =100 num_points = grid_dim * grid_dim > points_original = np.zeros(shape=(num_points,2),dtype=np.float64) > for iin range(grid_dim): > for jin range(grid_dim): > points_original[i*grid_dim+j] = (i*grid_spacing, j*grid_spacing) > > # Compute transformed grid points affine_mat = np.array([[-0.5, -0.86,100], [0.86, -0.5,100]])# createAffineMatrix(120, 1, 1, 100, 100) points_transformed = np.matmul(affine_mat[:2,:2], points_original.T).T + affine_mat[:2,2] > > # Initialize PETSc objects num_unknown = points_transformed.size > mat_shape = (num_unknown, num_unknown) > A = PETSc.Mat() > A.createAIJ(size=mat_shape,comm=PETSc.COMM_WORLD) > A.setUp() > x, f = A.createVecs() > > options = PETSc.Options() > options.setValue("-snes_qn_type","lbfgs")# broyden/lbfgs options.setValue("-snes_qn_scale_type","none")# none, diagonal, scalar, jacobian, options.setValue("-snes_monitor","") > # options.setValue("-snes_view", "") options.setValue("-snes_converged_reason","") > options.setFromOptions() > > snes = PETSc.SNES() > snes.create(PETSc.COMM_WORLD) > snes.setType("qn")snes.setFunction(fill_function, f,args=(points_original,)) > snes.setJacobian(fill_jacobian, A,None,args=(points_original,)) > snes_pc = snes.getNPC()# Inner snes instance (newtonls by default!) # > snes_pc.setType("ngmres") snes.setFromOptions() > > ksp = snes_pc.getKSP() > ksp.setType("cg") > ksp.setTolerances(rtol=1e-10,max_it=40000) > pc = ksp.getPC() > pc.setType("asm") > ksp.setFromOptions() > > x.setArray(points_transformed.ravel()) > snes.solve(None, x) > > > > ------------------------------------------------------------------------------------------------ > ------------------------------------------------------------------------------------------------ > Forschungszentrum Juelich GmbH > 52425 Juelich > Sitz der Gesellschaft: Juelich > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 > Vorsitzender des Aufsichtsrats: MinDir Volker Rieke > Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), > Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt > ------------------------------------------------------------------------------------------------ > ------------------------------------------------------------------------------------------------ > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -- Marcel Huysegoms Institut f?r Neurowissenschaften und Medizin (INM-1) Forschungszentrum J?lich GmbH 52425 J?lich Telefon: +49 2461 61 3678 Email: m.huysegoms at fz-juelich.de -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tao_test.py Type: text/x-python Size: 2096 bytes Desc: not available URL: From junchao.zhang at gmail.com Wed Mar 24 17:23:33 2021 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 24 Mar 2021 17:23:33 -0500 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: References: Message-ID: On Wed, Mar 24, 2021 at 2:17 AM Mohammad Gohardoust wrote: > So the code itself is a finite-element scheme and in stage 1 and 3 there > are expensive loops over entire mesh elements which consume a lot of time. > So these expensive loops must also take half time with newer petsc? And these loops do not call petsc routines? I think you can build two PETSc versions with the same configuration options, then run your code with one MPI rank to see if there is a difference. If they give the same performance, then scale to 2, 4, ... ranks and see what happens. > > Mohammad > > On Tue, Mar 23, 2021 at 6:08 PM Junchao Zhang > wrote: > >> In the new log, I saw >> >> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >> 0: Main Stage: 5.4095e+00 2.3% 4.3700e+03 0.0% 4.764e+05 3.0% 3.135e+02 1.0% 2.244e+04 12.6% 1: Solute_Assembly: 1.3977e+02 59.4% 7.3353e+09 4.6% 3.263e+06 20.7% 1.278e+03 26.9% 1.059e+04 6.0% >> >> >> But I didn't see any event in this stage had a cost close to 140s. What >> happened? >> >> --- Event Stage 1: Solute_Assembly >> >> BuildTwoSided 3531 1.0 2.8025e+0026.3 0.00e+00 0.0 3.6e+05 4.0e+00 3.5e+03 1 0 2 0 2 1 0 11 0 33 0 >> BuildTwoSidedF 3531 1.0 2.8678e+0013.2 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 1 0 5 17 2 1 0 22 62 33 0 >> VecScatterBegin 7062 1.0 7.1911e-02 1.9 0.00e+00 0.0 7.1e+05 3.5e+02 0.0e+00 0 0 5 2 0 0 0 22 6 0 0 >> VecScatterEnd 7062 1.0 2.1248e-01 3.0 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 73 >> SFBcastOpBegin 3531 1.0 2.6516e-02 2.4 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 >> SFBcastOpEnd 3531 1.0 9.5041e-02 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFReduceBegin 3531 1.0 3.8955e-02 2.1 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 >> SFReduceEnd 3531 1.0 1.3791e-01 3.9 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 112 >> SFPack 7062 1.0 6.5591e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFUnpack 7062 1.0 7.4186e-03 2.1 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2080 >> MatAssemblyBegin 3531 1.0 4.7846e+00 1.1 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 2 0 5 17 2 3 0 22 62 33 0 >> MatAssemblyEnd 3531 1.0 1.5468e+00 2.7 1.68e+07 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 2 0 0 0 104 >> MatZeroEntries 3531 1.0 3.0998e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> >> --Junchao Zhang >> >> >> >> On Tue, Mar 23, 2021 at 5:24 PM Mohammad Gohardoust >> wrote: >> >>> Thanks Dave for your reply. >>> >>> For sure PETSc is awesome :D >>> >>> Yes, in both cases petsc was configured with --with-debugging=0 and >>> fortunately I do have the old and new -log-veiw outputs which I attached. >>> >>> Best, >>> Mohammad >>> >>> On Tue, Mar 23, 2021 at 1:37 AM Dave May >>> wrote: >>> >>>> Nice to hear! >>>> The answer is simple, PETSc is awesome :) >>>> >>>> Jokes aside, assuming both petsc builds were configured with >>>> ?with-debugging=0, I don?t think there is a definitive answer to your >>>> question with the information you provided. >>>> >>>> It could be as simple as one specific implementation you use was >>>> improved between petsc releases. Not being an Ubuntu expert, the change >>>> might be associated with using a different compiler, and or a more >>>> efficient BLAS implementation (non threaded vs threaded). However I doubt >>>> this is the origin of your 2x performance increase. >>>> >>>> If you really want to understand where the performance improvement >>>> originated from, you?d need to send to the email list the result of >>>> -log_view from both the old and new versions, running the exact same >>>> problem. >>>> >>>> From that info, we can see what implementations in PETSc are being used >>>> and where the time reduction is occurring. Knowing that, it should be >>>> clearer to provide an explanation for it. >>>> >>>> >>>> Thanks, >>>> Dave >>>> >>>> >>>> On Tue 23. Mar 2021 at 06:24, Mohammad Gohardoust >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I am using a code which is based on petsc (and also parmetis). >>>>> Recently I made the following changes and now the code is running about two >>>>> times faster than before: >>>>> >>>>> - Upgraded Ubuntu 18.04 to 20.04 >>>>> - Upgraded petsc 3.13.4 to 3.14.5 >>>>> - This time I installed parmetis and metis directly via petsc by >>>>> --download-parmetis --download-metis flags instead of installing them >>>>> separately and using --with-parmetis-include=... and >>>>> --with-parmetis-lib=... (the version of installed parmetis was 4.0.3 before) >>>>> >>>>> I was wondering what can possibly explain this speedup? Does anyone >>>>> have any suggestions? >>>>> >>>>> Thanks, >>>>> Mohammad >>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From zjorti at lanl.gov Wed Mar 24 18:17:26 2021 From: zjorti at lanl.gov (Jorti, Zakariae) Date: Wed, 24 Mar 2021 23:17:26 +0000 Subject: [petsc-users] [EXTERNAL] Re: Question about periodic conditions In-Reply-To: <3CEA21BE-8019-46E0-A2A0-D3E2A887E8F8@gmail.com> References: <1cf6b948af3d47308e69115f1f8e543f@lanl.gov>, <3CEA21BE-8019-46E0-A2A0-D3E2A887E8F8@gmail.com> Message-ID: Hi Patrick, Thanks for your responses. As for the code, I was not granted permission to share it yet. So, I cannot send it to you for the moment. I apologize for that. I wanted to let you know that while I was testing my code, I discovered that when the periodic boundary conditions are activated, the coordinates accessed might be incorrect on one side of the boundary. Let me give you an example in cylindrical coordinates with a 3x3x3 DMStag mesh: PetscInt startr,startphi,startz,nr,nphi,nz,d; PetscInt er,ephi,ez,icErmphip[3]; DM dmCoorda, coordDA; Vec coordaLocal; PetscScalar ****arrCoord; PetscScalar surf; DMStagCreate3d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,3,3,3,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,1,1,1,DMSTAG_STENCIL_BOX,1,NULL,NULL,NULL,&coordDA); DMSetFromOptions(coordDA); DMSetUp(coordDA); DMStagGetCorners(coordDA,&startr,&startphi,&startz,&nr,&nphi,&nz,NULL,NULL,NULL); DMGetCoordinateDM(coordDA,&dmCoorda); DMGetCoordinatesLocal(coordDA,&coordaLocal); DMStagVecGetArrayRead(dmCoorda,coordaLocal,&arrCoord); for (d=0; d< 3; ++d){ DMStagGetLocationSlot(dmCoorda,UP_LEFT,d,&icErmphip[d]); } er = 1; ez = 0; for (ephi=0; ephi< 3; ++ephi){ PetscPrintf(PETSC_COMM_WORLD,"Phi_p(%d,%d,%d) = %E\n",er,ephi,ez,(double)arrCoord[ez][ephi][er][icErmphip[1]); } When I execute this example, I get this output: Phi_p(1,0,0) = 2.094395E+00 Phi_p(1,1,0) = 4.188790E+00 Phi_p(1,2,0) = 0.000000E+00 Note here that the first two lines correspond to 2? / 3 and 4? / 3 respectively. Thus, nothing is wrong here. But the last line should rather give 2? instead of 0. I understand that degrees of freedom should be the same on both sides of the boundary, but should the coordinates not be preserved? Thank you. Best regards, Zakariae Jorti ________________________________ From: Patrick Sanan Sent: Tuesday, March 23, 2021 11:37:04 AM To: Jorti, Zakariae Cc: petsc-users at mcs.anl.gov Subject: [EXTERNAL] Re: Question about periodic conditions Hi Zakariae - sorry about the delay - responses inline below. I'd be curious to see your code (which you can send directly to me if you don't want to post it publicly), so I can give you more comments, as DMStag is a new component. Am 23.03.2021 um 00:54 schrieb Jorti, Zakariae >: Hi, I implemented a PETSc code to solve Maxwell's equations for the magnetic and electric fields (B and E) in a cylinder: 0 < r_min <= r <= r_max; with r_max > r_min phi_min = 0 <= r <= phi_max = 2 ? z_min <= z =< z_max; with z_max > z_min. I am using a PETSc staggered grid with the electric field E defined on edge centers and the magnetic field B defined on face centers. (dof0 = 0, dof1 = 1,dof2 = 1, dof3 = 0;). I have two versions of my code: 1 - A first version in which I set the boundary type to DM_BOUNDARY_NONE in the three directions r, phi and z 2- A second version in which I set the boundary type to DM_BOUNDARY_NONE in the r and z directions, and DM_BOUNDARY_PERIODIC in the phi direction. When I print the solution vector X, which contains both E and B components, I notice that the vector is shorter with the second version compared to the first one. Is it normal? Yes - with the periodic boundary conditions, there will be fewer points since there won't be the "extra" layer of faces and edges at phi = 2 * pi . If you consider a 1-d example with 1 dof on vertices and cells, with three elements, the periodic case looks like this, globally, x ---- x ---- x ---- as opposed to the non-periodic case, x ---- x ---- x ---- x Besides, I was wondering if I have to change the way I define the value of the solution on the boundary. What I am doing so far in both versions is something like: B_phi [phi = 0] = 1.0; B_phi [phi = 2?] = 1.0; E_z [r, phi = 0] = 1/r; E_z [r, phi = 2?] = 1/r; Assuming that values at phi = 0 should be the same as at phi=2? with the periodic boundary conditions, is it sufficient for example to have only the following boundary conditions: B_phi [phi = 0] = 1.0; E_z [r, phi = 0] = 1/r ? Yes - this is the intention, since the boundary at phi = 2 * pi is represented by the same entries in the global vector. Of course, you need to make sure that your continuous problem is well-posed, which in general could change when using different boundary conditions. Thank you. Best regards, Zakariae Jorti -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 24 18:20:48 2021 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Mar 2021 19:20:48 -0400 Subject: [petsc-users] [EXTERNAL] Re: Question about periodic conditions In-Reply-To: References: <1cf6b948af3d47308e69115f1f8e543f@lanl.gov> <3CEA21BE-8019-46E0-A2A0-D3E2A887E8F8@gmail.com> Message-ID: On Wed, Mar 24, 2021 at 7:17 PM Jorti, Zakariae via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi Patrick, > > > Thanks for your responses. > > As for the code, I was not granted permission to share it yet. So, I > cannot send it to you for the moment. I apologize for that. > > > I wanted to let you know that while I was testing my code, I discovered > that when the periodic boundary conditions are activated, the coordinates > accessed might be incorrect on one side of the boundary. > > Let me give you an example in cylindrical coordinates with a 3x3x3 > DMStag mesh: > > > > > > PetscInt startr,startphi,startz,nr,nphi,nz,d; > > PetscInt er,ephi,ez,icErmphip[3]; > > DM dmCoorda, coordDA; > > Vec coordaLocal; > > PetscScalar ****arrCoord; > > PetscScalar surf; > > > > DMStagCreate3d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,3,3,3,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,1,1,1,DMSTAG_STENCIL_BOX,1,NULL,NULL,NULL,&coordDA); > > > DMSetFromOptions(coordDA); > > DMSetUp(coordDA); > > > > DMStagGetCorners(coordDA,&startr,&startphi,&startz,&nr,&nphi,&nz,NULL,NULL,NULL); > > > DMGetCoordinateDM(coordDA,&dmCoorda); > > DMGetCoordinatesLocal(coordDA,&coordaLocal); > > DMStagVecGetArrayRead(dmCoorda,coordaLocal,&arrCoord); > > > for (d=0; d< 3; ++d){ > > DMStagGetLocationSlot(dmCoorda,UP_LEFT,d,&icErmphip[d]); > > } > > > er = 1; ez = 0; > > for (ephi=0; ephi< 3; ++ephi){ > > PetscPrintf(PETSC_COMM_WORLD,"Phi_p(%d,%d,%d) = %E\n",er,ephi,ez,(double) > arrCoord[ez][ephi][er][icErmphip[1]); > > } > > > When I execute this example, I get this output: > > Phi_p(1,0,0) = 2.094395E+00 > > Phi_p(1,1,0) = 4.188790E+00 > > Phi_p(1,2,0) = 0.000000E+00 > > > Note here that the first two lines correspond to 2? / 3 and 4? / 3 > respectively. Thus, nothing is wrong here. > > But the last line should rather give 2? instead of 0. > > > I understand that degrees of freedom should be the same on both sides of > the boundary, but should the coordinates not be preserved? > > I don't think so. The circle has coordinates in [0, 2\pi), so the point at 2\pi is identified with the point at 0 and you must choose one, so we choose 0. Thanks, Matt > Thank you. > > Best regards, > > > Zakariae Jorti > ------------------------------ > *From:* Patrick Sanan > *Sent:* Tuesday, March 23, 2021 11:37:04 AM > *To:* Jorti, Zakariae > *Cc:* petsc-users at mcs.anl.gov > *Subject:* [EXTERNAL] Re: Question about periodic conditions > > Hi Zakariae - sorry about the delay - responses inline below. > > I'd be curious to see your code (which you can send directly to me if you > don't want to post it publicly), so I can give you more comments, as DMStag > is a new component. > > > Am 23.03.2021 um 00:54 schrieb Jorti, Zakariae : > > Hi, > > I implemented a PETSc code to solve Maxwell's equations for the magnetic > and electric fields (B and E) in a cylinder: > 0 < r_min <= r <= r_max; with r_max > r_min > phi_min = 0 <= r <= phi_max = 2 ? > z_min <= z =< z_max; with z_max > z_min. > > I am using a PETSc staggered grid with the electric field E defined on > edge centers and the magnetic field B defined on face centers. (dof0 = 0, > dof1 = 1,dof2 = 1, dof3 = 0;). > > > I have two versions of my code: > 1 - A first version in which I set the boundary type to DM_BOUNDARY_NONE > in the three directions r, phi and z > 2- A second version in which I set the boundary type to DM_BOUNDARY_NONE > in the r and z directions, and DM_BOUNDARY_PERIODIC in the phi direction. > > When I print the solution vector X, which contains both E and B > components, I notice that the vector is shorter with the second version > compared to the first one. > Is it normal? > > Yes - with the periodic boundary conditions, there will be fewer points > since there won't be the "extra" layer of faces and edges at phi = 2 * pi . > > If you consider a 1-d example with 1 dof on vertices and cells, with three > elements, the periodic case looks like this, globally, > > x ---- x ---- x ---- > > as opposed to the non-periodic case, > > x ---- x ---- x ---- x > > > > Besides, I was wondering if I have to change the way I define the value of > the solution on the boundary. What I am doing so far in both versions is > something like: > B_phi [phi = 0] = 1.0; > B_phi [phi = 2?] = 1.0; > E_z [r, phi = 0] = 1/r; > E_z [r, phi = 2?] = 1/r; > > Assuming that values at phi = 0 should be the same as at phi=2? with > the periodic boundary conditions, is it sufficient for example to have only > the following boundary conditions: > B_phi [phi = 0] = 1.0; > > E_z [r, phi = 0] = 1/r ? > > > Yes - this is the intention, since the boundary at phi = 2 * pi is > represented by the same entries in the global vector. > > Of course, you need to make sure that your continuous problem is > well-posed, which in general could change when using different boundary > conditions. > > Thank you. > Best regards, > > Zakariae Jorti > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Mar 24 19:03:15 2021 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 24 Mar 2021 19:03:15 -0500 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> <38509E59-D27A-47C6-8D97-EAAEBFC15FBF@petsc.dev> <5B018B57-B679-4015-8097-042B7C6B9D38@petsc.dev> Message-ID: <151FDDB8-2384-4A3E-9B17-45318E2CC7CC@petsc.dev> > On Mar 24, 2021, at 5:08 AM, feng wang wrote: > > Hi Barry, > > Thanks for your comments. It's very helpful. For your comments, I have a bit more questions > > for your 1st comment " Yes, in some sense. So long as each process ....". > If I understand it correctly (hopefully) a parallel vector in petsc can hold discontinuous rows of data in a global array. If this is true, If I call "VecGetArray", it would create a copy in a continuous space if the data is not continuous, do some operations and petsc will figure out how to put updated values back to the right place in the global array? > This would generate an overhead. If I do the renumbering to make each process hold continuous rows, this overhead can be avoided when I call "VecGetArray"? GetArray does nothing except return the pointer to the data in the vector. It does not copy anything or reorder anything. Whatever order the numbers are in vector they are in the same order as in the array you obtain with VecGetArray. > for your 2nd comment " The matrix and vectors the algebraic solvers see DO NOT have......." For the callback function of my shell matrix "mymult(Mat m ,Vec x, Vec y)", I need to get "x" for the halo elements to compute the non-linear function. My code will take care of other halo exchanges, but I am not sure how to use petsc to get the halo elements "x" in the shell matrix, could you please elaborate on this? some related examples or simple pesudo code would be great. Basically all the parallel code in PETSc does this. How you need to set up the halo communication depends on how you are managing the assignment of degrees of freedom on each process and between processes. VecScatterCreate() is the tool you will use to tell PETSc how to get the correct values from one process to their halo-ed location on the process. It like everything in PETSc uses a number in the vectors of 0 ... n_0-1 on the first process, n_0, n_0+1, ... n_1-1 on the second etc. Since you are managing the partitioning and distribution of parallel data you must renumber the vector entry numbering in your data structures to match that shown above. Just do the numbering once after you have setup your distributed data and use it for the rest of the run. You might use the object from AOCreate to do the renumbering for you. Barry > Thanks, > Feng > > From: Barry Smith > > Sent: 22 March 2021 1:28 > To: feng wang > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation > > > >> On Mar 21, 2021, at 6:22 PM, feng wang > wrote: >> >> Hi Barry, >> >> Thanks for your help, I really appreciate it. >> >> In the end I used a shell matrix to compute the matrix-vector product, it is clearer to me and there are more things under my control. I am now trying to do a parallel implementation, I have some questions on setting up parallel matrices and vectors for a user-defined partition, could you please provide some advice? Suppose I have already got a partition for 2 CPUs. Each cpu is assigned a list of elements and also their halo elements. >> The global element index for each partition is not necessarily continuous, do I have to I re-order them to make them continuous? > > Yes, in some sense. So long as each process can march over ITS elements computing the function and Jacobian matrix-vector product it doesn't matter how you have named/numbered entries. But conceptually the first process has the first set of vector entries and the second the second set. >> >> When I set up the size of the matrix and vectors for each cpu, should I take into account the halo elements? > > The matrix and vectors the algebraic solvers see DO NOT have halo elements in their sizes. You will likely need a halo-ed work vector to do the matrix-free multiply from. The standard model is use VecScatterBegin/End to get the values from the non-halo-ed algebraic vector input to MatMult into a halo-ed one to do the local product. > >> In my serial version, when I initialize my RHS vector, I am not using VecSetValues, Instead I use VecGetArray/VecRestoreArray to assign the values. VecAssemblyBegin()/VecAssemblyEnd() is never used. would this still work for a parallel version? > > Yes, you can use Get/Restore but the input vector x will need to be, as noted above, scattered into a haloed version to get all the entries you will need to do the local part of the product. > > >> Thanks, >> Feng >> >> From: Barry Smith > >> Sent: 12 March 2021 23:40 >> To: feng wang > >> Cc: petsc-users at mcs.anl.gov > >> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >> >> >> >>> On Mar 12, 2021, at 9:37 AM, feng wang > wrote: >>> >>> Hi Matt, >>> >>> Thanks for your prompt response. >>> >>> Below are my two versions. one is buggy and the 2nd one is working. For the first one, I add the diagonal contribution to the true RHS (variable: rhs) and then set the base point, the callback function is somehow called twice afterwards to compute Jacobian. >> >> Do you mean "to compute the Jacobian matrix-vector product?" >> >> Is it only in the first computation of the product (for the given base vector) that it calls it twice or every matrix-vector product? >> >> It is possible there is a bug in our logic; run in the debugger with a break point in FormFunction_mf and each time the function is hit in the debugger type where or bt to get the stack frames from the calls. Send this. From this we can all see if it is being called excessively and why. >> >>> For the 2nd one, I just call the callback function manually to recompute everything, the callback function is then called once as expected to compute the Jacobian. For me, both versions should do the same things. but I don't know why in the first one the callback function is called twice after I set the base point. what could possibly go wrong? >> >> The logic of how it is suppose to work is shown below. >>> >>> Thanks, >>> Feng >>> >>> //This does not work >>> fld->cnsv( iqs,iqe, q, aux, csv ); >>> //add contribution of time-stepping >>> for(iv=0; iv>> { >>> for(iq=0; iq>> { >>> //use conservative variables here >>> rhs[iv][iq] = -rhs[iv][iq] + csv[iv][iq]*lhsa[nlhs-1][iq]/cfl; >>> } >>> } >>> ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); >>> ierr = petsc_setrhs(petsc_baserhs); CHKERRQ(ierr); >>> ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); >>> >>> //This works >>> fld->cnsv( iqs,iqe, q, aux, csv ); >>> ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); >>> ierr = FormFunction_mf(this, petsc_csv, petsc_baserhs); //this is my callback function, now call it manually >>> ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); >>> >>> >> Since you provide petsc_baserhs MatMFFD assumes (naturally) that you will keep the correct values in it. Hence for each new base value YOU need to compute the new values in petsc_baserhs. This approach gives you a bit more control over reusing the information in petsc_baserhs. >> >> If you would prefer that MatMFFD recomputes the base values, as needed, then you call FormFunction_mf(this, petsc_csv, NULL); and PETSc will allocate a vector and fill it up as needed by calling your FormFunction_mf() But you need to call MatAssemblyBegin/End each time you the base input vector this, petsc_csv values change. For example >> >> MatAssemblyBegin(petsc_A_mf,...) >> MatAssemblyEnd(petsc_A_mf,...) >> KSPSolve() >> >> >> >> >>> >>> From: Matthew Knepley > >>> Sent: 12 March 2021 15:08 >>> To: feng wang > >>> Cc: Barry Smith >; petsc-users at mcs.anl.gov > >>> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >>> >>> On Fri, Mar 12, 2021 at 9:55 AM feng wang > wrote: >>> Hi Mat, >>> >>> Thanks for your reply. I will try the parallel implementation. >>> >>> I've got a serial matrix-free GMRES working, but I would like to know why my initial version of matrix-free implementation does not work and there is still something I don't understand. I did some debugging and find that the callback function to compute the RHS for the matrix-free matrix is called twice by Petsc when it computes the finite difference Jacobian, but it should only be called once. I don't know why, could you please give some advice? >>> >>> F is called once to calculate the base point and once to get the perturbation. The base point is not recalculated, so if you do many iterates, it is amortized. >>> >>> Thanks, >>> >>> Matt >>> >>> Thanks, >>> Feng >>> >>> >>> >>> From: Matthew Knepley > >>> Sent: 12 March 2021 12:05 >>> To: feng wang > >>> Cc: Barry Smith >; petsc-users at mcs.anl.gov > >>> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >>> >>> On Fri, Mar 12, 2021 at 6:02 AM feng wang > wrote: >>> Hi Barry, >>> >>> Thanks for your advice. >>> >>> You are right on this. somehow there is some inconsistency when I compute the right hand side (true RHS + time-stepping contribution to the diagonal matrix) to compute the finite difference Jacobian. If I just use the call back function to recompute my RHS before I call MatMFFDSetBase, then it works like a charm. But now I end up with computing my RHS three times. 1st time is to compute the true RHS, the rest two is for computing finite difference Jacobian. >>> >>> In my previous buggy version, I only compute RHS twice. If possible, could you elaborate on your comments "Also be careful about petsc_baserhs", so I may possibly understand what was going on with my buggy version. >>> >>> Our FD implementation is simple. It approximates the action of the Jacobian as >>> >>> J(b) v = (F(b + h v) - F(b)) / h ||v|| >>> >>> where h is some small parameter and b is the base vector, namely the one that you are linearizing around. In a Newton step, b is the previous solution >>> and v is the proposed solution update. >>> >>> Besides, for a parallel implementation, my code already has its own partition method, is it possible to allow petsc read in a user-defined partition? if not what is a better way to do this? >>> >>> Sure >>> >>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html >>> >>> Thanks, >>> >>> Matt >>> >>> Many thanks, >>> Feng >>> >>> From: Barry Smith > >>> Sent: 11 March 2021 22:15 >>> To: feng wang > >>> Cc: petsc-users at mcs.anl.gov > >>> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >>> >>> >>> Feng, >>> >>> The first thing to check is that for each linear solve that involves a new operator (values in the base vector) the MFFD matrix knows it is using a new operator. >>> >>> The easiest way is to call MatMFFDSetBase() before each solve that involves a new operator (new values in the base vector). Also be careful about petsc_baserhs, when you change the base vector's values you also need to change the petsc_baserhs values to the function evaluation at that point. >>> >>> If that is correct I would check with a trivial function evaluator to make sure the infrastructure is all set up correctly. For examples use for the matrix free a 1 4 1 operator applied matrix free. >>> >>> Barry >>> >>> >>>> On Mar 11, 2021, at 7:35 AM, feng wang > wrote: >>>> >>>> Dear All, >>>> >>>> I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: >>>> >>>> the matrix-free matrix is created as: >>>> >>>> ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); >>>> ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); >>>> >>>> KSP linear operator is set up as: >>>> >>>> ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix >>>> >>>> Before calling KSPSolve, I do: >>>> >>>> ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side >>>> >>>> The call back function is defined as: >>>> >>>> PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) >>>> { >>>> PetscErrorCode ierr; >>>> cFdDomain *user_ctx; >>>> >>>> cout << "FormFunction_mf called\n"; >>>> >>>> //in_vec: flow states >>>> //out_vec: right hand side + diagonal contributions from CFL number >>>> >>>> user_ctx = (cFdDomain*)ctx; >>>> >>>> //get perturbed conservative variables from petsc >>>> user_ctx->petsc_getcsv(in_vec); >>>> >>>> //get new right side >>>> user_ctx->petsc_fd_rhs(); >>>> >>>> //set new right hand side to the output vector >>>> user_ctx->petsc_setrhs(out_vec); >>>> >>>> ierr = 0; >>>> return ierr; >>>> } >>>> >>>> The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. >>>> >>>> The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? >>>> >>>> Thanks for your help in advance. >>>> Feng >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wence at gmx.li Thu Mar 25 06:54:26 2021 From: wence at gmx.li (Lawrence Mitchell) Date: Thu, 25 Mar 2021 11:54:26 +0000 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: References: Message-ID: > On 24 Mar 2021, at 01:30, Matthew Knepley wrote: > > This is true, but all the PETSc operations are speeding up by a factor 2x. It is hard to believe these were run on the same machine. > For example, VecScale speeds up!?! So it is not network, or optimizations. I cannot explain this. > VecMDot speeds up by a factor of 8! Unrelatedly, one thing I see, which _may_ offer potential for much more speedup, is this: BuildTwoSided 17548 1.0 4.9331e+00 9.9 0.00e+00 0.0 5.9e+05 4.0e+00 1.8e+04 BuildTwoSidedF 17547 1.0 5.0489e+00 7.3 0.00e+00 0.0 1.2e+06 3.6e+03 1.8e+04 ... MatAssemblyBegin 17547 1.0 8.8252e+00 1.1 0.00e+00 0.0 1.2e+06 3.6e+03 1.8e+04 MatAssemblyEnd 17547 1.0 2.6903e+00 2.8 2.79e+07 2.7 2.1e+02 2.0e+02 1.0e+01 I think these BuildTwoSided calls are coming from the MatAssemblyBegin/End pairs. If you preallocate and fill your matrices with zeros in all the possible places that you might end up putting a non-zero, then calling MatSetOption(mat, MAT_SUBSET_OFF_PROC_ENTRIES, PETSC_TRUE) on the matrix you create will reduce this time in BuildTwoSided to almost zero. Lawrence From gohardoust at gmail.com Thu Mar 25 14:51:19 2021 From: gohardoust at gmail.com (Mohammad Gohardoust) Date: Thu, 25 Mar 2021 12:51:19 -0700 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: References: Message-ID: That's right, these loops also take roughly half time as well. If I am not mistaken, petsc (MatSetValue) is called after doing some calculations over each tetrahedral element. Thanks for your suggestion. I will try that and will post the results. Mohammad On Wed, Mar 24, 2021 at 3:23 PM Junchao Zhang wrote: > > > > On Wed, Mar 24, 2021 at 2:17 AM Mohammad Gohardoust > wrote: > >> So the code itself is a finite-element scheme and in stage 1 and 3 there >> are expensive loops over entire mesh elements which consume a lot of time. >> > So these expensive loops must also take half time with newer petsc? And > these loops do not call petsc routines? > I think you can build two PETSc versions with the same configuration > options, then run your code with one MPI rank to see if there is a > difference. > If they give the same performance, then scale to 2, 4, ... ranks and see > what happens. > > > >> >> Mohammad >> >> On Tue, Mar 23, 2021 at 6:08 PM Junchao Zhang >> wrote: >> >>> In the new log, I saw >>> >>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >>> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >>> 0: Main Stage: 5.4095e+00 2.3% 4.3700e+03 0.0% 4.764e+05 3.0% 3.135e+02 1.0% 2.244e+04 12.6% 1: Solute_Assembly: 1.3977e+02 59.4% 7.3353e+09 4.6% 3.263e+06 20.7% 1.278e+03 26.9% 1.059e+04 6.0% >>> >>> >>> But I didn't see any event in this stage had a cost close to 140s. What >>> happened? >>> >>> --- Event Stage 1: Solute_Assembly >>> >>> BuildTwoSided 3531 1.0 2.8025e+0026.3 0.00e+00 0.0 3.6e+05 4.0e+00 3.5e+03 1 0 2 0 2 1 0 11 0 33 0 >>> BuildTwoSidedF 3531 1.0 2.8678e+0013.2 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 1 0 5 17 2 1 0 22 62 33 0 >>> VecScatterBegin 7062 1.0 7.1911e-02 1.9 0.00e+00 0.0 7.1e+05 3.5e+02 0.0e+00 0 0 5 2 0 0 0 22 6 0 0 >>> VecScatterEnd 7062 1.0 2.1248e-01 3.0 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 73 >>> SFBcastOpBegin 3531 1.0 2.6516e-02 2.4 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 >>> SFBcastOpEnd 3531 1.0 9.5041e-02 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> SFReduceBegin 3531 1.0 3.8955e-02 2.1 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 >>> SFReduceEnd 3531 1.0 1.3791e-01 3.9 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 112 >>> SFPack 7062 1.0 6.5591e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> SFUnpack 7062 1.0 7.4186e-03 2.1 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2080 >>> MatAssemblyBegin 3531 1.0 4.7846e+00 1.1 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 2 0 5 17 2 3 0 22 62 33 0 >>> MatAssemblyEnd 3531 1.0 1.5468e+00 2.7 1.68e+07 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 2 0 0 0 104 >>> MatZeroEntries 3531 1.0 3.0998e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> >>> >>> --Junchao Zhang >>> >>> >>> >>> On Tue, Mar 23, 2021 at 5:24 PM Mohammad Gohardoust < >>> gohardoust at gmail.com> wrote: >>> >>>> Thanks Dave for your reply. >>>> >>>> For sure PETSc is awesome :D >>>> >>>> Yes, in both cases petsc was configured with --with-debugging=0 and >>>> fortunately I do have the old and new -log-veiw outputs which I attached. >>>> >>>> Best, >>>> Mohammad >>>> >>>> On Tue, Mar 23, 2021 at 1:37 AM Dave May >>>> wrote: >>>> >>>>> Nice to hear! >>>>> The answer is simple, PETSc is awesome :) >>>>> >>>>> Jokes aside, assuming both petsc builds were configured with >>>>> ?with-debugging=0, I don?t think there is a definitive answer to your >>>>> question with the information you provided. >>>>> >>>>> It could be as simple as one specific implementation you use was >>>>> improved between petsc releases. Not being an Ubuntu expert, the change >>>>> might be associated with using a different compiler, and or a more >>>>> efficient BLAS implementation (non threaded vs threaded). However I doubt >>>>> this is the origin of your 2x performance increase. >>>>> >>>>> If you really want to understand where the performance improvement >>>>> originated from, you?d need to send to the email list the result of >>>>> -log_view from both the old and new versions, running the exact same >>>>> problem. >>>>> >>>>> From that info, we can see what implementations in PETSc are being >>>>> used and where the time reduction is occurring. Knowing that, it should be >>>>> clearer to provide an explanation for it. >>>>> >>>>> >>>>> Thanks, >>>>> Dave >>>>> >>>>> >>>>> On Tue 23. Mar 2021 at 06:24, Mohammad Gohardoust < >>>>> gohardoust at gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I am using a code which is based on petsc (and also parmetis). >>>>>> Recently I made the following changes and now the code is running about two >>>>>> times faster than before: >>>>>> >>>>>> - Upgraded Ubuntu 18.04 to 20.04 >>>>>> - Upgraded petsc 3.13.4 to 3.14.5 >>>>>> - This time I installed parmetis and metis directly via petsc by >>>>>> --download-parmetis --download-metis flags instead of installing them >>>>>> separately and using --with-parmetis-include=... and >>>>>> --with-parmetis-lib=... (the version of installed parmetis was 4.0.3 before) >>>>>> >>>>>> I was wondering what can possibly explain this speedup? Does anyone >>>>>> have any suggestions? >>>>>> >>>>>> Thanks, >>>>>> Mohammad >>>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From gohardoust at gmail.com Thu Mar 25 15:01:20 2021 From: gohardoust at gmail.com (Mohammad Gohardoust) Date: Thu, 25 Mar 2021 13:01:20 -0700 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: References: Message-ID: Thanks Lawrence for your suggestion. It did work and the BuildTwoSided time is almost zero now: BuildTwoSided 4 1.0 1.6279e-0310.1 0.00e+00 0.0 2.0e+02 4.0e+00 4.0e+00 BuildTwoSidedF 3 1.0 1.5969e-0310.5 0.00e+00 0.0 2.0e+02 3.6e+03 3.0e+00 Mohammad On Thu, Mar 25, 2021 at 4:54 AM Lawrence Mitchell wrote: > > > > On 24 Mar 2021, at 01:30, Matthew Knepley wrote: > > > > This is true, but all the PETSc operations are speeding up by a factor > 2x. It is hard to believe these were run on the same machine. > > For example, VecScale speeds up!?! So it is not network, or > optimizations. I cannot explain this. > > > > VecMDot speeds up by a factor of 8! > > Unrelatedly, one thing I see, which _may_ offer potential for much more > speedup, is this: > > BuildTwoSided 17548 1.0 4.9331e+00 9.9 0.00e+00 0.0 5.9e+05 4.0e+00 > 1.8e+04 > BuildTwoSidedF 17547 1.0 5.0489e+00 7.3 0.00e+00 0.0 1.2e+06 3.6e+03 > 1.8e+04 > > ... > > MatAssemblyBegin 17547 1.0 8.8252e+00 1.1 0.00e+00 0.0 1.2e+06 3.6e+03 > 1.8e+04 > MatAssemblyEnd 17547 1.0 2.6903e+00 2.8 2.79e+07 2.7 2.1e+02 2.0e+02 > 1.0e+01 > > I think these BuildTwoSided calls are coming from the MatAssemblyBegin/End > pairs. > > If you preallocate and fill your matrices with zeros in all the possible > places that you might end up putting a non-zero, then calling > > MatSetOption(mat, MAT_SUBSET_OFF_PROC_ENTRIES, PETSC_TRUE) > > on the matrix you create will reduce this time in BuildTwoSided to almost > zero. > > Lawrence > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From snailsoar at hotmail.com Thu Mar 25 18:39:19 2021 From: snailsoar at hotmail.com (feng wang) Date: Thu, 25 Mar 2021 23:39:19 +0000 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: <151FDDB8-2384-4A3E-9B17-45318E2CC7CC@petsc.dev> References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> <38509E59-D27A-47C6-8D97-EAAEBFC15FBF@petsc.dev> <5B018B57-B679-4015-8097-042B7C6B9D38@petsc.dev> , <151FDDB8-2384-4A3E-9B17-45318E2CC7CC@petsc.dev> Message-ID: Hi Barry, Thanks for your comments. I will renumber the cells in the way as you recommended. I went through the manual again and understand how to update the halo elements for my shell matrix routine "mymult(Mat m ,Vec x, Vec y)". I can use the global index of ghost cells for each rank and "Vec x" to get the ghost values for each rank via scattering. It should be similar to the example in page 40 in the manual. One more question, I also have an assembled approximate Jacobian matrix for pre-conditioning GMRES. If I re-number the cells properly as your suggested, I don't need to worry about communication and petsc will handle it properly together with my shell-matrix? Thanks, Feng ________________________________ From: Barry Smith Sent: 25 March 2021 0:03 To: feng wang Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Mar 24, 2021, at 5:08 AM, feng wang > wrote: Hi Barry, Thanks for your comments. It's very helpful. For your comments, I have a bit more questions 1. for your 1st comment " Yes, in some sense. So long as each process ....". * If I understand it correctly (hopefully) a parallel vector in petsc can hold discontinuous rows of data in a global array. If this is true, If I call "VecGetArray", it would create a copy in a continuous space if the data is not continuous, do some operations and petsc will figure out how to put updated values back to the right place in the global array? * This would generate an overhead. If I do the renumbering to make each process hold continuous rows, this overhead can be avoided when I call "VecGetArray"? GetArray does nothing except return the pointer to the data in the vector. It does not copy anything or reorder anything. Whatever order the numbers are in vector they are in the same order as in the array you obtain with VecGetArray. 1. for your 2nd comment " The matrix and vectors the algebraic solvers see DO NOT have......." For the callback function of my shell matrix "mymult(Mat m ,Vec x, Vec y)", I need to get "x" for the halo elements to compute the non-linear function. My code will take care of other halo exchanges, but I am not sure how to use petsc to get the halo elements "x" in the shell matrix, could you please elaborate on this? some related examples or simple pesudo code would be great. Basically all the parallel code in PETSc does this. How you need to set up the halo communication depends on how you are managing the assignment of degrees of freedom on each process and between processes. VecScatterCreate() is the tool you will use to tell PETSc how to get the correct values from one process to their halo-ed location on the process. It like everything in PETSc uses a number in the vectors of 0 ... n_0-1 on the first process, n_0, n_0+1, ... n_1-1 on the second etc. Since you are managing the partitioning and distribution of parallel data you must renumber the vector entry numbering in your data structures to match that shown above. Just do the numbering once after you have setup your distributed data and use it for the rest of the run. You might use the object from AOCreate to do the renumbering for you. Barry Thanks, Feng ________________________________ From: Barry Smith > Sent: 22 March 2021 1:28 To: feng wang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Mar 21, 2021, at 6:22 PM, feng wang > wrote: Hi Barry, Thanks for your help, I really appreciate it. In the end I used a shell matrix to compute the matrix-vector product, it is clearer to me and there are more things under my control. I am now trying to do a parallel implementation, I have some questions on setting up parallel matrices and vectors for a user-defined partition, could you please provide some advice? Suppose I have already got a partition for 2 CPUs. Each cpu is assigned a list of elements and also their halo elements. 1. The global element index for each partition is not necessarily continuous, do I have to I re-order them to make them continuous? Yes, in some sense. So long as each process can march over ITS elements computing the function and Jacobian matrix-vector product it doesn't matter how you have named/numbered entries. But conceptually the first process has the first set of vector entries and the second the second set. 1. 2. When I set up the size of the matrix and vectors for each cpu, should I take into account the halo elements? The matrix and vectors the algebraic solvers see DO NOT have halo elements in their sizes. You will likely need a halo-ed work vector to do the matrix-free multiply from. The standard model is use VecScatterBegin/End to get the values from the non-halo-ed algebraic vector input to MatMult into a halo-ed one to do the local product. 1. In my serial version, when I initialize my RHS vector, I am not using VecSetValues, Instead I use VecGetArray/VecRestoreArray to assign the values. VecAssemblyBegin()/VecAssemblyEnd() is never used. would this still work for a parallel version? Yes, you can use Get/Restore but the input vector x will need to be, as noted above, scattered into a haloed version to get all the entries you will need to do the local part of the product. Thanks, Feng ________________________________ From: Barry Smith > Sent: 12 March 2021 23:40 To: feng wang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Mar 12, 2021, at 9:37 AM, feng wang > wrote: Hi Matt, Thanks for your prompt response. Below are my two versions. one is buggy and the 2nd one is working. For the first one, I add the diagonal contribution to the true RHS (variable: rhs) and then set the base point, the callback function is somehow called twice afterwards to compute Jacobian. Do you mean "to compute the Jacobian matrix-vector product?" Is it only in the first computation of the product (for the given base vector) that it calls it twice or every matrix-vector product? It is possible there is a bug in our logic; run in the debugger with a break point in FormFunction_mf and each time the function is hit in the debugger type where or bt to get the stack frames from the calls. Send this. From this we can all see if it is being called excessively and why. For the 2nd one, I just call the callback function manually to recompute everything, the callback function is then called once as expected to compute the Jacobian. For me, both versions should do the same things. but I don't know why in the first one the callback function is called twice after I set the base point. what could possibly go wrong? The logic of how it is suppose to work is shown below. Thanks, Feng //This does not work fld->cnsv( iqs,iqe, q, aux, csv ); //add contribution of time-stepping for(iv=0; ivcnsv( iqs,iqe, q, aux, csv ); ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); ierr = FormFunction_mf(this, petsc_csv, petsc_baserhs); //this is my callback function, now call it manually ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); Since you provide petsc_baserhs MatMFFD assumes (naturally) that you will keep the correct values in it. Hence for each new base value YOU need to compute the new values in petsc_baserhs. This approach gives you a bit more control over reusing the information in petsc_baserhs. If you would prefer that MatMFFD recomputes the base values, as needed, then you call FormFunction_mf(this, petsc_csv, NULL); and PETSc will allocate a vector and fill it up as needed by calling your FormFunction_mf() But you need to call MatAssemblyBegin/End each time you the base input vector this, petsc_csv values change. For example MatAssemblyBegin(petsc_A_mf,...) MatAssemblyEnd(petsc_A_mf,...) KSPSolve() ________________________________ From: Matthew Knepley > Sent: 12 March 2021 15:08 To: feng wang > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 9:55 AM feng wang > wrote: Hi Mat, Thanks for your reply. I will try the parallel implementation. I've got a serial matrix-free GMRES working, but I would like to know why my initial version of matrix-free implementation does not work and there is still something I don't understand. I did some debugging and find that the callback function to compute the RHS for the matrix-free matrix is called twice by Petsc when it computes the finite difference Jacobian, but it should only be called once. I don't know why, could you please give some advice? F is called once to calculate the base point and once to get the perturbation. The base point is not recalculated, so if you do many iterates, it is amortized. Thanks, Matt Thanks, Feng ________________________________ From: Matthew Knepley > Sent: 12 March 2021 12:05 To: feng wang > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 6:02 AM feng wang > wrote: Hi Barry, Thanks for your advice. You are right on this. somehow there is some inconsistency when I compute the right hand side (true RHS + time-stepping contribution to the diagonal matrix) to compute the finite difference Jacobian. If I just use the call back function to recompute my RHS before I call MatMFFDSetBase, then it works like a charm. But now I end up with computing my RHS three times. 1st time is to compute the true RHS, the rest two is for computing finite difference Jacobian. In my previous buggy version, I only compute RHS twice. If possible, could you elaborate on your comments "Also be careful about petsc_baserhs", so I may possibly understand what was going on with my buggy version. Our FD implementation is simple. It approximates the action of the Jacobian as J(b) v = (F(b + h v) - F(b)) / h ||v|| where h is some small parameter and b is the base vector, namely the one that you are linearizing around. In a Newton step, b is the previous solution and v is the proposed solution update. Besides, for a parallel implementation, my code already has its own partition method, is it possible to allow petsc read in a user-defined partition? if not what is a better way to do this? Sure https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html Thanks, Matt Many thanks, Feng ________________________________ From: Barry Smith > Sent: 11 March 2021 22:15 To: feng wang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation Feng, The first thing to check is that for each linear solve that involves a new operator (values in the base vector) the MFFD matrix knows it is using a new operator. The easiest way is to call MatMFFDSetBase() before each solve that involves a new operator (new values in the base vector). Also be careful about petsc_baserhs, when you change the base vector's values you also need to change the petsc_baserhs values to the function evaluation at that point. If that is correct I would check with a trivial function evaluator to make sure the infrastructure is all set up correctly. For examples use for the matrix free a 1 4 1 operator applied matrix free. Barry On Mar 11, 2021, at 7:35 AM, feng wang > wrote: Dear All, I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: the matrix-free matrix is created as: ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); KSP linear operator is set up as: ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix Before calling KSPSolve, I do: ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side The call back function is defined as: PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) { PetscErrorCode ierr; cFdDomain *user_ctx; cout << "FormFunction_mf called\n"; //in_vec: flow states //out_vec: right hand side + diagonal contributions from CFL number user_ctx = (cFdDomain*)ctx; //get perturbed conservative variables from petsc user_ctx->petsc_getcsv(in_vec); //get new right side user_ctx->petsc_fd_rhs(); //set new right hand side to the output vector user_ctx->petsc_setrhs(out_vec); ierr = 0; return ierr; } The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? Thanks for your help in advance. Feng -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From smithc11 at rpi.edu Fri Mar 26 10:40:01 2021 From: smithc11 at rpi.edu (Cameron Smith) Date: Fri, 26 Mar 2021 11:40:01 -0400 Subject: [petsc-users] DMPlex: vtk/vtu support for labels Message-ID: <6a45fec1-2e55-7e60-f68b-39dacd572daf@rpi.edu> Hello, I'm debugging our use of DMLabel to mark the mesh vertices in a 2d triangular mesh on the geometric model boundary. From what I understand, the latex/tikz and glvis viewers support rendering user labels via the following options: tikz: -dm_view :mesh.tex:ascii_latex -dm_plex_view_scale 8.0 #not related to label, better formatting -dm_plex_view_labels glvis (I have not tried this yet): -viewer_glvis_dm_plex_bmarker Are there any other graphical viewers that support rendering the existence and/or value of DMLabel? I'm most familiar with paraview but didn't see vtu/vtk support in dm/impls/plex/plex[vtu].c . Thank-you, Cameron From jed at jedbrown.org Fri Mar 26 10:42:51 2021 From: jed at jedbrown.org (Jed Brown) Date: Fri, 26 Mar 2021 09:42:51 -0600 Subject: [petsc-users] DMPlex: vtk/vtu support for labels In-Reply-To: <6a45fec1-2e55-7e60-f68b-39dacd572daf@rpi.edu> References: <6a45fec1-2e55-7e60-f68b-39dacd572daf@rpi.edu> Message-ID: <87eeg1udj8.fsf@jedbrown.org> Cameron Smith writes: > Hello, > > I'm debugging our use of DMLabel to mark the mesh vertices in a 2d > triangular mesh on the geometric model boundary. > > From what I understand, the latex/tikz and glvis viewers support > rendering user labels via the following options: > > tikz: > -dm_view :mesh.tex:ascii_latex > -dm_plex_view_scale 8.0 #not related to label, better formatting > -dm_plex_view_labels > > glvis (I have not tried this yet): > -viewer_glvis_dm_plex_bmarker > > Are there any other graphical viewers that support rendering the > existence and/or value of DMLabel? I'm most familiar with paraview but > didn't see vtu/vtk support in dm/impls/plex/plex[vtu].c . I think it would be good to include labels. Is there a more efficient way to do it than to lower labels to an integer-valued field over the entire mesh with some default value inserted for unlabeled points? From knepley at gmail.com Fri Mar 26 12:28:39 2021 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 26 Mar 2021 13:28:39 -0400 Subject: [petsc-users] DMPlex: vtk/vtu support for labels In-Reply-To: <87eeg1udj8.fsf@jedbrown.org> References: <6a45fec1-2e55-7e60-f68b-39dacd572daf@rpi.edu> <87eeg1udj8.fsf@jedbrown.org> Message-ID: On Fri, Mar 26, 2021 at 11:43 AM Jed Brown wrote: > Cameron Smith writes: > > > Hello, > > > > I'm debugging our use of DMLabel to mark the mesh vertices in a 2d > > triangular mesh on the geometric model boundary. > > > > From what I understand, the latex/tikz and glvis viewers support > > rendering user labels via the following options: > > > > tikz: > > -dm_view :mesh.tex:ascii_latex > > -dm_plex_view_scale 8.0 #not related to label, better formatting > > -dm_plex_view_labels > > > > glvis (I have not tried this yet): > > -viewer_glvis_dm_plex_bmarker > > > > Are there any other graphical viewers that support rendering the > > existence and/or value of DMLabel? I'm most familiar with paraview but > > didn't see vtu/vtk support in dm/impls/plex/plex[vtu].c . > > I think it would be good to include labels. Is there a more efficient way > to do it than to lower labels to an integer-valued field over the entire > mesh with some default value inserted for unlabeled points? > Last time I looked at the XDMF standard, I did not see something, but maybe that was too long ago. I did this kind of automatic integer field thing with 'rank', so doing it with a label is trivial, but it would be nice if it were more integrated on the platform. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Mar 26 18:44:57 2021 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 26 Mar 2021 18:44:57 -0500 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> <38509E59-D27A-47C6-8D97-EAAEBFC15FBF@petsc.dev> <5B018B57-B679-4015-8097-042B7C6B9D38@petsc.dev> <151FDDB8-2384-4A3E-9B17-45318E2CC7CC@petsc.dev> Message-ID: <1599C26D-14C3-4EA7-9CD3-F0526F098AD6@petsc.dev> > On Mar 25, 2021, at 6:39 PM, feng wang wrote: > > Hi Barry, > > Thanks for your comments. > > I will renumber the cells in the way as you recommended. I went through the manual again and understand how to update the halo elements for my shell matrix routine "mymult(Mat m ,Vec x, Vec y)". I can use the global index of ghost cells for each rank and "Vec x" to get the ghost values for each rank via scattering. It should be similar to the example in page 40 in the manual. > > One more question, I also have an assembled approximate Jacobian matrix for pre-conditioning GMRES. If I re-number the cells properly as your suggested, I don't need to worry about communication and petsc will handle it properly together with my shell-matrix? If you assembly the approximate Jaocobian using the "new" ordering then it will reflect the same function evaluation and matrix free operators so should be ok. Barry > > Thanks, > Feng > > From: Barry Smith > > Sent: 25 March 2021 0:03 > To: feng wang > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation > > > >> On Mar 24, 2021, at 5:08 AM, feng wang > wrote: >> >> Hi Barry, >> >> Thanks for your comments. It's very helpful. For your comments, I have a bit more questions >> >> for your 1st comment " Yes, in some sense. So long as each process ....". >> If I understand it correctly (hopefully) a parallel vector in petsc can hold discontinuous rows of data in a global array. If this is true, If I call "VecGetArray", it would create a copy in a continuous space if the data is not continuous, do some operations and petsc will figure out how to put updated values back to the right place in the global array? >> This would generate an overhead. If I do the renumbering to make each process hold continuous rows, this overhead can be avoided when I call "VecGetArray"? > > GetArray does nothing except return the pointer to the data in the vector. It does not copy anything or reorder anything. Whatever order the numbers are in vector they are in the same order as in the array you obtain with VecGetArray. > >> for your 2nd comment " The matrix and vectors the algebraic solvers see DO NOT have......." For the callback function of my shell matrix "mymult(Mat m ,Vec x, Vec y)", I need to get "x" for the halo elements to compute the non-linear function. My code will take care of other halo exchanges, but I am not sure how to use petsc to get the halo elements "x" in the shell matrix, could you please elaborate on this? some related examples or simple pesudo code would be great. > Basically all the parallel code in PETSc does this. How you need to set up the halo communication depends on how you are managing the assignment of degrees of freedom on each process and between processes. VecScatterCreate() is the tool you will use to tell PETSc how to get the correct values from one process to their halo-ed location on the process. It like everything in PETSc uses a number in the vectors of 0 ... n_0-1 on the first process, n_0, n_0+1, ... n_1-1 on the second etc. Since you are managing the partitioning and distribution of parallel data you must renumber the vector entry numbering in your data structures to match that shown above. Just do the numbering once after you have setup your distributed data and use it for the rest of the run. You might use the object from AOCreate to do the renumbering for you. > > Barry > > > >> Thanks, >> Feng >> >> From: Barry Smith > >> Sent: 22 March 2021 1:28 >> To: feng wang > >> Cc: petsc-users at mcs.anl.gov > >> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >> >> >> >>> On Mar 21, 2021, at 6:22 PM, feng wang > wrote: >>> >>> Hi Barry, >>> >>> Thanks for your help, I really appreciate it. >>> >>> In the end I used a shell matrix to compute the matrix-vector product, it is clearer to me and there are more things under my control. I am now trying to do a parallel implementation, I have some questions on setting up parallel matrices and vectors for a user-defined partition, could you please provide some advice? Suppose I have already got a partition for 2 CPUs. Each cpu is assigned a list of elements and also their halo elements. >>> The global element index for each partition is not necessarily continuous, do I have to I re-order them to make them continuous? >> >> Yes, in some sense. So long as each process can march over ITS elements computing the function and Jacobian matrix-vector product it doesn't matter how you have named/numbered entries. But conceptually the first process has the first set of vector entries and the second the second set. >>> >>> When I set up the size of the matrix and vectors for each cpu, should I take into account the halo elements? >> >> The matrix and vectors the algebraic solvers see DO NOT have halo elements in their sizes. You will likely need a halo-ed work vector to do the matrix-free multiply from. The standard model is use VecScatterBegin/End to get the values from the non-halo-ed algebraic vector input to MatMult into a halo-ed one to do the local product. >> >>> In my serial version, when I initialize my RHS vector, I am not using VecSetValues, Instead I use VecGetArray/VecRestoreArray to assign the values. VecAssemblyBegin()/VecAssemblyEnd() is never used. would this still work for a parallel version? >> >> Yes, you can use Get/Restore but the input vector x will need to be, as noted above, scattered into a haloed version to get all the entries you will need to do the local part of the product. >> >> >>> Thanks, >>> Feng >>> >>> >>> From: Barry Smith > >>> Sent: 12 March 2021 23:40 >>> To: feng wang > >>> Cc: petsc-users at mcs.anl.gov > >>> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >>> >>> >>> >>>> On Mar 12, 2021, at 9:37 AM, feng wang > wrote: >>>> >>>> Hi Matt, >>>> >>>> Thanks for your prompt response. >>>> >>>> Below are my two versions. one is buggy and the 2nd one is working. For the first one, I add the diagonal contribution to the true RHS (variable: rhs) and then set the base point, the callback function is somehow called twice afterwards to compute Jacobian. >>> >>> Do you mean "to compute the Jacobian matrix-vector product?" >>> >>> Is it only in the first computation of the product (for the given base vector) that it calls it twice or every matrix-vector product? >>> >>> It is possible there is a bug in our logic; run in the debugger with a break point in FormFunction_mf and each time the function is hit in the debugger type where or bt to get the stack frames from the calls. Send this. From this we can all see if it is being called excessively and why. >>> >>>> For the 2nd one, I just call the callback function manually to recompute everything, the callback function is then called once as expected to compute the Jacobian. For me, both versions should do the same things. but I don't know why in the first one the callback function is called twice after I set the base point. what could possibly go wrong? >>> >>> The logic of how it is suppose to work is shown below. >>>> >>>> Thanks, >>>> Feng >>>> >>>> //This does not work >>>> fld->cnsv( iqs,iqe, q, aux, csv ); >>>> //add contribution of time-stepping >>>> for(iv=0; iv>>> { >>>> for(iq=0; iq>>> { >>>> //use conservative variables here >>>> rhs[iv][iq] = -rhs[iv][iq] + csv[iv][iq]*lhsa[nlhs-1][iq]/cfl; >>>> } >>>> } >>>> ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); >>>> ierr = petsc_setrhs(petsc_baserhs); CHKERRQ(ierr); >>>> ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); >>>> >>>> //This works >>>> fld->cnsv( iqs,iqe, q, aux, csv ); >>>> ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); >>>> ierr = FormFunction_mf(this, petsc_csv, petsc_baserhs); //this is my callback function, now call it manually >>>> ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); >>>> >>>> >>> Since you provide petsc_baserhs MatMFFD assumes (naturally) that you will keep the correct values in it. Hence for each new base value YOU need to compute the new values in petsc_baserhs. This approach gives you a bit more control over reusing the information in petsc_baserhs. >>> >>> If you would prefer that MatMFFD recomputes the base values, as needed, then you call FormFunction_mf(this, petsc_csv, NULL); and PETSc will allocate a vector and fill it up as needed by calling your FormFunction_mf() But you need to call MatAssemblyBegin/End each time you the base input vector this, petsc_csv values change. For example >>> >>> MatAssemblyBegin(petsc_A_mf,...) >>> MatAssemblyEnd(petsc_A_mf,...) >>> KSPSolve() >>> >>> >>> >>> >>>> >>>> >>>> From: Matthew Knepley > >>>> Sent: 12 March 2021 15:08 >>>> To: feng wang > >>>> Cc: Barry Smith >; petsc-users at mcs.anl.gov > >>>> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >>>> >>>> On Fri, Mar 12, 2021 at 9:55 AM feng wang > wrote: >>>> Hi Mat, >>>> >>>> Thanks for your reply. I will try the parallel implementation. >>>> >>>> I've got a serial matrix-free GMRES working, but I would like to know why my initial version of matrix-free implementation does not work and there is still something I don't understand. I did some debugging and find that the callback function to compute the RHS for the matrix-free matrix is called twice by Petsc when it computes the finite difference Jacobian, but it should only be called once. I don't know why, could you please give some advice? >>>> >>>> F is called once to calculate the base point and once to get the perturbation. The base point is not recalculated, so if you do many iterates, it is amortized. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Thanks, >>>> Feng >>>> >>>> >>>> >>>> >>>> From: Matthew Knepley > >>>> Sent: 12 March 2021 12:05 >>>> To: feng wang > >>>> Cc: Barry Smith >; petsc-users at mcs.anl.gov > >>>> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >>>> >>>> On Fri, Mar 12, 2021 at 6:02 AM feng wang > wrote: >>>> Hi Barry, >>>> >>>> Thanks for your advice. >>>> >>>> You are right on this. somehow there is some inconsistency when I compute the right hand side (true RHS + time-stepping contribution to the diagonal matrix) to compute the finite difference Jacobian. If I just use the call back function to recompute my RHS before I call MatMFFDSetBase, then it works like a charm. But now I end up with computing my RHS three times. 1st time is to compute the true RHS, the rest two is for computing finite difference Jacobian. >>>> >>>> In my previous buggy version, I only compute RHS twice. If possible, could you elaborate on your comments "Also be careful about petsc_baserhs", so I may possibly understand what was going on with my buggy version. >>>> >>>> Our FD implementation is simple. It approximates the action of the Jacobian as >>>> >>>> J(b) v = (F(b + h v) - F(b)) / h ||v|| >>>> >>>> where h is some small parameter and b is the base vector, namely the one that you are linearizing around. In a Newton step, b is the previous solution >>>> and v is the proposed solution update. >>>> >>>> Besides, for a parallel implementation, my code already has its own partition method, is it possible to allow petsc read in a user-defined partition? if not what is a better way to do this? >>>> >>>> Sure >>>> >>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Many thanks, >>>> Feng >>>> >>>> >>>> From: Barry Smith > >>>> Sent: 11 March 2021 22:15 >>>> To: feng wang > >>>> Cc: petsc-users at mcs.anl.gov > >>>> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >>>> >>>> >>>> Feng, >>>> >>>> The first thing to check is that for each linear solve that involves a new operator (values in the base vector) the MFFD matrix knows it is using a new operator. >>>> >>>> The easiest way is to call MatMFFDSetBase() before each solve that involves a new operator (new values in the base vector). Also be careful about petsc_baserhs, when you change the base vector's values you also need to change the petsc_baserhs values to the function evaluation at that point. >>>> >>>> If that is correct I would check with a trivial function evaluator to make sure the infrastructure is all set up correctly. For examples use for the matrix free a 1 4 1 operator applied matrix free. >>>> >>>> Barry >>>> >>>> >>>>> On Mar 11, 2021, at 7:35 AM, feng wang > wrote: >>>>> >>>>> Dear All, >>>>> >>>>> I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: >>>>> >>>>> the matrix-free matrix is created as: >>>>> >>>>> ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); >>>>> ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); >>>>> >>>>> KSP linear operator is set up as: >>>>> >>>>> ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix >>>>> >>>>> Before calling KSPSolve, I do: >>>>> >>>>> ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side >>>>> >>>>> The call back function is defined as: >>>>> >>>>> PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) >>>>> { >>>>> PetscErrorCode ierr; >>>>> cFdDomain *user_ctx; >>>>> >>>>> cout << "FormFunction_mf called\n"; >>>>> >>>>> //in_vec: flow states >>>>> //out_vec: right hand side + diagonal contributions from CFL number >>>>> >>>>> user_ctx = (cFdDomain*)ctx; >>>>> >>>>> //get perturbed conservative variables from petsc >>>>> user_ctx->petsc_getcsv(in_vec); >>>>> >>>>> //get new right side >>>>> user_ctx->petsc_fd_rhs(); >>>>> >>>>> //set new right hand side to the output vector >>>>> user_ctx->petsc_setrhs(out_vec); >>>>> >>>>> ierr = 0; >>>>> return ierr; >>>>> } >>>>> >>>>> The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. >>>>> >>>>> The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? >>>>> >>>>> Thanks for your help in advance. >>>>> Feng >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Mar 26 19:20:30 2021 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 26 Mar 2021 19:20:30 -0500 Subject: [petsc-users] Local Discontinuous Galerkin with PETSc TS In-Reply-To: References: <3EE29E70-8ECF-4842-99DC-30E867769875@llnl.gov> <775766A0-D6D6-4007-888C-A261A139941F@petsc.dev> Message-ID: <430E5D90-16D8-4F02-884D-4AE804BF21D3@petsc.dev> What is SLATE in this context? > On Mar 23, 2021, at 2:57 PM, Matthew Knepley wrote: > > On Tue, Mar 23, 2021 at 11:54 AM Salazar De Troya, Miguel > wrote: > The calculation of p1 and p2 are done by solving an element-wise local problem using u^n. I guess I could embed this calculation inside of the calculation for G = H(p1, p2). However, I am hoping to be able to solve the problem using firedrake-ts so the formulation is all clearly in one place and in variational form. Reading the manual, Section 2.5.2 DAE formulations, the Hessenberg Index-1 DAE case seems to be what I need, although it is not clear to me how one can achieve this with an IMEX scheme. If I have: > > > I am almost certain that you do not want to do this. I am guessing the Firedrake guys will agree. Did they tell you to do this? > If you had a large, nonlinear system for p1/p2, then a DAE would make sense. Since it is just element-wise elimination, you should > roll it into the easy equation > > u' = H > > Then you can use any integrator, as Barry says, in particular a nice symplectic integrator. My understand is that SLATE is for exactly > this kind of thing. > > Thanks, > > Matt > > F(U', U, t) = G(t,U) > > p1 = f(u_x) > > p2 = g(u_x) > > u' - H(p1, p2) = 0 > > > > where U = (p1, p2, u), F(U?, U, t) = [p1, p2, u? - H(p1, p2)],] and G(t, U) = [f(u_x), g(u_x), 0], is there a solver strategy that will solve for p1 and p2 first and then use that to solve the last equation? The jacobian for F in this formulation would be > > > > dF/dU = [[M, 0, 0], > > [0, M, 0], > > [H'(p1), H'(p2), \sigma*M]] > > > > where M is a mass matrix, H'(p1) is the jacobian of H(p1, p2) w.r.t. p1 and H'(p2), the jacobian of H(p1, p2) w.r.t. p2. H'(p1) and H'(p2) are unnecessary for the solver strategy I want to implement. > > > > Thanks > > Miguel > > > > > > > > From: Barry Smith > > Date: Monday, March 22, 2021 at 7:42 PM > To: Matthew Knepley > > Cc: "Salazar De Troya, Miguel" >, "Jorti, Zakariae via petsc-users" > > Subject: Re: [petsc-users] Local Discontinuous Galerkin with PETSc TS > > > > > > u_t = G(u) > > > > I don't see why you won't just compute any needed u_x from the given u and then you can use any explicit or implicit TS solver trivially. For implicit methods it can automatically compute the Jacobian of G for you or you can provide it directly. Explicit methods will just use the "old" u while implicit methods will use the new. > > > > Barry > > > > > > > On Mar 22, 2021, at 7:20 PM, Matthew Knepley > wrote: > > > > On Mon, Mar 22, 2021 at 7:53 PM Salazar De Troya, Miguel via petsc-users > wrote: > > Hello > > > > I am interested in implementing the LDG method in ?A local discontinuous Galerkin method for directly solving Hamilton?Jacobi equations?https://www.sciencedirect.com/science/article/pii/S0021999110005255 . The equation is more or less of the form (for 1D case): > > p1 = f(u_x) > > p2 = g(u_x) > > u_t = H(p1, p2) > > > > where typically one solves for p1 and p2 using the previous time step solution ?u? and then plugs them into the third equation to obtain the next step solution. I am wondering if the TS infrastructure could be used to implement this solution scheme. Looking at the manual, I think one could set G(t, U) to the right-hand side in the above equations and F(t, u, u?) = 0 to the left-hand side, although the first two equations would not have time derivative. In that case, how could one take advantage of the operator split scheme I mentioned? Maybe using some block preconditioners? > > > > Hi Miguel, > > > > I have a simple-minded way of understanding these TS things. My heuristic is that you put things in F that you expect to want > > at u^{n+1}, and things in G that you expect to want at u^n. It is not that simple, since you could for instance move F and G > > to the LHS and have Backward Euler, but it is my rule of thumb. > > > > So, were you looking for an IMEX scheme? If so, which terms should be lagged? Also, from the equations above, it is hard to > > see why you need a solve to calculate p1/p2. It looks like just a forward application of an operator. > > > > Thanks, > > > > Matt > > > > I am trying to solve the Hamilton-Jacobi equation u_t ? H(u_x) = 0. I welcome any suggestion for better methods. > > > > Thanks > > Miguel > > > > Miguel A. Salazar de Troya > > Postdoctoral Researcher, Lawrence Livermore National Laboratory > > B141 > > Rm: 1085-5 > > Ph: 1(925) 422-6411 > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Sat Mar 27 08:27:01 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Sat, 27 Mar 2021 14:27:01 +0100 Subject: [petsc-users] DMPlex overlap Message-ID: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> Hi all, First, I'm not sure I understand what the overlap parameter in DMPlexDistributeOverlap does. I tried the following: generate a small mesh on 1 rank with DMPlexCreateBoxMesh, then distribute it with DMPlexDistribute. At this point I have two nice partitions, with shared vertices and no overlapping cells. Then I call DMPlexDistributeOverlap with the overlap parameter set to 0 or 1, and get the same resulting plex in both cases. Why is that ? Second, I'm wondering what would be a good way to handle two overlaps and associated local vectors. In my adaptation code, the remeshing library requires a non-overlapping mesh, while the refinement criterion computation is based on hessian computations, which require a layer of overlap. What I can do is clone the dm before distributing the overlap, then manage two independent plex objects with their own local sections etc. and copy/trim local vectors manually. Is there a more automatic way to do this ? Thanks -- Nicolas From fdkong.jd at gmail.com Sat Mar 27 09:55:25 2021 From: fdkong.jd at gmail.com (Fande Kong) Date: Sat, 27 Mar 2021 08:55:25 -0600 Subject: [petsc-users] MUMPS failure In-Reply-To: <8dc371cc-61d8-c545-7ce9-6ae19221acdc@berkeley.edu> References: <3B8CB18A-556C-458E-8285-56D3C522E80E@petsc.dev> <8dc371cc-61d8-c545-7ce9-6ae19221acdc@berkeley.edu> Message-ID: There are some statements from MUMPS user manual http://mumps.enseeiht.fr/doc/userguide_5.3.5.pdf " A full 64-bit integer version can be obtained compiling MUMPS with C preprocessing flag -DINTSIZE64 and Fortran compiler option -i8, -fdefault-integer-8 or something equivalent depending on your compiler, and compiling all libraries including MPI, BLACS, ScaLAPACK, LAPACK and BLAS also with 64-bit integers. We refer the reader to the ?INSTALL? file provided with the package for details and explanations of the compilation flags controlling integer sizes. " It seems possible to build a full-64-bit-integer version of MUMPS. However, I do not understand how to build MPI with 64-bit integer support. From my understanding, MPI is hard coded with an integer type (int), and there is no way to make "int" become "long" . Thanks, Fande On Tue, Mar 23, 2021 at 12:20 PM Sanjay Govindjee wrote: > I agree. If you are mixing C and Fortran, everything is *nota bene. *It > is easy to miss argument mismatches. > -sanjay > > On 3/23/21 11:04 AM, Barry Smith wrote: > > > In a pure Fortran code using -fdefault-integer-8 is probably fine. But > MUMPS is a mixture of Fortran and C code and PETSc uses MUMPs C interface. > The -fdefault-integer-8 doesn't magically fix anything in the C parts of > MUMPS. I also don't know about MPI calls and if they would need editing. > > I am not saying it is impossible to get it to work but one needs are to > insure the C portions also switch to 64 bit integers in a consistent way. > This may be all doable bit is not simply using -fdefault-integer-8 on MUMPS. > > Barry > > > On Mar 23, 2021, at 12:07 AM, Sanjay Govindjee wrote: > > Barry, > I am curious about your statement "does not work generically". If I > compile with -fdefault-integer-8, > I would assume that this produces objects/libraries that will use 64bit > integers. As long as I have not declared > explicit kind=4 integers, what else could go wrong. > -sanjay > > PS: I am not advocating this as a great idea, but I am curious if there or > other obscure compiler level things that could go wrong. > > > On 3/22/21 8:53 PM, Barry Smith wrote: > > > > On Mar 22, 2021, at 3:24 PM, Junchao Zhang > wrote: > > > > > On Mon, Mar 22, 2021 at 1:39 PM Barry Smith wrote: > >> >> Version of PETSc and MUMPS? We fixed a bug in MUMPs a couple years ago >> that produced error messages as below. Please confirm you are using the >> latest PETSc and MUMPS. >> >> You can run your production version with the option -malloc_debug ; >> this will slow it down a bit but if there is memory corruption it may >> detect it and indicate the problematic error. >> >> One also has to be careful about the size of the problem passed to >> MUMPs since PETSc/MUMPs does not fully support using all 64 bit integers. >> Is it only crashing for problems near 2 billion entries in the sparse >> matrix? >> > "problems near 2 billion entries"? I don't understand. Should not be an > issue if building petsc with 64-bit indices. > > > MUMPS does not have proper support for 64 bit indices. It relies on > add-hoc Fortran compiler command line options to support to converting > integer to 64 bit integers and does not work generically. Yes, Fortran > lovers have been doing this for 30 years inside their applications but it > does not really work in a library environment. But then a big feature of > Fortran is "who needs libraries, we just write all the code we need" > (except Eispack,Linpack,LAPACK :=-). > > > >> valgrind is the gold standard for detecting memory corruption. >> >> Barry >> >> >> On Mar 22, 2021, at 12:56 PM, Chris Hewson wrote: >> >> Hi All, >> >> I have been having a problem with MUMPS randomly crashing in our program >> and causing the entire program to crash. I am compiling in -O2 optimization >> mode and using --download-mumps etc. to compile PETSc. If I rerun the >> program, 95%+ of the time I can't reproduce the error. It seems to be a >> similar issue to this thread: >> >> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html >> >> Similar to the resolution there I am going to try and increase icntl_14 >> and see if that resolves the issue. Any other thoughts on this? >> >> Thanks, >> >> *Chris Hewson* >> Senior Reservoir Simulation Engineer >> ResFrac >> +1.587.575.9792 >> >> >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Mar 27 11:39:56 2021 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 27 Mar 2021 12:39:56 -0400 Subject: [petsc-users] Local Discontinuous Galerkin with PETSc TS In-Reply-To: <430E5D90-16D8-4F02-884D-4AE804BF21D3@petsc.dev> References: <3EE29E70-8ECF-4842-99DC-30E867769875@llnl.gov> <775766A0-D6D6-4007-888C-A261A139941F@petsc.dev> <430E5D90-16D8-4F02-884D-4AE804BF21D3@petsc.dev> Message-ID: On Fri, Mar 26, 2021 at 8:20 PM Barry Smith wrote: > > What is SLATE in this context? > SLATE is an extension to the Firedrake DSL that describes local elimination. The idea is that you declaratively tell it what you want, say static condensation or elimination to get the hybridized problem or Wheeler Yotov elimination, and it automatically transforms the problem to give the solve the problem after elimination, handling the local solves automatically. We definitely want this capability if we ever seriously pursue hybridization. Thomas Gibson did this, who just moved to UIUC to work with Andres and company. Thanks, Matt > On Mar 23, 2021, at 2:57 PM, Matthew Knepley wrote: > > On Tue, Mar 23, 2021 at 11:54 AM Salazar De Troya, Miguel < > salazardetro1 at llnl.gov> wrote: > >> The calculation of p1 and p2 are done by solving an element-wise local >> problem using u^n. I guess I could embed this calculation inside of the >> calculation for G = H(p1, p2). However, I am hoping to be able to solve the >> problem using firedrake-ts so the formulation is all clearly in one place >> and in variational form. Reading the manual, Section 2.5.2 DAE >> formulations, the Hessenberg Index-1 DAE case seems to be what I need, >> although it is not clear to me how one can achieve this with an IMEX >> scheme. If I have: >> > > I am almost certain that you do not want to do this. I am guessing the > Firedrake guys will agree. Did they tell you to do this? > If you had a large, nonlinear system for p1/p2, then a DAE would make > sense. Since it is just element-wise elimination, you should > roll it into the easy equation > > u' = H > > Then you can use any integrator, as Barry says, in particular a nice > symplectic integrator. My understand is that SLATE is for exactly > this kind of thing. > > Thanks, > > Matt > > >> F(U', U, t) = G(t,U) >> >> p1 = f(u_x) >> >> p2 = g(u_x) >> >> u' - H(p1, p2) = 0 >> >> >> >> where U = (p1, p2, u), F(U?, U, t) = [p1, p2, u? - H(p1, p2)],] and G(t, >> U) = [f(u_x), g(u_x), 0], is there a solver strategy that will solve for p1 >> and p2 first and then use that to solve the last equation? The jacobian for >> F in this formulation would be >> >> >> >> dF/dU = [[M, 0, 0], >> >> [0, M, 0], >> >> [H'(p1), H'(p2), \sigma*M]] >> >> >> >> where M is a mass matrix, H'(p1) is the jacobian of H(p1, p2) w.r.t. p1 >> and H'(p2), the jacobian of H(p1, p2) w.r.t. p2. H'(p1) and H'(p2) are >> unnecessary for the solver strategy I want to implement. >> >> >> >> Thanks >> >> Miguel >> >> >> >> >> >> >> >> *From: *Barry Smith >> *Date: *Monday, March 22, 2021 at 7:42 PM >> *To: *Matthew Knepley >> *Cc: *"Salazar De Troya, Miguel" , "Jorti, >> Zakariae via petsc-users" >> *Subject: *Re: [petsc-users] Local Discontinuous Galerkin with PETSc TS >> >> >> >> >> >> u_t = G(u) >> >> >> >> I don't see why you won't just compute any needed u_x from the given u >> and then you can use any explicit or implicit TS solver trivially. For >> implicit methods it can automatically compute the Jacobian of G for you or >> you can provide it directly. Explicit methods will just use the "old" u >> while implicit methods will use the new. >> >> >> >> Barry >> >> >> >> >> >> On Mar 22, 2021, at 7:20 PM, Matthew Knepley wrote: >> >> >> >> On Mon, Mar 22, 2021 at 7:53 PM Salazar De Troya, Miguel via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >> Hello >> >> >> >> I am interested in implementing the LDG method in ?A local discontinuous >> Galerkin method for directly solving Hamilton?Jacobi equations? >> https://www.sciencedirect.com/science/article/pii/S0021999110005255 >> . >> The equation is more or less of the form (for 1D case): >> >> p1 = f(u_x) >> >> p2 = g(u_x) >> >> u_t = H(p1, p2) >> >> >> >> where typically one solves for p1 and p2 using the previous time step >> solution ?u? and then plugs them into the third equation to obtain the next >> step solution. I am wondering if the TS infrastructure could be used to >> implement this solution scheme. Looking at the manual, I think one could >> set G(t, U) to the right-hand side in the above equations and F(t, u, u?) = >> 0 to the left-hand side, although the first two equations would not have >> time derivative. In that case, how could one take advantage of the operator >> split scheme I mentioned? Maybe using some block preconditioners? >> >> >> >> Hi Miguel, >> >> >> >> I have a simple-minded way of understanding these TS things. My heuristic >> is that you put things in F that you expect to want >> >> at u^{n+1}, and things in G that you expect to want at u^n. It is not >> that simple, since you could for instance move F and G >> >> to the LHS and have Backward Euler, but it is my rule of thumb. >> >> >> >> So, were you looking for an IMEX scheme? If so, which terms should be >> lagged? Also, from the equations above, it is hard to >> >> see why you need a solve to calculate p1/p2. It looks like just a forward >> application of an operator. >> >> >> >> Thanks, >> >> >> >> Matt >> >> >> >> I am trying to solve the Hamilton-Jacobi equation u_t ? H(u_x) = 0. I >> welcome any suggestion for better methods. >> >> >> >> Thanks >> >> Miguel >> >> >> >> Miguel A. Salazar de Troya >> >> Postdoctoral Researcher, Lawrence Livermore National Laboratory >> >> B141 >> >> Rm: 1085-5 >> >> Ph: 1(925) 422-6411 >> >> >> >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Mar 27 11:41:57 2021 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 27 Mar 2021 12:41:57 -0400 Subject: [petsc-users] MUMPS failure In-Reply-To: References: <3B8CB18A-556C-458E-8285-56D3C522E80E@petsc.dev> <8dc371cc-61d8-c545-7ce9-6ae19221acdc@berkeley.edu> Message-ID: On Sat, Mar 27, 2021 at 10:55 AM Fande Kong wrote: > There are some statements from MUMPS user manual > http://mumps.enseeiht.fr/doc/userguide_5.3.5.pdf > > " > A full 64-bit integer version can be obtained compiling MUMPS with C > preprocessing flag -DINTSIZE64 and Fortran compiler option -i8, > -fdefault-integer-8 or something equivalent depending on your compiler, and > compiling all libraries including MPI, BLACS, ScaLAPACK, LAPACK and BLAS > also with 64-bit integers. We refer the reader to the ?INSTALL? file > provided with the package for details and explanations of the compilation > flags controlling integer sizes. > " > > It seems possible to build a full-64-bit-integer version of MUMPS. > However, I do not understand how to build MPI with 64-bit integer support. > From my understanding, MPI is hard coded with an integer type (int), and > there is no way to make "int" become "long" . > We had a long conversation with the MUMPs developers about this and are aware of what it does. Thanks, Matt > Thanks, > > Fande > > > On Tue, Mar 23, 2021 at 12:20 PM Sanjay Govindjee > wrote: > >> I agree. If you are mixing C and Fortran, everything is *nota bene. *It >> is easy to miss argument mismatches. >> -sanjay >> >> On 3/23/21 11:04 AM, Barry Smith wrote: >> >> >> In a pure Fortran code using -fdefault-integer-8 is probably fine. But >> MUMPS is a mixture of Fortran and C code and PETSc uses MUMPs C interface. >> The -fdefault-integer-8 doesn't magically fix anything in the C parts of >> MUMPS. I also don't know about MPI calls and if they would need editing. >> >> I am not saying it is impossible to get it to work but one needs are >> to insure the C portions also switch to 64 bit integers in a consistent >> way. This may be all doable bit is not simply using -fdefault-integer-8 on >> MUMPS. >> >> Barry >> >> >> On Mar 23, 2021, at 12:07 AM, Sanjay Govindjee wrote: >> >> Barry, >> I am curious about your statement "does not work generically". If I >> compile with -fdefault-integer-8, >> I would assume that this produces objects/libraries that will use 64bit >> integers. As long as I have not declared >> explicit kind=4 integers, what else could go wrong. >> -sanjay >> >> PS: I am not advocating this as a great idea, but I am curious if there >> or other obscure compiler level things that could go wrong. >> >> >> On 3/22/21 8:53 PM, Barry Smith wrote: >> >> >> >> On Mar 22, 2021, at 3:24 PM, Junchao Zhang >> wrote: >> >> >> >> >> On Mon, Mar 22, 2021 at 1:39 PM Barry Smith wrote: >> >>> >>> Version of PETSc and MUMPS? We fixed a bug in MUMPs a couple years >>> ago that produced error messages as below. Please confirm you are using the >>> latest PETSc and MUMPS. >>> >>> You can run your production version with the option -malloc_debug ; >>> this will slow it down a bit but if there is memory corruption it may >>> detect it and indicate the problematic error. >>> >>> One also has to be careful about the size of the problem passed to >>> MUMPs since PETSc/MUMPs does not fully support using all 64 bit integers. >>> Is it only crashing for problems near 2 billion entries in the sparse >>> matrix? >>> >> "problems near 2 billion entries"? I don't understand. Should not be an >> issue if building petsc with 64-bit indices. >> >> >> MUMPS does not have proper support for 64 bit indices. It relies on >> add-hoc Fortran compiler command line options to support to converting >> integer to 64 bit integers and does not work generically. Yes, Fortran >> lovers have been doing this for 30 years inside their applications but it >> does not really work in a library environment. But then a big feature of >> Fortran is "who needs libraries, we just write all the code we need" >> (except Eispack,Linpack,LAPACK :=-). >> >> >> >>> valgrind is the gold standard for detecting memory corruption. >>> >>> Barry >>> >>> >>> On Mar 22, 2021, at 12:56 PM, Chris Hewson wrote: >>> >>> Hi All, >>> >>> I have been having a problem with MUMPS randomly crashing in our program >>> and causing the entire program to crash. I am compiling in -O2 optimization >>> mode and using --download-mumps etc. to compile PETSc. If I rerun the >>> program, 95%+ of the time I can't reproduce the error. It seems to be a >>> similar issue to this thread: >>> >>> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html >>> >>> Similar to the resolution there I am going to try and increase icntl_14 >>> and see if that resolves the issue. Any other thoughts on this? >>> >>> Thanks, >>> >>> *Chris Hewson* >>> Senior Reservoir Simulation Engineer >>> ResFrac >>> +1.587.575.9792 >>> >>> >>> >> >> >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sat Mar 27 12:32:57 2021 From: bsmith at petsc.dev (Barry Smith) Date: Sat, 27 Mar 2021 12:32:57 -0500 Subject: [petsc-users] Local Discontinuous Galerkin with PETSc TS In-Reply-To: References: <3EE29E70-8ECF-4842-99DC-30E867769875@llnl.gov> <775766A0-D6D6-4007-888C-A261A139941F@petsc.dev> <430E5D90-16D8-4F02-884D-4AE804BF21D3@petsc.dev> Message-ID: Ok, thanks. I took a quick look at the paper, I don't see any elimination or need for SLATE. The p1 and p2 are two different "up wind" approximations for Ux. There are two of them so their difference can provide a stabilization in Lax-Friedrichs. I don't understand everything but the implementation looks very straightforward with RK. Barry > On Mar 27, 2021, at 11:39 AM, Matthew Knepley wrote: > > On Fri, Mar 26, 2021 at 8:20 PM Barry Smith > wrote: > > What is SLATE in this context? > > SLATE is an extension to the Firedrake DSL that describes local elimination. The idea is that you declaratively tell it what you want, > say static condensation or elimination to get the hybridized problem or Wheeler Yotov elimination, and it automatically transforms the > problem to give the solve the problem after elimination, handling the local solves automatically. We definitely want this capability if we > ever seriously pursue hybridization. Thomas Gibson did this, who just moved to UIUC to work with Andres and company. > > Thanks, > > Matt >> On Mar 23, 2021, at 2:57 PM, Matthew Knepley > wrote: >> >> On Tue, Mar 23, 2021 at 11:54 AM Salazar De Troya, Miguel > wrote: >> The calculation of p1 and p2 are done by solving an element-wise local problem using u^n. I guess I could embed this calculation inside of the calculation for G = H(p1, p2). However, I am hoping to be able to solve the problem using firedrake-ts so the formulation is all clearly in one place and in variational form. Reading the manual, Section 2.5.2 DAE formulations, the Hessenberg Index-1 DAE case seems to be what I need, although it is not clear to me how one can achieve this with an IMEX scheme. If I have: >> >> >> I am almost certain that you do not want to do this. I am guessing the Firedrake guys will agree. Did they tell you to do this? >> If you had a large, nonlinear system for p1/p2, then a DAE would make sense. Since it is just element-wise elimination, you should >> roll it into the easy equation >> >> u' = H >> >> Then you can use any integrator, as Barry says, in particular a nice symplectic integrator. My understand is that SLATE is for exactly >> this kind of thing. >> >> Thanks, >> >> Matt >> >> F(U', U, t) = G(t,U) >> >> p1 = f(u_x) >> >> p2 = g(u_x) >> >> u' - H(p1, p2) = 0 >> >> >> >> where U = (p1, p2, u), F(U?, U, t) = [p1, p2, u? - H(p1, p2)],] and G(t, U) = [f(u_x), g(u_x), 0], is there a solver strategy that will solve for p1 and p2 first and then use that to solve the last equation? The jacobian for F in this formulation would be >> >> >> >> dF/dU = [[M, 0, 0], >> >> [0, M, 0], >> >> [H'(p1), H'(p2), \sigma*M]] >> >> >> >> where M is a mass matrix, H'(p1) is the jacobian of H(p1, p2) w.r.t. p1 and H'(p2), the jacobian of H(p1, p2) w.r.t. p2. H'(p1) and H'(p2) are unnecessary for the solver strategy I want to implement. >> >> >> >> Thanks >> >> Miguel >> >> >> >> >> >> >> >> From: Barry Smith > >> Date: Monday, March 22, 2021 at 7:42 PM >> To: Matthew Knepley > >> Cc: "Salazar De Troya, Miguel" >, "Jorti, Zakariae via petsc-users" > >> Subject: Re: [petsc-users] Local Discontinuous Galerkin with PETSc TS >> >> >> >> >> >> u_t = G(u) >> >> >> >> I don't see why you won't just compute any needed u_x from the given u and then you can use any explicit or implicit TS solver trivially. For implicit methods it can automatically compute the Jacobian of G for you or you can provide it directly. Explicit methods will just use the "old" u while implicit methods will use the new. >> >> >> >> Barry >> >> >> >> >> >> >> On Mar 22, 2021, at 7:20 PM, Matthew Knepley > wrote: >> >> >> >> On Mon, Mar 22, 2021 at 7:53 PM Salazar De Troya, Miguel via petsc-users > wrote: >> >> Hello >> >> >> >> I am interested in implementing the LDG method in ?A local discontinuous Galerkin method for directly solving Hamilton?Jacobi equations?https://www.sciencedirect.com/science/article/pii/S0021999110005255 . The equation is more or less of the form (for 1D case): >> >> p1 = f(u_x) >> >> p2 = g(u_x) >> >> u_t = H(p1, p2) >> >> >> >> where typically one solves for p1 and p2 using the previous time step solution ?u? and then plugs them into the third equation to obtain the next step solution. I am wondering if the TS infrastructure could be used to implement this solution scheme. Looking at the manual, I think one could set G(t, U) to the right-hand side in the above equations and F(t, u, u?) = 0 to the left-hand side, although the first two equations would not have time derivative. In that case, how could one take advantage of the operator split scheme I mentioned? Maybe using some block preconditioners? >> >> >> >> Hi Miguel, >> >> >> >> I have a simple-minded way of understanding these TS things. My heuristic is that you put things in F that you expect to want >> >> at u^{n+1}, and things in G that you expect to want at u^n. It is not that simple, since you could for instance move F and G >> >> to the LHS and have Backward Euler, but it is my rule of thumb. >> >> >> >> So, were you looking for an IMEX scheme? If so, which terms should be lagged? Also, from the equations above, it is hard to >> >> see why you need a solve to calculate p1/p2. It looks like just a forward application of an operator. >> >> >> >> Thanks, >> >> >> >> Matt >> >> >> >> I am trying to solve the Hamilton-Jacobi equation u_t ? H(u_x) = 0. I welcome any suggestion for better methods. >> >> >> >> Thanks >> >> Miguel >> >> >> >> Miguel A. Salazar de Troya >> >> Postdoctoral Researcher, Lawrence Livermore National Laboratory >> >> B141 >> >> Rm: 1085-5 >> >> Ph: 1(925) 422-6411 >> >> >> >> >> >> -- >> >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Sat Mar 27 12:39:44 2021 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Sat, 27 Mar 2021 12:39:44 -0500 Subject: [petsc-users] MUMPS failure In-Reply-To: References: <3B8CB18A-556C-458E-8285-56D3C522E80E@petsc.dev> <8dc371cc-61d8-c545-7ce9-6ae19221acdc@berkeley.edu> Message-ID: On Sat, Mar 27, 2021 at 9:55 AM Fande Kong wrote: > There are some statements from MUMPS user manual > http://mumps.enseeiht.fr/doc/userguide_5.3.5.pdf > > " > A full 64-bit integer version can be obtained compiling MUMPS with C > preprocessing flag -DINTSIZE64 and Fortran compiler option -i8, > -fdefault-integer-8 or something equivalent depending on your compiler, and > compiling all libraries including MPI, BLACS, ScaLAPACK, LAPACK and BLAS > also with 64-bit integers. We refer the reader to the ?INSTALL? file > provided with the package for details and explanations of the compilation > flags controlling integer sizes. > " > > It seems possible to build a full-64-bit-integer version of MUMPS. > However, I do not understand how to build MPI with 64-bit integer support. > From my understanding, MPI is hard coded with an integer type (int), and > there is no way to make "int" become "long" . > Fande, it is possible for Fortran code, see https://software.intel.com/content/www/us/en/develop/documentation/mpi-developer-guide-linux/top/compiling-and-linking/ilp64-support.html > > > Thanks, > > Fande > > > On Tue, Mar 23, 2021 at 12:20 PM Sanjay Govindjee > wrote: > >> I agree. If you are mixing C and Fortran, everything is *nota bene. *It >> is easy to miss argument mismatches. >> -sanjay >> >> On 3/23/21 11:04 AM, Barry Smith wrote: >> >> >> In a pure Fortran code using -fdefault-integer-8 is probably fine. But >> MUMPS is a mixture of Fortran and C code and PETSc uses MUMPs C interface. >> The -fdefault-integer-8 doesn't magically fix anything in the C parts of >> MUMPS. I also don't know about MPI calls and if they would need editing. >> >> I am not saying it is impossible to get it to work but one needs are >> to insure the C portions also switch to 64 bit integers in a consistent >> way. This may be all doable bit is not simply using -fdefault-integer-8 on >> MUMPS. >> >> Barry >> >> >> On Mar 23, 2021, at 12:07 AM, Sanjay Govindjee wrote: >> >> Barry, >> I am curious about your statement "does not work generically". If I >> compile with -fdefault-integer-8, >> I would assume that this produces objects/libraries that will use 64bit >> integers. As long as I have not declared >> explicit kind=4 integers, what else could go wrong. >> -sanjay >> >> PS: I am not advocating this as a great idea, but I am curious if there >> or other obscure compiler level things that could go wrong. >> >> >> On 3/22/21 8:53 PM, Barry Smith wrote: >> >> >> >> On Mar 22, 2021, at 3:24 PM, Junchao Zhang >> wrote: >> >> >> >> >> On Mon, Mar 22, 2021 at 1:39 PM Barry Smith wrote: >> >>> >>> Version of PETSc and MUMPS? We fixed a bug in MUMPs a couple years >>> ago that produced error messages as below. Please confirm you are using the >>> latest PETSc and MUMPS. >>> >>> You can run your production version with the option -malloc_debug ; >>> this will slow it down a bit but if there is memory corruption it may >>> detect it and indicate the problematic error. >>> >>> One also has to be careful about the size of the problem passed to >>> MUMPs since PETSc/MUMPs does not fully support using all 64 bit integers. >>> Is it only crashing for problems near 2 billion entries in the sparse >>> matrix? >>> >> "problems near 2 billion entries"? I don't understand. Should not be an >> issue if building petsc with 64-bit indices. >> >> >> MUMPS does not have proper support for 64 bit indices. It relies on >> add-hoc Fortran compiler command line options to support to converting >> integer to 64 bit integers and does not work generically. Yes, Fortran >> lovers have been doing this for 30 years inside their applications but it >> does not really work in a library environment. But then a big feature of >> Fortran is "who needs libraries, we just write all the code we need" >> (except Eispack,Linpack,LAPACK :=-). >> >> >> >>> valgrind is the gold standard for detecting memory corruption. >>> >>> Barry >>> >>> >>> On Mar 22, 2021, at 12:56 PM, Chris Hewson wrote: >>> >>> Hi All, >>> >>> I have been having a problem with MUMPS randomly crashing in our program >>> and causing the entire program to crash. I am compiling in -O2 optimization >>> mode and using --download-mumps etc. to compile PETSc. If I rerun the >>> program, 95%+ of the time I can't reproduce the error. It seems to be a >>> similar issue to this thread: >>> >>> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html >>> >>> Similar to the resolution there I am going to try and increase icntl_14 >>> and see if that resolves the issue. Any other thoughts on this? >>> >>> Thanks, >>> >>> *Chris Hewson* >>> Senior Reservoir Simulation Engineer >>> ResFrac >>> +1.587.575.9792 >>> >>> >>> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From cebau.mail at gmail.com Sat Mar 27 13:58:59 2021 From: cebau.mail at gmail.com (C B) Date: Sat, 27 Mar 2021 13:58:59 -0500 Subject: [petsc-users] unsubscribe In-Reply-To: References: <3B8CB18A-556C-458E-8285-56D3C522E80E@petsc.dev> <8dc371cc-61d8-c545-7ce9-6ae19221acdc@berkeley.edu> Message-ID: unsubscribe >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From sayosale at hotmail.com Sun Mar 28 04:25:16 2021 From: sayosale at hotmail.com (dazza simplythebest) Date: Sun, 28 Mar 2021 09:25:16 +0000 Subject: [petsc-users] Newbie question: Something is wrong with this Slepc simple example for generalised eigenvalue problem Message-ID: Dear All, I am seeking to use slepc/petsc to solve a generalised eigenvalue problem that arises in a hydrodynamic stability problem. The code is a parallelisation of an existing serial Fortran code. Before I get to grips with this target problem, I would of course like to get some relevant examples working. I have installed petsc/ slepc seemingly without any problem, and the provided slepc example fortran program ex1f.F, which solves a regular eigenvalue problem Ax = lambda x, seemed to compile and run correctly. I have now written a short program to instead solve the complex generalised problem Ax = lambda B x (see below) . This code compiles and runs w/out errors but for some reason hangs when calling EPSSolve - we enter EPSSolve but never leave. The matrices appear to be correctly assembled -all the values are correct in the Matview printout, so I am not quite sure where I have gone wrong, can anyone spot my mistake? ( Note that for the actual problem I wish to solve I have already written the code to construct the matrix, which distributes the rows across the processes and it is fully tested and working. Hence I want to specify the distribution of rows and not leave it up to a PETS_DECIDE .) I would be very grateful if someone can point out what is wrong with this small example code (see below), Many thanks, Dan. ! this program MUST be run with NGLOBAL = 10, MY_NO_ROWS = 5 ! and two MPI processes since row distribution is hard-baked into code ! program main #include use slepceps implicit none ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ! Declarations ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ! ! Variables: ! A , B double complex operator matrices ! eps eigenproblem solver context Mat A,B EPS eps EPSType tname PetscReal tol, error PetscScalar kr, ki Vec xr, xi PetscInt NGLOBAL , MY_NO_ROWS, NL3, owner PetscInt nev, maxit, its, nconv PetscInt i,j,ic,jc PetscReal reala, imaga, realb, imagb, di, dj PetscScalar a_entry, b_entry PetscMPIInt rank PetscErrorCode ierr PetscInt,parameter :: zero = 0, one = 1, two = 2, three = 3 PetscInt M1, N1, mm1, nn1 ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ! Beginning of program ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - call SlepcInitialize(PETSC_NULL_CHARACTER,ierr) call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) ! make sure you set NGLOBAL = 10, MY_NO_ROWS = 5 and run with two processes NGLOBAL = 10 MY_NO_ROWS = 5 ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ! Compute the operator matrices that define the eigensystem, Ax=kBx ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - !!!!!!!!! Setup A matrix call MatCreate(PETSC_COMM_WORLD,A,ierr) call MatSetsizes(A,MY_NO_ROWS, MY_NO_ROWS ,NGLOBAL,NGLOBAL,ierr) call MatSetFromOptions(A,ierr) call MatGetSize(A,M1,N1,ierr) write(*,*)'Rank [',rank,']: global size of A is ',M1, N1 call MatGetLocalSize(A,mm1,nn1,ierr) write(*,*)'Rank [',rank,']: my local size of A is ',mm1, nn1 call MatMPIAIJSetPreallocation(A,three, PETSC_NULL_INTEGER,one, & & PETSC_NULL_INTEGER,ierr) !parallel (MPI) allocation !!!!!!!!! Setup B matrix call MatCreate(PETSC_COMM_WORLD,B,ierr) call MatSetsizes(B,MY_NO_ROWS, MY_NO_ROWS ,NGLOBAL,NGLOBAL,ierr) call MatSetFromOptions(B,ierr) call MatGetSize(B,M1,N1,ierr) write(*,*)'Rank [',rank,']: global size of B is ',M1, N1 call MatGetLocalSize(B,mm1,nn1,ierr) write(*,*)'Rank [',rank,']: my local size of B is ',mm1, nn1 call MatMPIAIJSetPreallocation(B,three, PETSC_NULL_INTEGER,one, & & PETSC_NULL_INTEGER,ierr) !parallel (MPI) allocation ! initalise call MatZeroEntries(A,ierr) call MatZeroEntries(B,ierr) ! Fill in values of A, B and assemble matrices ! Both matrices are tridiagonal with ! Aij = cmplx( (i-j)**2, (i+j)**2) ! Bij = cmplx( ij/i + j, (i/j)**2) ! (numbering from 1 ) do i = 1, NGLOBAL ! a rather crude way to distribute rows if (i < 6) owner = 0 if (i >= 6) owner = 1 if (rank /= owner) cycle do j = 1, NGLOBAL if ( abs(i-j) < 2 ) then write(*,*)rank,' : Setting ',i,j di = dble(i) ; dj = dble(j) reala = (di - dj)**2 ; imaga = (di + dj)**2 a_entry = dcmplx(reala, imaga) realb = (di*dj)/(di + dj) ; imagb = di**2/dj**2 b_entry = dcmplx(realb, imagb) ic = i -1 ; jc = j-1 ! convert to C indexing call MatSetValue(A, ic, jc, a_entry, ADD_VALUES,ierr) call MatSetValue(B, ic, jc, b_entry, ADD_VALUES,ierr) endif enddo enddo call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,ierr) call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,ierr) ! Check matrices write(*,*)'A matrix ... ' call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr) write(*,*)'B matrix ... ' call MatView(B,PETSC_VIEWER_STDOUT_WORLD,ierr) call MatCreateVecs(A,PETSC_NULL_VEC,xr) call MatCreateVecs(A, PETSC_NULL_VEC,xi) ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ! Create the eigensolver and display info ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ! ** Create eigensolver context call EPSCreate(PETSC_COMM_WORLD,eps,ierr) ! ** Set operators.for general problem Ax = lambda B x call EPSSetOperators(eps,A, B, ierr) call EPSSetProblemType(eps,EPS_GNHEP,ierr) ! ** Set solver parameters at runtime call EPSSetFromOptions(eps,ierr) ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ! Solve the eigensystem ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - write(*,*)'rank',rank, 'entering solver ...' call EPSSolve(eps,ierr) ! ** Free work space call EPSDestroy(eps,ierr) call MatDestroy(A,ierr) call MatDestroy(B,ierr) call VecDestroy(xr,ierr) call VecDestroy(xi,ierr) call SlepcFinalize(ierr) end -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Sun Mar 28 11:42:23 2021 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sun, 28 Mar 2021 18:42:23 +0200 Subject: [petsc-users] Newbie question: Something is wrong with this Slepc simple example for generalised eigenvalue problem In-Reply-To: References: Message-ID: <52F02412-EAAE-4A66-97AF-AD97E5A3AED5@dsic.upv.es> You should run in debug mode until you get a correct code, otherwise you may no see some error messages. Also, it is recommended to add error checking after every call to PETSc, otherwise the execution continues and may get blocked. See the CHKERRA macro in https://slepc.upv.es/documentation/current/src/eps/tutorials/ex1f90.F90.html The problem you are probably having is that you are running with several MPI processes, so you need a parallel LU solver. See FAQ #10 https://slepc.upv.es/documentation/faq.htm Jose > El 28 mar 2021, a las 11:25, dazza simplythebest escribi?: > > Dear All, > I am seeking to use slepc/petsc to solve a generalised eigenvalue problem > that arises in a hydrodynamic stability problem. The code is a parallelisation of an existing > serial Fortran code. Before I get to grips with this target problem, I would of course like to get > some relevant examples working. I have installed petsc/ slepc seemingly without any problem, > and the provided slepc example fortran program ex1f.F, which solves a regular eigenvalue > problem Ax = lambda x, seemed to compile and run correctly. > I have now written a short program to instead solve the complex generalised > problem Ax = lambda B x (see below) . This code compiles and runs w/out errors but > for some reason hangs when calling EPSSolve - we enter EPSSolve but never leave. > The matrices appear to be correctly assembled -all the values are correct in the Matview > printout, so I am not quite sure where I have gone wrong, can anyone spot my mistake? > ( Note that for the actual problem I wish to solve I have already written the code to construct the matrix, > which distributes the rows across the processes and it is fully tested and working. Hence I want to specify > the distribution of rows and not leave it up to a PETS_DECIDE .) > I would be very grateful if someone can point out what is wrong with this small example code (see below), > Many thanks, > Dan. > > ! this program MUST be run with NGLOBAL = 10, MY_NO_ROWS = 5 > ! and two MPI processes since row distribution is hard-baked into code > ! > program main > #include > use slepceps > implicit none > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ! Declarations > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ! > ! Variables: > ! A , B double complex operator matrices > ! eps eigenproblem solver context > > Mat A,B > EPS eps > EPSType tname > PetscReal tol, error > PetscScalar kr, ki > Vec xr, xi > PetscInt NGLOBAL , MY_NO_ROWS, NL3, owner > PetscInt nev, maxit, its, nconv > PetscInt i,j,ic,jc > PetscReal reala, imaga, realb, imagb, di, dj > PetscScalar a_entry, b_entry > PetscMPIInt rank > PetscErrorCode ierr > PetscInt,parameter :: zero = 0, one = 1, two = 2, three = 3 > > PetscInt M1, N1, mm1, nn1 > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ! Beginning of program > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > call SlepcInitialize(PETSC_NULL_CHARACTER,ierr) > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) > ! make sure you set NGLOBAL = 10, MY_NO_ROWS = 5 and run with two processes > NGLOBAL = 10 > MY_NO_ROWS = 5 > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ! Compute the operator matrices that define the eigensystem, Ax=kBx > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > !!!!!!!!! Setup A matrix > > call MatCreate(PETSC_COMM_WORLD,A,ierr) > call MatSetsizes(A,MY_NO_ROWS, MY_NO_ROWS ,NGLOBAL,NGLOBAL,ierr) > call MatSetFromOptions(A,ierr) > call MatGetSize(A,M1,N1,ierr) > write(*,*)'Rank [',rank,']: global size of A is ',M1, N1 > call MatGetLocalSize(A,mm1,nn1,ierr) > write(*,*)'Rank [',rank,']: my local size of A is ',mm1, nn1 > call MatMPIAIJSetPreallocation(A,three, PETSC_NULL_INTEGER,one, & > & PETSC_NULL_INTEGER,ierr) !parallel (MPI) allocation > > !!!!!!!!! Setup B matrix > call MatCreate(PETSC_COMM_WORLD,B,ierr) > call MatSetsizes(B,MY_NO_ROWS, MY_NO_ROWS ,NGLOBAL,NGLOBAL,ierr) > call MatSetFromOptions(B,ierr) > call MatGetSize(B,M1,N1,ierr) > write(*,*)'Rank [',rank,']: global size of B is ',M1, N1 > call MatGetLocalSize(B,mm1,nn1,ierr) > write(*,*)'Rank [',rank,']: my local size of B is ',mm1, nn1 > > call MatMPIAIJSetPreallocation(B,three, PETSC_NULL_INTEGER,one, & > & PETSC_NULL_INTEGER,ierr) !parallel (MPI) allocation > > ! initalise > call MatZeroEntries(A,ierr) > call MatZeroEntries(B,ierr) > > ! Fill in values of A, B and assemble matrices > ! Both matrices are tridiagonal with > ! Aij = cmplx( (i-j)**2, (i+j)**2) > ! Bij = cmplx( ij/i + j, (i/j)**2) > ! (numbering from 1 ) > > do i = 1, NGLOBAL > ! a rather crude way to distribute rows > if (i < 6) owner = 0 > if (i >= 6) owner = 1 > if (rank /= owner) cycle > do j = 1, NGLOBAL > if ( abs(i-j) < 2 ) then > write(*,*)rank,' : Setting ',i,j > di = dble(i) ; dj = dble(j) > > reala = (di - dj)**2 ; imaga = (di + dj)**2 > a_entry = dcmplx(reala, imaga) > realb = (di*dj)/(di + dj) ; imagb = di**2/dj**2 > b_entry = dcmplx(realb, imagb) > > ic = i -1 ; jc = j-1 ! convert to C indexing > call MatSetValue(A, ic, jc, a_entry, ADD_VALUES,ierr) > call MatSetValue(B, ic, jc, b_entry, ADD_VALUES,ierr) > endif > enddo > enddo > > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,ierr) > > ! Check matrices > write(*,*)'A matrix ... ' > call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr) > write(*,*)'B matrix ... ' > call MatView(B,PETSC_VIEWER_STDOUT_WORLD,ierr) > > call MatCreateVecs(A,PETSC_NULL_VEC,xr) > call MatCreateVecs(A, PETSC_NULL_VEC,xi) > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ! Create the eigensolver and display info > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ! ** Create eigensolver context > call EPSCreate(PETSC_COMM_WORLD,eps,ierr) > > ! ** Set operators.for general problem Ax = lambda B x > call EPSSetOperators(eps,A, B, ierr) > call EPSSetProblemType(eps,EPS_GNHEP,ierr) > > ! ** Set solver parameters at runtime > call EPSSetFromOptions(eps,ierr) > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ! Solve the eigensystem > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > write(*,*)'rank',rank, 'entering solver ...' > call EPSSolve(eps,ierr) > > ! ** Free work space > call EPSDestroy(eps,ierr) > call MatDestroy(A,ierr) > call MatDestroy(B,ierr) > > call VecDestroy(xr,ierr) > call VecDestroy(xi,ierr) > > call SlepcFinalize(ierr) > end From gohardoust at gmail.com Sun Mar 28 12:50:38 2021 From: gohardoust at gmail.com (Mohammad Gohardoust) Date: Sun, 28 Mar 2021 10:50:38 -0700 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: References: Message-ID: Here is the plot of run time in old and new petsc using 1,2,4,8, and 16 CPUs (in logarithmic scale): [image: Screenshot from 2021-03-28 10-48-56.png] On Thu, Mar 25, 2021 at 12:51 PM Mohammad Gohardoust wrote: > That's right, these loops also take roughly half time as well. If I am not > mistaken, petsc (MatSetValue) is called after doing some calculations over > each tetrahedral element. > Thanks for your suggestion. I will try that and will post the results. > > Mohammad > > On Wed, Mar 24, 2021 at 3:23 PM Junchao Zhang > wrote: > >> >> >> >> On Wed, Mar 24, 2021 at 2:17 AM Mohammad Gohardoust >> wrote: >> >>> So the code itself is a finite-element scheme and in stage 1 and 3 there >>> are expensive loops over entire mesh elements which consume a lot of time. >>> >> So these expensive loops must also take half time with newer petsc? And >> these loops do not call petsc routines? >> I think you can build two PETSc versions with the same configuration >> options, then run your code with one MPI rank to see if there is a >> difference. >> If they give the same performance, then scale to 2, 4, ... ranks and see >> what happens. >> >> >> >>> >>> Mohammad >>> >>> On Tue, Mar 23, 2021 at 6:08 PM Junchao Zhang >>> wrote: >>> >>>> In the new log, I saw >>>> >>>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >>>> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >>>> 0: Main Stage: 5.4095e+00 2.3% 4.3700e+03 0.0% 4.764e+05 3.0% 3.135e+02 1.0% 2.244e+04 12.6% 1: Solute_Assembly: 1.3977e+02 59.4% 7.3353e+09 4.6% 3.263e+06 20.7% 1.278e+03 26.9% 1.059e+04 6.0% >>>> >>>> >>>> But I didn't see any event in this stage had a cost close to 140s. What >>>> happened? >>>> >>>> --- Event Stage 1: Solute_Assembly >>>> >>>> BuildTwoSided 3531 1.0 2.8025e+0026.3 0.00e+00 0.0 3.6e+05 4.0e+00 3.5e+03 1 0 2 0 2 1 0 11 0 33 0 >>>> BuildTwoSidedF 3531 1.0 2.8678e+0013.2 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 1 0 5 17 2 1 0 22 62 33 0 >>>> VecScatterBegin 7062 1.0 7.1911e-02 1.9 0.00e+00 0.0 7.1e+05 3.5e+02 0.0e+00 0 0 5 2 0 0 0 22 6 0 0 >>>> VecScatterEnd 7062 1.0 2.1248e-01 3.0 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 73 >>>> SFBcastOpBegin 3531 1.0 2.6516e-02 2.4 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 >>>> SFBcastOpEnd 3531 1.0 9.5041e-02 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> SFReduceBegin 3531 1.0 3.8955e-02 2.1 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 >>>> SFReduceEnd 3531 1.0 1.3791e-01 3.9 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 112 >>>> SFPack 7062 1.0 6.5591e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> SFUnpack 7062 1.0 7.4186e-03 2.1 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2080 >>>> MatAssemblyBegin 3531 1.0 4.7846e+00 1.1 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 2 0 5 17 2 3 0 22 62 33 0 >>>> MatAssemblyEnd 3531 1.0 1.5468e+00 2.7 1.68e+07 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 2 0 0 0 104 >>>> MatZeroEntries 3531 1.0 3.0998e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> >>>> >>>> --Junchao Zhang >>>> >>>> >>>> >>>> On Tue, Mar 23, 2021 at 5:24 PM Mohammad Gohardoust < >>>> gohardoust at gmail.com> wrote: >>>> >>>>> Thanks Dave for your reply. >>>>> >>>>> For sure PETSc is awesome :D >>>>> >>>>> Yes, in both cases petsc was configured with --with-debugging=0 and >>>>> fortunately I do have the old and new -log-veiw outputs which I attached. >>>>> >>>>> Best, >>>>> Mohammad >>>>> >>>>> On Tue, Mar 23, 2021 at 1:37 AM Dave May >>>>> wrote: >>>>> >>>>>> Nice to hear! >>>>>> The answer is simple, PETSc is awesome :) >>>>>> >>>>>> Jokes aside, assuming both petsc builds were configured with >>>>>> ?with-debugging=0, I don?t think there is a definitive answer to your >>>>>> question with the information you provided. >>>>>> >>>>>> It could be as simple as one specific implementation you use was >>>>>> improved between petsc releases. Not being an Ubuntu expert, the change >>>>>> might be associated with using a different compiler, and or a more >>>>>> efficient BLAS implementation (non threaded vs threaded). However I doubt >>>>>> this is the origin of your 2x performance increase. >>>>>> >>>>>> If you really want to understand where the performance improvement >>>>>> originated from, you?d need to send to the email list the result of >>>>>> -log_view from both the old and new versions, running the exact same >>>>>> problem. >>>>>> >>>>>> From that info, we can see what implementations in PETSc are being >>>>>> used and where the time reduction is occurring. Knowing that, it should be >>>>>> clearer to provide an explanation for it. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Dave >>>>>> >>>>>> >>>>>> On Tue 23. Mar 2021 at 06:24, Mohammad Gohardoust < >>>>>> gohardoust at gmail.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I am using a code which is based on petsc (and also parmetis). >>>>>>> Recently I made the following changes and now the code is running about two >>>>>>> times faster than before: >>>>>>> >>>>>>> - Upgraded Ubuntu 18.04 to 20.04 >>>>>>> - Upgraded petsc 3.13.4 to 3.14.5 >>>>>>> - This time I installed parmetis and metis directly via petsc by >>>>>>> --download-parmetis --download-metis flags instead of installing them >>>>>>> separately and using --with-parmetis-include=... and >>>>>>> --with-parmetis-lib=... (the version of installed parmetis was 4.0.3 before) >>>>>>> >>>>>>> I was wondering what can possibly explain this speedup? Does anyone >>>>>>> have any suggestions? >>>>>>> >>>>>>> Thanks, >>>>>>> Mohammad >>>>>>> >>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot from 2021-03-28 10-48-56.png Type: image/png Size: 14457 bytes Desc: not available URL: From jed at jedbrown.org Sun Mar 28 15:33:46 2021 From: jed at jedbrown.org (Jed Brown) Date: Sun, 28 Mar 2021 14:33:46 -0600 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: References: Message-ID: <87mtunnhlh.fsf@jedbrown.org> I take it this was using MAT_SUBSET_OFF_PROC_ENTRIES. I implemented that to help performance of PHASTA and other applications that assemble matrices that are relatively cheap to solve (so assembly cost is significant compared to preconditioner setup and KSPSolve) and I'm glad it helps so much here. I don't have an explanation for why you're observing local vector operations like VecScale and VecMAXPY running over twice as fast in the new code. These consist of simple code that has not changed, and which are normally memory bandwidth limited (though some of your problem sizes might fit in cache). Mohammad Gohardoust writes: > Here is the plot of run time in old and new petsc using 1,2,4,8, and 16 > CPUs (in logarithmic scale): > > [image: Screenshot from 2021-03-28 10-48-56.png] > > > > > On Thu, Mar 25, 2021 at 12:51 PM Mohammad Gohardoust > wrote: > >> That's right, these loops also take roughly half time as well. If I am not >> mistaken, petsc (MatSetValue) is called after doing some calculations over >> each tetrahedral element. >> Thanks for your suggestion. I will try that and will post the results. >> >> Mohammad >> >> On Wed, Mar 24, 2021 at 3:23 PM Junchao Zhang >> wrote: >> >>> >>> >>> >>> On Wed, Mar 24, 2021 at 2:17 AM Mohammad Gohardoust >>> wrote: >>> >>>> So the code itself is a finite-element scheme and in stage 1 and 3 there >>>> are expensive loops over entire mesh elements which consume a lot of time. >>>> >>> So these expensive loops must also take half time with newer petsc? And >>> these loops do not call petsc routines? >>> I think you can build two PETSc versions with the same configuration >>> options, then run your code with one MPI rank to see if there is a >>> difference. >>> If they give the same performance, then scale to 2, 4, ... ranks and see >>> what happens. >>> >>> >>> >>>> >>>> Mohammad >>>> >>>> On Tue, Mar 23, 2021 at 6:08 PM Junchao Zhang >>>> wrote: >>>> >>>>> In the new log, I saw >>>>> >>>>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >>>>> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >>>>> 0: Main Stage: 5.4095e+00 2.3% 4.3700e+03 0.0% 4.764e+05 3.0% 3.135e+02 1.0% 2.244e+04 12.6% 1: Solute_Assembly: 1.3977e+02 59.4% 7.3353e+09 4.6% 3.263e+06 20.7% 1.278e+03 26.9% 1.059e+04 6.0% >>>>> >>>>> >>>>> But I didn't see any event in this stage had a cost close to 140s. What >>>>> happened? >>>>> >>>>> --- Event Stage 1: Solute_Assembly >>>>> >>>>> BuildTwoSided 3531 1.0 2.8025e+0026.3 0.00e+00 0.0 3.6e+05 4.0e+00 3.5e+03 1 0 2 0 2 1 0 11 0 33 0 >>>>> BuildTwoSidedF 3531 1.0 2.8678e+0013.2 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 1 0 5 17 2 1 0 22 62 33 0 >>>>> VecScatterBegin 7062 1.0 7.1911e-02 1.9 0.00e+00 0.0 7.1e+05 3.5e+02 0.0e+00 0 0 5 2 0 0 0 22 6 0 0 >>>>> VecScatterEnd 7062 1.0 2.1248e-01 3.0 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 73 >>>>> SFBcastOpBegin 3531 1.0 2.6516e-02 2.4 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 >>>>> SFBcastOpEnd 3531 1.0 9.5041e-02 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> SFReduceBegin 3531 1.0 3.8955e-02 2.1 0.00e+00 0.0 3.6e+05 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 >>>>> SFReduceEnd 3531 1.0 1.3791e-01 3.9 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 112 >>>>> SFPack 7062 1.0 6.5591e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> SFUnpack 7062 1.0 7.4186e-03 2.1 1.60e+06 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2080 >>>>> MatAssemblyBegin 3531 1.0 4.7846e+00 1.1 0.00e+00 0.0 7.1e+05 3.6e+03 3.5e+03 2 0 5 17 2 3 0 22 62 33 0 >>>>> MatAssemblyEnd 3531 1.0 1.5468e+00 2.7 1.68e+07 2.7 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 2 0 0 0 104 >>>>> MatZeroEntries 3531 1.0 3.0998e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> >>>>> >>>>> --Junchao Zhang >>>>> >>>>> >>>>> >>>>> On Tue, Mar 23, 2021 at 5:24 PM Mohammad Gohardoust < >>>>> gohardoust at gmail.com> wrote: >>>>> >>>>>> Thanks Dave for your reply. >>>>>> >>>>>> For sure PETSc is awesome :D >>>>>> >>>>>> Yes, in both cases petsc was configured with --with-debugging=0 and >>>>>> fortunately I do have the old and new -log-veiw outputs which I attached. >>>>>> >>>>>> Best, >>>>>> Mohammad >>>>>> >>>>>> On Tue, Mar 23, 2021 at 1:37 AM Dave May >>>>>> wrote: >>>>>> >>>>>>> Nice to hear! >>>>>>> The answer is simple, PETSc is awesome :) >>>>>>> >>>>>>> Jokes aside, assuming both petsc builds were configured with >>>>>>> ?with-debugging=0, I don?t think there is a definitive answer to your >>>>>>> question with the information you provided. >>>>>>> >>>>>>> It could be as simple as one specific implementation you use was >>>>>>> improved between petsc releases. Not being an Ubuntu expert, the change >>>>>>> might be associated with using a different compiler, and or a more >>>>>>> efficient BLAS implementation (non threaded vs threaded). However I doubt >>>>>>> this is the origin of your 2x performance increase. >>>>>>> >>>>>>> If you really want to understand where the performance improvement >>>>>>> originated from, you?d need to send to the email list the result of >>>>>>> -log_view from both the old and new versions, running the exact same >>>>>>> problem. >>>>>>> >>>>>>> From that info, we can see what implementations in PETSc are being >>>>>>> used and where the time reduction is occurring. Knowing that, it should be >>>>>>> clearer to provide an explanation for it. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Dave >>>>>>> >>>>>>> >>>>>>> On Tue 23. Mar 2021 at 06:24, Mohammad Gohardoust < >>>>>>> gohardoust at gmail.com> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I am using a code which is based on petsc (and also parmetis). >>>>>>>> Recently I made the following changes and now the code is running about two >>>>>>>> times faster than before: >>>>>>>> >>>>>>>> - Upgraded Ubuntu 18.04 to 20.04 >>>>>>>> - Upgraded petsc 3.13.4 to 3.14.5 >>>>>>>> - This time I installed parmetis and metis directly via petsc by >>>>>>>> --download-parmetis --download-metis flags instead of installing them >>>>>>>> separately and using --with-parmetis-include=... and >>>>>>>> --with-parmetis-lib=... (the version of installed parmetis was 4.0.3 before) >>>>>>>> >>>>>>>> I was wondering what can possibly explain this speedup? Does anyone >>>>>>>> have any suggestions? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Mohammad >>>>>>>> >>>>>>> From cebau.mail at gmail.com Sun Mar 28 15:48:52 2021 From: cebau.mail at gmail.com (C B) Date: Sun, 28 Mar 2021 15:48:52 -0500 Subject: [petsc-users] unsubscribe Message-ID: unsubscribe -------------- next part -------------- An HTML attachment was scrubbed... URL: From snailsoar at hotmail.com Sun Mar 28 17:05:21 2021 From: snailsoar at hotmail.com (feng wang) Date: Sun, 28 Mar 2021 22:05:21 +0000 Subject: [petsc-users] Questions on matrix-free GMRES implementation In-Reply-To: <1599C26D-14C3-4EA7-9CD3-F0526F098AD6@petsc.dev> References: <584E6514-C3C6-469B-A256-5470811D8D52@petsc.dev> <38509E59-D27A-47C6-8D97-EAAEBFC15FBF@petsc.dev> <5B018B57-B679-4015-8097-042B7C6B9D38@petsc.dev> <151FDDB8-2384-4A3E-9B17-45318E2CC7CC@petsc.dev> , <1599C26D-14C3-4EA7-9CD3-F0526F098AD6@petsc.dev> Message-ID: Hi Barry, Thanks for your comments. I will try that. Thanks, Feng ________________________________ From: Barry Smith Sent: 26 March 2021 23:44 To: feng wang Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Mar 25, 2021, at 6:39 PM, feng wang > wrote: Hi Barry, Thanks for your comments. I will renumber the cells in the way as you recommended. I went through the manual again and understand how to update the halo elements for my shell matrix routine "mymult(Mat m ,Vec x, Vec y)". I can use the global index of ghost cells for each rank and "Vec x" to get the ghost values for each rank via scattering. It should be similar to the example in page 40 in the manual. One more question, I also have an assembled approximate Jacobian matrix for pre-conditioning GMRES. If I re-number the cells properly as your suggested, I don't need to worry about communication and petsc will handle it properly together with my shell-matrix? If you assembly the approximate Jaocobian using the "new" ordering then it will reflect the same function evaluation and matrix free operators so should be ok. Barry Thanks, Feng ________________________________ From: Barry Smith > Sent: 25 March 2021 0:03 To: feng wang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Mar 24, 2021, at 5:08 AM, feng wang > wrote: Hi Barry, Thanks for your comments. It's very helpful. For your comments, I have a bit more questions 1. for your 1st comment " Yes, in some sense. So long as each process ....". * If I understand it correctly (hopefully) a parallel vector in petsc can hold discontinuous rows of data in a global array. If this is true, If I call "VecGetArray", it would create a copy in a continuous space if the data is not continuous, do some operations and petsc will figure out how to put updated values back to the right place in the global array? * This would generate an overhead. If I do the renumbering to make each process hold continuous rows, this overhead can be avoided when I call "VecGetArray"? GetArray does nothing except return the pointer to the data in the vector. It does not copy anything or reorder anything. Whatever order the numbers are in vector they are in the same order as in the array you obtain with VecGetArray. 1. for your 2nd comment " The matrix and vectors the algebraic solvers see DO NOT have......." For the callback function of my shell matrix "mymult(Mat m ,Vec x, Vec y)", I need to get "x" for the halo elements to compute the non-linear function. My code will take care of other halo exchanges, but I am not sure how to use petsc to get the halo elements "x" in the shell matrix, could you please elaborate on this? some related examples or simple pesudo code would be great. Basically all the parallel code in PETSc does this. How you need to set up the halo communication depends on how you are managing the assignment of degrees of freedom on each process and between processes. VecScatterCreate() is the tool you will use to tell PETSc how to get the correct values from one process to their halo-ed location on the process. It like everything in PETSc uses a number in the vectors of 0 ... n_0-1 on the first process, n_0, n_0+1, ... n_1-1 on the second etc. Since you are managing the partitioning and distribution of parallel data you must renumber the vector entry numbering in your data structures to match that shown above. Just do the numbering once after you have setup your distributed data and use it for the rest of the run. You might use the object from AOCreate to do the renumbering for you. Barry Thanks, Feng ________________________________ From: Barry Smith > Sent: 22 March 2021 1:28 To: feng wang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Mar 21, 2021, at 6:22 PM, feng wang > wrote: Hi Barry, Thanks for your help, I really appreciate it. In the end I used a shell matrix to compute the matrix-vector product, it is clearer to me and there are more things under my control. I am now trying to do a parallel implementation, I have some questions on setting up parallel matrices and vectors for a user-defined partition, could you please provide some advice? Suppose I have already got a partition for 2 CPUs. Each cpu is assigned a list of elements and also their halo elements. 1. The global element index for each partition is not necessarily continuous, do I have to I re-order them to make them continuous? Yes, in some sense. So long as each process can march over ITS elements computing the function and Jacobian matrix-vector product it doesn't matter how you have named/numbered entries. But conceptually the first process has the first set of vector entries and the second the second set. 1. 2. When I set up the size of the matrix and vectors for each cpu, should I take into account the halo elements? The matrix and vectors the algebraic solvers see DO NOT have halo elements in their sizes. You will likely need a halo-ed work vector to do the matrix-free multiply from. The standard model is use VecScatterBegin/End to get the values from the non-halo-ed algebraic vector input to MatMult into a halo-ed one to do the local product. 1. In my serial version, when I initialize my RHS vector, I am not using VecSetValues, Instead I use VecGetArray/VecRestoreArray to assign the values. VecAssemblyBegin()/VecAssemblyEnd() is never used. would this still work for a parallel version? Yes, you can use Get/Restore but the input vector x will need to be, as noted above, scattered into a haloed version to get all the entries you will need to do the local part of the product. Thanks, Feng ________________________________ From: Barry Smith > Sent: 12 March 2021 23:40 To: feng wang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Mar 12, 2021, at 9:37 AM, feng wang > wrote: Hi Matt, Thanks for your prompt response. Below are my two versions. one is buggy and the 2nd one is working. For the first one, I add the diagonal contribution to the true RHS (variable: rhs) and then set the base point, the callback function is somehow called twice afterwards to compute Jacobian. Do you mean "to compute the Jacobian matrix-vector product?" Is it only in the first computation of the product (for the given base vector) that it calls it twice or every matrix-vector product? It is possible there is a bug in our logic; run in the debugger with a break point in FormFunction_mf and each time the function is hit in the debugger type where or bt to get the stack frames from the calls. Send this. From this we can all see if it is being called excessively and why. For the 2nd one, I just call the callback function manually to recompute everything, the callback function is then called once as expected to compute the Jacobian. For me, both versions should do the same things. but I don't know why in the first one the callback function is called twice after I set the base point. what could possibly go wrong? The logic of how it is suppose to work is shown below. Thanks, Feng //This does not work fld->cnsv( iqs,iqe, q, aux, csv ); //add contribution of time-stepping for(iv=0; ivcnsv( iqs,iqe, q, aux, csv ); ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); ierr = FormFunction_mf(this, petsc_csv, petsc_baserhs); //this is my callback function, now call it manually ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); Since you provide petsc_baserhs MatMFFD assumes (naturally) that you will keep the correct values in it. Hence for each new base value YOU need to compute the new values in petsc_baserhs. This approach gives you a bit more control over reusing the information in petsc_baserhs. If you would prefer that MatMFFD recomputes the base values, as needed, then you call FormFunction_mf(this, petsc_csv, NULL); and PETSc will allocate a vector and fill it up as needed by calling your FormFunction_mf() But you need to call MatAssemblyBegin/End each time you the base input vector this, petsc_csv values change. For example MatAssemblyBegin(petsc_A_mf,...) MatAssemblyEnd(petsc_A_mf,...) KSPSolve() ________________________________ From: Matthew Knepley > Sent: 12 March 2021 15:08 To: feng wang > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 9:55 AM feng wang > wrote: Hi Mat, Thanks for your reply. I will try the parallel implementation. I've got a serial matrix-free GMRES working, but I would like to know why my initial version of matrix-free implementation does not work and there is still something I don't understand. I did some debugging and find that the callback function to compute the RHS for the matrix-free matrix is called twice by Petsc when it computes the finite difference Jacobian, but it should only be called once. I don't know why, could you please give some advice? F is called once to calculate the base point and once to get the perturbation. The base point is not recalculated, so if you do many iterates, it is amortized. Thanks, Matt Thanks, Feng ________________________________ From: Matthew Knepley > Sent: 12 March 2021 12:05 To: feng wang > Cc: Barry Smith >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation On Fri, Mar 12, 2021 at 6:02 AM feng wang > wrote: Hi Barry, Thanks for your advice. You are right on this. somehow there is some inconsistency when I compute the right hand side (true RHS + time-stepping contribution to the diagonal matrix) to compute the finite difference Jacobian. If I just use the call back function to recompute my RHS before I call MatMFFDSetBase, then it works like a charm. But now I end up with computing my RHS three times. 1st time is to compute the true RHS, the rest two is for computing finite difference Jacobian. In my previous buggy version, I only compute RHS twice. If possible, could you elaborate on your comments "Also be careful about petsc_baserhs", so I may possibly understand what was going on with my buggy version. Our FD implementation is simple. It approximates the action of the Jacobian as J(b) v = (F(b + h v) - F(b)) / h ||v|| where h is some small parameter and b is the base vector, namely the one that you are linearizing around. In a Newton step, b is the previous solution and v is the proposed solution update. Besides, for a parallel implementation, my code already has its own partition method, is it possible to allow petsc read in a user-defined partition? if not what is a better way to do this? Sure https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html Thanks, Matt Many thanks, Feng ________________________________ From: Barry Smith > Sent: 11 March 2021 22:15 To: feng wang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation Feng, The first thing to check is that for each linear solve that involves a new operator (values in the base vector) the MFFD matrix knows it is using a new operator. The easiest way is to call MatMFFDSetBase() before each solve that involves a new operator (new values in the base vector). Also be careful about petsc_baserhs, when you change the base vector's values you also need to change the petsc_baserhs values to the function evaluation at that point. If that is correct I would check with a trivial function evaluator to make sure the infrastructure is all set up correctly. For examples use for the matrix free a 1 4 1 operator applied matrix free. Barry On Mar 11, 2021, at 7:35 AM, feng wang > wrote: Dear All, I am new to petsc and trying to implement a matrix-free GMRES. I have assembled an approximate Jacobian matrix just for preconditioning. After reading some previous questions on this topic, my approach is: the matrix-free matrix is created as: ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); CHKERRQ(ierr); KSP linear operator is set up as: ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix Before calling KSPSolve, I do: ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the pre-computed right hand side The call back function is defined as: PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec out_vec) { PetscErrorCode ierr; cFdDomain *user_ctx; cout << "FormFunction_mf called\n"; //in_vec: flow states //out_vec: right hand side + diagonal contributions from CFL number user_ctx = (cFdDomain*)ctx; //get perturbed conservative variables from petsc user_ctx->petsc_getcsv(in_vec); //get new right side user_ctx->petsc_fd_rhs(); //set new right hand side to the output vector user_ctx->petsc_setrhs(out_vec); ierr = 0; return ierr; } The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. D is a diagonal matrix and it is used to stabilise the solution at the start but reduced gradually when the solution moves on to recover Newton's method. I add D*x to the true right side when non-linear function is computed to work out finite difference Jacobian, so when finite difference is used, it actually computes (J+D)*dx. The code runs but diverges in the end. If I don't do matrix-free and use my approximate Jacobian matrix, GMRES works. So something is wrong with my matrix-free implementation. Have I missed something in my implementation? Besides, is there a way to check if the finite difference Jacobian matrix is computed correctly in a matrix-free implementation? Thanks for your help in advance. Feng -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Sun Mar 28 19:48:20 2021 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Sun, 28 Mar 2021 19:48:20 -0500 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: <87mtunnhlh.fsf@jedbrown.org> References: <87mtunnhlh.fsf@jedbrown.org> Message-ID: Is there an option to turn off MAT_SUBSET_OFF_PROC_ENTRIES for Mohammad to try? --Junchao Zhang On Sun, Mar 28, 2021 at 3:34 PM Jed Brown wrote: > I take it this was using MAT_SUBSET_OFF_PROC_ENTRIES. I implemented that > to help performance of PHASTA and other applications that assemble matrices > that are relatively cheap to solve (so assembly cost is significant > compared to preconditioner setup and KSPSolve) and I'm glad it helps so > much here. > > I don't have an explanation for why you're observing local vector > operations like VecScale and VecMAXPY running over twice as fast in the new > code. These consist of simple code that has not changed, and which are > normally memory bandwidth limited (though some of your problem sizes might > fit in cache). > > Mohammad Gohardoust writes: > > > Here is the plot of run time in old and new petsc using 1,2,4,8, and 16 > > CPUs (in logarithmic scale): > > > > [image: Screenshot from 2021-03-28 10-48-56.png] > > > > > > > > > > On Thu, Mar 25, 2021 at 12:51 PM Mohammad Gohardoust < > gohardoust at gmail.com> > > wrote: > > > >> That's right, these loops also take roughly half time as well. If I am > not > >> mistaken, petsc (MatSetValue) is called after doing some calculations > over > >> each tetrahedral element. > >> Thanks for your suggestion. I will try that and will post the results. > >> > >> Mohammad > >> > >> On Wed, Mar 24, 2021 at 3:23 PM Junchao Zhang > >> wrote: > >> > >>> > >>> > >>> > >>> On Wed, Mar 24, 2021 at 2:17 AM Mohammad Gohardoust < > gohardoust at gmail.com> > >>> wrote: > >>> > >>>> So the code itself is a finite-element scheme and in stage 1 and 3 > there > >>>> are expensive loops over entire mesh elements which consume a lot of > time. > >>>> > >>> So these expensive loops must also take half time with newer petsc? > And > >>> these loops do not call petsc routines? > >>> I think you can build two PETSc versions with the same configuration > >>> options, then run your code with one MPI rank to see if there is a > >>> difference. > >>> If they give the same performance, then scale to 2, 4, ... ranks and > see > >>> what happens. > >>> > >>> > >>> > >>>> > >>>> Mohammad > >>>> > >>>> On Tue, Mar 23, 2021 at 6:08 PM Junchao Zhang < > junchao.zhang at gmail.com> > >>>> wrote: > >>>> > >>>>> In the new log, I saw > >>>>> > >>>>> Summary of Stages: ----- Time ------ ----- Flop ------ --- > Messages --- -- Message Lengths -- -- Reductions -- > >>>>> Avg %Total Avg %Total Count > %Total Avg %Total Count %Total > >>>>> 0: Main Stage: 5.4095e+00 2.3% 4.3700e+03 0.0% > 4.764e+05 3.0% 3.135e+02 1.0% 2.244e+04 12.6% 1: > Solute_Assembly: 1.3977e+02 59.4% 7.3353e+09 4.6% 3.263e+06 20.7% > 1.278e+03 26.9% 1.059e+04 6.0% > >>>>> > >>>>> > >>>>> But I didn't see any event in this stage had a cost close to 140s. > What > >>>>> happened? > >>>>> > >>>>> --- Event Stage 1: Solute_Assembly > >>>>> > >>>>> BuildTwoSided 3531 1.0 2.8025e+0026.3 0.00e+00 0.0 3.6e+05 > 4.0e+00 3.5e+03 1 0 2 0 2 1 0 11 0 33 0 > >>>>> BuildTwoSidedF 3531 1.0 2.8678e+0013.2 0.00e+00 0.0 7.1e+05 > 3.6e+03 3.5e+03 1 0 5 17 2 1 0 22 62 33 0 > >>>>> VecScatterBegin 7062 1.0 7.1911e-02 1.9 0.00e+00 0.0 7.1e+05 > 3.5e+02 0.0e+00 0 0 5 2 0 0 0 22 6 0 0 > >>>>> VecScatterEnd 7062 1.0 2.1248e-01 3.0 1.60e+06 2.7 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 73 > >>>>> SFBcastOpBegin 3531 1.0 2.6516e-02 2.4 0.00e+00 0.0 3.6e+05 > 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 > >>>>> SFBcastOpEnd 3531 1.0 9.5041e-02 4.7 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >>>>> SFReduceBegin 3531 1.0 3.8955e-02 2.1 0.00e+00 0.0 3.6e+05 > 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 > >>>>> SFReduceEnd 3531 1.0 1.3791e-01 3.9 1.60e+06 2.7 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 112 > >>>>> SFPack 7062 1.0 6.5591e-03 2.5 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >>>>> SFUnpack 7062 1.0 7.4186e-03 2.1 1.60e+06 2.7 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2080 > >>>>> MatAssemblyBegin 3531 1.0 4.7846e+00 1.1 0.00e+00 0.0 7.1e+05 > 3.6e+03 3.5e+03 2 0 5 17 2 3 0 22 62 33 0 > >>>>> MatAssemblyEnd 3531 1.0 1.5468e+00 2.7 1.68e+07 2.7 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 1 2 0 0 0 104 > >>>>> MatZeroEntries 3531 1.0 3.0998e-02 1.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > >>>>> > >>>>> > >>>>> --Junchao Zhang > >>>>> > >>>>> > >>>>> > >>>>> On Tue, Mar 23, 2021 at 5:24 PM Mohammad Gohardoust < > >>>>> gohardoust at gmail.com> wrote: > >>>>> > >>>>>> Thanks Dave for your reply. > >>>>>> > >>>>>> For sure PETSc is awesome :D > >>>>>> > >>>>>> Yes, in both cases petsc was configured with --with-debugging=0 and > >>>>>> fortunately I do have the old and new -log-veiw outputs which I > attached. > >>>>>> > >>>>>> Best, > >>>>>> Mohammad > >>>>>> > >>>>>> On Tue, Mar 23, 2021 at 1:37 AM Dave May > >>>>>> wrote: > >>>>>> > >>>>>>> Nice to hear! > >>>>>>> The answer is simple, PETSc is awesome :) > >>>>>>> > >>>>>>> Jokes aside, assuming both petsc builds were configured with > >>>>>>> ?with-debugging=0, I don?t think there is a definitive answer to > your > >>>>>>> question with the information you provided. > >>>>>>> > >>>>>>> It could be as simple as one specific implementation you use was > >>>>>>> improved between petsc releases. Not being an Ubuntu expert, the > change > >>>>>>> might be associated with using a different compiler, and or a more > >>>>>>> efficient BLAS implementation (non threaded vs threaded). However > I doubt > >>>>>>> this is the origin of your 2x performance increase. > >>>>>>> > >>>>>>> If you really want to understand where the performance improvement > >>>>>>> originated from, you?d need to send to the email list the result of > >>>>>>> -log_view from both the old and new versions, running the exact > same > >>>>>>> problem. > >>>>>>> > >>>>>>> From that info, we can see what implementations in PETSc are being > >>>>>>> used and where the time reduction is occurring. Knowing that, it > should be > >>>>>>> clearer to provide an explanation for it. > >>>>>>> > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Dave > >>>>>>> > >>>>>>> > >>>>>>> On Tue 23. Mar 2021 at 06:24, Mohammad Gohardoust < > >>>>>>> gohardoust at gmail.com> wrote: > >>>>>>> > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> I am using a code which is based on petsc (and also parmetis). > >>>>>>>> Recently I made the following changes and now the code is running > about two > >>>>>>>> times faster than before: > >>>>>>>> > >>>>>>>> - Upgraded Ubuntu 18.04 to 20.04 > >>>>>>>> - Upgraded petsc 3.13.4 to 3.14.5 > >>>>>>>> - This time I installed parmetis and metis directly via petsc > by > >>>>>>>> --download-parmetis --download-metis flags instead of > installing them > >>>>>>>> separately and using --with-parmetis-include=... and > >>>>>>>> --with-parmetis-lib=... (the version of installed parmetis was > 4.0.3 before) > >>>>>>>> > >>>>>>>> I was wondering what can possibly explain this speedup? Does > anyone > >>>>>>>> have any suggestions? > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Mohammad > >>>>>>>> > >>>>>>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun Mar 28 22:11:35 2021 From: jed at jedbrown.org (Jed Brown) Date: Sun, 28 Mar 2021 21:11:35 -0600 Subject: [petsc-users] Code speedup after upgrading In-Reply-To: References: <87mtunnhlh.fsf@jedbrown.org> Message-ID: <87blb27ixk.fsf@jedbrown.org> It's an option that he would have set explicitly via MatSetOption, following Lawrence's suggestion. He can either not call that function or use PETSC_FALSE to unset it. Junchao Zhang writes: > Is there an option to turn off MAT_SUBSET_OFF_PROC_ENTRIES for Mohammad to > try? > > --Junchao Zhang > > > On Sun, Mar 28, 2021 at 3:34 PM Jed Brown wrote: > >> I take it this was using MAT_SUBSET_OFF_PROC_ENTRIES. I implemented that >> to help performance of PHASTA and other applications that assemble matrices >> that are relatively cheap to solve (so assembly cost is significant >> compared to preconditioner setup and KSPSolve) and I'm glad it helps so >> much here. >> >> I don't have an explanation for why you're observing local vector >> operations like VecScale and VecMAXPY running over twice as fast in the new >> code. These consist of simple code that has not changed, and which are >> normally memory bandwidth limited (though some of your problem sizes might >> fit in cache). >> >> Mohammad Gohardoust writes: >> >> > Here is the plot of run time in old and new petsc using 1,2,4,8, and 16 >> > CPUs (in logarithmic scale): >> > >> > [image: Screenshot from 2021-03-28 10-48-56.png] >> > >> > >> > >> > >> > On Thu, Mar 25, 2021 at 12:51 PM Mohammad Gohardoust < >> gohardoust at gmail.com> >> > wrote: >> > >> >> That's right, these loops also take roughly half time as well. If I am >> not >> >> mistaken, petsc (MatSetValue) is called after doing some calculations >> over >> >> each tetrahedral element. >> >> Thanks for your suggestion. I will try that and will post the results. >> >> >> >> Mohammad >> >> >> >> On Wed, Mar 24, 2021 at 3:23 PM Junchao Zhang >> >> wrote: >> >> >> >>> >> >>> >> >>> >> >>> On Wed, Mar 24, 2021 at 2:17 AM Mohammad Gohardoust < >> gohardoust at gmail.com> >> >>> wrote: >> >>> >> >>>> So the code itself is a finite-element scheme and in stage 1 and 3 >> there >> >>>> are expensive loops over entire mesh elements which consume a lot of >> time. >> >>>> >> >>> So these expensive loops must also take half time with newer petsc? >> And >> >>> these loops do not call petsc routines? >> >>> I think you can build two PETSc versions with the same configuration >> >>> options, then run your code with one MPI rank to see if there is a >> >>> difference. >> >>> If they give the same performance, then scale to 2, 4, ... ranks and >> see >> >>> what happens. >> >>> >> >>> >> >>> >> >>>> >> >>>> Mohammad >> >>>> >> >>>> On Tue, Mar 23, 2021 at 6:08 PM Junchao Zhang < >> junchao.zhang at gmail.com> >> >>>> wrote: >> >>>> >> >>>>> In the new log, I saw >> >>>>> >> >>>>> Summary of Stages: ----- Time ------ ----- Flop ------ --- >> Messages --- -- Message Lengths -- -- Reductions -- >> >>>>> Avg %Total Avg %Total Count >> %Total Avg %Total Count %Total >> >>>>> 0: Main Stage: 5.4095e+00 2.3% 4.3700e+03 0.0% >> 4.764e+05 3.0% 3.135e+02 1.0% 2.244e+04 12.6% 1: >> Solute_Assembly: 1.3977e+02 59.4% 7.3353e+09 4.6% 3.263e+06 20.7% >> 1.278e+03 26.9% 1.059e+04 6.0% >> >>>>> >> >>>>> >> >>>>> But I didn't see any event in this stage had a cost close to 140s. >> What >> >>>>> happened? >> >>>>> >> >>>>> --- Event Stage 1: Solute_Assembly >> >>>>> >> >>>>> BuildTwoSided 3531 1.0 2.8025e+0026.3 0.00e+00 0.0 3.6e+05 >> 4.0e+00 3.5e+03 1 0 2 0 2 1 0 11 0 33 0 >> >>>>> BuildTwoSidedF 3531 1.0 2.8678e+0013.2 0.00e+00 0.0 7.1e+05 >> 3.6e+03 3.5e+03 1 0 5 17 2 1 0 22 62 33 0 >> >>>>> VecScatterBegin 7062 1.0 7.1911e-02 1.9 0.00e+00 0.0 7.1e+05 >> 3.5e+02 0.0e+00 0 0 5 2 0 0 0 22 6 0 0 >> >>>>> VecScatterEnd 7062 1.0 2.1248e-01 3.0 1.60e+06 2.7 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 73 >> >>>>> SFBcastOpBegin 3531 1.0 2.6516e-02 2.4 0.00e+00 0.0 3.6e+05 >> 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 >> >>>>> SFBcastOpEnd 3531 1.0 9.5041e-02 4.7 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >>>>> SFReduceBegin 3531 1.0 3.8955e-02 2.1 0.00e+00 0.0 3.6e+05 >> 3.5e+02 0.0e+00 0 0 2 1 0 0 0 11 3 0 0 >> >>>>> SFReduceEnd 3531 1.0 1.3791e-01 3.9 1.60e+06 2.7 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 112 >> >>>>> SFPack 7062 1.0 6.5591e-03 2.5 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >>>>> SFUnpack 7062 1.0 7.4186e-03 2.1 1.60e+06 2.7 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2080 >> >>>>> MatAssemblyBegin 3531 1.0 4.7846e+00 1.1 0.00e+00 0.0 7.1e+05 >> 3.6e+03 3.5e+03 2 0 5 17 2 3 0 22 62 33 0 >> >>>>> MatAssemblyEnd 3531 1.0 1.5468e+00 2.7 1.68e+07 2.7 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 1 2 0 0 0 104 >> >>>>> MatZeroEntries 3531 1.0 3.0998e-02 1.2 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >>>>> >> >>>>> >> >>>>> --Junchao Zhang >> >>>>> >> >>>>> >> >>>>> >> >>>>> On Tue, Mar 23, 2021 at 5:24 PM Mohammad Gohardoust < >> >>>>> gohardoust at gmail.com> wrote: >> >>>>> >> >>>>>> Thanks Dave for your reply. >> >>>>>> >> >>>>>> For sure PETSc is awesome :D >> >>>>>> >> >>>>>> Yes, in both cases petsc was configured with --with-debugging=0 and >> >>>>>> fortunately I do have the old and new -log-veiw outputs which I >> attached. >> >>>>>> >> >>>>>> Best, >> >>>>>> Mohammad >> >>>>>> >> >>>>>> On Tue, Mar 23, 2021 at 1:37 AM Dave May >> >>>>>> wrote: >> >>>>>> >> >>>>>>> Nice to hear! >> >>>>>>> The answer is simple, PETSc is awesome :) >> >>>>>>> >> >>>>>>> Jokes aside, assuming both petsc builds were configured with >> >>>>>>> ?with-debugging=0, I don?t think there is a definitive answer to >> your >> >>>>>>> question with the information you provided. >> >>>>>>> >> >>>>>>> It could be as simple as one specific implementation you use was >> >>>>>>> improved between petsc releases. Not being an Ubuntu expert, the >> change >> >>>>>>> might be associated with using a different compiler, and or a more >> >>>>>>> efficient BLAS implementation (non threaded vs threaded). However >> I doubt >> >>>>>>> this is the origin of your 2x performance increase. >> >>>>>>> >> >>>>>>> If you really want to understand where the performance improvement >> >>>>>>> originated from, you?d need to send to the email list the result of >> >>>>>>> -log_view from both the old and new versions, running the exact >> same >> >>>>>>> problem. >> >>>>>>> >> >>>>>>> From that info, we can see what implementations in PETSc are being >> >>>>>>> used and where the time reduction is occurring. Knowing that, it >> should be >> >>>>>>> clearer to provide an explanation for it. >> >>>>>>> >> >>>>>>> >> >>>>>>> Thanks, >> >>>>>>> Dave >> >>>>>>> >> >>>>>>> >> >>>>>>> On Tue 23. Mar 2021 at 06:24, Mohammad Gohardoust < >> >>>>>>> gohardoust at gmail.com> wrote: >> >>>>>>> >> >>>>>>>> Hi, >> >>>>>>>> >> >>>>>>>> I am using a code which is based on petsc (and also parmetis). >> >>>>>>>> Recently I made the following changes and now the code is running >> about two >> >>>>>>>> times faster than before: >> >>>>>>>> >> >>>>>>>> - Upgraded Ubuntu 18.04 to 20.04 >> >>>>>>>> - Upgraded petsc 3.13.4 to 3.14.5 >> >>>>>>>> - This time I installed parmetis and metis directly via petsc >> by >> >>>>>>>> --download-parmetis --download-metis flags instead of >> installing them >> >>>>>>>> separately and using --with-parmetis-include=... and >> >>>>>>>> --with-parmetis-lib=... (the version of installed parmetis was >> 4.0.3 before) >> >>>>>>>> >> >>>>>>>> I was wondering what can possibly explain this speedup? Does >> anyone >> >>>>>>>> have any suggestions? >> >>>>>>>> >> >>>>>>>> Thanks, >> >>>>>>>> Mohammad >> >>>>>>>> >> >>>>>>> >> From sayosale at hotmail.com Mon Mar 29 04:59:00 2021 From: sayosale at hotmail.com (dazza simplythebest) Date: Mon, 29 Mar 2021 09:59:00 +0000 Subject: [petsc-users] Newbie question: Something is wrong with this Slepc simple example for generalised eigenvalue problem In-Reply-To: <52F02412-EAAE-4A66-97AF-AD97E5A3AED5@dsic.upv.es> References: , <52F02412-EAAE-4A66-97AF-AD97E5A3AED5@dsic.upv.es> Message-ID: Dear Jose, Many thanks for your reply, it enabled me to solve the problem I described below. In particular, I recompiled PETSC and SLEPC with MUMPS and SCALAPACK included (by including the following lines in the configuration file: '--download-mumps', '--download-scalapack', '--download-cmake', ) and then the program (unchanged from the previous attempt) then both compiled and ran straightaway with no hanging. Just in case someone is following this example later, the correct eigenvalues are (I have checked them against an independent lapack zggev code): 0 (15.7283987073479, 79.3812583009335) 1 (10.3657189951037, 65.4935496512632) 2 (20.2726807729152, 60.7235264113338) 3 (15.8693370278539, 54.4403170495817) 4 (8.93270390707530, 42.0676105026967) 5 (18.0161334989426, 31.7976217614629) 6 (16.2219350827311, 26.7999463239364) 7 (6.64653598248233, 19.2535093354505) 8 (7.23494239184217, 4.58606776802574) 9 (3.68158090200136, 1.65838104812904) The problem would then indeed appear to have been - exactly as you suggested - the fact that the generalised eigenvalue problem requires linear systems to be solved as part of the solution process, but in the previous attempt I was trying to solve an MPI-distributed system using a sequential solver. A couple of other quick points / queries: 1) By 'debug mode' you mean compiling the library with '--with-debugging=1'? 2) Out of interest, I am wondering out of interest which LU solver was used - scalapack or MUMPS ? I could see both these libraries in the linking command, is there an easy way to find out which solver was actually called ? 2) One slightly strange thing is that I also compiled the library with '--with-64-bit-indices=1' for 64-bit memory addressing, but I noticed in the compilation commands generated by the make command there is no '-i8' flag, which is used with mpiifort to request 64 bit integers. Is it the case that this is not required since everything is somehow taken care of by Petsc custom data type ( PetscInt) ? As an experiment I tried manually inserting the -i8 and got a few errors like: /data/work/rotplane/omega_to_zero/stability/test/tmp10/tmp3/write_and_solve8.F(38): warning #6075: The data type of the actual argument does not match the definition. [PETSC_COMM_WORLD] call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);if (ierr .ne. 0) th A big thank you once again, Best wishes, Dan. ________________________________ From: Jose E. Roman Sent: Sunday, March 28, 2021 4:42 PM To: dazza simplythebest Cc: PETSc users list Subject: Re: [petsc-users] Newbie question: Something is wrong with this Slepc simple example for generalised eigenvalue problem You should run in debug mode until you get a correct code, otherwise you may no see some error messages. Also, it is recommended to add error checking after every call to PETSc, otherwise the execution continues and may get blocked. See the CHKERRA macro in https://slepc.upv.es/documentation/current/src/eps/tutorials/ex1f90.F90.html The problem you are probably having is that you are running with several MPI processes, so you need a parallel LU solver. See FAQ #10 https://slepc.upv.es/documentation/faq.htm Jose > El 28 mar 2021, a las 11:25, dazza simplythebest escribi?: > > Dear All, > I am seeking to use slepc/petsc to solve a generalised eigenvalue problem > that arises in a hydrodynamic stability problem. The code is a parallelisation of an existing > serial Fortran code. Before I get to grips with this target problem, I would of course like to get > some relevant examples working. I have installed petsc/ slepc seemingly without any problem, > and the provided slepc example fortran program ex1f.F, which solves a regular eigenvalue > problem Ax = lambda x, seemed to compile and run correctly. > I have now written a short program to instead solve the complex generalised > problem Ax = lambda B x (see below) . This code compiles and runs w/out errors but > for some reason hangs when calling EPSSolve - we enter EPSSolve but never leave. > The matrices appear to be correctly assembled -all the values are correct in the Matview > printout, so I am not quite sure where I have gone wrong, can anyone spot my mistake? > ( Note that for the actual problem I wish to solve I have already written the code to construct the matrix, > which distributes the rows across the processes and it is fully tested and working. Hence I want to specify > the distribution of rows and not leave it up to a PETS_DECIDE .) > I would be very grateful if someone can point out what is wrong with this small example code (see below), > Many thanks, > Dan. > > ! this program MUST be run with NGLOBAL = 10, MY_NO_ROWS = 5 > ! and two MPI processes since row distribution is hard-baked into code > ! > program main > #include > use slepceps > implicit none > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ! Declarations > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ! > ! Variables: > ! A , B double complex operator matrices > ! eps eigenproblem solver context > > Mat A,B > EPS eps > EPSType tname > PetscReal tol, error > PetscScalar kr, ki > Vec xr, xi > PetscInt NGLOBAL , MY_NO_ROWS, NL3, owner > PetscInt nev, maxit, its, nconv > PetscInt i,j,ic,jc > PetscReal reala, imaga, realb, imagb, di, dj > PetscScalar a_entry, b_entry > PetscMPIInt rank > PetscErrorCode ierr > PetscInt,parameter :: zero = 0, one = 1, two = 2, three = 3 > > PetscInt M1, N1, mm1, nn1 > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ! Beginning of program > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > call SlepcInitialize(PETSC_NULL_CHARACTER,ierr) > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) > ! make sure you set NGLOBAL = 10, MY_NO_ROWS = 5 and run with two processes > NGLOBAL = 10 > MY_NO_ROWS = 5 > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ! Compute the operator matrices that define the eigensystem, Ax=kBx > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > !!!!!!!!! Setup A matrix > > call MatCreate(PETSC_COMM_WORLD,A,ierr) > call MatSetsizes(A,MY_NO_ROWS, MY_NO_ROWS ,NGLOBAL,NGLOBAL,ierr) > call MatSetFromOptions(A,ierr) > call MatGetSize(A,M1,N1,ierr) > write(*,*)'Rank [',rank,']: global size of A is ',M1, N1 > call MatGetLocalSize(A,mm1,nn1,ierr) > write(*,*)'Rank [',rank,']: my local size of A is ',mm1, nn1 > call MatMPIAIJSetPreallocation(A,three, PETSC_NULL_INTEGER,one, & > & PETSC_NULL_INTEGER,ierr) !parallel (MPI) allocation > > !!!!!!!!! Setup B matrix > call MatCreate(PETSC_COMM_WORLD,B,ierr) > call MatSetsizes(B,MY_NO_ROWS, MY_NO_ROWS ,NGLOBAL,NGLOBAL,ierr) > call MatSetFromOptions(B,ierr) > call MatGetSize(B,M1,N1,ierr) > write(*,*)'Rank [',rank,']: global size of B is ',M1, N1 > call MatGetLocalSize(B,mm1,nn1,ierr) > write(*,*)'Rank [',rank,']: my local size of B is ',mm1, nn1 > > call MatMPIAIJSetPreallocation(B,three, PETSC_NULL_INTEGER,one, & > & PETSC_NULL_INTEGER,ierr) !parallel (MPI) allocation > > ! initalise > call MatZeroEntries(A,ierr) > call MatZeroEntries(B,ierr) > > ! Fill in values of A, B and assemble matrices > ! Both matrices are tridiagonal with > ! Aij = cmplx( (i-j)**2, (i+j)**2) > ! Bij = cmplx( ij/i + j, (i/j)**2) > ! (numbering from 1 ) > > do i = 1, NGLOBAL > ! a rather crude way to distribute rows > if (i < 6) owner = 0 > if (i >= 6) owner = 1 > if (rank /= owner) cycle > do j = 1, NGLOBAL > if ( abs(i-j) < 2 ) then > write(*,*)rank,' : Setting ',i,j > di = dble(i) ; dj = dble(j) > > reala = (di - dj)**2 ; imaga = (di + dj)**2 > a_entry = dcmplx(reala, imaga) > realb = (di*dj)/(di + dj) ; imagb = di**2/dj**2 > b_entry = dcmplx(realb, imagb) > > ic = i -1 ; jc = j-1 ! convert to C indexing > call MatSetValue(A, ic, jc, a_entry, ADD_VALUES,ierr) > call MatSetValue(B, ic, jc, b_entry, ADD_VALUES,ierr) > endif > enddo > enddo > > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,ierr) > > ! Check matrices > write(*,*)'A matrix ... ' > call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr) > write(*,*)'B matrix ... ' > call MatView(B,PETSC_VIEWER_STDOUT_WORLD,ierr) > > call MatCreateVecs(A,PETSC_NULL_VEC,xr) > call MatCreateVecs(A, PETSC_NULL_VEC,xi) > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ! Create the eigensolver and display info > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ! ** Create eigensolver context > call EPSCreate(PETSC_COMM_WORLD,eps,ierr) > > ! ** Set operators.for general problem Ax = lambda B x > call EPSSetOperators(eps,A, B, ierr) > call EPSSetProblemType(eps,EPS_GNHEP,ierr) > > ! ** Set solver parameters at runtime > call EPSSetFromOptions(eps,ierr) > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > ! Solve the eigensystem > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > write(*,*)'rank',rank, 'entering solver ...' > call EPSSolve(eps,ierr) > > ! ** Free work space > call EPSDestroy(eps,ierr) > call MatDestroy(A,ierr) > call MatDestroy(B,ierr) > > call VecDestroy(xr,ierr) > call VecDestroy(xi,ierr) > > call SlepcFinalize(ierr) > end -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Mon Mar 29 05:15:38 2021 From: jroman at dsic.upv.es (Jose E. Roman) Date: Mon, 29 Mar 2021 12:15:38 +0200 Subject: [petsc-users] Newbie question: Something is wrong with this Slepc simple example for generalised eigenvalue problem In-Reply-To: References: <52F02412-EAAE-4A66-97AF-AD97E5A3AED5@dsic.upv.es> Message-ID: <482ECB3C-20B8-4346-A320-829EDCFA5AE5@dsic.upv.es> > El 29 mar 2021, a las 11:59, dazza simplythebest escribi?: > > Dear Jose, > Many thanks for your reply, it enabled me to solve the problem I described below. > In particular, I recompiled PETSC and SLEPC with MUMPS and SCALAPACK included > (by including the following lines in the configuration file: > '--download-mumps', > '--download-scalapack', > '--download-cmake', > ) > and then the program (unchanged from the previous attempt) then both compiled > and ran straightaway with no hanging. Just in case someone is following this example later, > the correct eigenvalues are (I have checked them against an independent lapack zggev code): > 0 (15.7283987073479, 79.3812583009335) > 1 (10.3657189951037, 65.4935496512632) > 2 (20.2726807729152, 60.7235264113338) > 3 (15.8693370278539, 54.4403170495817) > 4 (8.93270390707530, 42.0676105026967) > 5 (18.0161334989426, 31.7976217614629) > 6 (16.2219350827311, 26.7999463239364) > 7 (6.64653598248233, 19.2535093354505) > 8 (7.23494239184217, 4.58606776802574) > 9 (3.68158090200136, 1.65838104812904) > > The problem would then indeed appear to have been - exactly as you suggested - > the fact that the generalised eigenvalue problem requires linear systems to be > solved as part of the solution process, but in the previous attempt I was trying > to solve an MPI-distributed system using a sequential solver. > > A couple of other quick points / queries: > 1) By 'debug mode' you mean compiling the library with '--with-debugging=1'? Yes. > 2) Out of interest, I am wondering out of interest which LU solver was used - scalapack or > MUMPS ? I could see both these libraries in the linking command, is there an easy way > to find out which solver was actually called ? MUMPS (which uses ScaLAPACK internally). Run with -eps_view and you will see the details of the solver being used. > 2) One slightly strange thing is that I also compiled the library with '--with-64-bit-indices=1' > for 64-bit memory addressing, but I noticed in the compilation commands generated > by the make command there is no '-i8' flag, which is used with mpiifort to request 64 bit > integers. Is it the case that this is not required since everything is somehow taken care of by > Petsc custom data type ( PetscInt) ? As an experiment I tried manually inserting the -i8 > and got a few errors like: > /data/work/rotplane/omega_to_zero/stability/test/tmp10/tmp3/write_and_solve8.F(38): warning #6075: The data type of the actual argument does not match the definition. [PETSC_COMM_WORLD] > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);if (ierr .ne. 0) th If you use PetscInt throughout your code, then no -i8 flag is required. Note that the argument of MPI functions should be PetscMPIInt, not PetscInt. Jose > > > A big thank you once again, > Best wishes, > Dan. > > From: Jose E. Roman > Sent: Sunday, March 28, 2021 4:42 PM > To: dazza simplythebest > Cc: PETSc users list > Subject: Re: [petsc-users] Newbie question: Something is wrong with this Slepc simple example for generalised eigenvalue problem > > You should run in debug mode until you get a correct code, otherwise you may no see some error messages. Also, it is recommended to add error checking after every call to PETSc, otherwise the execution continues and may get blocked. See the CHKERRA macro in https://slepc.upv.es/documentation/current/src/eps/tutorials/ex1f90.F90.html > > The problem you are probably having is that you are running with several MPI processes, so you need a parallel LU solver. See FAQ #10 https://slepc.upv.es/documentation/faq.htm > > Jose > > > > El 28 mar 2021, a las 11:25, dazza simplythebest escribi?: > > > > Dear All, > > I am seeking to use slepc/petsc to solve a generalised eigenvalue problem > > that arises in a hydrodynamic stability problem. The code is a parallelisation of an existing > > serial Fortran code. Before I get to grips with this target problem, I would of course like to get > > some relevant examples working. I have installed petsc/ slepc seemingly without any problem, > > and the provided slepc example fortran program ex1f.F, which solves a regular eigenvalue > > problem Ax = lambda x, seemed to compile and run correctly. > > I have now written a short program to instead solve the complex generalised > > problem Ax = lambda B x (see below) . This code compiles and runs w/out errors but > > for some reason hangs when calling EPSSolve - we enter EPSSolve but never leave. > > The matrices appear to be correctly assembled -all the values are correct in the Matview > > printout, so I am not quite sure where I have gone wrong, can anyone spot my mistake? > > ( Note that for the actual problem I wish to solve I have already written the code to construct the matrix, > > which distributes the rows across the processes and it is fully tested and working. Hence I want to specify > > the distribution of rows and not leave it up to a PETS_DECIDE .) > > I would be very grateful if someone can point out what is wrong with this small example code (see below), > > Many thanks, > > Dan. > > > > ! this program MUST be run with NGLOBAL = 10, MY_NO_ROWS = 5 > > ! and two MPI processes since row distribution is hard-baked into code > > ! > > program main > > #include > > use slepceps > > implicit none > > > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > ! Declarations > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > ! > > ! Variables: > > ! A , B double complex operator matrices > > ! eps eigenproblem solver context > > > > Mat A,B > > EPS eps > > EPSType tname > > PetscReal tol, error > > PetscScalar kr, ki > > Vec xr, xi > > PetscInt NGLOBAL , MY_NO_ROWS, NL3, owner > > PetscInt nev, maxit, its, nconv > > PetscInt i,j,ic,jc > > PetscReal reala, imaga, realb, imagb, di, dj > > PetscScalar a_entry, b_entry > > PetscMPIInt rank > > PetscErrorCode ierr > > PetscInt,parameter :: zero = 0, one = 1, two = 2, three = 3 > > > > PetscInt M1, N1, mm1, nn1 > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > ! Beginning of program > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > > > call SlepcInitialize(PETSC_NULL_CHARACTER,ierr) > > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) > > ! make sure you set NGLOBAL = 10, MY_NO_ROWS = 5 and run with two processes > > NGLOBAL = 10 > > MY_NO_ROWS = 5 > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > ! Compute the operator matrices that define the eigensystem, Ax=kBx > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > !!!!!!!!! Setup A matrix > > > > call MatCreate(PETSC_COMM_WORLD,A,ierr) > > call MatSetsizes(A,MY_NO_ROWS, MY_NO_ROWS ,NGLOBAL,NGLOBAL,ierr) > > call MatSetFromOptions(A,ierr) > > call MatGetSize(A,M1,N1,ierr) > > write(*,*)'Rank [',rank,']: global size of A is ',M1, N1 > > call MatGetLocalSize(A,mm1,nn1,ierr) > > write(*,*)'Rank [',rank,']: my local size of A is ',mm1, nn1 > > call MatMPIAIJSetPreallocation(A,three, PETSC_NULL_INTEGER,one, & > > & PETSC_NULL_INTEGER,ierr) !parallel (MPI) allocation > > > > !!!!!!!!! Setup B matrix > > call MatCreate(PETSC_COMM_WORLD,B,ierr) > > call MatSetsizes(B,MY_NO_ROWS, MY_NO_ROWS ,NGLOBAL,NGLOBAL,ierr) > > call MatSetFromOptions(B,ierr) > > call MatGetSize(B,M1,N1,ierr) > > write(*,*)'Rank [',rank,']: global size of B is ',M1, N1 > > call MatGetLocalSize(B,mm1,nn1,ierr) > > write(*,*)'Rank [',rank,']: my local size of B is ',mm1, nn1 > > > > call MatMPIAIJSetPreallocation(B,three, PETSC_NULL_INTEGER,one, & > > & PETSC_NULL_INTEGER,ierr) !parallel (MPI) allocation > > > > ! initalise > > call MatZeroEntries(A,ierr) > > call MatZeroEntries(B,ierr) > > > > ! Fill in values of A, B and assemble matrices > > ! Both matrices are tridiagonal with > > ! Aij = cmplx( (i-j)**2, (i+j)**2) > > ! Bij = cmplx( ij/i + j, (i/j)**2) > > ! (numbering from 1 ) > > > > do i = 1, NGLOBAL > > ! a rather crude way to distribute rows > > if (i < 6) owner = 0 > > if (i >= 6) owner = 1 > > if (rank /= owner) cycle > > do j = 1, NGLOBAL > > if ( abs(i-j) < 2 ) then > > write(*,*)rank,' : Setting ',i,j > > di = dble(i) ; dj = dble(j) > > > > reala = (di - dj)**2 ; imaga = (di + dj)**2 > > a_entry = dcmplx(reala, imaga) > > realb = (di*dj)/(di + dj) ; imagb = di**2/dj**2 > > b_entry = dcmplx(realb, imagb) > > > > ic = i -1 ; jc = j-1 ! convert to C indexing > > call MatSetValue(A, ic, jc, a_entry, ADD_VALUES,ierr) > > call MatSetValue(B, ic, jc, b_entry, ADD_VALUES,ierr) > > endif > > enddo > > enddo > > > > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) > > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) > > call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,ierr) > > call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,ierr) > > > > ! Check matrices > > write(*,*)'A matrix ... ' > > call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr) > > write(*,*)'B matrix ... ' > > call MatView(B,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > call MatCreateVecs(A,PETSC_NULL_VEC,xr) > > call MatCreateVecs(A, PETSC_NULL_VEC,xi) > > > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > ! Create the eigensolver and display info > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > ! ** Create eigensolver context > > call EPSCreate(PETSC_COMM_WORLD,eps,ierr) > > > > ! ** Set operators.for general problem Ax = lambda B x > > call EPSSetOperators(eps,A, B, ierr) > > call EPSSetProblemType(eps,EPS_GNHEP,ierr) > > > > ! ** Set solver parameters at runtime > > call EPSSetFromOptions(eps,ierr) > > > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > ! Solve the eigensystem > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > > > write(*,*)'rank',rank, 'entering solver ...' > > call EPSSolve(eps,ierr) > > > > ! ** Free work space > > call EPSDestroy(eps,ierr) > > call MatDestroy(A,ierr) > > call MatDestroy(B,ierr) > > > > call VecDestroy(xr,ierr) > > call VecDestroy(xi,ierr) > > > > call SlepcFinalize(ierr) > > end From brardafrancesco at gmail.com Wed Mar 31 02:53:53 2021 From: brardafrancesco at gmail.com (Francesco Brarda) Date: Wed, 31 Mar 2021 09:53:53 +0200 Subject: [petsc-users] Parallel TS for ODE Message-ID: <278D238E-4519-4943-BC8D-607CF62F0025@gmail.com> Hi everyone! I am trying to solve a system of 3 ODEs (a basic SIR model) with TS. Sequentially works pretty well, but I need to switch it into a parallel version. I started working with TS not very long time ago, there are few questions I?d like to share with you and if you have any advices I?d be happy to hear. First of all, do I need to use a DM object even if the model is only time dependent? All the examples I found were using that object for the other variable when solving PDEs. When I preallocate the space for the Jacobian matrix, is it better to decide the local or global space? Best, Francesco From knepley at gmail.com Wed Mar 31 05:06:03 2021 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Mar 2021 06:06:03 -0400 Subject: [petsc-users] Parallel TS for ODE In-Reply-To: <278D238E-4519-4943-BC8D-607CF62F0025@gmail.com> References: <278D238E-4519-4943-BC8D-607CF62F0025@gmail.com> Message-ID: On Wed, Mar 31, 2021 at 3:54 AM Francesco Brarda wrote: > Hi everyone! > > I am trying to solve a system of 3 ODEs (a basic SIR model) with TS. > Sequentially works pretty well, but I need to switch it into a parallel > version. > I started working with TS not very long time ago, there are few questions > I?d like to share with you and if you have any advices I?d be happy to hear. > First of all, do I need to use a DM object even if the model is only time > dependent? All the examples I found were using that object for the other > variable when solving PDEs. > You do not need one. We use it in examples because it makes it easy to create the data. > When I preallocate the space for the Jacobian matrix, is it better to > decide the local or global space? > Since you are producing all the Jacobian values, it is whatever is easier in your code I think. THanks, Matt > Best, > Francesco -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From brardafrancesco at gmail.com Wed Mar 31 09:15:43 2021 From: brardafrancesco at gmail.com (Francesco Brarda) Date: Wed, 31 Mar 2021 16:15:43 +0200 Subject: [petsc-users] Parallel TS for ODE In-Reply-To: References: <278D238E-4519-4943-BC8D-607CF62F0025@gmail.com> Message-ID: <3C70181A-7F7A-40C6-A27C-EE8F01558975@gmail.com> Thank you for your advices. I wrote what seems to me a very basic code, but I got this error when I run it with more than 1 processor: Clearly the result 299. is wrong but I do not understand what am doing wrong. With 1 processor it works fine. steps 150, ftime 15. Vec Object: 2 MPI processes type: mpi Process [0] 16.5613 2.91405 Process [1] 299. [0]PETSC ERROR: PetscTrFreeDefault() called from VecDestroy_MPI() line 21 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c [0]PETSC ERROR: Block [id=0(16)] at address 0x15812a0 is corrupted (probably write past end of array) [0]PETSC ERROR: Block allocated in VecCreate_MPI_Private() line 514 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pbvec.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Memory corruption: https://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind [0]PETSC ERROR: Corrupted memory [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.14.4, unknown [0]PETSC ERROR: ./par_sir_model on a arch-debug named srvulx13 by fbrarda Wed Mar 31 16:05:22 2021 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-openblas-dir=/opt/packages/openblas/0.2.13-gcc --download-mpich PETSC_ARCH=arch-debug [0]PETSC ERROR: #1 PetscTrFreeDefault() line 310 in /home/fbrarda/petsc/src/sys/memory/mtr.c [0]PETSC ERROR: #2 VecDestroy_MPI() line 21 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c [0]PETSC ERROR: #3 VecDestroy() line 396 in /home/fbrarda/petsc/src/vec/vec/interface/vector.c [0]PETSC ERROR: #4 SNESLineSearchReset() line 284 in /home/fbrarda/petsc/src/snes/linesearch/interface/linesearch.c [0]PETSC ERROR: #5 SNESReset() line 3229 in /home/fbrarda/petsc/src/snes/interface/snes.c [0]PETSC ERROR: #6 TSReset() line 2800 in /home/fbrarda/petsc/src/ts/interface/ts.c [0]PETSC ERROR: #7 TSDestroy() line 2856 in /home/fbrarda/petsc/src/ts/interface/ts.c [0]PETSC ERROR: #8 main() line 256 in par_sir_model.c [0]PETSC ERROR: No PETSc Option Table entries [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_SELF, 256001) - process 0 [1]PETSC ERROR: PetscTrFreeDefault() called from VecDestroy_MPI() line 21 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c [1]PETSC ERROR: Block [id=0(16)] at address 0xbd9520 is corrupted (probably write past end of array) [1]PETSC ERROR: Block allocated in VecCreate_MPI_Private() line 514 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pbvec.c [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Memory corruption: https://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind [1]PETSC ERROR: Corrupted memory [1]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.14.4, unknown [1]PETSC ERROR: ./par_sir_model on a arch-debug named srvulx13 by fbrarda Wed Mar 31 16:05:22 2021 [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-openblas-dir=/opt/packages/openblas/0.2.13-gcc --download-mpich PETSC_ARCH=arch-debug [1]PETSC ERROR: #1 PetscTrFreeDefault() line 310 in /home/fbrarda/petsc/src/sys/memory/mtr.c [1]PETSC ERROR: #2 VecDestroy_MPI() line 21 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c [1]PETSC ERROR: #3 VecDestroy() line 396 in /home/fbrarda/petsc/src/vec/vec/interface/vector.c [1]PETSC ERROR: #4 TSReset() line 2806 in /home/fbrarda/petsc/src/ts/interface/ts.c [1]PETSC ERROR: #5 TSDestroy() line 2856 in /home/fbrarda/petsc/src/ts/interface/ts.c [1]PETSC ERROR: #6 main() line 256 in par_sir_model.c [1]PETSC ERROR: No PETSc Option Table entries [1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_SELF, 256001) - process 0 > Il giorno 31 mar 2021, alle ore 12:06, Matthew Knepley ha scritto: > > On Wed, Mar 31, 2021 at 3:54 AM Francesco Brarda wrote: > Hi everyone! > > I am trying to solve a system of 3 ODEs (a basic SIR model) with TS. Sequentially works pretty well, but I need to switch it into a parallel version. > I started working with TS not very long time ago, there are few questions I?d like to share with you and if you have any advices I?d be happy to hear. > First of all, do I need to use a DM object even if the model is only time dependent? All the examples I found were using that object for the other variable when solving PDEs. > > You do not need one. We use it in examples because it makes it easy to create the data. > > When I preallocate the space for the Jacobian matrix, is it better to decide the local or global space? > > Since you are producing all the Jacobian values, it is whatever is easier in your code I think. > > THanks, > > Matt > > Best, > Francesco > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 31 09:18:53 2021 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Mar 2021 10:18:53 -0400 Subject: [petsc-users] Parallel TS for ODE In-Reply-To: <3C70181A-7F7A-40C6-A27C-EE8F01558975@gmail.com> References: <278D238E-4519-4943-BC8D-607CF62F0025@gmail.com> <3C70181A-7F7A-40C6-A27C-EE8F01558975@gmail.com> Message-ID: On Wed, Mar 31, 2021 at 10:15 AM Francesco Brarda wrote: > Thank you for your advices. > I wrote what seems to me a very basic code, but I got this error when I > run it with more than 1 processor: > Clearly the result 299. is wrong but I do not understand what am doing > wrong. With 1 processor it works fine. > My guess is that you do VecGetArray() and index the array using global indices rather than local indices, because there memory corruption with a Vec array. Thanks, Matt > steps 150, ftime 15. > Vec Object: 2 MPI processes > type: mpi > Process [0] > 16.5613 > 2.91405 > Process [1] > 299. > [0]PETSC ERROR: PetscTrFreeDefault() called from VecDestroy_MPI() line 21 > in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c > [0]PETSC ERROR: Block [id=0(16)] at address 0x15812a0 is corrupted > (probably write past end of array) > [0]PETSC ERROR: Block allocated in VecCreate_MPI_Private() line 514 > in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pbvec.c > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Memory corruption: > https://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind > [0]PETSC ERROR: Corrupted memory > [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.14.4, unknown > [0]PETSC ERROR: ./par_sir_model on a arch-debug named srvulx13 by fbrarda > Wed Mar 31 16:05:22 2021 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-openblas-dir=/opt/packages/openblas/0.2.13-gcc > --download-mpich PETSC_ARCH=arch-debug > [0]PETSC ERROR: #1 PetscTrFreeDefault() line 310 > in /home/fbrarda/petsc/src/sys/memory/mtr.c > [0]PETSC ERROR: #2 VecDestroy_MPI() line 21 > in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c > [0]PETSC ERROR: #3 VecDestroy() line 396 > in /home/fbrarda/petsc/src/vec/vec/interface/vector.c > [0]PETSC ERROR: #4 SNESLineSearchReset() line 284 > in /home/fbrarda/petsc/src/snes/linesearch/interface/linesearch.c > [0]PETSC ERROR: #5 SNESReset() line 3229 in > /home/fbrarda/petsc/src/snes/interface/snes.c > [0]PETSC ERROR: #6 TSReset() line 2800 in > /home/fbrarda/petsc/src/ts/interface/ts.c > [0]PETSC ERROR: #7 TSDestroy() line 2856 in > /home/fbrarda/petsc/src/ts/interface/ts.c > [0]PETSC ERROR: #8 main() line 256 in par_sir_model.c > [0]PETSC ERROR: No PETSc Option Table entries > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > application called MPI_Abort(MPI_COMM_SELF, 256001) - process 0 > [1]PETSC ERROR: PetscTrFreeDefault() called from VecDestroy_MPI() line 21 > in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c > [1]PETSC ERROR: Block [id=0(16)] at address 0xbd9520 is corrupted > (probably write past end of array) > [1]PETSC ERROR: Block allocated in VecCreate_MPI_Private() line 514 > in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pbvec.c > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Memory corruption: > https://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind > [1]PETSC ERROR: Corrupted memory > [1]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.14.4, unknown > [1]PETSC ERROR: ./par_sir_model on a arch-debug named srvulx13 by fbrarda > Wed Mar 31 16:05:22 2021 > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-openblas-dir=/opt/packages/openblas/0.2.13-gcc > --download-mpich PETSC_ARCH=arch-debug > [1]PETSC ERROR: #1 PetscTrFreeDefault() line 310 > in /home/fbrarda/petsc/src/sys/memory/mtr.c > [1]PETSC ERROR: #2 VecDestroy_MPI() line 21 > in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c > [1]PETSC ERROR: #3 VecDestroy() line 396 > in /home/fbrarda/petsc/src/vec/vec/interface/vector.c > [1]PETSC ERROR: #4 TSReset() line 2806 in > /home/fbrarda/petsc/src/ts/interface/ts.c > [1]PETSC ERROR: #5 TSDestroy() line 2856 in > /home/fbrarda/petsc/src/ts/interface/ts.c > [1]PETSC ERROR: #6 main() line 256 in par_sir_model.c > [1]PETSC ERROR: No PETSc Option Table entries > [1]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > application called MPI_Abort(MPI_COMM_SELF, 256001) - process 0 > > Il giorno 31 mar 2021, alle ore 12:06, Matthew Knepley > ha scritto: > > On Wed, Mar 31, 2021 at 3:54 AM Francesco Brarda < > brardafrancesco at gmail.com> wrote: > Hi everyone! > > I am trying to solve a system of 3 ODEs (a basic SIR model) with TS. > Sequentially works pretty well, but I need to switch it into a > parallel version. > I started working with TS not very long time ago, there are few questions > I?d like to share with you and if you have any advices I?d be happy to hear. > First of all, do I need to use a DM object even if the model is only time > dependent? All the examples I found were using that object for the other > variable when solving PDEs. > > You do not need one. We use it in examples because it makes it easy to > create the data. > > When I preallocate the space for the Jacobian matrix, is it better to > decide the local or global space? > > Since you are producing all the Jacobian values, it is whatever is easier > in your code I think. > > THanks, > > Matt > > Best, > Francesco > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Wed Mar 31 09:43:08 2021 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Wed, 31 Mar 2021 17:43:08 +0300 Subject: [petsc-users] Parallel TS for ODE In-Reply-To: References: <278D238E-4519-4943-BC8D-607CF62F0025@gmail.com> <3C70181A-7F7A-40C6-A27C-EE8F01558975@gmail.com> Message-ID: Are you trying to parallelize a 3 equations system? Or you just use your SIR code to experiment with TS? > On Mar 31, 2021, at 5:18 PM, Matthew Knepley wrote: > > On Wed, Mar 31, 2021 at 10:15 AM Francesco Brarda > wrote: > Thank you for your advices. > I wrote what seems to me a very basic code, but I got this error when I run it with more than 1 processor: > Clearly the result 299. is wrong but I do not understand what am doing wrong. With 1 processor it works fine. > > My guess is that you do VecGetArray() and index the array using global indices rather than local indices, because > there memory corruption with a Vec array. > > Thanks, > > Matt > > steps 150, ftime 15. > Vec Object: 2 MPI processes > type: mpi > Process [0] > 16.5613 > 2.91405 > Process [1] > 299. > [0]PETSC ERROR: PetscTrFreeDefault() called from VecDestroy_MPI() line 21 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c > [0]PETSC ERROR: Block [id=0(16)] at address 0x15812a0 is corrupted (probably write past end of array) > [0]PETSC ERROR: Block allocated in VecCreate_MPI_Private() line 514 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pbvec.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Memory corruption: https://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind > [0]PETSC ERROR: Corrupted memory > [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.14.4, unknown > [0]PETSC ERROR: ./par_sir_model on a arch-debug named srvulx13 by fbrarda Wed Mar 31 16:05:22 2021 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-openblas-dir=/opt/packages/openblas/0.2.13-gcc --download-mpich PETSC_ARCH=arch-debug > [0]PETSC ERROR: #1 PetscTrFreeDefault() line 310 in /home/fbrarda/petsc/src/sys/memory/mtr.c > [0]PETSC ERROR: #2 VecDestroy_MPI() line 21 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c > [0]PETSC ERROR: #3 VecDestroy() line 396 in /home/fbrarda/petsc/src/vec/vec/interface/vector.c > [0]PETSC ERROR: #4 SNESLineSearchReset() line 284 in /home/fbrarda/petsc/src/snes/linesearch/interface/linesearch.c > [0]PETSC ERROR: #5 SNESReset() line 3229 in /home/fbrarda/petsc/src/snes/interface/snes.c > [0]PETSC ERROR: #6 TSReset() line 2800 in /home/fbrarda/petsc/src/ts/interface/ts.c > [0]PETSC ERROR: #7 TSDestroy() line 2856 in /home/fbrarda/petsc/src/ts/interface/ts.c > [0]PETSC ERROR: #8 main() line 256 in par_sir_model.c > [0]PETSC ERROR: No PETSc Option Table entries > [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > application called MPI_Abort(MPI_COMM_SELF, 256001) - process 0 > [1]PETSC ERROR: PetscTrFreeDefault() called from VecDestroy_MPI() line 21 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c > [1]PETSC ERROR: Block [id=0(16)] at address 0xbd9520 is corrupted (probably write past end of array) > [1]PETSC ERROR: Block allocated in VecCreate_MPI_Private() line 514 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pbvec.c > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: Memory corruption: https://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind > [1]PETSC ERROR: Corrupted memory > [1]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.14.4, unknown > [1]PETSC ERROR: ./par_sir_model on a arch-debug named srvulx13 by fbrarda Wed Mar 31 16:05:22 2021 > [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-openblas-dir=/opt/packages/openblas/0.2.13-gcc --download-mpich PETSC_ARCH=arch-debug > [1]PETSC ERROR: #1 PetscTrFreeDefault() line 310 in /home/fbrarda/petsc/src/sys/memory/mtr.c > [1]PETSC ERROR: #2 VecDestroy_MPI() line 21 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c > [1]PETSC ERROR: #3 VecDestroy() line 396 in /home/fbrarda/petsc/src/vec/vec/interface/vector.c > [1]PETSC ERROR: #4 TSReset() line 2806 in /home/fbrarda/petsc/src/ts/interface/ts.c > [1]PETSC ERROR: #5 TSDestroy() line 2856 in /home/fbrarda/petsc/src/ts/interface/ts.c > [1]PETSC ERROR: #6 main() line 256 in par_sir_model.c > [1]PETSC ERROR: No PETSc Option Table entries > [1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > application called MPI_Abort(MPI_COMM_SELF, 256001) - process 0 > >> Il giorno 31 mar 2021, alle ore 12:06, Matthew Knepley > ha scritto: >> >> On Wed, Mar 31, 2021 at 3:54 AM Francesco Brarda > wrote: >> Hi everyone! >> >> I am trying to solve a system of 3 ODEs (a basic SIR model) with TS. Sequentially works pretty well, but I need to switch it into a parallel version. >> I started working with TS not very long time ago, there are few questions I?d like to share with you and if you have any advices I?d be happy to hear. >> First of all, do I need to use a DM object even if the model is only time dependent? All the examples I found were using that object for the other variable when solving PDEs. >> >> You do not need one. We use it in examples because it makes it easy to create the data. >> >> When I preallocate the space for the Jacobian matrix, is it better to decide the local or global space? >> >> Since you are producing all the Jacobian values, it is whatever is easier in your code I think. >> >> THanks, >> >> Matt >> >> Best, >> Francesco >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From brardafrancesco at gmail.com Wed Mar 31 09:57:51 2021 From: brardafrancesco at gmail.com (Francesco Brarda) Date: Wed, 31 Mar 2021 16:57:51 +0200 Subject: [petsc-users] Parallel TS for ODE In-Reply-To: References: <278D238E-4519-4943-BC8D-607CF62F0025@gmail.com> <3C70181A-7F7A-40C6-A27C-EE8F01558975@gmail.com> Message-ID: Right now this is only a toy example. I do not expect to see any actual improvement over the number of processors, or should I? > Il giorno 31 mar 2021, alle ore 16:43, Stefano Zampini ha scritto: > > Are you trying to parallelize a 3 equations system? Or you just use your SIR code to experiment with TS? > > >> On Mar 31, 2021, at 5:18 PM, Matthew Knepley > wrote: >> >> On Wed, Mar 31, 2021 at 10:15 AM Francesco Brarda > wrote: >> Thank you for your advices. >> I wrote what seems to me a very basic code, but I got this error when I run it with more than 1 processor: >> Clearly the result 299. is wrong but I do not understand what am doing wrong. With 1 processor it works fine. >> >> My guess is that you do VecGetArray() and index the array using global indices rather than local indices, because >> there memory corruption with a Vec array. >> >> Thanks, >> >> Matt >> >> steps 150, ftime 15. >> Vec Object: 2 MPI processes >> type: mpi >> Process [0] >> 16.5613 >> 2.91405 >> Process [1] >> 299. >> [0]PETSC ERROR: PetscTrFreeDefault() called from VecDestroy_MPI() line 21 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c >> [0]PETSC ERROR: Block [id=0(16)] at address 0x15812a0 is corrupted (probably write past end of array) >> [0]PETSC ERROR: Block allocated in VecCreate_MPI_Private() line 514 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pbvec.c >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: Memory corruption: https://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind >> [0]PETSC ERROR: Corrupted memory >> [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.14.4, unknown >> [0]PETSC ERROR: ./par_sir_model on a arch-debug named srvulx13 by fbrarda Wed Mar 31 16:05:22 2021 >> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-openblas-dir=/opt/packages/openblas/0.2.13-gcc --download-mpich PETSC_ARCH=arch-debug >> [0]PETSC ERROR: #1 PetscTrFreeDefault() line 310 in /home/fbrarda/petsc/src/sys/memory/mtr.c >> [0]PETSC ERROR: #2 VecDestroy_MPI() line 21 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c >> [0]PETSC ERROR: #3 VecDestroy() line 396 in /home/fbrarda/petsc/src/vec/vec/interface/vector.c >> [0]PETSC ERROR: #4 SNESLineSearchReset() line 284 in /home/fbrarda/petsc/src/snes/linesearch/interface/linesearch.c >> [0]PETSC ERROR: #5 SNESReset() line 3229 in /home/fbrarda/petsc/src/snes/interface/snes.c >> [0]PETSC ERROR: #6 TSReset() line 2800 in /home/fbrarda/petsc/src/ts/interface/ts.c >> [0]PETSC ERROR: #7 TSDestroy() line 2856 in /home/fbrarda/petsc/src/ts/interface/ts.c >> [0]PETSC ERROR: #8 main() line 256 in par_sir_model.c >> [0]PETSC ERROR: No PETSc Option Table entries >> [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov ---------- >> application called MPI_Abort(MPI_COMM_SELF, 256001) - process 0 >> [1]PETSC ERROR: PetscTrFreeDefault() called from VecDestroy_MPI() line 21 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c >> [1]PETSC ERROR: Block [id=0(16)] at address 0xbd9520 is corrupted (probably write past end of array) >> [1]PETSC ERROR: Block allocated in VecCreate_MPI_Private() line 514 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pbvec.c >> [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [1]PETSC ERROR: Memory corruption: https://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind >> [1]PETSC ERROR: Corrupted memory >> [1]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [1]PETSC ERROR: Petsc Release Version 3.14.4, unknown >> [1]PETSC ERROR: ./par_sir_model on a arch-debug named srvulx13 by fbrarda Wed Mar 31 16:05:22 2021 >> [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-openblas-dir=/opt/packages/openblas/0.2.13-gcc --download-mpich PETSC_ARCH=arch-debug >> [1]PETSC ERROR: #1 PetscTrFreeDefault() line 310 in /home/fbrarda/petsc/src/sys/memory/mtr.c >> [1]PETSC ERROR: #2 VecDestroy_MPI() line 21 in /home/fbrarda/petsc/src/vec/vec/impls/mpi/pdvec.c >> [1]PETSC ERROR: #3 VecDestroy() line 396 in /home/fbrarda/petsc/src/vec/vec/interface/vector.c >> [1]PETSC ERROR: #4 TSReset() line 2806 in /home/fbrarda/petsc/src/ts/interface/ts.c >> [1]PETSC ERROR: #5 TSDestroy() line 2856 in /home/fbrarda/petsc/src/ts/interface/ts.c >> [1]PETSC ERROR: #6 main() line 256 in par_sir_model.c >> [1]PETSC ERROR: No PETSc Option Table entries >> [1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov ---------- >> application called MPI_Abort(MPI_COMM_SELF, 256001) - process 0 >> >>> Il giorno 31 mar 2021, alle ore 12:06, Matthew Knepley > ha scritto: >>> >>> On Wed, Mar 31, 2021 at 3:54 AM Francesco Brarda > wrote: >>> Hi everyone! >>> >>> I am trying to solve a system of 3 ODEs (a basic SIR model) with TS. Sequentially works pretty well, but I need to switch it into a parallel version. >>> I started working with TS not very long time ago, there are few questions I?d like to share with you and if you have any advices I?d be happy to hear. >>> First of all, do I need to use a DM object even if the model is only time dependent? All the examples I found were using that object for the other variable when solving PDEs. >>> >>> You do not need one. We use it in examples because it makes it easy to create the data. >>> >>> When I preallocate the space for the Jacobian matrix, is it better to decide the local or global space? >>> >>> Since you are producing all the Jacobian values, it is whatever is easier in your code I think. >>> >>> THanks, >>> >>> Matt >>> >>> Best, >>> Francesco >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 31 10:47:21 2021 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Mar 2021 11:47:21 -0400 Subject: [petsc-users] DMPlex overlap In-Reply-To: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> Message-ID: On Sat, Mar 27, 2021 at 9:27 AM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > Hi all, > I promise to read this today. While I am doing that, I have a branch https://gitlab.com/petsc/petsc/-/commits/knepley/feature-plex-vertex-mapping which I think does the face labeling you want, but I have not tested it yet. Thanks, Matt > First, I'm not sure I understand what the overlap parameter in > DMPlexDistributeOverlap does. I tried the following: generate a small > mesh on 1 rank with DMPlexCreateBoxMesh, then distribute it with > DMPlexDistribute. At this point I have two nice partitions, with shared > vertices and no overlapping cells. Then I call DMPlexDistributeOverlap > with the overlap parameter set to 0 or 1, and get the same resulting > plex in both cases. Why is that ? > > Second, I'm wondering what would be a good way to handle two overlaps > and associated local vectors. In my adaptation code, the remeshing > library requires a non-overlapping mesh, while the refinement criterion > computation is based on hessian computations, which require a layer of > overlap. What I can do is clone the dm before distributing the overlap, > then manage two independent plex objects with their own local sections > etc. and copy/trim local vectors manually. Is there a more automatic way > to do this ? > > Thanks > > -- > Nicolas > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 31 10:51:47 2021 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Mar 2021 11:51:47 -0400 Subject: [petsc-users] DMPlex overlap In-Reply-To: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> Message-ID: On Sat, Mar 27, 2021 at 9:27 AM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > Hi all, > > First, I'm not sure I understand what the overlap parameter in > DMPlexDistributeOverlap does. I tried the following: generate a small > mesh on 1 rank with DMPlexCreateBoxMesh, then distribute it with > DMPlexDistribute. At this point I have two nice partitions, with shared > vertices and no overlapping cells. Then I call DMPlexDistributeOverlap > with the overlap parameter set to 0 or 1, and get the same resulting > plex in both cases. Why is that ? > The overlap parameter says how many cell adjacencies to go out. You should not get the same mesh out. We have lots of examples that use this. If you send your small example, I can probably tell you what is happening. > Second, I'm wondering what would be a good way to handle two overlaps > and associated local vectors. In my adaptation code, the remeshing > library requires a non-overlapping mesh, while the refinement criterion > computation is based on hessian computations, which require a layer of > overlap. What I can do is clone the dm before distributing the overlap, > then manage two independent plex objects with their own local sections > etc. and copy/trim local vectors manually. Is there a more automatic way > to do this ? > DMClone() is a shallow copy, so that will not work. You would maintain two different Plexes, overlapping and non-overlapping, with their own sections and vecs. Are you sure you need to keep around the non-overlapping one? Maybe if I understood what operations you want to work, I could say something more definitive. Thanks, Matt > Thanks > > -- > Nicolas > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Wed Mar 31 10:59:30 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Wed, 31 Mar 2021 17:59:30 +0200 Subject: [petsc-users] DMPlex overlap In-Reply-To: References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> Message-ID: On 31/03/2021 17:47, Matthew Knepley wrote: > On Sat, Mar 27, 2021 at 9:27 AM Nicolas Barral > > wrote: > > Hi all, > > > I promise?to read this today. While I am doing that, I have a branch > > https://gitlab.com/petsc/petsc/-/commits/knepley/feature-plex-vertex-mapping > > which I think does the face labeling you want, but I have not tested it yet. > Thanks Matt, I'll have a look later today :) I also have working code for what I need - maybe it? time to share my branch even if it's still in progress. Thanks -- Nicolas > ? Thanks, > > ? ? Matt > > First, I'm not sure I understand what the overlap parameter in > DMPlexDistributeOverlap does. I tried the following: generate a small > mesh on 1 rank with DMPlexCreateBoxMesh, then distribute it with > DMPlexDistribute. At this point I have two nice partitions, with shared > vertices and no overlapping cells. Then I call DMPlexDistributeOverlap > with the overlap parameter set to 0 or 1, and get the same resulting > plex in both cases. Why is that ? > > Second, I'm wondering what would be a good way to handle two overlaps > and associated local vectors. In my adaptation code, the remeshing > library requires a non-overlapping mesh, while the refinement criterion > computation is based on hessian computations, which require a layer of > overlap. What I can do is clone the dm before distributing the overlap, > then manage two independent plex objects with their own local sections > etc. and copy/trim local vectors manually. Is there a more automatic > way > to do this ? > > Thanks > > -- > Nicolas > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From nicolas.barral at math.u-bordeaux.fr Wed Mar 31 11:22:52 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Wed, 31 Mar 2021 18:22:52 +0200 Subject: [petsc-users] DMPlex overlap In-Reply-To: References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> Message-ID: <1c9c519d-4d94-66e1-844c-329fb58d3e94@math.u-bordeaux.fr> @+ -- Nicolas On 31/03/2021 17:51, Matthew Knepley wrote: > On Sat, Mar 27, 2021 at 9:27 AM Nicolas Barral > > wrote: > > Hi all, > > First, I'm not sure I understand what the overlap parameter in > DMPlexDistributeOverlap does. I tried the following: generate a small > mesh on 1 rank with DMPlexCreateBoxMesh, then distribute it with > DMPlexDistribute. At this point I have two nice partitions, with shared > vertices and no overlapping cells. Then I call DMPlexDistributeOverlap > with the overlap parameter set to 0 or 1, and get the same resulting > plex in both cases. Why is that ? > > > The overlap parameter says how many cell adjacencies to go out. You > should not get the same > mesh out. We have lots of examples that use this. If you send your small > example, I can probably > tell you what is happening. > Ok so I do have a small example on that and the DMClone thing I set up to understand! I attach it to the email. For the overlap, you can change the overlap constant at the top of the file. With OVERLAP=0 or 1, the distributed overlapping mesh (shown using -over_dm_view, it's DMover) are the same, and different from the mesh before distributing the overlap (shown using -distrib_dm_view). For larger overlap values they're different. The process is: 1/ create a DM dm on 1 rank 2/ clone dm into dm2 3/ distribute dm 4/ clone dm into dm3 5/ distribute dm overlap I print all the DMs after each step. dm has a distributed overlap, dm2 is not distributed, dm3 is distributed but without overlap. Since distribute and distributeOverlap create new DMs, I don't seem have a problem with the shallow copies. > Second, I'm wondering what would be a good way to handle two overlaps > and associated local vectors. In my adaptation code, the remeshing > library requires a non-overlapping mesh, while the refinement criterion > computation is based on hessian computations, which require a layer of > overlap. What I can do is clone the dm before distributing the overlap, > then manage two independent plex objects with their own local sections > etc. and copy/trim local vectors manually. Is there a more automatic > way > to do this ? > > > DMClone() is a shallow copy, so that will not work. You would maintain > two different Plexes, overlapping > and non-overlapping, with their own sections and vecs. Are you sure you > need to keep around the non-overlapping one? > Maybe if I understood what operations you want to work, I could say > something more definitive. > I need to be able to pass the non-overlapping mesh to the remesher. I can either maintain 2 plexes, or trim the overlapping plex when I create the arrays I give to the remesher. I'm not sure which is the best/worst ? Thanks -- Nicolas > ? Thanks, > > ? ? ?Matt > > Thanks > > -- > Nicolas > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- A non-text attachment was scrubbed... Name: test_overlap.c Type: text/x-csrc Size: 2429 bytes Desc: not available URL: From knepley at gmail.com Wed Mar 31 11:57:02 2021 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Mar 2021 12:57:02 -0400 Subject: [petsc-users] DMPlex overlap In-Reply-To: <1c9c519d-4d94-66e1-844c-329fb58d3e94@math.u-bordeaux.fr> References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> <1c9c519d-4d94-66e1-844c-329fb58d3e94@math.u-bordeaux.fr> Message-ID: Okay, let me show a really simple example that gives the expected result before I figure out what is going wrong for you. This code static char help[] = "Tests plex distribution and overlaps.\n"; #include int main (int argc, char * argv[]) { DM dm; MPI_Comm comm; PetscErrorCode ierr; ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return ierr; comm = PETSC_COMM_WORLD; ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); ierr = DMSetFromOptions(dm);CHKERRQ(ierr); ierr = PetscObjectSetName((PetscObject) dm, "Initial DM");CHKERRQ(ierr); ierr = DMViewFromOptions(dm, NULL, "-initial_dm_view");CHKERRQ(ierr); ierr = DMDestroy(&dm);CHKERRQ(ierr); ierr = PetscFinalize(); return ierr; } can do all the overlap tests. For example, you can run it naively and get a serial mesh master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 DM Object: Initial DM 2 MPI processes type: plex Initial DM in 2 dimensions: 0-cells: 36 0 1-cells: 85 0 2-cells: 50 0 Labels: celltype: 3 strata with value/size (0 (36), 3 (50), 1 (85)) depth: 3 strata with value/size (0 (36), 1 (85), 2 (50)) marker: 1 strata with value/size (1 (40)) Face Sets: 1 strata with value/size (1 (20)) Then run it telling Plex to distribute after creating the mesh master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 -dm_distribute DM Object: Initial DM 2 MPI processes type: plex Initial DM in 2 dimensions: 0-cells: 21 21 1-cells: 45 45 2-cells: 25 25 Labels: depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) marker: 1 strata with value/size (1 (21)) Face Sets: 1 strata with value/size (1 (10)) The get the same thing back with overlap = 0 master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 -dm_distribute -dm_distribute_overlap 0 DM Object: Initial DM 2 MPI processes type: plex Initial DM in 2 dimensions: 0-cells: 21 21 1-cells: 45 45 2-cells: 25 25 Labels: depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) marker: 1 strata with value/size (1 (21)) Face Sets: 1 strata with value/size (1 (10)) and get larger local meshes with overlap = 1 master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 -dm_distribute -dm_distribute_overlap 1 DM Object: Initial DM 2 MPI processes type: plex Initial DM in 2 dimensions: 0-cells: 29 29 1-cells: 65 65 2-cells: 37 37 Labels: depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) marker: 1 strata with value/size (1 (27)) Face Sets: 1 strata with value/size (1 (13)) Thanks, Matt On Wed, Mar 31, 2021 at 12:22 PM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > > > @+ > > -- > Nicolas > > On 31/03/2021 17:51, Matthew Knepley wrote: > > On Sat, Mar 27, 2021 at 9:27 AM Nicolas Barral > > > > wrote: > > > > Hi all, > > > > First, I'm not sure I understand what the overlap parameter in > > DMPlexDistributeOverlap does. I tried the following: generate a small > > mesh on 1 rank with DMPlexCreateBoxMesh, then distribute it with > > DMPlexDistribute. At this point I have two nice partitions, with > shared > > vertices and no overlapping cells. Then I call > DMPlexDistributeOverlap > > with the overlap parameter set to 0 or 1, and get the same resulting > > plex in both cases. Why is that ? > > > > > > The overlap parameter says how many cell adjacencies to go out. You > > should not get the same > > mesh out. We have lots of examples that use this. If you send your small > > example, I can probably > > tell you what is happening. > > > > Ok so I do have a small example on that and the DMClone thing I set up > to understand! I attach it to the email. > > For the overlap, you can change the overlap constant at the top of the > file. With OVERLAP=0 or 1, the distributed overlapping mesh (shown using > -over_dm_view, it's DMover) are the same, and different from the mesh > before distributing the overlap (shown using -distrib_dm_view). For > larger overlap values they're different. > > The process is: > 1/ create a DM dm on 1 rank > 2/ clone dm into dm2 > 3/ distribute dm > 4/ clone dm into dm3 > 5/ distribute dm overlap > > I print all the DMs after each step. dm has a distributed overlap, dm2 > is not distributed, dm3 is distributed but without overlap. Since > distribute and distributeOverlap create new DMs, I don't seem have a > problem with the shallow copies. > > > > Second, I'm wondering what would be a good way to handle two overlaps > > and associated local vectors. In my adaptation code, the remeshing > > library requires a non-overlapping mesh, while the refinement > criterion > > computation is based on hessian computations, which require a layer > of > > overlap. What I can do is clone the dm before distributing the > overlap, > > then manage two independent plex objects with their own local > sections > > etc. and copy/trim local vectors manually. Is there a more automatic > > way > > to do this ? > > > > > > DMClone() is a shallow copy, so that will not work. You would maintain > > two different Plexes, overlapping > > and non-overlapping, with their own sections and vecs. Are you sure you > > need to keep around the non-overlapping one? > > Maybe if I understood what operations you want to work, I could say > > something more definitive. > > > I need to be able to pass the non-overlapping mesh to the remesher. I > can either maintain 2 plexes, or trim the overlapping plex when I create > the arrays I give to the remesher. I'm not sure which is the best/worst ? > > Thanks > > -- > Nicolas > > > > Thanks, > > > > Matt > > > > Thanks > > > > -- > > Nicolas > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 31 12:02:31 2021 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Mar 2021 13:02:31 -0400 Subject: [petsc-users] DMPlex overlap In-Reply-To: References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> <1c9c519d-4d94-66e1-844c-329fb58d3e94@math.u-bordeaux.fr> Message-ID: Alright, I think the problem had to do with keeping track of what DM you were looking at. This code increases the overlap of an initial DM: static char help[] = "Tests plex distribution and overlaps.\n"; #include int main (int argc, char * argv[]) { DM dm, dm2; PetscInt overlap; MPI_Comm comm; PetscErrorCode ierr; ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return ierr; comm = PETSC_COMM_WORLD; ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); ierr = DMSetFromOptions(dm);CHKERRQ(ierr); ierr = PetscObjectSetName((PetscObject) dm, "Initial DM");CHKERRQ(ierr); ierr = DMViewFromOptions(dm, NULL, "-initial_dm_view");CHKERRQ(ierr); ierr = DMPlexGetOverlap(dm, &overlap);CHKERRQ(ierr); ierr = DMPlexDistributeOverlap(dm, overlap+1, NULL, &dm2);CHKERRQ(ierr); ierr = PetscObjectSetName((PetscObject) dm2, "More Overlap DM");CHKERRQ(ierr); ierr = DMViewFromOptions(dm2, NULL, "-over_dm_view");CHKERRQ(ierr); ierr = DMDestroy(&dm2);CHKERRQ(ierr); ierr = DMDestroy(&dm);CHKERRQ(ierr); ierr = PetscFinalize(); return ierr; } and when we run it we get the expected result master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 -dm_distribute -dm_distribute_overlap 1 -over_dm_view DM Object: Initial DM 2 MPI processes type: plex Initial DM in 2 dimensions: 0-cells: 29 29 1-cells: 65 65 2-cells: 37 37 Labels: depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) marker: 1 strata with value/size (1 (27)) Face Sets: 1 strata with value/size (1 (13)) DM Object: More Overlap DM 2 MPI processes type: plex More Overlap DM in 2 dimensions: 0-cells: 36 36 1-cells: 85 85 2-cells: 50 50 Labels: depth: 3 strata with value/size (0 (36), 1 (85), 2 (50)) celltype: 3 strata with value/size (0 (36), 1 (85), 3 (50)) marker: 1 strata with value/size (1 (40)) Face Sets: 1 strata with value/size (1 (20)) Thanks, Matt On Wed, Mar 31, 2021 at 12:57 PM Matthew Knepley wrote: > Okay, let me show a really simple example that gives the expected result > before I figure out what is going wrong for you. This code > > static char help[] = "Tests plex distribution and overlaps.\n"; > > #include > > int main (int argc, char * argv[]) { > DM dm; > MPI_Comm comm; > PetscErrorCode ierr; > > ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return ierr; > comm = PETSC_COMM_WORLD; > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, NULL, > PETSC_TRUE, &dm);CHKERRQ(ierr); > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) dm, "Initial DM");CHKERRQ(ierr); > ierr = DMViewFromOptions(dm, NULL, "-initial_dm_view");CHKERRQ(ierr); > ierr = DMDestroy(&dm);CHKERRQ(ierr); > ierr = PetscFinalize(); > return ierr; > } > > can do all the overlap tests. For example, you can run it naively and get > a serial mesh > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n 2 > ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > DM Object: Initial DM 2 MPI processes > type: plex > Initial DM in 2 dimensions: > 0-cells: 36 0 > 1-cells: 85 0 > 2-cells: 50 0 > Labels: > celltype: 3 strata with value/size (0 (36), 3 (50), 1 (85)) > depth: 3 strata with value/size (0 (36), 1 (85), 2 (50)) > marker: 1 strata with value/size (1 (40)) > Face Sets: 1 strata with value/size (1 (20)) > > Then run it telling Plex to distribute after creating the mesh > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n 2 > ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 -dm_distribute > DM Object: Initial DM 2 MPI processes > type: plex > Initial DM in 2 dimensions: > 0-cells: 21 21 > 1-cells: 45 45 > 2-cells: 25 25 > Labels: > depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > marker: 1 strata with value/size (1 (21)) > Face Sets: 1 strata with value/size (1 (10)) > > The get the same thing back with overlap = 0 > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n 2 > ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 -dm_distribute > -dm_distribute_overlap 0 > DM Object: Initial DM 2 MPI processes > type: plex > Initial DM in 2 dimensions: > 0-cells: 21 21 > 1-cells: 45 45 > 2-cells: 25 25 > Labels: > depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > marker: 1 strata with value/size (1 (21)) > Face Sets: 1 strata with value/size (1 (10)) > > and get larger local meshes with overlap = 1 > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n 2 > ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 -dm_distribute > -dm_distribute_overlap 1 > DM Object: Initial DM 2 MPI processes > type: plex > Initial DM in 2 dimensions: > 0-cells: 29 29 > 1-cells: 65 65 > 2-cells: 37 37 > Labels: > depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > marker: 1 strata with value/size (1 (27)) > Face Sets: 1 strata with value/size (1 (13)) > > Thanks, > > Matt > > On Wed, Mar 31, 2021 at 12:22 PM Nicolas Barral < > nicolas.barral at math.u-bordeaux.fr> wrote: > >> >> >> @+ >> >> -- >> Nicolas >> >> On 31/03/2021 17:51, Matthew Knepley wrote: >> > On Sat, Mar 27, 2021 at 9:27 AM Nicolas Barral >> > > > > wrote: >> > >> > Hi all, >> > >> > First, I'm not sure I understand what the overlap parameter in >> > DMPlexDistributeOverlap does. I tried the following: generate a >> small >> > mesh on 1 rank with DMPlexCreateBoxMesh, then distribute it with >> > DMPlexDistribute. At this point I have two nice partitions, with >> shared >> > vertices and no overlapping cells. Then I call >> DMPlexDistributeOverlap >> > with the overlap parameter set to 0 or 1, and get the same resulting >> > plex in both cases. Why is that ? >> > >> > >> > The overlap parameter says how many cell adjacencies to go out. You >> > should not get the same >> > mesh out. We have lots of examples that use this. If you send your >> small >> > example, I can probably >> > tell you what is happening. >> > >> >> Ok so I do have a small example on that and the DMClone thing I set up >> to understand! I attach it to the email. >> >> For the overlap, you can change the overlap constant at the top of the >> file. With OVERLAP=0 or 1, the distributed overlapping mesh (shown using >> -over_dm_view, it's DMover) are the same, and different from the mesh >> before distributing the overlap (shown using -distrib_dm_view). For >> larger overlap values they're different. >> >> The process is: >> 1/ create a DM dm on 1 rank >> 2/ clone dm into dm2 >> 3/ distribute dm >> 4/ clone dm into dm3 >> 5/ distribute dm overlap >> >> I print all the DMs after each step. dm has a distributed overlap, dm2 >> is not distributed, dm3 is distributed but without overlap. Since >> distribute and distributeOverlap create new DMs, I don't seem have a >> problem with the shallow copies. >> >> >> > Second, I'm wondering what would be a good way to handle two >> overlaps >> > and associated local vectors. In my adaptation code, the remeshing >> > library requires a non-overlapping mesh, while the refinement >> criterion >> > computation is based on hessian computations, which require a layer >> of >> > overlap. What I can do is clone the dm before distributing the >> overlap, >> > then manage two independent plex objects with their own local >> sections >> > etc. and copy/trim local vectors manually. Is there a more automatic >> > way >> > to do this ? >> > >> > >> > DMClone() is a shallow copy, so that will not work. You would maintain >> > two different Plexes, overlapping >> > and non-overlapping, with their own sections and vecs. Are you sure you >> > need to keep around the non-overlapping one? >> > Maybe if I understood what operations you want to work, I could say >> > something more definitive. >> > >> I need to be able to pass the non-overlapping mesh to the remesher. I >> can either maintain 2 plexes, or trim the overlapping plex when I create >> the arrays I give to the remesher. I'm not sure which is the best/worst ? >> >> Thanks >> >> -- >> Nicolas >> >> >> > Thanks, >> > >> > Matt >> > >> > Thanks >> > >> > -- >> > Nicolas >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> > experiments is infinitely more interesting than any results to which >> > their experiments lead. >> > -- Norbert Wiener >> > >> > https://www.cse.buffalo.edu/~knepley/ < >> http://www.cse.buffalo.edu/~knepley/> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Wed Mar 31 12:41:52 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Wed, 31 Mar 2021 19:41:52 +0200 Subject: [petsc-users] DMPlex overlap In-Reply-To: References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> <1c9c519d-4d94-66e1-844c-329fb58d3e94@math.u-bordeaux.fr> Message-ID: Thanks Matt, but sorry I still don't get it. Why does: static char help[] = "Tests plex distribution and overlaps.\n"; #include int main (int argc, char * argv[]) { DM dm; MPI_Comm comm; PetscErrorCode ierr; ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return ierr; comm = PETSC_COMM_WORLD; ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); ierr = DMSetFromOptions(dm);CHKERRQ(ierr); ierr = PetscObjectSetName((PetscObject) dm, "Initial DM");CHKERRQ(ierr); ierr = DMViewFromOptions(dm, NULL, "-initial_dm_view");CHKERRQ(ierr); ierr = DMDestroy(&dm);CHKERRQ(ierr); ierr = PetscFinalize(); return ierr; } called with mpiexec -n 2 ./test_overlapV2 -initial_dm_view -dm_plex_box_faces 5,5 -dm_distribute -dm_distribute_overlap 0 give DM Object: Initial DM 2 MPI processes type: plex Initial DM in 2 dimensions: 0-cells: 21 21 1-cells: 45 45 2-cells: 25 25 Labels: depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) marker: 1 strata with value/size (1 (21)) Face Sets: 1 strata with value/size (1 (10)) which is what I expect, while static char help[] = "Tests plex distribution and overlaps.\n"; #include int main (int argc, char * argv[]) { DM dm, odm; MPI_Comm comm; PetscErrorCode ierr; ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return ierr; comm = PETSC_COMM_WORLD; ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); ierr = DMSetFromOptions(dm);CHKERRQ(ierr); odm = dm; DMPlexDistributeOverlap(odm, 0, NULL, &dm); if (!dm) {printf("Big problem\n"); dm = odm;} else {DMDestroy(&odm);} ierr = PetscObjectSetName((PetscObject) dm, "Initial DM");CHKERRQ(ierr); ierr = DMViewFromOptions(dm, NULL, "-initial_dm_view");CHKERRQ(ierr); ierr = DMDestroy(&dm);CHKERRQ(ierr); ierr = PetscFinalize(); return ierr; } called with mpiexec -n 2 ./test_overlapV3 -initial_dm_view -dm_plex_box_faces 5,5 -dm_distribute gives: DM Object: Initial DM 2 MPI processes type: plex Initial DM in 2 dimensions: 0-cells: 29 29 1-cells: 65 65 2-cells: 37 37 Labels: depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) marker: 1 strata with value/size (1 (27)) Face Sets: 1 strata with value/size (1 (13)) which is not what I expect ? Thanks, -- Nicolas On 31/03/2021 19:02, Matthew Knepley wrote: > Alright, I think the problem had to do with keeping track of what DM you > were looking at. This code increases the overlap of an initial DM: > > static char help[] = "Tests plex distribution and overlaps.\n"; > > #include > > int main (int argc, char * argv[]) { > > ? DM ? ? ? ? ? ? dm, dm2; > ? PetscInt ? ? ? overlap; > ? MPI_Comm ? ? ? comm; > ? PetscErrorCode ierr; > > ? ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return ierr; > ? comm = PETSC_COMM_WORLD; > > ? ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > ? ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > ? ierr = PetscObjectSetName((PetscObject) dm, "Initial DM");CHKERRQ(ierr); > ? ierr = DMViewFromOptions(dm, NULL, "-initial_dm_view");CHKERRQ(ierr); > > ? ierr = DMPlexGetOverlap(dm, &overlap);CHKERRQ(ierr); > ? ierr = DMPlexDistributeOverlap(dm, overlap+1, NULL, &dm2);CHKERRQ(ierr); > ? ierr = PetscObjectSetName((PetscObject) dm2, "More Overlap > DM");CHKERRQ(ierr); > ? ierr = DMViewFromOptions(dm2, NULL, "-over_dm_view");CHKERRQ(ierr); > > ? ierr = DMDestroy(&dm2);CHKERRQ(ierr); > ? ierr = DMDestroy(&dm);CHKERRQ(ierr); > ? ierr = PetscFinalize(); > ? return ierr; > } > > and when we run it we get the expected result > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n 2 > ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 -dm_distribute > -dm_distribute_overlap 1 -over_dm_view > DM Object: Initial DM 2 MPI processes > ? type: plex > Initial DM in 2 dimensions: > ? 0-cells: 29 29 > ? 1-cells: 65 65 > ? 2-cells: 37 37 > Labels: > ? depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > ? celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > ? marker: 1 strata with value/size (1 (27)) > ? Face Sets: 1 strata with value/size (1 (13)) > DM Object: More Overlap DM 2 MPI processes > ? type: plex > More Overlap DM in 2 dimensions: > ? 0-cells: 36 36 > ? 1-cells: 85 85 > ? 2-cells: 50 50 > Labels: > ? depth: 3 strata with value/size (0 (36), 1 (85), 2 (50)) > ? celltype: 3 strata with value/size (0 (36), 1 (85), 3 (50)) > ? marker: 1 strata with value/size (1 (40)) > ? Face Sets: 1 strata with value/size (1 (20)) > > ? Thanks, > > ? ? ?Matt > > On Wed, Mar 31, 2021 at 12:57 PM Matthew Knepley > wrote: > > Okay, let me show a really simple example that gives the expected > result before I figure out what is going wrong for you. This code > > static char help[] = "Tests plex distribution and overlaps.\n"; > > #include > > int main (int argc, char * argv[]) { > ? DM? ? ? ? ? ? ? ? ? ? dm; > ? MPI_Comm? ? ? ?comm; > ? PetscErrorCode ierr; > > ? ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return > ierr; > ? comm = PETSC_COMM_WORLD; > > ? ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > ? ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > ? ierr = PetscObjectSetName((PetscObject) dm, "Initial > DM");CHKERRQ(ierr); > ? ierr = DMViewFromOptions(dm, NULL, "-initial_dm_view");CHKERRQ(ierr); > ? ierr = DMDestroy(&dm);CHKERRQ(ierr); > ? ierr = PetscFinalize(); > ? return ierr; > } > > can do all the overlap tests. For example, you can run it naively > and get a serial mesh > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > DM Object: Initial DM 2 MPI processes > ? type: plex > Initial DM in 2 dimensions: > ? 0-cells: 36 0 > ? 1-cells: 85 0 > ? 2-cells: 50 0 > Labels: > ? celltype: 3 strata with value/size (0 (36), 3 (50), 1 (85)) > ? depth: 3 strata with value/size (0 (36), 1 (85), 2 (50)) > ? marker: 1 strata with value/size (1 (40)) > ? Face Sets: 1 strata with value/size (1 (20)) > > Then run it telling Plex to distribute after creating the mesh > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 -dm_distribute > DM Object: Initial DM 2 MPI processes > ? type: plex > Initial DM in 2 dimensions: > ? 0-cells: 21 21 > ? 1-cells: 45 45 > ? 2-cells: 25 25 > Labels: > ? depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > ? celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > ? marker: 1 strata with value/size (1 (21)) > ? Face Sets: 1 strata with value/size (1 (10)) > > The get the same thing back with overlap = 0 > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > -dm_distribute -dm_distribute_overlap 0 > DM Object: Initial DM 2 MPI processes > ? type: plex > Initial DM in 2 dimensions: > ? 0-cells: 21 21 > ? 1-cells: 45 45 > ? 2-cells: 25 25 > Labels: > ? depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > ? celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > ? marker: 1 strata with value/size (1 (21)) > ? Face Sets: 1 strata with value/size (1 (10)) > > and get larger local meshes with overlap = 1 > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > -dm_distribute -dm_distribute_overlap 1 > DM Object: Initial DM 2 MPI processes > ? type: plex > Initial DM in 2 dimensions: > ? 0-cells: 29 29 > ? 1-cells: 65 65 > ? 2-cells: 37 37 > Labels: > ? depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > ? celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > ? marker: 1 strata with value/size (1 (27)) > ? Face Sets: 1 strata with value/size (1 (13)) > > ? Thanks, > > ? ? ?Matt > > On Wed, Mar 31, 2021 at 12:22 PM Nicolas Barral > > wrote: > > > > @+ > > -- > Nicolas > > On 31/03/2021 17:51, Matthew Knepley wrote: > > On Sat, Mar 27, 2021 at 9:27 AM Nicolas Barral > > > > >> wrote: > > > >? ? ?Hi all, > > > >? ? ?First, I'm not sure I understand what the overlap > parameter in > >? ? ?DMPlexDistributeOverlap does. I tried the following: > generate a small > >? ? ?mesh on 1 rank with DMPlexCreateBoxMesh, then distribute > it with > >? ? ?DMPlexDistribute. At this point I have two nice > partitions, with shared > >? ? ?vertices and no overlapping cells. Then I call > DMPlexDistributeOverlap > >? ? ?with the overlap parameter set to 0 or 1, and get the > same resulting > >? ? ?plex in both cases. Why is that ? > > > > > > The overlap parameter says how many cell adjacencies to go > out. You > > should not get the same > > mesh out. We have lots of examples that use this. If you send > your small > > example, I can probably > > tell you what is happening. > > > > Ok so I do have a small example on that and the DMClone thing I > set up > to understand! I attach it to the email. > > For the overlap, you can change the overlap constant at the top > of the > file. With OVERLAP=0 or 1, the distributed overlapping mesh > (shown using > -over_dm_view, it's DMover) are the same, and different from the > mesh > before distributing the overlap (shown using -distrib_dm_view). For > larger overlap values they're different. > > The process is: > 1/ create a DM dm on 1 rank > 2/ clone dm into dm2 > 3/ distribute dm > 4/ clone dm into dm3 > 5/ distribute dm overlap > > I print all the DMs after each step. dm has a distributed > overlap, dm2 > is not distributed, dm3 is distributed but without overlap. Since > distribute and distributeOverlap create new DMs, I don't seem > have a > problem with the shallow copies. > > > >? ? ?Second, I'm wondering what would be a good way to handle > two overlaps > >? ? ?and associated local vectors. In my adaptation code, the > remeshing > >? ? ?library requires a non-overlapping mesh, while the > refinement criterion > >? ? ?computation is based on hessian computations, which > require a layer of > >? ? ?overlap. What I can do is clone the dm before > distributing the overlap, > >? ? ?then manage two independent plex objects with their own > local sections > >? ? ?etc. and copy/trim local vectors manually. Is there a > more automatic > >? ? ?way > >? ? ?to do this ? > > > > > > DMClone() is a shallow copy, so that will not work. You would > maintain > > two different Plexes, overlapping > > and non-overlapping, with their own sections and vecs. Are > you sure you > > need to keep around the non-overlapping one? > > Maybe if I understood what operations you want to work, I > could say > > something more definitive. > > > I need to be able to pass the non-overlapping mesh to the > remesher. I > can either maintain 2 plexes, or trim the overlapping plex when > I create > the arrays I give to the remesher. I'm not sure which is the > best/worst ? > > Thanks > > -- > Nicolas > > > >? ? Thanks, > > > >? ? ? ?Matt > > > >? ? ?Thanks > > > >? ? ?-- > >? ? ?Nicolas > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results > to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Wed Mar 31 13:10:05 2021 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Mar 2021 14:10:05 -0400 Subject: [petsc-users] DMPlex overlap In-Reply-To: References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> <1c9c519d-4d94-66e1-844c-329fb58d3e94@math.u-bordeaux.fr> Message-ID: On Wed, Mar 31, 2021 at 1:41 PM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > Thanks Matt, but sorry I still don't get it. Why does: > > static char help[] = "Tests plex distribution and overlaps.\n"; > > #include > > int main (int argc, char * argv[]) { > > DM dm; > MPI_Comm comm; > PetscErrorCode ierr; > > ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return ierr; > comm = PETSC_COMM_WORLD; > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) dm, "Initial DM");CHKERRQ(ierr); > ierr = DMViewFromOptions(dm, NULL, "-initial_dm_view");CHKERRQ(ierr); > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > ierr = PetscFinalize(); > return ierr; > } > > called with mpiexec -n 2 ./test_overlapV2 -initial_dm_view > -dm_plex_box_faces 5,5 -dm_distribute -dm_distribute_overlap 0 > > give > DM Object: Initial DM 2 MPI processes > type: plex > Initial DM in 2 dimensions: > 0-cells: 21 21 > 1-cells: 45 45 > 2-cells: 25 25 > Labels: > depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > marker: 1 strata with value/size (1 (21)) > Face Sets: 1 strata with value/size (1 (10)) > > which is what I expect, while > > static char help[] = "Tests plex distribution and overlaps.\n"; > > #include > > int main (int argc, char * argv[]) { > > DM dm, odm; > MPI_Comm comm; > PetscErrorCode ierr; > > ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return ierr; > comm = PETSC_COMM_WORLD; > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > odm = dm; > This is just setting pointers, so no copy. > DMPlexDistributeOverlap(odm, 0, NULL, &dm); > Here you have overwritten the DM here. Don't do this. Thanks, Matt > if (!dm) {printf("Big problem\n"); dm = odm;} > else {DMDestroy(&odm);} > ierr = PetscObjectSetName((PetscObject) dm, "Initial DM");CHKERRQ(ierr); > ierr = DMViewFromOptions(dm, NULL, "-initial_dm_view");CHKERRQ(ierr); > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > ierr = PetscFinalize(); > return ierr; > } > > called with mpiexec -n 2 ./test_overlapV3 -initial_dm_view > -dm_plex_box_faces 5,5 -dm_distribute > > gives: > DM Object: Initial DM 2 MPI processes > type: plex > Initial DM in 2 dimensions: > 0-cells: 29 29 > 1-cells: 65 65 > 2-cells: 37 37 > Labels: > depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > marker: 1 strata with value/size (1 (27)) > Face Sets: 1 strata with value/size (1 (13)) > > which is not what I expect ? > > Thanks, > > -- > Nicolas > > On 31/03/2021 19:02, Matthew Knepley wrote: > > Alright, I think the problem had to do with keeping track of what DM you > > were looking at. This code increases the overlap of an initial DM: > > > > static char help[] = "Tests plex distribution and overlaps.\n"; > > > > #include > > > > int main (int argc, char * argv[]) { > > > > DM dm, dm2; > > PetscInt overlap; > > MPI_Comm comm; > > PetscErrorCode ierr; > > > > ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return > ierr; > > comm = PETSC_COMM_WORLD; > > > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > ierr = PetscObjectSetName((PetscObject) dm, "Initial > DM");CHKERRQ(ierr); > > ierr = DMViewFromOptions(dm, NULL, "-initial_dm_view");CHKERRQ(ierr); > > > > ierr = DMPlexGetOverlap(dm, &overlap);CHKERRQ(ierr); > > ierr = DMPlexDistributeOverlap(dm, overlap+1, NULL, > &dm2);CHKERRQ(ierr); > > ierr = PetscObjectSetName((PetscObject) dm2, "More Overlap > > DM");CHKERRQ(ierr); > > ierr = DMViewFromOptions(dm2, NULL, "-over_dm_view");CHKERRQ(ierr); > > > > ierr = DMDestroy(&dm2);CHKERRQ(ierr); > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > ierr = PetscFinalize(); > > return ierr; > > } > > > > and when we run it we get the expected result > > > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n 2 > > ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 -dm_distribute > > -dm_distribute_overlap 1 -over_dm_view > > DM Object: Initial DM 2 MPI processes > > type: plex > > Initial DM in 2 dimensions: > > 0-cells: 29 29 > > 1-cells: 65 65 > > 2-cells: 37 37 > > Labels: > > depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > > celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > > marker: 1 strata with value/size (1 (27)) > > Face Sets: 1 strata with value/size (1 (13)) > > DM Object: More Overlap DM 2 MPI processes > > type: plex > > More Overlap DM in 2 dimensions: > > 0-cells: 36 36 > > 1-cells: 85 85 > > 2-cells: 50 50 > > Labels: > > depth: 3 strata with value/size (0 (36), 1 (85), 2 (50)) > > celltype: 3 strata with value/size (0 (36), 1 (85), 3 (50)) > > marker: 1 strata with value/size (1 (40)) > > Face Sets: 1 strata with value/size (1 (20)) > > > > Thanks, > > > > Matt > > > > On Wed, Mar 31, 2021 at 12:57 PM Matthew Knepley > > wrote: > > > > Okay, let me show a really simple example that gives the expected > > result before I figure out what is going wrong for you. This code > > > > static char help[] = "Tests plex distribution and overlaps.\n"; > > > > #include > > > > int main (int argc, char * argv[]) { > > DM dm; > > MPI_Comm comm; > > PetscErrorCode ierr; > > > > ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return > > ierr; > > comm = PETSC_COMM_WORLD; > > > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > ierr = PetscObjectSetName((PetscObject) dm, "Initial > > DM");CHKERRQ(ierr); > > ierr = DMViewFromOptions(dm, NULL, > "-initial_dm_view");CHKERRQ(ierr); > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > ierr = PetscFinalize(); > > return ierr; > > } > > > > can do all the overlap tests. For example, you can run it naively > > and get a serial mesh > > > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n > > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > > DM Object: Initial DM 2 MPI processes > > type: plex > > Initial DM in 2 dimensions: > > 0-cells: 36 0 > > 1-cells: 85 0 > > 2-cells: 50 0 > > Labels: > > celltype: 3 strata with value/size (0 (36), 3 (50), 1 (85)) > > depth: 3 strata with value/size (0 (36), 1 (85), 2 (50)) > > marker: 1 strata with value/size (1 (40)) > > Face Sets: 1 strata with value/size (1 (20)) > > > > Then run it telling Plex to distribute after creating the mesh > > > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n > > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > -dm_distribute > > DM Object: Initial DM 2 MPI processes > > type: plex > > Initial DM in 2 dimensions: > > 0-cells: 21 21 > > 1-cells: 45 45 > > 2-cells: 25 25 > > Labels: > > depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > > celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > > marker: 1 strata with value/size (1 (21)) > > Face Sets: 1 strata with value/size (1 (10)) > > > > The get the same thing back with overlap = 0 > > > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n > > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > > -dm_distribute -dm_distribute_overlap 0 > > DM Object: Initial DM 2 MPI processes > > type: plex > > Initial DM in 2 dimensions: > > 0-cells: 21 21 > > 1-cells: 45 45 > > 2-cells: 25 25 > > Labels: > > depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > > celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > > marker: 1 strata with value/size (1 (21)) > > Face Sets: 1 strata with value/size (1 (10)) > > > > and get larger local meshes with overlap = 1 > > > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec -n > > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > > -dm_distribute -dm_distribute_overlap 1 > > DM Object: Initial DM 2 MPI processes > > type: plex > > Initial DM in 2 dimensions: > > 0-cells: 29 29 > > 1-cells: 65 65 > > 2-cells: 37 37 > > Labels: > > depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > > celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > > marker: 1 strata with value/size (1 (27)) > > Face Sets: 1 strata with value/size (1 (13)) > > > > Thanks, > > > > Matt > > > > On Wed, Mar 31, 2021 at 12:22 PM Nicolas Barral > > > > wrote: > > > > > > > > @+ > > > > -- > > Nicolas > > > > On 31/03/2021 17:51, Matthew Knepley wrote: > > > On Sat, Mar 27, 2021 at 9:27 AM Nicolas Barral > > > > > > > > >> wrote: > > > > > > Hi all, > > > > > > First, I'm not sure I understand what the overlap > > parameter in > > > DMPlexDistributeOverlap does. I tried the following: > > generate a small > > > mesh on 1 rank with DMPlexCreateBoxMesh, then distribute > > it with > > > DMPlexDistribute. At this point I have two nice > > partitions, with shared > > > vertices and no overlapping cells. Then I call > > DMPlexDistributeOverlap > > > with the overlap parameter set to 0 or 1, and get the > > same resulting > > > plex in both cases. Why is that ? > > > > > > > > > The overlap parameter says how many cell adjacencies to go > > out. You > > > should not get the same > > > mesh out. We have lots of examples that use this. If you send > > your small > > > example, I can probably > > > tell you what is happening. > > > > > > > Ok so I do have a small example on that and the DMClone thing I > > set up > > to understand! I attach it to the email. > > > > For the overlap, you can change the overlap constant at the top > > of the > > file. With OVERLAP=0 or 1, the distributed overlapping mesh > > (shown using > > -over_dm_view, it's DMover) are the same, and different from the > > mesh > > before distributing the overlap (shown using -distrib_dm_view). > For > > larger overlap values they're different. > > > > The process is: > > 1/ create a DM dm on 1 rank > > 2/ clone dm into dm2 > > 3/ distribute dm > > 4/ clone dm into dm3 > > 5/ distribute dm overlap > > > > I print all the DMs after each step. dm has a distributed > > overlap, dm2 > > is not distributed, dm3 is distributed but without overlap. Since > > distribute and distributeOverlap create new DMs, I don't seem > > have a > > problem with the shallow copies. > > > > > > > Second, I'm wondering what would be a good way to handle > > two overlaps > > > and associated local vectors. In my adaptation code, the > > remeshing > > > library requires a non-overlapping mesh, while the > > refinement criterion > > > computation is based on hessian computations, which > > require a layer of > > > overlap. What I can do is clone the dm before > > distributing the overlap, > > > then manage two independent plex objects with their own > > local sections > > > etc. and copy/trim local vectors manually. Is there a > > more automatic > > > way > > > to do this ? > > > > > > > > > DMClone() is a shallow copy, so that will not work. You would > > maintain > > > two different Plexes, overlapping > > > and non-overlapping, with their own sections and vecs. Are > > you sure you > > > need to keep around the non-overlapping one? > > > Maybe if I understood what operations you want to work, I > > could say > > > something more definitive. > > > > > I need to be able to pass the non-overlapping mesh to the > > remesher. I > > can either maintain 2 plexes, or trim the overlapping plex when > > I create > > the arrays I give to the remesher. I'm not sure which is the > > best/worst ? > > > > Thanks > > > > -- > > Nicolas > > > > > > > Thanks, > > > > > > Matt > > > > > > Thanks > > > > > > -- > > > Nicolas > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin > their > > > experiments is infinitely more interesting than any results > > to which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Wed Mar 31 13:59:03 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Wed, 31 Mar 2021 20:59:03 +0200 Subject: [petsc-users] DMPlex overlap In-Reply-To: References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> <1c9c519d-4d94-66e1-844c-329fb58d3e94@math.u-bordeaux.fr> Message-ID: On 31/03/2021 20:10, Matthew Knepley wrote: > On Wed, Mar 31, 2021 at 1:41 PM Nicolas Barral > > wrote: > > Thanks Matt, but sorry I still don't get it. Why does: > > static char help[] = "Tests plex distribution and overlaps.\n"; > > #include > > int main (int argc, char * argv[]) { > > ? ?DM? ? ? ? ? ? ?dm; > ? ?MPI_Comm? ? ? ?comm; > ? ?PetscErrorCode ierr; > > ? ?ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > return ierr; > ? ?comm = PETSC_COMM_WORLD; > > ? ?ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > ? ?ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > ? ?ierr = PetscObjectSetName((PetscObject) dm, "Initial > DM");CHKERRQ(ierr); > ? ?ierr = DMViewFromOptions(dm, NULL, > "-initial_dm_view");CHKERRQ(ierr); > > > ? ?ierr = DMDestroy(&dm);CHKERRQ(ierr); > ? ?ierr = PetscFinalize(); > ? ?return ierr; > } > > called with mpiexec -n 2 ./test_overlapV2 -initial_dm_view > -dm_plex_box_faces 5,5 -dm_distribute -dm_distribute_overlap 0 > > give > DM Object: Initial DM 2 MPI processes > ? ?type: plex > Initial DM in 2 dimensions: > ? ?0-cells: 21 21 > ? ?1-cells: 45 45 > ? ?2-cells: 25 25 > Labels: > ? ?depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > ? ?celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > ? ?marker: 1 strata with value/size (1 (21)) > ? ?Face Sets: 1 strata with value/size (1 (10)) > > which is what I expect, while > > static char help[] = "Tests plex distribution and overlaps.\n"; > > #include > > int main (int argc, char * argv[]) { > > ? ?DM? ? ? ? ? ? ?dm, odm; > ? ?MPI_Comm? ? ? ?comm; > ? ?PetscErrorCode ierr; > > ? ?ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > return ierr; > ? ?comm = PETSC_COMM_WORLD; > > ? ?ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > ? ?ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > ? ?odm = dm; > > > This is just?setting pointers, so no copy. > > ? ?DMPlexDistributeOverlap(odm, 0, NULL, &dm); > > > Here you have overwritten the DM here. Don't do this. I just copied what's in plex ex19! But ok if I change for: static char help[] = "Tests plex distribution and overlaps.\n"; #include int main (int argc, char * argv[]) { DM dm, odm; MPI_Comm comm; PetscErrorCode ierr; ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return ierr; comm = PETSC_COMM_WORLD; ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); ierr = DMSetFromOptions(dm);CHKERRQ(ierr); DMPlexDistributeOverlap(dm, 0, NULL, &odm); ierr = PetscObjectSetName((PetscObject) odm, "Initial DM");CHKERRQ(ierr); ierr = DMViewFromOptions(odm, NULL, "-initial_dm_view");CHKERRQ(ierr); ierr = DMDestroy(&dm);CHKERRQ(ierr); ierr = DMDestroy(&odm);CHKERRQ(ierr); ierr = PetscFinalize(); return ierr; } I still get: DM Object: Initial DM 2 MPI processes type: plex Initial DM in 2 dimensions: 0-cells: 29 29 1-cells: 65 65 2-cells: 37 37 Labels: depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) marker: 1 strata with value/size (1 (27)) Face Sets: 1 strata with value/size (1 (13)) which is the mesh with a 1-overlap. So what's wrong now ? Thanks, -- Nicolas > > ? Thanks, > > ? ? ?Matt > > ? ?if (!dm) {printf("Big problem\n"); dm = odm;} > ? ?else? ? ?{DMDestroy(&odm);} > ? ?ierr = PetscObjectSetName((PetscObject) dm, "Initial > DM");CHKERRQ(ierr); > ? ?ierr = DMViewFromOptions(dm, NULL, > "-initial_dm_view");CHKERRQ(ierr); > > > ? ?ierr = DMDestroy(&dm);CHKERRQ(ierr); > ? ?ierr = PetscFinalize(); > ? ?return ierr; > } > > called with mpiexec -n 2 ./test_overlapV3 -initial_dm_view > -dm_plex_box_faces 5,5 -dm_distribute > > gives: > DM Object: Initial DM 2 MPI processes > ? ?type: plex > Initial DM in 2 dimensions: > ? ?0-cells: 29 29 > ? ?1-cells: 65 65 > ? ?2-cells: 37 37 > Labels: > ? ?depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > ? ?celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > ? ?marker: 1 strata with value/size (1 (27)) > ? ?Face Sets: 1 strata with value/size (1 (13)) > > which is not what I expect ? > > Thanks, > > -- > Nicolas > > On 31/03/2021 19:02, Matthew Knepley wrote: > > Alright, I think the problem had to do with keeping track of what > DM you > > were looking at. This code increases the overlap of an initial DM: > > > > static char help[] = "Tests plex distribution and overlaps.\n"; > > > > #include > > > > int main (int argc, char * argv[]) { > > > >? ? DM ? ? ? ? ? ? dm, dm2; > >? ? PetscInt ? ? ? overlap; > >? ? MPI_Comm ? ? ? comm; > >? ? PetscErrorCode ierr; > > > >? ? ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > return ierr; > >? ? comm = PETSC_COMM_WORLD; > > > >? ? ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > >? ? ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > >? ? ierr = PetscObjectSetName((PetscObject) dm, "Initial > DM");CHKERRQ(ierr); > >? ? ierr = DMViewFromOptions(dm, NULL, > "-initial_dm_view");CHKERRQ(ierr); > > > >? ? ierr = DMPlexGetOverlap(dm, &overlap);CHKERRQ(ierr); > >? ? ierr = DMPlexDistributeOverlap(dm, overlap+1, NULL, > &dm2);CHKERRQ(ierr); > >? ? ierr = PetscObjectSetName((PetscObject) dm2, "More Overlap > > DM");CHKERRQ(ierr); > >? ? ierr = DMViewFromOptions(dm2, NULL, > "-over_dm_view");CHKERRQ(ierr); > > > >? ? ierr = DMDestroy(&dm2);CHKERRQ(ierr); > >? ? ierr = DMDestroy(&dm);CHKERRQ(ierr); > >? ? ierr = PetscFinalize(); > >? ? return ierr; > > } > > > > and when we run it we get the expected result > > > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec > -n 2 > > ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > -dm_distribute > > -dm_distribute_overlap 1 -over_dm_view > > DM Object: Initial DM 2 MPI processes > >? ? type: plex > > Initial DM in 2 dimensions: > >? ? 0-cells: 29 29 > >? ? 1-cells: 65 65 > >? ? 2-cells: 37 37 > > Labels: > >? ? depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > >? ? celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > >? ? marker: 1 strata with value/size (1 (27)) > >? ? Face Sets: 1 strata with value/size (1 (13)) > > DM Object: More Overlap DM 2 MPI processes > >? ? type: plex > > More Overlap DM in 2 dimensions: > >? ? 0-cells: 36 36 > >? ? 1-cells: 85 85 > >? ? 2-cells: 50 50 > > Labels: > >? ? depth: 3 strata with value/size (0 (36), 1 (85), 2 (50)) > >? ? celltype: 3 strata with value/size (0 (36), 1 (85), 3 (50)) > >? ? marker: 1 strata with value/size (1 (40)) > >? ? Face Sets: 1 strata with value/size (1 (20)) > > > >? ? Thanks, > > > >? ? ? ?Matt > > > > On Wed, Mar 31, 2021 at 12:57 PM Matthew Knepley > > > >> wrote: > > > >? ? ?Okay, let me show a really simple example that gives the expected > >? ? ?result before I figure out what is going wrong for you. This code > > > >? ? ?static char help[] = "Tests plex distribution and overlaps.\n"; > > > >? ? ?#include > > > >? ? ?int main (int argc, char * argv[]) { > >? ? ? ? DM? ? ? ? ? ? ? ? ? ? dm; > >? ? ? ? MPI_Comm? ? ? ?comm; > >? ? ? ? PetscErrorCode ierr; > > > >? ? ? ? ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > return > >? ? ?ierr; > >? ? ? ? comm = PETSC_COMM_WORLD; > > > >? ? ? ? ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, > NULL, NULL, > >? ? ?NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > >? ? ? ? ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > >? ? ? ? ierr = PetscObjectSetName((PetscObject) dm, "Initial > >? ? ?DM");CHKERRQ(ierr); > >? ? ? ? ierr = DMViewFromOptions(dm, NULL, > "-initial_dm_view");CHKERRQ(ierr); > >? ? ? ? ierr = DMDestroy(&dm);CHKERRQ(ierr); > >? ? ? ? ierr = PetscFinalize(); > >? ? ? ? return ierr; > >? ? ?} > > > >? ? ?can do all the overlap tests. For example, you can run it naively > >? ? ?and get a serial mesh > > > >? ? ?master *:~/Downloads/tmp/Nicolas$ > /PETSc3/petsc/apple/bin/mpiexec -n > >? ? ?2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? ? type: plex > >? ? ?Initial DM in 2 dimensions: > >? ? ? ? 0-cells: 36 0 > >? ? ? ? 1-cells: 85 0 > >? ? ? ? 2-cells: 50 0 > >? ? ?Labels: > >? ? ? ? celltype: 3 strata with value/size (0 (36), 3 (50), 1 (85)) > >? ? ? ? depth: 3 strata with value/size (0 (36), 1 (85), 2 (50)) > >? ? ? ? marker: 1 strata with value/size (1 (40)) > >? ? ? ? Face Sets: 1 strata with value/size (1 (20)) > > > >? ? ?Then run it telling Plex to distribute after creating the mesh > > > >? ? ?master *:~/Downloads/tmp/Nicolas$ > /PETSc3/petsc/apple/bin/mpiexec -n > >? ? ?2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > -dm_distribute > >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? ? type: plex > >? ? ?Initial DM in 2 dimensions: > >? ? ? ? 0-cells: 21 21 > >? ? ? ? 1-cells: 45 45 > >? ? ? ? 2-cells: 25 25 > >? ? ?Labels: > >? ? ? ? depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > >? ? ? ? celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > >? ? ? ? marker: 1 strata with value/size (1 (21)) > >? ? ? ? Face Sets: 1 strata with value/size (1 (10)) > > > >? ? ?The get the same thing back with overlap = 0 > > > >? ? ?master *:~/Downloads/tmp/Nicolas$ > /PETSc3/petsc/apple/bin/mpiexec -n > >? ? ?2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > >? ? ?-dm_distribute -dm_distribute_overlap 0 > >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? ? type: plex > >? ? ?Initial DM in 2 dimensions: > >? ? ? ? 0-cells: 21 21 > >? ? ? ? 1-cells: 45 45 > >? ? ? ? 2-cells: 25 25 > >? ? ?Labels: > >? ? ? ? depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > >? ? ? ? celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > >? ? ? ? marker: 1 strata with value/size (1 (21)) > >? ? ? ? Face Sets: 1 strata with value/size (1 (10)) > > > >? ? ?and get larger local meshes with overlap = 1 > > > >? ? ?master *:~/Downloads/tmp/Nicolas$ > /PETSc3/petsc/apple/bin/mpiexec -n > >? ? ?2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > >? ? ?-dm_distribute -dm_distribute_overlap 1 > >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? ? type: plex > >? ? ?Initial DM in 2 dimensions: > >? ? ? ? 0-cells: 29 29 > >? ? ? ? 1-cells: 65 65 > >? ? ? ? 2-cells: 37 37 > >? ? ?Labels: > >? ? ? ? depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > >? ? ? ? celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > >? ? ? ? marker: 1 strata with value/size (1 (27)) > >? ? ? ? Face Sets: 1 strata with value/size (1 (13)) > > > >? ? ? ? Thanks, > > > >? ? ? ? ? ?Matt > > > >? ? ?On Wed, Mar 31, 2021 at 12:22 PM Nicolas Barral > >? ? ? > >? ? ? >> wrote: > > > > > > > >? ? ? ? ?@+ > > > >? ? ? ? ?-- > >? ? ? ? ?Nicolas > > > >? ? ? ? ?On 31/03/2021 17:51, Matthew Knepley wrote: > >? ? ? ? ? > On Sat, Mar 27, 2021 at 9:27 AM Nicolas Barral > >? ? ? ? ? > > >? ? ? ? ? > > >? ? ? ? ? > > >? ? ? ? ? >>> wrote: > >? ? ? ? ? > > >? ? ? ? ? >? ? ?Hi all, > >? ? ? ? ? > > >? ? ? ? ? >? ? ?First, I'm not sure I understand what the overlap > >? ? ? ? ?parameter in > >? ? ? ? ? >? ? ?DMPlexDistributeOverlap does. I tried the following: > >? ? ? ? ?generate a small > >? ? ? ? ? >? ? ?mesh on 1 rank with DMPlexCreateBoxMesh, then > distribute > >? ? ? ? ?it with > >? ? ? ? ? >? ? ?DMPlexDistribute. At this point I have two nice > >? ? ? ? ?partitions, with shared > >? ? ? ? ? >? ? ?vertices and no overlapping cells. Then I call > >? ? ? ? ?DMPlexDistributeOverlap > >? ? ? ? ? >? ? ?with the overlap parameter set to 0 or 1, and get the > >? ? ? ? ?same resulting > >? ? ? ? ? >? ? ?plex in both cases. Why is that ? > >? ? ? ? ? > > >? ? ? ? ? > > >? ? ? ? ? > The overlap parameter says how many cell adjacencies to go > >? ? ? ? ?out. You > >? ? ? ? ? > should not get the same > >? ? ? ? ? > mesh out. We have lots of examples that use this. If > you send > >? ? ? ? ?your small > >? ? ? ? ? > example, I can probably > >? ? ? ? ? > tell you what is happening. > >? ? ? ? ? > > > > >? ? ? ? ?Ok so I do have a small example on that and the DMClone > thing I > >? ? ? ? ?set up > >? ? ? ? ?to understand! I attach it to the email. > > > >? ? ? ? ?For the overlap, you can change the overlap constant at > the top > >? ? ? ? ?of the > >? ? ? ? ?file. With OVERLAP=0 or 1, the distributed overlapping mesh > >? ? ? ? ?(shown using > >? ? ? ? ?-over_dm_view, it's DMover) are the same, and different > from the > >? ? ? ? ?mesh > >? ? ? ? ?before distributing the overlap (shown using > -distrib_dm_view). For > >? ? ? ? ?larger overlap values they're different. > > > >? ? ? ? ?The process is: > >? ? ? ? ?1/ create a DM dm on 1 rank > >? ? ? ? ?2/ clone dm into dm2 > >? ? ? ? ?3/ distribute dm > >? ? ? ? ?4/ clone dm into dm3 > >? ? ? ? ?5/ distribute dm overlap > > > >? ? ? ? ?I print all the DMs after each step. dm has a distributed > >? ? ? ? ?overlap, dm2 > >? ? ? ? ?is not distributed, dm3 is distributed but without > overlap. Since > >? ? ? ? ?distribute and distributeOverlap create new DMs, I don't seem > >? ? ? ? ?have a > >? ? ? ? ?problem with the shallow copies. > > > > > >? ? ? ? ? >? ? ?Second, I'm wondering what would be a good way to > handle > >? ? ? ? ?two overlaps > >? ? ? ? ? >? ? ?and associated local vectors. In my adaptation > code, the > >? ? ? ? ?remeshing > >? ? ? ? ? >? ? ?library requires a non-overlapping mesh, while the > >? ? ? ? ?refinement criterion > >? ? ? ? ? >? ? ?computation is based on hessian computations, which > >? ? ? ? ?require a layer of > >? ? ? ? ? >? ? ?overlap. What I can do is clone the dm before > >? ? ? ? ?distributing the overlap, > >? ? ? ? ? >? ? ?then manage two independent plex objects with > their own > >? ? ? ? ?local sections > >? ? ? ? ? >? ? ?etc. and copy/trim local vectors manually. Is there a > >? ? ? ? ?more automatic > >? ? ? ? ? >? ? ?way > >? ? ? ? ? >? ? ?to do this ? > >? ? ? ? ? > > >? ? ? ? ? > > >? ? ? ? ? > DMClone() is a shallow copy, so that will not work. > You would > >? ? ? ? ?maintain > >? ? ? ? ? > two different Plexes, overlapping > >? ? ? ? ? > and non-overlapping, with their own sections and vecs. Are > >? ? ? ? ?you sure you > >? ? ? ? ? > need to keep around the non-overlapping one? > >? ? ? ? ? > Maybe if I understood what operations you want to work, I > >? ? ? ? ?could say > >? ? ? ? ? > something more definitive. > >? ? ? ? ? > > >? ? ? ? ?I need to be able to pass the non-overlapping mesh to the > >? ? ? ? ?remesher. I > >? ? ? ? ?can either maintain 2 plexes, or trim the overlapping > plex when > >? ? ? ? ?I create > >? ? ? ? ?the arrays I give to the remesher. I'm not sure which is the > >? ? ? ? ?best/worst ? > > > >? ? ? ? ?Thanks > > > >? ? ? ? ?-- > >? ? ? ? ?Nicolas > > > > > >? ? ? ? ? >? ? Thanks, > >? ? ? ? ? > > >? ? ? ? ? >? ? ? ?Matt > >? ? ? ? ? > > >? ? ? ? ? >? ? ?Thanks > >? ? ? ? ? > > >? ? ? ? ? >? ? ?-- > >? ? ? ? ? >? ? ?Nicolas > >? ? ? ? ? > > >? ? ? ? ? > > >? ? ? ? ? > > >? ? ? ? ? > -- > >? ? ? ? ? > What most experimenters take for granted before they > begin their > >? ? ? ? ? > experiments is infinitely more interesting than any > results > >? ? ? ? ?to which > >? ? ? ? ? > their experiments lead. > >? ? ? ? ? > -- Norbert Wiener > >? ? ? ? ? > > >? ? ? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? ? ? > > > > > > > >? ? ?-- > >? ? ?What most experimenters take for granted before they begin their > >? ? ?experiments is infinitely more interesting than any results > to which > >? ? ?their experiments lead. > >? ? ?-- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > >? ? ? > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Wed Mar 31 14:01:35 2021 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Mar 2021 15:01:35 -0400 Subject: [petsc-users] DMPlex overlap In-Reply-To: References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> <1c9c519d-4d94-66e1-844c-329fb58d3e94@math.u-bordeaux.fr> Message-ID: On Wed, Mar 31, 2021 at 2:59 PM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > On 31/03/2021 20:10, Matthew Knepley wrote: > > On Wed, Mar 31, 2021 at 1:41 PM Nicolas Barral > > > > wrote: > > > > Thanks Matt, but sorry I still don't get it. Why does: > > > > static char help[] = "Tests plex distribution and overlaps.\n"; > > > > #include > > > > int main (int argc, char * argv[]) { > > > > DM dm; > > MPI_Comm comm; > > PetscErrorCode ierr; > > > > ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > > return ierr; > > comm = PETSC_COMM_WORLD; > > > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > ierr = PetscObjectSetName((PetscObject) dm, "Initial > > DM");CHKERRQ(ierr); > > ierr = DMViewFromOptions(dm, NULL, > > "-initial_dm_view");CHKERRQ(ierr); > > > > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > ierr = PetscFinalize(); > > return ierr; > > } > > > > called with mpiexec -n 2 ./test_overlapV2 -initial_dm_view > > -dm_plex_box_faces 5,5 -dm_distribute -dm_distribute_overlap 0 > > > > give > > DM Object: Initial DM 2 MPI processes > > type: plex > > Initial DM in 2 dimensions: > > 0-cells: 21 21 > > 1-cells: 45 45 > > 2-cells: 25 25 > > Labels: > > depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > > celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > > marker: 1 strata with value/size (1 (21)) > > Face Sets: 1 strata with value/size (1 (10)) > > > > which is what I expect, while > > > > static char help[] = "Tests plex distribution and overlaps.\n"; > > > > #include > > > > int main (int argc, char * argv[]) { > > > > DM dm, odm; > > MPI_Comm comm; > > PetscErrorCode ierr; > > > > ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > > return ierr; > > comm = PETSC_COMM_WORLD; > > > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > odm = dm; > > > > > > This is just setting pointers, so no copy. > > > > DMPlexDistributeOverlap(odm, 0, NULL, &dm); > > > > > > Here you have overwritten the DM here. Don't do this. > > > I just copied what's in plex ex19! > > But ok if I change for: > > static char help[] = "Tests plex distribution and overlaps.\n"; > > #include > > int main (int argc, char * argv[]) { > > DM dm, odm; > MPI_Comm comm; > PetscErrorCode ierr; > > ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return ierr; > comm = PETSC_COMM_WORLD; > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > DMPlexDistributeOverlap(dm, 0, NULL, &odm); > ierr = PetscObjectSetName((PetscObject) odm, "Initial > DM");CHKERRQ(ierr); > ierr = DMViewFromOptions(odm, NULL, "-initial_dm_view");CHKERRQ(ierr); > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > ierr = DMDestroy(&odm);CHKERRQ(ierr); > ierr = PetscFinalize(); > return ierr; > } > > I still get: > > DM Object: Initial DM 2 MPI processes > type: plex > Initial DM in 2 dimensions: > 0-cells: 29 29 > 1-cells: 65 65 > 2-cells: 37 37 > Labels: > depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > marker: 1 strata with value/size (1 (27)) > Face Sets: 1 strata with value/size (1 (13)) > > which is the mesh with a 1-overlap. So what's wrong now ? > Oh, you mean why did passing "0" in DistributeOverlap not reduce your overlap to 0? That function takes an existing mesh and gives you "k" more levels of overlap. It does not set your overlap at level k. Maybe that is not clear from the docs. Thanks, Matt > Thanks, > > -- > Nicolas > > > > > Thanks, > > > > Matt > > > > if (!dm) {printf("Big problem\n"); dm = odm;} > > else {DMDestroy(&odm);} > > ierr = PetscObjectSetName((PetscObject) dm, "Initial > > DM");CHKERRQ(ierr); > > ierr = DMViewFromOptions(dm, NULL, > > "-initial_dm_view");CHKERRQ(ierr); > > > > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > ierr = PetscFinalize(); > > return ierr; > > } > > > > called with mpiexec -n 2 ./test_overlapV3 -initial_dm_view > > -dm_plex_box_faces 5,5 -dm_distribute > > > > gives: > > DM Object: Initial DM 2 MPI processes > > type: plex > > Initial DM in 2 dimensions: > > 0-cells: 29 29 > > 1-cells: 65 65 > > 2-cells: 37 37 > > Labels: > > depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > > celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > > marker: 1 strata with value/size (1 (27)) > > Face Sets: 1 strata with value/size (1 (13)) > > > > which is not what I expect ? > > > > Thanks, > > > > -- > > Nicolas > > > > On 31/03/2021 19:02, Matthew Knepley wrote: > > > Alright, I think the problem had to do with keeping track of what > > DM you > > > were looking at. This code increases the overlap of an initial DM: > > > > > > static char help[] = "Tests plex distribution and overlaps.\n"; > > > > > > #include > > > > > > int main (int argc, char * argv[]) { > > > > > > DM dm, dm2; > > > PetscInt overlap; > > > MPI_Comm comm; > > > PetscErrorCode ierr; > > > > > > ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > > return ierr; > > > comm = PETSC_COMM_WORLD; > > > > > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, > NULL, > > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > > > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > > ierr = PetscObjectSetName((PetscObject) dm, "Initial > > DM");CHKERRQ(ierr); > > > ierr = DMViewFromOptions(dm, NULL, > > "-initial_dm_view");CHKERRQ(ierr); > > > > > > ierr = DMPlexGetOverlap(dm, &overlap);CHKERRQ(ierr); > > > ierr = DMPlexDistributeOverlap(dm, overlap+1, NULL, > > &dm2);CHKERRQ(ierr); > > > ierr = PetscObjectSetName((PetscObject) dm2, "More Overlap > > > DM");CHKERRQ(ierr); > > > ierr = DMViewFromOptions(dm2, NULL, > > "-over_dm_view");CHKERRQ(ierr); > > > > > > ierr = DMDestroy(&dm2);CHKERRQ(ierr); > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > > ierr = PetscFinalize(); > > > return ierr; > > > } > > > > > > and when we run it we get the expected result > > > > > > master *:~/Downloads/tmp/Nicolas$ /PETSc3/petsc/apple/bin/mpiexec > > -n 2 > > > ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > > -dm_distribute > > > -dm_distribute_overlap 1 -over_dm_view > > > DM Object: Initial DM 2 MPI processes > > > type: plex > > > Initial DM in 2 dimensions: > > > 0-cells: 29 29 > > > 1-cells: 65 65 > > > 2-cells: 37 37 > > > Labels: > > > depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > > > celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > > > marker: 1 strata with value/size (1 (27)) > > > Face Sets: 1 strata with value/size (1 (13)) > > > DM Object: More Overlap DM 2 MPI processes > > > type: plex > > > More Overlap DM in 2 dimensions: > > > 0-cells: 36 36 > > > 1-cells: 85 85 > > > 2-cells: 50 50 > > > Labels: > > > depth: 3 strata with value/size (0 (36), 1 (85), 2 (50)) > > > celltype: 3 strata with value/size (0 (36), 1 (85), 3 (50)) > > > marker: 1 strata with value/size (1 (40)) > > > Face Sets: 1 strata with value/size (1 (20)) > > > > > > Thanks, > > > > > > Matt > > > > > > On Wed, Mar 31, 2021 at 12:57 PM Matthew Knepley > > > > > >> wrote: > > > > > > Okay, let me show a really simple example that gives the > expected > > > result before I figure out what is going wrong for you. This > code > > > > > > static char help[] = "Tests plex distribution and > overlaps.\n"; > > > > > > #include > > > > > > int main (int argc, char * argv[]) { > > > DM dm; > > > MPI_Comm comm; > > > PetscErrorCode ierr; > > > > > > ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > > return > > > ierr; > > > comm = PETSC_COMM_WORLD; > > > > > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, > > NULL, NULL, > > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > > > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > > ierr = PetscObjectSetName((PetscObject) dm, "Initial > > > DM");CHKERRQ(ierr); > > > ierr = DMViewFromOptions(dm, NULL, > > "-initial_dm_view");CHKERRQ(ierr); > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > > ierr = PetscFinalize(); > > > return ierr; > > > } > > > > > > can do all the overlap tests. For example, you can run it > naively > > > and get a serial mesh > > > > > > master *:~/Downloads/tmp/Nicolas$ > > /PETSc3/petsc/apple/bin/mpiexec -n > > > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > > > DM Object: Initial DM 2 MPI processes > > > type: plex > > > Initial DM in 2 dimensions: > > > 0-cells: 36 0 > > > 1-cells: 85 0 > > > 2-cells: 50 0 > > > Labels: > > > celltype: 3 strata with value/size (0 (36), 3 (50), 1 (85)) > > > depth: 3 strata with value/size (0 (36), 1 (85), 2 (50)) > > > marker: 1 strata with value/size (1 (40)) > > > Face Sets: 1 strata with value/size (1 (20)) > > > > > > Then run it telling Plex to distribute after creating the mesh > > > > > > master *:~/Downloads/tmp/Nicolas$ > > /PETSc3/petsc/apple/bin/mpiexec -n > > > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > > -dm_distribute > > > DM Object: Initial DM 2 MPI processes > > > type: plex > > > Initial DM in 2 dimensions: > > > 0-cells: 21 21 > > > 1-cells: 45 45 > > > 2-cells: 25 25 > > > Labels: > > > depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > > > celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > > > marker: 1 strata with value/size (1 (21)) > > > Face Sets: 1 strata with value/size (1 (10)) > > > > > > The get the same thing back with overlap = 0 > > > > > > master *:~/Downloads/tmp/Nicolas$ > > /PETSc3/petsc/apple/bin/mpiexec -n > > > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > > > -dm_distribute -dm_distribute_overlap 0 > > > DM Object: Initial DM 2 MPI processes > > > type: plex > > > Initial DM in 2 dimensions: > > > 0-cells: 21 21 > > > 1-cells: 45 45 > > > 2-cells: 25 25 > > > Labels: > > > depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > > > celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > > > marker: 1 strata with value/size (1 (21)) > > > Face Sets: 1 strata with value/size (1 (10)) > > > > > > and get larger local meshes with overlap = 1 > > > > > > master *:~/Downloads/tmp/Nicolas$ > > /PETSc3/petsc/apple/bin/mpiexec -n > > > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > > > -dm_distribute -dm_distribute_overlap 1 > > > DM Object: Initial DM 2 MPI processes > > > type: plex > > > Initial DM in 2 dimensions: > > > 0-cells: 29 29 > > > 1-cells: 65 65 > > > 2-cells: 37 37 > > > Labels: > > > depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > > > celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > > > marker: 1 strata with value/size (1 (27)) > > > Face Sets: 1 strata with value/size (1 (13)) > > > > > > Thanks, > > > > > > Matt > > > > > > On Wed, Mar 31, 2021 at 12:22 PM Nicolas Barral > > > > > > > > >> wrote: > > > > > > > > > > > > @+ > > > > > > -- > > > Nicolas > > > > > > On 31/03/2021 17:51, Matthew Knepley wrote: > > > > On Sat, Mar 27, 2021 at 9:27 AM Nicolas Barral > > > > > > > > > > > > > > > > > > > >>> wrote: > > > > > > > > Hi all, > > > > > > > > First, I'm not sure I understand what the overlap > > > parameter in > > > > DMPlexDistributeOverlap does. I tried the > following: > > > generate a small > > > > mesh on 1 rank with DMPlexCreateBoxMesh, then > > distribute > > > it with > > > > DMPlexDistribute. At this point I have two nice > > > partitions, with shared > > > > vertices and no overlapping cells. Then I call > > > DMPlexDistributeOverlap > > > > with the overlap parameter set to 0 or 1, and get > the > > > same resulting > > > > plex in both cases. Why is that ? > > > > > > > > > > > > The overlap parameter says how many cell adjacencies > to go > > > out. You > > > > should not get the same > > > > mesh out. We have lots of examples that use this. If > > you send > > > your small > > > > example, I can probably > > > > tell you what is happening. > > > > > > > > > > Ok so I do have a small example on that and the DMClone > > thing I > > > set up > > > to understand! I attach it to the email. > > > > > > For the overlap, you can change the overlap constant at > > the top > > > of the > > > file. With OVERLAP=0 or 1, the distributed overlapping > mesh > > > (shown using > > > -over_dm_view, it's DMover) are the same, and different > > from the > > > mesh > > > before distributing the overlap (shown using > > -distrib_dm_view). For > > > larger overlap values they're different. > > > > > > The process is: > > > 1/ create a DM dm on 1 rank > > > 2/ clone dm into dm2 > > > 3/ distribute dm > > > 4/ clone dm into dm3 > > > 5/ distribute dm overlap > > > > > > I print all the DMs after each step. dm has a distributed > > > overlap, dm2 > > > is not distributed, dm3 is distributed but without > > overlap. Since > > > distribute and distributeOverlap create new DMs, I don't > seem > > > have a > > > problem with the shallow copies. > > > > > > > > > > Second, I'm wondering what would be a good way to > > handle > > > two overlaps > > > > and associated local vectors. In my adaptation > > code, the > > > remeshing > > > > library requires a non-overlapping mesh, while the > > > refinement criterion > > > > computation is based on hessian computations, which > > > require a layer of > > > > overlap. What I can do is clone the dm before > > > distributing the overlap, > > > > then manage two independent plex objects with > > their own > > > local sections > > > > etc. and copy/trim local vectors manually. Is > there a > > > more automatic > > > > way > > > > to do this ? > > > > > > > > > > > > DMClone() is a shallow copy, so that will not work. > > You would > > > maintain > > > > two different Plexes, overlapping > > > > and non-overlapping, with their own sections and vecs. > Are > > > you sure you > > > > need to keep around the non-overlapping one? > > > > Maybe if I understood what operations you want to > work, I > > > could say > > > > something more definitive. > > > > > > > I need to be able to pass the non-overlapping mesh to the > > > remesher. I > > > can either maintain 2 plexes, or trim the overlapping > > plex when > > > I create > > > the arrays I give to the remesher. I'm not sure which is > the > > > best/worst ? > > > > > > Thanks > > > > > > -- > > > Nicolas > > > > > > > > > > Thanks, > > > > > > > > Matt > > > > > > > > Thanks > > > > > > > > -- > > > > Nicolas > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before they > > begin their > > > > experiments is infinitely more interesting than any > > results > > > to which > > > > their experiments lead. > > > > -- Norbert Wiener > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin > their > > > experiments is infinitely more interesting than any results > > to which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to > which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Wed Mar 31 14:16:59 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Wed, 31 Mar 2021 21:16:59 +0200 Subject: [petsc-users] DMPlex overlap In-Reply-To: References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> <1c9c519d-4d94-66e1-844c-329fb58d3e94@math.u-bordeaux.fr> Message-ID: On 31/03/2021 21:01, Matthew Knepley wrote: > On Wed, Mar 31, 2021 at 2:59 PM Nicolas Barral > > wrote: > > On 31/03/2021 20:10, Matthew Knepley wrote: > > On Wed, Mar 31, 2021 at 1:41 PM Nicolas Barral > > > > >> wrote: > > > >? ? ?Thanks Matt, but sorry I still don't get it. Why does: > > > >? ? ?static char help[] = "Tests plex distribution and overlaps.\n"; > > > >? ? ?#include > > > >? ? ?int main (int argc, char * argv[]) { > > > >? ? ? ? ?DM? ? ? ? ? ? ?dm; > >? ? ? ? ?MPI_Comm? ? ? ?comm; > >? ? ? ? ?PetscErrorCode ierr; > > > >? ? ? ? ?ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > >? ? ?return ierr; > >? ? ? ? ?comm = PETSC_COMM_WORLD; > > > >? ? ? ? ?ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, > NULL, NULL, > >? ? ?NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > >? ? ? ? ?ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > >? ? ? ? ?ierr = PetscObjectSetName((PetscObject) dm, "Initial > >? ? ?DM");CHKERRQ(ierr); > >? ? ? ? ?ierr = DMViewFromOptions(dm, NULL, > >? ? ?"-initial_dm_view");CHKERRQ(ierr); > > > > > >? ? ? ? ?ierr = DMDestroy(&dm);CHKERRQ(ierr); > >? ? ? ? ?ierr = PetscFinalize(); > >? ? ? ? ?return ierr; > >? ? ?} > > > >? ? ?called with mpiexec -n 2 ./test_overlapV2 -initial_dm_view > >? ? ?-dm_plex_box_faces 5,5 -dm_distribute -dm_distribute_overlap 0 > > > >? ? ?give > >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? ? ?type: plex > >? ? ?Initial DM in 2 dimensions: > >? ? ? ? ?0-cells: 21 21 > >? ? ? ? ?1-cells: 45 45 > >? ? ? ? ?2-cells: 25 25 > >? ? ?Labels: > >? ? ? ? ?depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > >? ? ? ? ?celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > >? ? ? ? ?marker: 1 strata with value/size (1 (21)) > >? ? ? ? ?Face Sets: 1 strata with value/size (1 (10)) > > > >? ? ?which is what I expect, while > > > >? ? ?static char help[] = "Tests plex distribution and overlaps.\n"; > > > >? ? ?#include > > > >? ? ?int main (int argc, char * argv[]) { > > > >? ? ? ? ?DM? ? ? ? ? ? ?dm, odm; > >? ? ? ? ?MPI_Comm? ? ? ?comm; > >? ? ? ? ?PetscErrorCode ierr; > > > >? ? ? ? ?ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > >? ? ?return ierr; > >? ? ? ? ?comm = PETSC_COMM_WORLD; > > > >? ? ? ? ?ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, > NULL, NULL, > >? ? ?NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > >? ? ? ? ?ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > >? ? ? ? ?odm = dm; > > > > > > This is just?setting pointers, so no copy. > > > >? ? ? ? ?DMPlexDistributeOverlap(odm, 0, NULL, &dm); > > > > > > Here you have overwritten the DM here. Don't do this. > > > I just copied what's in plex ex19! > > But ok if I change for: > > static char help[] = "Tests plex distribution and overlaps.\n"; > > #include > > int main (int argc, char * argv[]) { > > ? ?DM? ? ? ? ? ? ?dm, odm; > ? ?MPI_Comm? ? ? ?comm; > ? ?PetscErrorCode ierr; > > ? ?ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > return ierr; > ? ?comm = PETSC_COMM_WORLD; > > ? ?ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > ? ?ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > ? ?DMPlexDistributeOverlap(dm, 0, NULL, &odm); > ? ?ierr = PetscObjectSetName((PetscObject) odm, "Initial > DM");CHKERRQ(ierr); > ? ?ierr = DMViewFromOptions(odm, NULL, > "-initial_dm_view");CHKERRQ(ierr); > > > ? ?ierr = DMDestroy(&dm);CHKERRQ(ierr); > ? ?ierr = DMDestroy(&odm);CHKERRQ(ierr); > ? ?ierr = PetscFinalize(); > ? ?return ierr; > } > > I still get: > > DM Object: Initial DM 2 MPI processes > ? ?type: plex > Initial DM in 2 dimensions: > ? ?0-cells: 29 29 > ? ?1-cells: 65 65 > ? ?2-cells: 37 37 > Labels: > ? ?depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > ? ?celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > ? ?marker: 1 strata with value/size (1 (27)) > ? ?Face Sets: 1 strata with value/size (1 (13)) > > which is the mesh with a 1-overlap. So what's wrong now ? > > > Oh, you mean why did passing "0" in DistributeOverlap not reduce your > overlap to 0? That > function takes an existing mesh and gives you "k" more levels of > overlap. It does not set your > overlap at level k. Maybe that is not clear from the docs. Ok so that's worth clarifying, but my problem is that dm has a 0-overlap before DMPlexDistributeOverlap and odm has a 1-overlap even though I passed "0". If I understand what you just wrote, adding "0" more levels of overlap to 0-overlap, that should still be 0 overlap ? Yet when I look at the code of DMPlexDistributeOverlap, you're flagging points to be added to the overlap whatever "k" is. Sample code: static char help[] = "Tests plex distribution and overlaps.\n"; #include int main (int argc, char * argv[]) { DM dm, odm; MPI_Comm comm; PetscErrorCode ierr; ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return ierr; comm = PETSC_COMM_WORLD; ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); ierr = DMSetFromOptions(dm);CHKERRQ(ierr); ierr = PetscObjectSetName((PetscObject) dm, "DM before");CHKERRQ(ierr); ierr = DMViewFromOptions(dm, NULL, "-before_dm_view");CHKERRQ(ierr); DMPlexDistributeOverlap(dm, 0, NULL, &odm); ierr = PetscObjectSetName((PetscObject) odm, "DM after");CHKERRQ(ierr); ierr = DMViewFromOptions(odm, NULL, "-after_dm_view");CHKERRQ(ierr); ierr = DMDestroy(&dm);CHKERRQ(ierr); ierr = DMDestroy(&odm);CHKERRQ(ierr); ierr = PetscFinalize(); return ierr; } % mpiexec -n 2 ./test_overlapV3 -dm_plex_box_faces 5,5 -dm_distribute -before_dm_view -after_dm_view DM Object: DM before 2 MPI processes type: plex DM before in 2 dimensions: 0-cells: 21 21 1-cells: 45 45 2-cells: 25 25 Labels: depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) marker: 1 strata with value/size (1 (21)) Face Sets: 1 strata with value/size (1 (10)) DM Object: DM after 2 MPI processes type: plex DM after in 2 dimensions: 0-cells: 29 29 1-cells: 65 65 2-cells: 37 37 Labels: depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) marker: 1 strata with value/size (1 (27)) Face Sets: 1 strata with value/size (1 (13)) > > ? Thanks, > > ? ? ?Matt > > Thanks, > > -- > Nicolas > > > > >? ? Thanks, > > > >? ? ? ?Matt > > > >? ? ? ? ?if (!dm) {printf("Big problem\n"); dm = odm;} > >? ? ? ? ?else? ? ?{DMDestroy(&odm);} > >? ? ? ? ?ierr = PetscObjectSetName((PetscObject) dm, "Initial > >? ? ?DM");CHKERRQ(ierr); > >? ? ? ? ?ierr = DMViewFromOptions(dm, NULL, > >? ? ?"-initial_dm_view");CHKERRQ(ierr); > > > > > >? ? ? ? ?ierr = DMDestroy(&dm);CHKERRQ(ierr); > >? ? ? ? ?ierr = PetscFinalize(); > >? ? ? ? ?return ierr; > >? ? ?} > > > >? ? ?called with mpiexec -n 2 ./test_overlapV3 -initial_dm_view > >? ? ?-dm_plex_box_faces 5,5 -dm_distribute > > > >? ? ?gives: > >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? ? ?type: plex > >? ? ?Initial DM in 2 dimensions: > >? ? ? ? ?0-cells: 29 29 > >? ? ? ? ?1-cells: 65 65 > >? ? ? ? ?2-cells: 37 37 > >? ? ?Labels: > >? ? ? ? ?depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > >? ? ? ? ?celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > >? ? ? ? ?marker: 1 strata with value/size (1 (27)) > >? ? ? ? ?Face Sets: 1 strata with value/size (1 (13)) > > > >? ? ?which is not what I expect ? > > > >? ? ?Thanks, > > > >? ? ?-- > >? ? ?Nicolas > > > >? ? ?On 31/03/2021 19:02, Matthew Knepley wrote: > >? ? ? > Alright, I think the problem had to do with keeping track > of what > >? ? ?DM you > >? ? ? > were looking at. This code increases the overlap of an > initial DM: > >? ? ? > > >? ? ? > static char help[] = "Tests plex distribution and > overlaps.\n"; > >? ? ? > > >? ? ? > #include > >? ? ? > > >? ? ? > int main (int argc, char * argv[]) { > >? ? ? > > >? ? ? >? ? DM ? ? ? ? ? ? dm, dm2; > >? ? ? >? ? PetscInt ? ? ? overlap; > >? ? ? >? ? MPI_Comm ? ? ? comm; > >? ? ? >? ? PetscErrorCode ierr; > >? ? ? > > >? ? ? >? ? ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > >? ? ?return ierr; > >? ? ? >? ? comm = PETSC_COMM_WORLD; > >? ? ? > > >? ? ? >? ? ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, > NULL, NULL, > >? ? ? > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > >? ? ? >? ? ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > >? ? ? >? ? ierr = PetscObjectSetName((PetscObject) dm, "Initial > >? ? ?DM");CHKERRQ(ierr); > >? ? ? >? ? ierr = DMViewFromOptions(dm, NULL, > >? ? ?"-initial_dm_view");CHKERRQ(ierr); > >? ? ? > > >? ? ? >? ? ierr = DMPlexGetOverlap(dm, &overlap);CHKERRQ(ierr); > >? ? ? >? ? ierr = DMPlexDistributeOverlap(dm, overlap+1, NULL, > >? ? ?&dm2);CHKERRQ(ierr); > >? ? ? >? ? ierr = PetscObjectSetName((PetscObject) dm2, "More Overlap > >? ? ? > DM");CHKERRQ(ierr); > >? ? ? >? ? ierr = DMViewFromOptions(dm2, NULL, > >? ? ?"-over_dm_view");CHKERRQ(ierr); > >? ? ? > > >? ? ? >? ? ierr = DMDestroy(&dm2);CHKERRQ(ierr); > >? ? ? >? ? ierr = DMDestroy(&dm);CHKERRQ(ierr); > >? ? ? >? ? ierr = PetscFinalize(); > >? ? ? >? ? return ierr; > >? ? ? > } > >? ? ? > > >? ? ? > and when we run it we get the expected result > >? ? ? > > >? ? ? > master *:~/Downloads/tmp/Nicolas$ > /PETSc3/petsc/apple/bin/mpiexec > >? ? ?-n 2 > >? ? ? > ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > >? ? ?-dm_distribute > >? ? ? > -dm_distribute_overlap 1 -over_dm_view > >? ? ? > DM Object: Initial DM 2 MPI processes > >? ? ? >? ? type: plex > >? ? ? > Initial DM in 2 dimensions: > >? ? ? >? ? 0-cells: 29 29 > >? ? ? >? ? 1-cells: 65 65 > >? ? ? >? ? 2-cells: 37 37 > >? ? ? > Labels: > >? ? ? >? ? depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > >? ? ? >? ? celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > >? ? ? >? ? marker: 1 strata with value/size (1 (27)) > >? ? ? >? ? Face Sets: 1 strata with value/size (1 (13)) > >? ? ? > DM Object: More Overlap DM 2 MPI processes > >? ? ? >? ? type: plex > >? ? ? > More Overlap DM in 2 dimensions: > >? ? ? >? ? 0-cells: 36 36 > >? ? ? >? ? 1-cells: 85 85 > >? ? ? >? ? 2-cells: 50 50 > >? ? ? > Labels: > >? ? ? >? ? depth: 3 strata with value/size (0 (36), 1 (85), 2 (50)) > >? ? ? >? ? celltype: 3 strata with value/size (0 (36), 1 (85), 3 (50)) > >? ? ? >? ? marker: 1 strata with value/size (1 (40)) > >? ? ? >? ? Face Sets: 1 strata with value/size (1 (20)) > >? ? ? > > >? ? ? >? ? Thanks, > >? ? ? > > >? ? ? >? ? ? ?Matt > >? ? ? > > >? ? ? > On Wed, Mar 31, 2021 at 12:57 PM Matthew Knepley > >? ? ? > > > >? ? ? > > >>> wrote: > >? ? ? > > >? ? ? >? ? ?Okay, let me show a really simple example that gives > the expected > >? ? ? >? ? ?result before I figure out what is going wrong for > you. This code > >? ? ? > > >? ? ? >? ? ?static char help[] = "Tests plex distribution and > overlaps.\n"; > >? ? ? > > >? ? ? >? ? ?#include > >? ? ? > > >? ? ? >? ? ?int main (int argc, char * argv[]) { > >? ? ? >? ? ? ? DM? ? ? ? ? ? ? ? ? ? dm; > >? ? ? >? ? ? ? MPI_Comm? ? ? ?comm; > >? ? ? >? ? ? ? PetscErrorCode ierr; > >? ? ? > > >? ? ? >? ? ? ? ierr = PetscInitialize(&argc, &argv, NULL, help);if > (ierr) > >? ? ?return > >? ? ? >? ? ?ierr; > >? ? ? >? ? ? ? comm = PETSC_COMM_WORLD; > >? ? ? > > >? ? ? >? ? ? ? ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, > >? ? ?NULL, NULL, > >? ? ? >? ? ?NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > >? ? ? >? ? ? ? ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > >? ? ? >? ? ? ? ierr = PetscObjectSetName((PetscObject) dm, "Initial > >? ? ? >? ? ?DM");CHKERRQ(ierr); > >? ? ? >? ? ? ? ierr = DMViewFromOptions(dm, NULL, > >? ? ?"-initial_dm_view");CHKERRQ(ierr); > >? ? ? >? ? ? ? ierr = DMDestroy(&dm);CHKERRQ(ierr); > >? ? ? >? ? ? ? ierr = PetscFinalize(); > >? ? ? >? ? ? ? return ierr; > >? ? ? >? ? ?} > >? ? ? > > >? ? ? >? ? ?can do all the overlap tests. For example, you can run > it naively > >? ? ? >? ? ?and get a serial mesh > >? ? ? > > >? ? ? >? ? ?master *:~/Downloads/tmp/Nicolas$ > >? ? ?/PETSc3/petsc/apple/bin/mpiexec -n > >? ? ? >? ? ?2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > >? ? ? >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? ? type: plex > >? ? ? >? ? ?Initial DM in 2 dimensions: > >? ? ? >? ? ? ? 0-cells: 36 0 > >? ? ? >? ? ? ? 1-cells: 85 0 > >? ? ? >? ? ? ? 2-cells: 50 0 > >? ? ? >? ? ?Labels: > >? ? ? >? ? ? ? celltype: 3 strata with value/size (0 (36), 3 (50), > 1 (85)) > >? ? ? >? ? ? ? depth: 3 strata with value/size (0 (36), 1 (85), 2 > (50)) > >? ? ? >? ? ? ? marker: 1 strata with value/size (1 (40)) > >? ? ? >? ? ? ? Face Sets: 1 strata with value/size (1 (20)) > >? ? ? > > >? ? ? >? ? ?Then run it telling Plex to distribute after creating > the mesh > >? ? ? > > >? ? ? >? ? ?master *:~/Downloads/tmp/Nicolas$ > >? ? ?/PETSc3/petsc/apple/bin/mpiexec -n > >? ? ? >? ? ?2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > >? ? ?-dm_distribute > >? ? ? >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? ? type: plex > >? ? ? >? ? ?Initial DM in 2 dimensions: > >? ? ? >? ? ? ? 0-cells: 21 21 > >? ? ? >? ? ? ? 1-cells: 45 45 > >? ? ? >? ? ? ? 2-cells: 25 25 > >? ? ? >? ? ?Labels: > >? ? ? >? ? ? ? depth: 3 strata with value/size (0 (21), 1 (45), 2 > (25)) > >? ? ? >? ? ? ? celltype: 3 strata with value/size (0 (21), 1 (45), > 3 (25)) > >? ? ? >? ? ? ? marker: 1 strata with value/size (1 (21)) > >? ? ? >? ? ? ? Face Sets: 1 strata with value/size (1 (10)) > >? ? ? > > >? ? ? >? ? ?The get the same thing back with overlap = 0 > >? ? ? > > >? ? ? >? ? ?master *:~/Downloads/tmp/Nicolas$ > >? ? ?/PETSc3/petsc/apple/bin/mpiexec -n > >? ? ? >? ? ?2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > >? ? ? >? ? ?-dm_distribute -dm_distribute_overlap 0 > >? ? ? >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? ? type: plex > >? ? ? >? ? ?Initial DM in 2 dimensions: > >? ? ? >? ? ? ? 0-cells: 21 21 > >? ? ? >? ? ? ? 1-cells: 45 45 > >? ? ? >? ? ? ? 2-cells: 25 25 > >? ? ? >? ? ?Labels: > >? ? ? >? ? ? ? depth: 3 strata with value/size (0 (21), 1 (45), 2 > (25)) > >? ? ? >? ? ? ? celltype: 3 strata with value/size (0 (21), 1 (45), > 3 (25)) > >? ? ? >? ? ? ? marker: 1 strata with value/size (1 (21)) > >? ? ? >? ? ? ? Face Sets: 1 strata with value/size (1 (10)) > >? ? ? > > >? ? ? >? ? ?and get larger local meshes with overlap = 1 > >? ? ? > > >? ? ? >? ? ?master *:~/Downloads/tmp/Nicolas$ > >? ? ?/PETSc3/petsc/apple/bin/mpiexec -n > >? ? ? >? ? ?2 ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > >? ? ? >? ? ?-dm_distribute -dm_distribute_overlap 1 > >? ? ? >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? ? type: plex > >? ? ? >? ? ?Initial DM in 2 dimensions: > >? ? ? >? ? ? ? 0-cells: 29 29 > >? ? ? >? ? ? ? 1-cells: 65 65 > >? ? ? >? ? ? ? 2-cells: 37 37 > >? ? ? >? ? ?Labels: > >? ? ? >? ? ? ? depth: 3 strata with value/size (0 (29), 1 (65), 2 > (37)) > >? ? ? >? ? ? ? celltype: 3 strata with value/size (0 (29), 1 (65), > 3 (37)) > >? ? ? >? ? ? ? marker: 1 strata with value/size (1 (27)) > >? ? ? >? ? ? ? Face Sets: 1 strata with value/size (1 (13)) > >? ? ? > > >? ? ? >? ? ? ? Thanks, > >? ? ? > > >? ? ? >? ? ? ? ? ?Matt > >? ? ? > > >? ? ? >? ? ?On Wed, Mar 31, 2021 at 12:22 PM Nicolas Barral > >? ? ? >? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >>> wrote: > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? >? ? ? ? ?@+ > >? ? ? > > >? ? ? >? ? ? ? ?-- > >? ? ? >? ? ? ? ?Nicolas > >? ? ? > > >? ? ? >? ? ? ? ?On 31/03/2021 17:51, Matthew Knepley wrote: > >? ? ? >? ? ? ? ? > On Sat, Mar 27, 2021 at 9:27 AM Nicolas Barral > >? ? ? >? ? ? ? ? > > >? ? ? > > >? ? ? >? ? ? ? ? > >? ? ? >> > >? ? ? >? ? ? ? ? > > >? ? ? > > >? ? ? >? ? ? ? ? > >? ? ? >>>> wrote: > >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? ? ? >? ? ?Hi all, > >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? ? ? >? ? ?First, I'm not sure I understand what the > overlap > >? ? ? >? ? ? ? ?parameter in > >? ? ? >? ? ? ? ? >? ? ?DMPlexDistributeOverlap does. I tried the > following: > >? ? ? >? ? ? ? ?generate a small > >? ? ? >? ? ? ? ? >? ? ?mesh on 1 rank with DMPlexCreateBoxMesh, then > >? ? ?distribute > >? ? ? >? ? ? ? ?it with > >? ? ? >? ? ? ? ? >? ? ?DMPlexDistribute. At this point I have two nice > >? ? ? >? ? ? ? ?partitions, with shared > >? ? ? >? ? ? ? ? >? ? ?vertices and no overlapping cells. Then I call > >? ? ? >? ? ? ? ?DMPlexDistributeOverlap > >? ? ? >? ? ? ? ? >? ? ?with the overlap parameter set to 0 or 1, > and get the > >? ? ? >? ? ? ? ?same resulting > >? ? ? >? ? ? ? ? >? ? ?plex in both cases. Why is that ? > >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? ? ? > The overlap parameter says how many cell > adjacencies to go > >? ? ? >? ? ? ? ?out. You > >? ? ? >? ? ? ? ? > should not get the same > >? ? ? >? ? ? ? ? > mesh out. We have lots of examples that use > this. If > >? ? ?you send > >? ? ? >? ? ? ? ?your small > >? ? ? >? ? ? ? ? > example, I can probably > >? ? ? >? ? ? ? ? > tell you what is happening. > >? ? ? >? ? ? ? ? > > >? ? ? > > >? ? ? >? ? ? ? ?Ok so I do have a small example on that and the > DMClone > >? ? ?thing I > >? ? ? >? ? ? ? ?set up > >? ? ? >? ? ? ? ?to understand! I attach it to the email. > >? ? ? > > >? ? ? >? ? ? ? ?For the overlap, you can change the overlap > constant at > >? ? ?the top > >? ? ? >? ? ? ? ?of the > >? ? ? >? ? ? ? ?file. With OVERLAP=0 or 1, the distributed > overlapping mesh > >? ? ? >? ? ? ? ?(shown using > >? ? ? >? ? ? ? ?-over_dm_view, it's DMover) are the same, and > different > >? ? ?from the > >? ? ? >? ? ? ? ?mesh > >? ? ? >? ? ? ? ?before distributing the overlap (shown using > >? ? ?-distrib_dm_view). For > >? ? ? >? ? ? ? ?larger overlap values they're different. > >? ? ? > > >? ? ? >? ? ? ? ?The process is: > >? ? ? >? ? ? ? ?1/ create a DM dm on 1 rank > >? ? ? >? ? ? ? ?2/ clone dm into dm2 > >? ? ? >? ? ? ? ?3/ distribute dm > >? ? ? >? ? ? ? ?4/ clone dm into dm3 > >? ? ? >? ? ? ? ?5/ distribute dm overlap > >? ? ? > > >? ? ? >? ? ? ? ?I print all the DMs after each step. dm has a > distributed > >? ? ? >? ? ? ? ?overlap, dm2 > >? ? ? >? ? ? ? ?is not distributed, dm3 is distributed but without > >? ? ?overlap. Since > >? ? ? >? ? ? ? ?distribute and distributeOverlap create new DMs, I > don't seem > >? ? ? >? ? ? ? ?have a > >? ? ? >? ? ? ? ?problem with the shallow copies. > >? ? ? > > >? ? ? > > >? ? ? >? ? ? ? ? >? ? ?Second, I'm wondering what would be a good > way to > >? ? ?handle > >? ? ? >? ? ? ? ?two overlaps > >? ? ? >? ? ? ? ? >? ? ?and associated local vectors. In my adaptation > >? ? ?code, the > >? ? ? >? ? ? ? ?remeshing > >? ? ? >? ? ? ? ? >? ? ?library requires a non-overlapping mesh, > while the > >? ? ? >? ? ? ? ?refinement criterion > >? ? ? >? ? ? ? ? >? ? ?computation is based on hessian > computations, which > >? ? ? >? ? ? ? ?require a layer of > >? ? ? >? ? ? ? ? >? ? ?overlap. What I can do is clone the dm before > >? ? ? >? ? ? ? ?distributing the overlap, > >? ? ? >? ? ? ? ? >? ? ?then manage two independent plex objects with > >? ? ?their own > >? ? ? >? ? ? ? ?local sections > >? ? ? >? ? ? ? ? >? ? ?etc. and copy/trim local vectors manually. > Is there a > >? ? ? >? ? ? ? ?more automatic > >? ? ? >? ? ? ? ? >? ? ?way > >? ? ? >? ? ? ? ? >? ? ?to do this ? > >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? ? ? > DMClone() is a shallow copy, so that will not work. > >? ? ?You would > >? ? ? >? ? ? ? ?maintain > >? ? ? >? ? ? ? ? > two different Plexes, overlapping > >? ? ? >? ? ? ? ? > and non-overlapping, with their own sections > and vecs. Are > >? ? ? >? ? ? ? ?you sure you > >? ? ? >? ? ? ? ? > need to keep around the non-overlapping one? > >? ? ? >? ? ? ? ? > Maybe if I understood what operations you want > to work, I > >? ? ? >? ? ? ? ?could say > >? ? ? >? ? ? ? ? > something more definitive. > >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? ? ?I need to be able to pass the non-overlapping mesh > to the > >? ? ? >? ? ? ? ?remesher. I > >? ? ? >? ? ? ? ?can either maintain 2 plexes, or trim the overlapping > >? ? ?plex when > >? ? ? >? ? ? ? ?I create > >? ? ? >? ? ? ? ?the arrays I give to the remesher. I'm not sure > which is the > >? ? ? >? ? ? ? ?best/worst ? > >? ? ? > > >? ? ? >? ? ? ? ?Thanks > >? ? ? > > >? ? ? >? ? ? ? ?-- > >? ? ? >? ? ? ? ?Nicolas > >? ? ? > > >? ? ? > > >? ? ? >? ? ? ? ? >? ? Thanks, > >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? ? ? >? ? ? ?Matt > >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? ? ? >? ? ?Thanks > >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? ? ? >? ? ?-- > >? ? ? >? ? ? ? ? >? ? ?Nicolas > >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? ? ? > -- > >? ? ? >? ? ? ? ? > What most experimenters take for granted before > they > >? ? ?begin their > >? ? ? >? ? ? ? ? > experiments is infinitely more interesting than any > >? ? ?results > >? ? ? >? ? ? ? ?to which > >? ? ? >? ? ? ? ? > their experiments lead. > >? ? ? >? ? ? ? ? > -- Norbert Wiener > >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? >? ? ? ? ? > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? >? ? ?-- > >? ? ? >? ? ?What most experimenters take for granted before they > begin their > >? ? ? >? ? ?experiments is infinitely more interesting than any > results > >? ? ?to which > >? ? ? >? ? ?their experiments lead. > >? ? ? >? ? ?-- Norbert Wiener > >? ? ? > > >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? >? ? ? > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? > -- > >? ? ? > What most experimenters take for granted before they begin > their > >? ? ? > experiments is infinitely more interesting than any > results to which > >? ? ? > their experiments lead. > >? ? ? > -- Norbert Wiener > >? ? ? > > >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Wed Mar 31 14:24:44 2021 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Mar 2021 15:24:44 -0400 Subject: [petsc-users] DMPlex overlap In-Reply-To: References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> <1c9c519d-4d94-66e1-844c-329fb58d3e94@math.u-bordeaux.fr> Message-ID: > > Ok so that's worth clarifying, but my problem is that dm has a 0-overlap > before DMPlexDistributeOverlap and odm has a 1-overlap even though I > passed "0". If I understand what you just wrote, adding "0" more levels > of overlap to 0-overlap, that should still be 0 overlap ? > > Yet when I look at the code of DMPlexDistributeOverlap, you're flagging > points to be added to the overlap whatever "k" is. > Crap. There is a bug handling 0. Evidently, no one ever asks for overlap 0. Will fix. Thanks, Matt > Sample code: > static char help[] = "Tests plex distribution and overlaps.\n"; > > #include > > int main (int argc, char * argv[]) { > > DM dm, odm; > MPI_Comm comm; > PetscErrorCode ierr; > > ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) return ierr; > comm = PETSC_COMM_WORLD; > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > ierr = PetscObjectSetName((PetscObject) dm, "DM before");CHKERRQ(ierr); > ierr = DMViewFromOptions(dm, NULL, "-before_dm_view");CHKERRQ(ierr); > DMPlexDistributeOverlap(dm, 0, NULL, &odm); > ierr = PetscObjectSetName((PetscObject) odm, "DM after");CHKERRQ(ierr); > ierr = DMViewFromOptions(odm, NULL, "-after_dm_view");CHKERRQ(ierr); > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > ierr = DMDestroy(&odm);CHKERRQ(ierr); > ierr = PetscFinalize(); > return ierr; > } > > % mpiexec -n 2 ./test_overlapV3 -dm_plex_box_faces 5,5 -dm_distribute > -before_dm_view -after_dm_view > DM Object: DM before 2 MPI processes > type: plex > DM before in 2 dimensions: > 0-cells: 21 21 > 1-cells: 45 45 > 2-cells: 25 25 > Labels: > depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > marker: 1 strata with value/size (1 (21)) > Face Sets: 1 strata with value/size (1 (10)) > DM Object: DM after 2 MPI processes > type: plex > DM after in 2 dimensions: > 0-cells: 29 29 > 1-cells: 65 65 > 2-cells: 37 37 > Labels: > depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > marker: 1 strata with value/size (1 (27)) > Face Sets: 1 strata with value/size (1 (13)) > > > > > > > Thanks, > > > > Matt > > > > Thanks, > > > > -- > > Nicolas > > > > > > > > Thanks, > > > > > > Matt > > > > > > if (!dm) {printf("Big problem\n"); dm = odm;} > > > else {DMDestroy(&odm);} > > > ierr = PetscObjectSetName((PetscObject) dm, "Initial > > > DM");CHKERRQ(ierr); > > > ierr = DMViewFromOptions(dm, NULL, > > > "-initial_dm_view");CHKERRQ(ierr); > > > > > > > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > > ierr = PetscFinalize(); > > > return ierr; > > > } > > > > > > called with mpiexec -n 2 ./test_overlapV3 -initial_dm_view > > > -dm_plex_box_faces 5,5 -dm_distribute > > > > > > gives: > > > DM Object: Initial DM 2 MPI processes > > > type: plex > > > Initial DM in 2 dimensions: > > > 0-cells: 29 29 > > > 1-cells: 65 65 > > > 2-cells: 37 37 > > > Labels: > > > depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > > > celltype: 3 strata with value/size (0 (29), 1 (65), 3 > (37)) > > > marker: 1 strata with value/size (1 (27)) > > > Face Sets: 1 strata with value/size (1 (13)) > > > > > > which is not what I expect ? > > > > > > Thanks, > > > > > > -- > > > Nicolas > > > > > > On 31/03/2021 19:02, Matthew Knepley wrote: > > > > Alright, I think the problem had to do with keeping track > > of what > > > DM you > > > > were looking at. This code increases the overlap of an > > initial DM: > > > > > > > > static char help[] = "Tests plex distribution and > > overlaps.\n"; > > > > > > > > #include > > > > > > > > int main (int argc, char * argv[]) { > > > > > > > > DM dm, dm2; > > > > PetscInt overlap; > > > > MPI_Comm comm; > > > > PetscErrorCode ierr; > > > > > > > > ierr = PetscInitialize(&argc, &argv, NULL, help);if > (ierr) > > > return ierr; > > > > comm = PETSC_COMM_WORLD; > > > > > > > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, > > NULL, NULL, > > > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > > > > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > > > ierr = PetscObjectSetName((PetscObject) dm, "Initial > > > DM");CHKERRQ(ierr); > > > > ierr = DMViewFromOptions(dm, NULL, > > > "-initial_dm_view");CHKERRQ(ierr); > > > > > > > > ierr = DMPlexGetOverlap(dm, &overlap);CHKERRQ(ierr); > > > > ierr = DMPlexDistributeOverlap(dm, overlap+1, NULL, > > > &dm2);CHKERRQ(ierr); > > > > ierr = PetscObjectSetName((PetscObject) dm2, "More > Overlap > > > > DM");CHKERRQ(ierr); > > > > ierr = DMViewFromOptions(dm2, NULL, > > > "-over_dm_view");CHKERRQ(ierr); > > > > > > > > ierr = DMDestroy(&dm2);CHKERRQ(ierr); > > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > > > ierr = PetscFinalize(); > > > > return ierr; > > > > } > > > > > > > > and when we run it we get the expected result > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > /PETSc3/petsc/apple/bin/mpiexec > > > -n 2 > > > > ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > > > -dm_distribute > > > > -dm_distribute_overlap 1 -over_dm_view > > > > DM Object: Initial DM 2 MPI processes > > > > type: plex > > > > Initial DM in 2 dimensions: > > > > 0-cells: 29 29 > > > > 1-cells: 65 65 > > > > 2-cells: 37 37 > > > > Labels: > > > > depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > > > > celltype: 3 strata with value/size (0 (29), 1 (65), 3 > (37)) > > > > marker: 1 strata with value/size (1 (27)) > > > > Face Sets: 1 strata with value/size (1 (13)) > > > > DM Object: More Overlap DM 2 MPI processes > > > > type: plex > > > > More Overlap DM in 2 dimensions: > > > > 0-cells: 36 36 > > > > 1-cells: 85 85 > > > > 2-cells: 50 50 > > > > Labels: > > > > depth: 3 strata with value/size (0 (36), 1 (85), 2 (50)) > > > > celltype: 3 strata with value/size (0 (36), 1 (85), 3 > (50)) > > > > marker: 1 strata with value/size (1 (40)) > > > > Face Sets: 1 strata with value/size (1 (20)) > > > > > > > > Thanks, > > > > > > > > Matt > > > > > > > > On Wed, Mar 31, 2021 at 12:57 PM Matthew Knepley > > > > > > > > > > > > >>> wrote: > > > > > > > > Okay, let me show a really simple example that gives > > the expected > > > > result before I figure out what is going wrong for > > you. This code > > > > > > > > static char help[] = "Tests plex distribution and > > overlaps.\n"; > > > > > > > > #include > > > > > > > > int main (int argc, char * argv[]) { > > > > DM dm; > > > > MPI_Comm comm; > > > > PetscErrorCode ierr; > > > > > > > > ierr = PetscInitialize(&argc, &argv, NULL, help);if > > (ierr) > > > return > > > > ierr; > > > > comm = PETSC_COMM_WORLD; > > > > > > > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, > NULL, > > > NULL, NULL, > > > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > > > > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > > > ierr = PetscObjectSetName((PetscObject) dm, "Initial > > > > DM");CHKERRQ(ierr); > > > > ierr = DMViewFromOptions(dm, NULL, > > > "-initial_dm_view");CHKERRQ(ierr); > > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > > > ierr = PetscFinalize(); > > > > return ierr; > > > > } > > > > > > > > can do all the overlap tests. For example, you can run > > it naively > > > > and get a serial mesh > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > > /PETSc3/petsc/apple/bin/mpiexec -n > > > > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces > 5,5 > > > > DM Object: Initial DM 2 MPI processes > > > > type: plex > > > > Initial DM in 2 dimensions: > > > > 0-cells: 36 0 > > > > 1-cells: 85 0 > > > > 2-cells: 50 0 > > > > Labels: > > > > celltype: 3 strata with value/size (0 (36), 3 (50), > > 1 (85)) > > > > depth: 3 strata with value/size (0 (36), 1 (85), 2 > > (50)) > > > > marker: 1 strata with value/size (1 (40)) > > > > Face Sets: 1 strata with value/size (1 (20)) > > > > > > > > Then run it telling Plex to distribute after creating > > the mesh > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > > /PETSc3/petsc/apple/bin/mpiexec -n > > > > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces > 5,5 > > > -dm_distribute > > > > DM Object: Initial DM 2 MPI processes > > > > type: plex > > > > Initial DM in 2 dimensions: > > > > 0-cells: 21 21 > > > > 1-cells: 45 45 > > > > 2-cells: 25 25 > > > > Labels: > > > > depth: 3 strata with value/size (0 (21), 1 (45), 2 > > (25)) > > > > celltype: 3 strata with value/size (0 (21), 1 (45), > > 3 (25)) > > > > marker: 1 strata with value/size (1 (21)) > > > > Face Sets: 1 strata with value/size (1 (10)) > > > > > > > > The get the same thing back with overlap = 0 > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > > /PETSc3/petsc/apple/bin/mpiexec -n > > > > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces > 5,5 > > > > -dm_distribute -dm_distribute_overlap 0 > > > > DM Object: Initial DM 2 MPI processes > > > > type: plex > > > > Initial DM in 2 dimensions: > > > > 0-cells: 21 21 > > > > 1-cells: 45 45 > > > > 2-cells: 25 25 > > > > Labels: > > > > depth: 3 strata with value/size (0 (21), 1 (45), 2 > > (25)) > > > > celltype: 3 strata with value/size (0 (21), 1 (45), > > 3 (25)) > > > > marker: 1 strata with value/size (1 (21)) > > > > Face Sets: 1 strata with value/size (1 (10)) > > > > > > > > and get larger local meshes with overlap = 1 > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > > /PETSc3/petsc/apple/bin/mpiexec -n > > > > 2 ./test_overlap -initial_dm_view -dm_plex_box_faces > 5,5 > > > > -dm_distribute -dm_distribute_overlap 1 > > > > DM Object: Initial DM 2 MPI processes > > > > type: plex > > > > Initial DM in 2 dimensions: > > > > 0-cells: 29 29 > > > > 1-cells: 65 65 > > > > 2-cells: 37 37 > > > > Labels: > > > > depth: 3 strata with value/size (0 (29), 1 (65), 2 > > (37)) > > > > celltype: 3 strata with value/size (0 (29), 1 (65), > > 3 (37)) > > > > marker: 1 strata with value/size (1 (27)) > > > > Face Sets: 1 strata with value/size (1 (13)) > > > > > > > > Thanks, > > > > > > > > Matt > > > > > > > > On Wed, Mar 31, 2021 at 12:22 PM Nicolas Barral > > > > > > > > > > > > > > > > > > > >>> wrote: > > > > > > > > > > > > > > > > @+ > > > > > > > > -- > > > > Nicolas > > > > > > > > On 31/03/2021 17:51, Matthew Knepley wrote: > > > > > On Sat, Mar 27, 2021 at 9:27 AM Nicolas Barral > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >>>> wrote: > > > > > > > > > > Hi all, > > > > > > > > > > First, I'm not sure I understand what the > > overlap > > > > parameter in > > > > > DMPlexDistributeOverlap does. I tried the > > following: > > > > generate a small > > > > > mesh on 1 rank with DMPlexCreateBoxMesh, > then > > > distribute > > > > it with > > > > > DMPlexDistribute. At this point I have two > nice > > > > partitions, with shared > > > > > vertices and no overlapping cells. Then I > call > > > > DMPlexDistributeOverlap > > > > > with the overlap parameter set to 0 or 1, > > and get the > > > > same resulting > > > > > plex in both cases. Why is that ? > > > > > > > > > > > > > > > The overlap parameter says how many cell > > adjacencies to go > > > > out. You > > > > > should not get the same > > > > > mesh out. We have lots of examples that use > > this. If > > > you send > > > > your small > > > > > example, I can probably > > > > > tell you what is happening. > > > > > > > > > > > > > Ok so I do have a small example on that and the > > DMClone > > > thing I > > > > set up > > > > to understand! I attach it to the email. > > > > > > > > For the overlap, you can change the overlap > > constant at > > > the top > > > > of the > > > > file. With OVERLAP=0 or 1, the distributed > > overlapping mesh > > > > (shown using > > > > -over_dm_view, it's DMover) are the same, and > > different > > > from the > > > > mesh > > > > before distributing the overlap (shown using > > > -distrib_dm_view). For > > > > larger overlap values they're different. > > > > > > > > The process is: > > > > 1/ create a DM dm on 1 rank > > > > 2/ clone dm into dm2 > > > > 3/ distribute dm > > > > 4/ clone dm into dm3 > > > > 5/ distribute dm overlap > > > > > > > > I print all the DMs after each step. dm has a > > distributed > > > > overlap, dm2 > > > > is not distributed, dm3 is distributed but without > > > overlap. Since > > > > distribute and distributeOverlap create new DMs, I > > don't seem > > > > have a > > > > problem with the shallow copies. > > > > > > > > > > > > > Second, I'm wondering what would be a good > > way to > > > handle > > > > two overlaps > > > > > and associated local vectors. In my > adaptation > > > code, the > > > > remeshing > > > > > library requires a non-overlapping mesh, > > while the > > > > refinement criterion > > > > > computation is based on hessian > > computations, which > > > > require a layer of > > > > > overlap. What I can do is clone the dm > before > > > > distributing the overlap, > > > > > then manage two independent plex objects > with > > > their own > > > > local sections > > > > > etc. and copy/trim local vectors manually. > > Is there a > > > > more automatic > > > > > way > > > > > to do this ? > > > > > > > > > > > > > > > DMClone() is a shallow copy, so that will not > work. > > > You would > > > > maintain > > > > > two different Plexes, overlapping > > > > > and non-overlapping, with their own sections > > and vecs. Are > > > > you sure you > > > > > need to keep around the non-overlapping one? > > > > > Maybe if I understood what operations you want > > to work, I > > > > could say > > > > > something more definitive. > > > > > > > > > I need to be able to pass the non-overlapping mesh > > to the > > > > remesher. I > > > > can either maintain 2 plexes, or trim the > overlapping > > > plex when > > > > I create > > > > the arrays I give to the remesher. I'm not sure > > which is the > > > > best/worst ? > > > > > > > > Thanks > > > > > > > > -- > > > > Nicolas > > > > > > > > > > > > > Thanks, > > > > > > > > > > Matt > > > > > > > > > > Thanks > > > > > > > > > > -- > > > > > Nicolas > > > > > > > > > > > > > > > > > > > > -- > > > > > What most experimenters take for granted before > > they > > > begin their > > > > > experiments is infinitely more interesting than > any > > > results > > > > to which > > > > > their experiments lead. > > > > > -- Norbert Wiener > > > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before they > > begin their > > > > experiments is infinitely more interesting than any > > results > > > to which > > > > their experiments lead. > > > > -- Norbert Wiener > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before they begin > > their > > > > experiments is infinitely more interesting than any > > results to which > > > > their experiments lead. > > > > -- Norbert Wiener > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to > which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Wed Mar 31 14:35:57 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Wed, 31 Mar 2021 21:35:57 +0200 Subject: [petsc-users] DMPlex overlap In-Reply-To: References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> <1c9c519d-4d94-66e1-844c-329fb58d3e94@math.u-bordeaux.fr> Message-ID: <3d847dd6-f0c6-ff9c-d287-0b56538b2610@math.u-bordeaux.fr> On 31/03/2021 21:24, Matthew Knepley wrote: > Ok so that's worth clarifying, but my problem is that dm has a > 0-overlap > before DMPlexDistributeOverlap and odm has a 1-overlap even though I > passed "0". If I understand what you just wrote, adding "0" more levels > of overlap to 0-overlap, that should still be 0 overlap ? > > Yet when I look at the code of DMPlexDistributeOverlap, you're flagging > points to be added to the overlap whatever "k" is. > > > Crap. There is a bug handling 0. Evidently, no one ever asks for overlap 0. > Will fix. > ok now I'm sure I understand what you explained :) Thanks Matt. Now we can look at the other question: I need to be able to pass the non-overlapping mesh to the remesher. I can either maintain 2 plexes, or trim the overlapping plex when I create the arrays I pass to the remesher. I'm not sure which is the best/worst ? Thanks -- Nicolas > ? Thanks, > > ? ? ?Matt > > Sample code: > static char help[] = "Tests plex distribution and overlaps.\n"; > > #include > > int main (int argc, char * argv[]) { > > ? ?DM? ? ? ? ? ? ?dm, odm; > ? ?MPI_Comm? ? ? ?comm; > ? ?PetscErrorCode ierr; > > ? ?ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > return ierr; > ? ?comm = PETSC_COMM_WORLD; > > ? ?ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > ? ?ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > > > ? ?ierr = PetscObjectSetName((PetscObject) dm, "DM > before");CHKERRQ(ierr); > ? ?ierr = DMViewFromOptions(dm, NULL, "-before_dm_view");CHKERRQ(ierr); > ? ?DMPlexDistributeOverlap(dm, 0, NULL, &odm); > ? ?ierr = PetscObjectSetName((PetscObject) odm, "DM > after");CHKERRQ(ierr); > ? ?ierr = DMViewFromOptions(odm, NULL, "-after_dm_view");CHKERRQ(ierr); > > > ? ?ierr = DMDestroy(&dm);CHKERRQ(ierr); > ? ?ierr = DMDestroy(&odm);CHKERRQ(ierr); > ? ?ierr = PetscFinalize(); > ? ?return ierr; > } > > % mpiexec -n 2 ./test_overlapV3 -dm_plex_box_faces 5,5 -dm_distribute > -before_dm_view -after_dm_view > DM Object: DM before 2 MPI processes > ? ?type: plex > DM before in 2 dimensions: > ? ?0-cells: 21 21 > ? ?1-cells: 45 45 > ? ?2-cells: 25 25 > Labels: > ? ?depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > ? ?celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > ? ?marker: 1 strata with value/size (1 (21)) > ? ?Face Sets: 1 strata with value/size (1 (10)) > DM Object: DM after 2 MPI processes > ? ?type: plex > DM after in 2 dimensions: > ? ?0-cells: 29 29 > ? ?1-cells: 65 65 > ? ?2-cells: 37 37 > Labels: > ? ?depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > ? ?celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > ? ?marker: 1 strata with value/size (1 (27)) > ? ?Face Sets: 1 strata with value/size (1 (13)) > > > > > > >? ? Thanks, > > > >? ? ? ?Matt > > > >? ? ?Thanks, > > > >? ? ?-- > >? ? ?Nicolas > > > >? ? ? > > >? ? ? >? ? Thanks, > >? ? ? > > >? ? ? >? ? ? ?Matt > >? ? ? > > >? ? ? >? ? ? ? ?if (!dm) {printf("Big problem\n"); dm = odm;} > >? ? ? >? ? ? ? ?else? ? ?{DMDestroy(&odm);} > >? ? ? >? ? ? ? ?ierr = PetscObjectSetName((PetscObject) dm, "Initial > >? ? ? >? ? ?DM");CHKERRQ(ierr); > >? ? ? >? ? ? ? ?ierr = DMViewFromOptions(dm, NULL, > >? ? ? >? ? ?"-initial_dm_view");CHKERRQ(ierr); > >? ? ? > > >? ? ? > > >? ? ? >? ? ? ? ?ierr = DMDestroy(&dm);CHKERRQ(ierr); > >? ? ? >? ? ? ? ?ierr = PetscFinalize(); > >? ? ? >? ? ? ? ?return ierr; > >? ? ? >? ? ?} > >? ? ? > > >? ? ? >? ? ?called with mpiexec -n 2 ./test_overlapV3 -initial_dm_view > >? ? ? >? ? ?-dm_plex_box_faces 5,5 -dm_distribute > >? ? ? > > >? ? ? >? ? ?gives: > >? ? ? >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? ? ?type: plex > >? ? ? >? ? ?Initial DM in 2 dimensions: > >? ? ? >? ? ? ? ?0-cells: 29 29 > >? ? ? >? ? ? ? ?1-cells: 65 65 > >? ? ? >? ? ? ? ?2-cells: 37 37 > >? ? ? >? ? ?Labels: > >? ? ? >? ? ? ? ?depth: 3 strata with value/size (0 (29), 1 (65), 2 > (37)) > >? ? ? >? ? ? ? ?celltype: 3 strata with value/size (0 (29), 1 > (65), 3 (37)) > >? ? ? >? ? ? ? ?marker: 1 strata with value/size (1 (27)) > >? ? ? >? ? ? ? ?Face Sets: 1 strata with value/size (1 (13)) > >? ? ? > > >? ? ? >? ? ?which is not what I expect ? > >? ? ? > > >? ? ? >? ? ?Thanks, > >? ? ? > > >? ? ? >? ? ?-- > >? ? ? >? ? ?Nicolas > >? ? ? > > >? ? ? >? ? ?On 31/03/2021 19:02, Matthew Knepley wrote: > >? ? ? >? ? ? > Alright, I think the problem had to do with keeping > track > >? ? ?of what > >? ? ? >? ? ?DM you > >? ? ? >? ? ? > were looking at. This code increases the overlap of an > >? ? ?initial DM: > >? ? ? >? ? ? > > >? ? ? >? ? ? > static char help[] = "Tests plex distribution and > >? ? ?overlaps.\n"; > >? ? ? >? ? ? > > >? ? ? >? ? ? > #include > >? ? ? >? ? ? > > >? ? ? >? ? ? > int main (int argc, char * argv[]) { > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? DM ? ? ? ? ? ? dm, dm2; > >? ? ? >? ? ? >? ? PetscInt ? ? ? overlap; > >? ? ? >? ? ? >? ? MPI_Comm ? ? ? comm; > >? ? ? >? ? ? >? ? PetscErrorCode ierr; > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ierr = PetscInitialize(&argc, &argv, NULL, > help);if (ierr) > >? ? ? >? ? ?return ierr; > >? ? ? >? ? ? >? ? comm = PETSC_COMM_WORLD; > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, > NULL, > >? ? ?NULL, NULL, > >? ? ? >? ? ? > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ierr = PetscObjectSetName((PetscObject) dm, "Initial > >? ? ? >? ? ?DM");CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ierr = DMViewFromOptions(dm, NULL, > >? ? ? >? ? ?"-initial_dm_view");CHKERRQ(ierr); > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ierr = DMPlexGetOverlap(dm, &overlap);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ierr = DMPlexDistributeOverlap(dm, overlap+1, NULL, > >? ? ? >? ? ?&dm2);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ierr = PetscObjectSetName((PetscObject) dm2, > "More Overlap > >? ? ? >? ? ? > DM");CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ierr = DMViewFromOptions(dm2, NULL, > >? ? ? >? ? ?"-over_dm_view");CHKERRQ(ierr); > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ierr = DMDestroy(&dm2);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ierr = DMDestroy(&dm);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ierr = PetscFinalize(); > >? ? ? >? ? ? >? ? return ierr; > >? ? ? >? ? ? > } > >? ? ? >? ? ? > > >? ? ? >? ? ? > and when we run it we get the expected result > >? ? ? >? ? ? > > >? ? ? >? ? ? > master *:~/Downloads/tmp/Nicolas$ > >? ? ?/PETSc3/petsc/apple/bin/mpiexec > >? ? ? >? ? ?-n 2 > >? ? ? >? ? ? > ./test_overlap -initial_dm_view -dm_plex_box_faces 5,5 > >? ? ? >? ? ?-dm_distribute > >? ? ? >? ? ? > -dm_distribute_overlap 1 -over_dm_view > >? ? ? >? ? ? > DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? >? ? type: plex > >? ? ? >? ? ? > Initial DM in 2 dimensions: > >? ? ? >? ? ? >? ? 0-cells: 29 29 > >? ? ? >? ? ? >? ? 1-cells: 65 65 > >? ? ? >? ? ? >? ? 2-cells: 37 37 > >? ? ? >? ? ? > Labels: > >? ? ? >? ? ? >? ? depth: 3 strata with value/size (0 (29), 1 (65), > 2 (37)) > >? ? ? >? ? ? >? ? celltype: 3 strata with value/size (0 (29), 1 > (65), 3 (37)) > >? ? ? >? ? ? >? ? marker: 1 strata with value/size (1 (27)) > >? ? ? >? ? ? >? ? Face Sets: 1 strata with value/size (1 (13)) > >? ? ? >? ? ? > DM Object: More Overlap DM 2 MPI processes > >? ? ? >? ? ? >? ? type: plex > >? ? ? >? ? ? > More Overlap DM in 2 dimensions: > >? ? ? >? ? ? >? ? 0-cells: 36 36 > >? ? ? >? ? ? >? ? 1-cells: 85 85 > >? ? ? >? ? ? >? ? 2-cells: 50 50 > >? ? ? >? ? ? > Labels: > >? ? ? >? ? ? >? ? depth: 3 strata with value/size (0 (36), 1 (85), > 2 (50)) > >? ? ? >? ? ? >? ? celltype: 3 strata with value/size (0 (36), 1 > (85), 3 (50)) > >? ? ? >? ? ? >? ? marker: 1 strata with value/size (1 (40)) > >? ? ? >? ? ? >? ? Face Sets: 1 strata with value/size (1 (20)) > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? Thanks, > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ?Matt > >? ? ? >? ? ? > > >? ? ? >? ? ? > On Wed, Mar 31, 2021 at 12:57 PM Matthew Knepley > >? ? ? >? ? ? > > > >? ? ? > >> > >? ? ? >? ? ? > > > >? ? ? > >>>> wrote: > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Okay, let me show a really simple example that > gives > >? ? ?the expected > >? ? ? >? ? ? >? ? ?result before I figure out what is going wrong for > >? ? ?you. This code > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?static char help[] = "Tests plex distribution and > >? ? ?overlaps.\n"; > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?#include > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?int main (int argc, char * argv[]) { > >? ? ? >? ? ? >? ? ? ? DM? ? ? ? ? ? ? ? ? ? dm; > >? ? ? >? ? ? >? ? ? ? MPI_Comm? ? ? ?comm; > >? ? ? >? ? ? >? ? ? ? PetscErrorCode ierr; > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ierr = PetscInitialize(&argc, &argv, NULL, > help);if > >? ? ?(ierr) > >? ? ? >? ? ?return > >? ? ? >? ? ? >? ? ?ierr; > >? ? ? >? ? ? >? ? ? ? comm = PETSC_COMM_WORLD; > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ierr = DMPlexCreateBoxMesh(comm, 2, > PETSC_TRUE, NULL, > >? ? ? >? ? ?NULL, NULL, > >? ? ? >? ? ? >? ? ?NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? ? ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? ? ierr = PetscObjectSetName((PetscObject) dm, > "Initial > >? ? ? >? ? ? >? ? ?DM");CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? ? ierr = DMViewFromOptions(dm, NULL, > >? ? ? >? ? ?"-initial_dm_view");CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? ? ierr = DMDestroy(&dm);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? ? ierr = PetscFinalize(); > >? ? ? >? ? ? >? ? ? ? return ierr; > >? ? ? >? ? ? >? ? ?} > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?can do all the overlap tests. For example, you > can run > >? ? ?it naively > >? ? ? >? ? ? >? ? ?and get a serial mesh > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?master *:~/Downloads/tmp/Nicolas$ > >? ? ? >? ? ?/PETSc3/petsc/apple/bin/mpiexec -n > >? ? ? >? ? ? >? ? ?2 ./test_overlap -initial_dm_view > -dm_plex_box_faces 5,5 > >? ? ? >? ? ? >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? >? ? ? ? type: plex > >? ? ? >? ? ? >? ? ?Initial DM in 2 dimensions: > >? ? ? >? ? ? >? ? ? ? 0-cells: 36 0 > >? ? ? >? ? ? >? ? ? ? 1-cells: 85 0 > >? ? ? >? ? ? >? ? ? ? 2-cells: 50 0 > >? ? ? >? ? ? >? ? ?Labels: > >? ? ? >? ? ? >? ? ? ? celltype: 3 strata with value/size (0 (36), > 3 (50), > >? ? ?1 (85)) > >? ? ? >? ? ? >? ? ? ? depth: 3 strata with value/size (0 (36), 1 > (85), 2 > >? ? ?(50)) > >? ? ? >? ? ? >? ? ? ? marker: 1 strata with value/size (1 (40)) > >? ? ? >? ? ? >? ? ? ? Face Sets: 1 strata with value/size (1 (20)) > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Then run it telling Plex to distribute after > creating > >? ? ?the mesh > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?master *:~/Downloads/tmp/Nicolas$ > >? ? ? >? ? ?/PETSc3/petsc/apple/bin/mpiexec -n > >? ? ? >? ? ? >? ? ?2 ./test_overlap -initial_dm_view > -dm_plex_box_faces 5,5 > >? ? ? >? ? ?-dm_distribute > >? ? ? >? ? ? >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? >? ? ? ? type: plex > >? ? ? >? ? ? >? ? ?Initial DM in 2 dimensions: > >? ? ? >? ? ? >? ? ? ? 0-cells: 21 21 > >? ? ? >? ? ? >? ? ? ? 1-cells: 45 45 > >? ? ? >? ? ? >? ? ? ? 2-cells: 25 25 > >? ? ? >? ? ? >? ? ?Labels: > >? ? ? >? ? ? >? ? ? ? depth: 3 strata with value/size (0 (21), 1 > (45), 2 > >? ? ?(25)) > >? ? ? >? ? ? >? ? ? ? celltype: 3 strata with value/size (0 (21), > 1 (45), > >? ? ?3 (25)) > >? ? ? >? ? ? >? ? ? ? marker: 1 strata with value/size (1 (21)) > >? ? ? >? ? ? >? ? ? ? Face Sets: 1 strata with value/size (1 (10)) > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?The get the same thing back with overlap = 0 > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?master *:~/Downloads/tmp/Nicolas$ > >? ? ? >? ? ?/PETSc3/petsc/apple/bin/mpiexec -n > >? ? ? >? ? ? >? ? ?2 ./test_overlap -initial_dm_view > -dm_plex_box_faces 5,5 > >? ? ? >? ? ? >? ? ?-dm_distribute -dm_distribute_overlap 0 > >? ? ? >? ? ? >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? >? ? ? ? type: plex > >? ? ? >? ? ? >? ? ?Initial DM in 2 dimensions: > >? ? ? >? ? ? >? ? ? ? 0-cells: 21 21 > >? ? ? >? ? ? >? ? ? ? 1-cells: 45 45 > >? ? ? >? ? ? >? ? ? ? 2-cells: 25 25 > >? ? ? >? ? ? >? ? ?Labels: > >? ? ? >? ? ? >? ? ? ? depth: 3 strata with value/size (0 (21), 1 > (45), 2 > >? ? ?(25)) > >? ? ? >? ? ? >? ? ? ? celltype: 3 strata with value/size (0 (21), > 1 (45), > >? ? ?3 (25)) > >? ? ? >? ? ? >? ? ? ? marker: 1 strata with value/size (1 (21)) > >? ? ? >? ? ? >? ? ? ? Face Sets: 1 strata with value/size (1 (10)) > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?and get larger local meshes with overlap = 1 > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?master *:~/Downloads/tmp/Nicolas$ > >? ? ? >? ? ?/PETSc3/petsc/apple/bin/mpiexec -n > >? ? ? >? ? ? >? ? ?2 ./test_overlap -initial_dm_view > -dm_plex_box_faces 5,5 > >? ? ? >? ? ? >? ? ?-dm_distribute -dm_distribute_overlap 1 > >? ? ? >? ? ? >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? >? ? ? ? type: plex > >? ? ? >? ? ? >? ? ?Initial DM in 2 dimensions: > >? ? ? >? ? ? >? ? ? ? 0-cells: 29 29 > >? ? ? >? ? ? >? ? ? ? 1-cells: 65 65 > >? ? ? >? ? ? >? ? ? ? 2-cells: 37 37 > >? ? ? >? ? ? >? ? ?Labels: > >? ? ? >? ? ? >? ? ? ? depth: 3 strata with value/size (0 (29), 1 > (65), 2 > >? ? ?(37)) > >? ? ? >? ? ? >? ? ? ? celltype: 3 strata with value/size (0 (29), > 1 (65), > >? ? ?3 (37)) > >? ? ? >? ? ? >? ? ? ? marker: 1 strata with value/size (1 (27)) > >? ? ? >? ? ? >? ? ? ? Face Sets: 1 strata with value/size (1 (13)) > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? Thanks, > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ? ?Matt > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?On Wed, Mar 31, 2021 at 12:22 PM Nicolas Barral > >? ? ? >? ? ? >? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >> > >? ? ? >? ? ? >? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >>>> wrote: > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ?@+ > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ?-- > >? ? ? >? ? ? >? ? ? ? ?Nicolas > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ?On 31/03/2021 17:51, Matthew Knepley wrote: > >? ? ? >? ? ? >? ? ? ? ? > On Sat, Mar 27, 2021 at 9:27 AM Nicolas > Barral > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? > > >? ? ? >? ? ? > >? ? ? >> > >? ? ? >? ? ? >? ? ? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >>> > >? ? ? >? ? ? >? ? ? ? ? > > > >? ? ? > > >? ? ? >? ? ? > >? ? ? >> > >? ? ? >? ? ? >? ? ? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >>>>> wrote: > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? ? ? >? ? ?Hi all, > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? ? ? >? ? ?First, I'm not sure I understand > what the > >? ? ?overlap > >? ? ? >? ? ? >? ? ? ? ?parameter in > >? ? ? >? ? ? >? ? ? ? ? >? ? ?DMPlexDistributeOverlap does. I > tried the > >? ? ?following: > >? ? ? >? ? ? >? ? ? ? ?generate a small > >? ? ? >? ? ? >? ? ? ? ? >? ? ?mesh on 1 rank with > DMPlexCreateBoxMesh, then > >? ? ? >? ? ?distribute > >? ? ? >? ? ? >? ? ? ? ?it with > >? ? ? >? ? ? >? ? ? ? ? >? ? ?DMPlexDistribute. At this point I > have two nice > >? ? ? >? ? ? >? ? ? ? ?partitions, with shared > >? ? ? >? ? ? >? ? ? ? ? >? ? ?vertices and no overlapping cells. > Then I call > >? ? ? >? ? ? >? ? ? ? ?DMPlexDistributeOverlap > >? ? ? >? ? ? >? ? ? ? ? >? ? ?with the overlap parameter set to 0 > or 1, > >? ? ?and get the > >? ? ? >? ? ? >? ? ? ? ?same resulting > >? ? ? >? ? ? >? ? ? ? ? >? ? ?plex in both cases. Why is that ? > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? ? ? > The overlap parameter says how many cell > >? ? ?adjacencies to go > >? ? ? >? ? ? >? ? ? ? ?out. You > >? ? ? >? ? ? >? ? ? ? ? > should not get the same > >? ? ? >? ? ? >? ? ? ? ? > mesh out. We have lots of examples that use > >? ? ?this. If > >? ? ? >? ? ?you send > >? ? ? >? ? ? >? ? ? ? ?your small > >? ? ? >? ? ? >? ? ? ? ? > example, I can probably > >? ? ? >? ? ? >? ? ? ? ? > tell you what is happening. > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ?Ok so I do have a small example on that and the > >? ? ?DMClone > >? ? ? >? ? ?thing I > >? ? ? >? ? ? >? ? ? ? ?set up > >? ? ? >? ? ? >? ? ? ? ?to understand! I attach it to the email. > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ?For the overlap, you can change the overlap > >? ? ?constant at > >? ? ? >? ? ?the top > >? ? ? >? ? ? >? ? ? ? ?of the > >? ? ? >? ? ? >? ? ? ? ?file. With OVERLAP=0 or 1, the distributed > >? ? ?overlapping mesh > >? ? ? >? ? ? >? ? ? ? ?(shown using > >? ? ? >? ? ? >? ? ? ? ?-over_dm_view, it's DMover) are the same, and > >? ? ?different > >? ? ? >? ? ?from the > >? ? ? >? ? ? >? ? ? ? ?mesh > >? ? ? >? ? ? >? ? ? ? ?before distributing the overlap (shown using > >? ? ? >? ? ?-distrib_dm_view). For > >? ? ? >? ? ? >? ? ? ? ?larger overlap values they're different. > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ?The process is: > >? ? ? >? ? ? >? ? ? ? ?1/ create a DM dm on 1 rank > >? ? ? >? ? ? >? ? ? ? ?2/ clone dm into dm2 > >? ? ? >? ? ? >? ? ? ? ?3/ distribute dm > >? ? ? >? ? ? >? ? ? ? ?4/ clone dm into dm3 > >? ? ? >? ? ? >? ? ? ? ?5/ distribute dm overlap > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ?I print all the DMs after each step. dm has a > >? ? ?distributed > >? ? ? >? ? ? >? ? ? ? ?overlap, dm2 > >? ? ? >? ? ? >? ? ? ? ?is not distributed, dm3 is distributed but > without > >? ? ? >? ? ?overlap. Since > >? ? ? >? ? ? >? ? ? ? ?distribute and distributeOverlap create new > DMs, I > >? ? ?don't seem > >? ? ? >? ? ? >? ? ? ? ?have a > >? ? ? >? ? ? >? ? ? ? ?problem with the shallow copies. > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ? >? ? ?Second, I'm wondering what would be > a good > >? ? ?way to > >? ? ? >? ? ?handle > >? ? ? >? ? ? >? ? ? ? ?two overlaps > >? ? ? >? ? ? >? ? ? ? ? >? ? ?and associated local vectors. In my > adaptation > >? ? ? >? ? ?code, the > >? ? ? >? ? ? >? ? ? ? ?remeshing > >? ? ? >? ? ? >? ? ? ? ? >? ? ?library requires a non-overlapping mesh, > >? ? ?while the > >? ? ? >? ? ? >? ? ? ? ?refinement criterion > >? ? ? >? ? ? >? ? ? ? ? >? ? ?computation is based on hessian > >? ? ?computations, which > >? ? ? >? ? ? >? ? ? ? ?require a layer of > >? ? ? >? ? ? >? ? ? ? ? >? ? ?overlap. What I can do is clone the > dm before > >? ? ? >? ? ? >? ? ? ? ?distributing the overlap, > >? ? ? >? ? ? >? ? ? ? ? >? ? ?then manage two independent plex > objects with > >? ? ? >? ? ?their own > >? ? ? >? ? ? >? ? ? ? ?local sections > >? ? ? >? ? ? >? ? ? ? ? >? ? ?etc. and copy/trim local vectors > manually. > >? ? ?Is there a > >? ? ? >? ? ? >? ? ? ? ?more automatic > >? ? ? >? ? ? >? ? ? ? ? >? ? ?way > >? ? ? >? ? ? >? ? ? ? ? >? ? ?to do this ? > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? ? ? > DMClone() is a shallow copy, so that > will not work. > >? ? ? >? ? ?You would > >? ? ? >? ? ? >? ? ? ? ?maintain > >? ? ? >? ? ? >? ? ? ? ? > two different Plexes, overlapping > >? ? ? >? ? ? >? ? ? ? ? > and non-overlapping, with their own sections > >? ? ?and vecs. Are > >? ? ? >? ? ? >? ? ? ? ?you sure you > >? ? ? >? ? ? >? ? ? ? ? > need to keep around the non-overlapping one? > >? ? ? >? ? ? >? ? ? ? ? > Maybe if I understood what operations > you want > >? ? ?to work, I > >? ? ? >? ? ? >? ? ? ? ?could say > >? ? ? >? ? ? >? ? ? ? ? > something more definitive. > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? ? ?I need to be able to pass the > non-overlapping mesh > >? ? ?to the > >? ? ? >? ? ? >? ? ? ? ?remesher. I > >? ? ? >? ? ? >? ? ? ? ?can either maintain 2 plexes, or trim the > overlapping > >? ? ? >? ? ?plex when > >? ? ? >? ? ? >? ? ? ? ?I create > >? ? ? >? ? ? >? ? ? ? ?the arrays I give to the remesher. I'm not sure > >? ? ?which is the > >? ? ? >? ? ? >? ? ? ? ?best/worst ? > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ?Thanks > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ?-- > >? ? ? >? ? ? >? ? ? ? ?Nicolas > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ? >? ? Thanks, > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? ? ? >? ? ? ?Matt > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? ? ? >? ? ?Thanks > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? ? ? >? ? ?-- > >? ? ? >? ? ? >? ? ? ? ? >? ? ?Nicolas > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? ? ? > -- > >? ? ? >? ? ? >? ? ? ? ? > What most experimenters take for granted > before > >? ? ?they > >? ? ? >? ? ?begin their > >? ? ? >? ? ? >? ? ? ? ? > experiments is infinitely more > interesting than any > >? ? ? >? ? ?results > >? ? ? >? ? ? >? ? ? ? ?to which > >? ? ? >? ? ? >? ? ? ? ? > their experiments lead. > >? ? ? >? ? ? >? ? ? ? ? > -- Norbert Wiener > >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? >? ? ? >? ? ? ? ? > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?-- > >? ? ? >? ? ? >? ? ?What most experimenters take for granted before > they > >? ? ?begin their > >? ? ? >? ? ? >? ? ?experiments is infinitely more interesting than any > >? ? ?results > >? ? ? >? ? ?to which > >? ? ? >? ? ? >? ? ?their experiments lead. > >? ? ? >? ? ? >? ? ?-- Norbert Wiener > >? ? ? >? ? ? > > >? ? ? >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? >? ? ? >? ? ? > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > -- > >? ? ? >? ? ? > What most experimenters take for granted before > they begin > >? ? ?their > >? ? ? >? ? ? > experiments is infinitely more interesting than any > >? ? ?results to which > >? ? ? >? ? ? > their experiments lead. > >? ? ? >? ? ? > -- Norbert Wiener > >? ? ? >? ? ? > > >? ? ? >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? >? ? ? > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? > -- > >? ? ? > What most experimenters take for granted before they begin > their > >? ? ? > experiments is infinitely more interesting than any > results to which > >? ? ? > their experiments lead. > >? ? ? > -- Norbert Wiener > >? ? ? > > >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Wed Mar 31 14:44:10 2021 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Mar 2021 15:44:10 -0400 Subject: [petsc-users] DMPlex overlap In-Reply-To: <3d847dd6-f0c6-ff9c-d287-0b56538b2610@math.u-bordeaux.fr> References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> <1c9c519d-4d94-66e1-844c-329fb58d3e94@math.u-bordeaux.fr> <3d847dd6-f0c6-ff9c-d287-0b56538b2610@math.u-bordeaux.fr> Message-ID: On Wed, Mar 31, 2021 at 3:36 PM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > On 31/03/2021 21:24, Matthew Knepley wrote: > > Ok so that's worth clarifying, but my problem is that dm has a > > 0-overlap > > before DMPlexDistributeOverlap and odm has a 1-overlap even though I > > passed "0". If I understand what you just wrote, adding "0" more > levels > > of overlap to 0-overlap, that should still be 0 overlap ? > > > > Yet when I look at the code of DMPlexDistributeOverlap, you're > flagging > > points to be added to the overlap whatever "k" is. > > > > > > Crap. There is a bug handling 0. Evidently, no one ever asks for overlap > 0. > > Will fix. > > > ok now I'm sure I understand what you explained :) Thanks Matt. > > Now we can look at the other question: I need to be able to pass the > non-overlapping mesh to the remesher. I can either maintain 2 plexes, or > trim the overlapping plex when I create the arrays I pass to the > remesher. I'm not sure which is the best/worst ? > I would start with two plexes since it is very easy. Trimming the plex would essentially make another plex anyway, but it is not hard using DMPlexFilter(). Thanks, Matt > Thanks > > -- > Nicolas > > > > Thanks, > > > > Matt > > > > Sample code: > > static char help[] = "Tests plex distribution and overlaps.\n"; > > > > #include > > > > int main (int argc, char * argv[]) { > > > > DM dm, odm; > > MPI_Comm comm; > > PetscErrorCode ierr; > > > > ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > > return ierr; > > comm = PETSC_COMM_WORLD; > > > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, NULL, NULL, > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > > > > > > > ierr = PetscObjectSetName((PetscObject) dm, "DM > > before");CHKERRQ(ierr); > > ierr = DMViewFromOptions(dm, NULL, > "-before_dm_view");CHKERRQ(ierr); > > DMPlexDistributeOverlap(dm, 0, NULL, &odm); > > ierr = PetscObjectSetName((PetscObject) odm, "DM > > after");CHKERRQ(ierr); > > ierr = DMViewFromOptions(odm, NULL, > "-after_dm_view");CHKERRQ(ierr); > > > > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > ierr = DMDestroy(&odm);CHKERRQ(ierr); > > ierr = PetscFinalize(); > > return ierr; > > } > > > > % mpiexec -n 2 ./test_overlapV3 -dm_plex_box_faces 5,5 -dm_distribute > > -before_dm_view -after_dm_view > > DM Object: DM before 2 MPI processes > > type: plex > > DM before in 2 dimensions: > > 0-cells: 21 21 > > 1-cells: 45 45 > > 2-cells: 25 25 > > Labels: > > depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > > celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > > marker: 1 strata with value/size (1 (21)) > > Face Sets: 1 strata with value/size (1 (10)) > > DM Object: DM after 2 MPI processes > > type: plex > > DM after in 2 dimensions: > > 0-cells: 29 29 > > 1-cells: 65 65 > > 2-cells: 37 37 > > Labels: > > depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > > celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > > marker: 1 strata with value/size (1 (27)) > > Face Sets: 1 strata with value/size (1 (13)) > > > > > > > > > > > > Thanks, > > > > > > Matt > > > > > > Thanks, > > > > > > -- > > > Nicolas > > > > > > > > > > > Thanks, > > > > > > > > Matt > > > > > > > > if (!dm) {printf("Big problem\n"); dm = odm;} > > > > else {DMDestroy(&odm);} > > > > ierr = PetscObjectSetName((PetscObject) dm, > "Initial > > > > DM");CHKERRQ(ierr); > > > > ierr = DMViewFromOptions(dm, NULL, > > > > "-initial_dm_view");CHKERRQ(ierr); > > > > > > > > > > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > > > ierr = PetscFinalize(); > > > > return ierr; > > > > } > > > > > > > > called with mpiexec -n 2 ./test_overlapV3 > -initial_dm_view > > > > -dm_plex_box_faces 5,5 -dm_distribute > > > > > > > > gives: > > > > DM Object: Initial DM 2 MPI processes > > > > type: plex > > > > Initial DM in 2 dimensions: > > > > 0-cells: 29 29 > > > > 1-cells: 65 65 > > > > 2-cells: 37 37 > > > > Labels: > > > > depth: 3 strata with value/size (0 (29), 1 (65), 2 > > (37)) > > > > celltype: 3 strata with value/size (0 (29), 1 > > (65), 3 (37)) > > > > marker: 1 strata with value/size (1 (27)) > > > > Face Sets: 1 strata with value/size (1 (13)) > > > > > > > > which is not what I expect ? > > > > > > > > Thanks, > > > > > > > > -- > > > > Nicolas > > > > > > > > On 31/03/2021 19:02, Matthew Knepley wrote: > > > > > Alright, I think the problem had to do with keeping > > track > > > of what > > > > DM you > > > > > were looking at. This code increases the overlap of > an > > > initial DM: > > > > > > > > > > static char help[] = "Tests plex distribution and > > > overlaps.\n"; > > > > > > > > > > #include > > > > > > > > > > int main (int argc, char * argv[]) { > > > > > > > > > > DM dm, dm2; > > > > > PetscInt overlap; > > > > > MPI_Comm comm; > > > > > PetscErrorCode ierr; > > > > > > > > > > ierr = PetscInitialize(&argc, &argv, NULL, > > help);if (ierr) > > > > return ierr; > > > > > comm = PETSC_COMM_WORLD; > > > > > > > > > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, > > NULL, > > > NULL, NULL, > > > > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > > > > > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > > > > ierr = PetscObjectSetName((PetscObject) dm, > "Initial > > > > DM");CHKERRQ(ierr); > > > > > ierr = DMViewFromOptions(dm, NULL, > > > > "-initial_dm_view");CHKERRQ(ierr); > > > > > > > > > > ierr = DMPlexGetOverlap(dm, > &overlap);CHKERRQ(ierr); > > > > > ierr = DMPlexDistributeOverlap(dm, overlap+1, > NULL, > > > > &dm2);CHKERRQ(ierr); > > > > > ierr = PetscObjectSetName((PetscObject) dm2, > > "More Overlap > > > > > DM");CHKERRQ(ierr); > > > > > ierr = DMViewFromOptions(dm2, NULL, > > > > "-over_dm_view");CHKERRQ(ierr); > > > > > > > > > > ierr = DMDestroy(&dm2);CHKERRQ(ierr); > > > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > > > > ierr = PetscFinalize(); > > > > > return ierr; > > > > > } > > > > > > > > > > and when we run it we get the expected result > > > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > > /PETSc3/petsc/apple/bin/mpiexec > > > > -n 2 > > > > > ./test_overlap -initial_dm_view -dm_plex_box_faces > 5,5 > > > > -dm_distribute > > > > > -dm_distribute_overlap 1 -over_dm_view > > > > > DM Object: Initial DM 2 MPI processes > > > > > type: plex > > > > > Initial DM in 2 dimensions: > > > > > 0-cells: 29 29 > > > > > 1-cells: 65 65 > > > > > 2-cells: 37 37 > > > > > Labels: > > > > > depth: 3 strata with value/size (0 (29), 1 (65), > > 2 (37)) > > > > > celltype: 3 strata with value/size (0 (29), 1 > > (65), 3 (37)) > > > > > marker: 1 strata with value/size (1 (27)) > > > > > Face Sets: 1 strata with value/size (1 (13)) > > > > > DM Object: More Overlap DM 2 MPI processes > > > > > type: plex > > > > > More Overlap DM in 2 dimensions: > > > > > 0-cells: 36 36 > > > > > 1-cells: 85 85 > > > > > 2-cells: 50 50 > > > > > Labels: > > > > > depth: 3 strata with value/size (0 (36), 1 (85), > > 2 (50)) > > > > > celltype: 3 strata with value/size (0 (36), 1 > > (85), 3 (50)) > > > > > marker: 1 strata with value/size (1 (40)) > > > > > Face Sets: 1 strata with value/size (1 (20)) > > > > > > > > > > Thanks, > > > > > > > > > > Matt > > > > > > > > > > On Wed, Mar 31, 2021 at 12:57 PM Matthew Knepley > > > > > > > > > > > > >> > > > > > > > > > > > > > >>>> wrote: > > > > > > > > > > Okay, let me show a really simple example that > > gives > > > the expected > > > > > result before I figure out what is going wrong > for > > > you. This code > > > > > > > > > > static char help[] = "Tests plex distribution > and > > > overlaps.\n"; > > > > > > > > > > #include > > > > > > > > > > int main (int argc, char * argv[]) { > > > > > DM dm; > > > > > MPI_Comm comm; > > > > > PetscErrorCode ierr; > > > > > > > > > > ierr = PetscInitialize(&argc, &argv, NULL, > > help);if > > > (ierr) > > > > return > > > > > ierr; > > > > > comm = PETSC_COMM_WORLD; > > > > > > > > > > ierr = DMPlexCreateBoxMesh(comm, 2, > > PETSC_TRUE, NULL, > > > > NULL, NULL, > > > > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > > > > > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > > > > ierr = PetscObjectSetName((PetscObject) dm, > > "Initial > > > > > DM");CHKERRQ(ierr); > > > > > ierr = DMViewFromOptions(dm, NULL, > > > > "-initial_dm_view");CHKERRQ(ierr); > > > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > > > > ierr = PetscFinalize(); > > > > > return ierr; > > > > > } > > > > > > > > > > can do all the overlap tests. For example, you > > can run > > > it naively > > > > > and get a serial mesh > > > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > > > /PETSc3/petsc/apple/bin/mpiexec -n > > > > > 2 ./test_overlap -initial_dm_view > > -dm_plex_box_faces 5,5 > > > > > DM Object: Initial DM 2 MPI processes > > > > > type: plex > > > > > Initial DM in 2 dimensions: > > > > > 0-cells: 36 0 > > > > > 1-cells: 85 0 > > > > > 2-cells: 50 0 > > > > > Labels: > > > > > celltype: 3 strata with value/size (0 (36), > > 3 (50), > > > 1 (85)) > > > > > depth: 3 strata with value/size (0 (36), 1 > > (85), 2 > > > (50)) > > > > > marker: 1 strata with value/size (1 (40)) > > > > > Face Sets: 1 strata with value/size (1 (20)) > > > > > > > > > > Then run it telling Plex to distribute after > > creating > > > the mesh > > > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > > > /PETSc3/petsc/apple/bin/mpiexec -n > > > > > 2 ./test_overlap -initial_dm_view > > -dm_plex_box_faces 5,5 > > > > -dm_distribute > > > > > DM Object: Initial DM 2 MPI processes > > > > > type: plex > > > > > Initial DM in 2 dimensions: > > > > > 0-cells: 21 21 > > > > > 1-cells: 45 45 > > > > > 2-cells: 25 25 > > > > > Labels: > > > > > depth: 3 strata with value/size (0 (21), 1 > > (45), 2 > > > (25)) > > > > > celltype: 3 strata with value/size (0 (21), > > 1 (45), > > > 3 (25)) > > > > > marker: 1 strata with value/size (1 (21)) > > > > > Face Sets: 1 strata with value/size (1 (10)) > > > > > > > > > > The get the same thing back with overlap = 0 > > > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > > > /PETSc3/petsc/apple/bin/mpiexec -n > > > > > 2 ./test_overlap -initial_dm_view > > -dm_plex_box_faces 5,5 > > > > > -dm_distribute -dm_distribute_overlap 0 > > > > > DM Object: Initial DM 2 MPI processes > > > > > type: plex > > > > > Initial DM in 2 dimensions: > > > > > 0-cells: 21 21 > > > > > 1-cells: 45 45 > > > > > 2-cells: 25 25 > > > > > Labels: > > > > > depth: 3 strata with value/size (0 (21), 1 > > (45), 2 > > > (25)) > > > > > celltype: 3 strata with value/size (0 (21), > > 1 (45), > > > 3 (25)) > > > > > marker: 1 strata with value/size (1 (21)) > > > > > Face Sets: 1 strata with value/size (1 (10)) > > > > > > > > > > and get larger local meshes with overlap = 1 > > > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > > > /PETSc3/petsc/apple/bin/mpiexec -n > > > > > 2 ./test_overlap -initial_dm_view > > -dm_plex_box_faces 5,5 > > > > > -dm_distribute -dm_distribute_overlap 1 > > > > > DM Object: Initial DM 2 MPI processes > > > > > type: plex > > > > > Initial DM in 2 dimensions: > > > > > 0-cells: 29 29 > > > > > 1-cells: 65 65 > > > > > 2-cells: 37 37 > > > > > Labels: > > > > > depth: 3 strata with value/size (0 (29), 1 > > (65), 2 > > > (37)) > > > > > celltype: 3 strata with value/size (0 (29), > > 1 (65), > > > 3 (37)) > > > > > marker: 1 strata with value/size (1 (27)) > > > > > Face Sets: 1 strata with value/size (1 (13)) > > > > > > > > > > Thanks, > > > > > > > > > > Matt > > > > > > > > > > On Wed, Mar 31, 2021 at 12:22 PM Nicolas Barral > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >>>> wrote: > > > > > > > > > > > > > > > > > > > > @+ > > > > > > > > > > -- > > > > > Nicolas > > > > > > > > > > On 31/03/2021 17:51, Matthew Knepley wrote: > > > > > > On Sat, Mar 27, 2021 at 9:27 AM Nicolas > > Barral > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >>>>> wrote: > > > > > > > > > > > > Hi all, > > > > > > > > > > > > First, I'm not sure I understand > > what the > > > overlap > > > > > parameter in > > > > > > DMPlexDistributeOverlap does. I > > tried the > > > following: > > > > > generate a small > > > > > > mesh on 1 rank with > > DMPlexCreateBoxMesh, then > > > > distribute > > > > > it with > > > > > > DMPlexDistribute. At this point I > > have two nice > > > > > partitions, with shared > > > > > > vertices and no overlapping cells. > > Then I call > > > > > DMPlexDistributeOverlap > > > > > > with the overlap parameter set to 0 > > or 1, > > > and get the > > > > > same resulting > > > > > > plex in both cases. Why is that ? > > > > > > > > > > > > > > > > > > The overlap parameter says how many cell > > > adjacencies to go > > > > > out. You > > > > > > should not get the same > > > > > > mesh out. We have lots of examples that > use > > > this. If > > > > you send > > > > > your small > > > > > > example, I can probably > > > > > > tell you what is happening. > > > > > > > > > > > > > > > > Ok so I do have a small example on that and > the > > > DMClone > > > > thing I > > > > > set up > > > > > to understand! I attach it to the email. > > > > > > > > > > For the overlap, you can change the overlap > > > constant at > > > > the top > > > > > of the > > > > > file. With OVERLAP=0 or 1, the distributed > > > overlapping mesh > > > > > (shown using > > > > > -over_dm_view, it's DMover) are the same, > and > > > different > > > > from the > > > > > mesh > > > > > before distributing the overlap (shown using > > > > -distrib_dm_view). For > > > > > larger overlap values they're different. > > > > > > > > > > The process is: > > > > > 1/ create a DM dm on 1 rank > > > > > 2/ clone dm into dm2 > > > > > 3/ distribute dm > > > > > 4/ clone dm into dm3 > > > > > 5/ distribute dm overlap > > > > > > > > > > I print all the DMs after each step. dm has > a > > > distributed > > > > > overlap, dm2 > > > > > is not distributed, dm3 is distributed but > > without > > > > overlap. Since > > > > > distribute and distributeOverlap create new > > DMs, I > > > don't seem > > > > > have a > > > > > problem with the shallow copies. > > > > > > > > > > > > > > > > Second, I'm wondering what would be > > a good > > > way to > > > > handle > > > > > two overlaps > > > > > > and associated local vectors. In my > > adaptation > > > > code, the > > > > > remeshing > > > > > > library requires a non-overlapping > mesh, > > > while the > > > > > refinement criterion > > > > > > computation is based on hessian > > > computations, which > > > > > require a layer of > > > > > > overlap. What I can do is clone the > > dm before > > > > > distributing the overlap, > > > > > > then manage two independent plex > > objects with > > > > their own > > > > > local sections > > > > > > etc. and copy/trim local vectors > > manually. > > > Is there a > > > > > more automatic > > > > > > way > > > > > > to do this ? > > > > > > > > > > > > > > > > > > DMClone() is a shallow copy, so that > > will not work. > > > > You would > > > > > maintain > > > > > > two different Plexes, overlapping > > > > > > and non-overlapping, with their own > sections > > > and vecs. Are > > > > > you sure you > > > > > > need to keep around the non-overlapping > one? > > > > > > Maybe if I understood what operations > > you want > > > to work, I > > > > > could say > > > > > > something more definitive. > > > > > > > > > > > I need to be able to pass the > > non-overlapping mesh > > > to the > > > > > remesher. I > > > > > can either maintain 2 plexes, or trim the > > overlapping > > > > plex when > > > > > I create > > > > > the arrays I give to the remesher. I'm not > sure > > > which is the > > > > > best/worst ? > > > > > > > > > > Thanks > > > > > > > > > > -- > > > > > Nicolas > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Matt > > > > > > > > > > > > Thanks > > > > > > > > > > > > -- > > > > > > Nicolas > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > What most experimenters take for granted > > before > > > they > > > > begin their > > > > > > experiments is infinitely more > > interesting than any > > > > results > > > > > to which > > > > > > their experiments lead. > > > > > > -- Norbert Wiener > > > > > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > What most experimenters take for granted before > > they > > > begin their > > > > > experiments is infinitely more interesting than > any > > > results > > > > to which > > > > > their experiments lead. > > > > > -- Norbert Wiener > > > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > What most experimenters take for granted before > > they begin > > > their > > > > > experiments is infinitely more interesting than any > > > results to which > > > > > their experiments lead. > > > > > -- Norbert Wiener > > > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before they begin > > their > > > > experiments is infinitely more interesting than any > > results to which > > > > their experiments lead. > > > > -- Norbert Wiener > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to > which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.barral at math.u-bordeaux.fr Wed Mar 31 15:00:46 2021 From: nicolas.barral at math.u-bordeaux.fr (Nicolas Barral) Date: Wed, 31 Mar 2021 22:00:46 +0200 Subject: [petsc-users] DMPlex overlap In-Reply-To: References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> <1c9c519d-4d94-66e1-844c-329fb58d3e94@math.u-bordeaux.fr> <3d847dd6-f0c6-ff9c-d287-0b56538b2610@math.u-bordeaux.fr> Message-ID: On 31/03/2021 21:44, Matthew Knepley wrote: > On Wed, Mar 31, 2021 at 3:36 PM Nicolas Barral > > wrote: > > On 31/03/2021 21:24, Matthew Knepley wrote: > >? ? ?Ok so that's worth clarifying, but my problem is that dm has a > >? ? ?0-overlap > >? ? ?before DMPlexDistributeOverlap and odm has a 1-overlap even > though I > >? ? ?passed "0". If I understand what you just wrote, adding "0" > more levels > >? ? ?of overlap to 0-overlap, that should still be 0 overlap ? > > > >? ? ?Yet when I look at the code of DMPlexDistributeOverlap, > you're flagging > >? ? ?points to be added to the overlap whatever "k" is. > > > > > > Crap. There is a bug handling 0. Evidently, no one ever asks for > overlap 0. > > Will fix. > > > ok now I'm sure I understand what you explained :) Thanks Matt. > > Now we can look at the other question: I need to be able to pass the > non-overlapping mesh to the remesher. I can either maintain 2 > plexes, or > trim the overlapping plex when I create the arrays I pass to the > remesher. I'm not sure which is the best/worst ? > > > I would start with two plexes since it is very easy. > I'll try that then, hopefully it should keep me busy for a while before the next round of questions. Thanks Matt! -- Nicolas > Trimming the plex would essentially make another plex anyway, but it is > not hard using DMPlexFilter(). > > ? Thanks, > > ? ? ?Matt > > Thanks > > -- > Nicolas > > > >? ? Thanks, > > > >? ? ? ?Matt > > > >? ? ?Sample code: > >? ? ?static char help[] = "Tests plex distribution and overlaps.\n"; > > > >? ? ?#include > > > >? ? ?int main (int argc, char * argv[]) { > > > >? ? ? ? ?DM? ? ? ? ? ? ?dm, odm; > >? ? ? ? ?MPI_Comm? ? ? ?comm; > >? ? ? ? ?PetscErrorCode ierr; > > > >? ? ? ? ?ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > >? ? ?return ierr; > >? ? ? ? ?comm = PETSC_COMM_WORLD; > > > >? ? ? ? ?ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, > NULL, NULL, > >? ? ?NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > >? ? ? ? ?ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > > > > > > >? ? ? ? ?ierr = PetscObjectSetName((PetscObject) dm, "DM > >? ? ?before");CHKERRQ(ierr); > >? ? ? ? ?ierr = DMViewFromOptions(dm, NULL, > "-before_dm_view");CHKERRQ(ierr); > >? ? ? ? ?DMPlexDistributeOverlap(dm, 0, NULL, &odm); > >? ? ? ? ?ierr = PetscObjectSetName((PetscObject) odm, "DM > >? ? ?after");CHKERRQ(ierr); > >? ? ? ? ?ierr = DMViewFromOptions(odm, NULL, > "-after_dm_view");CHKERRQ(ierr); > > > > > >? ? ? ? ?ierr = DMDestroy(&dm);CHKERRQ(ierr); > >? ? ? ? ?ierr = DMDestroy(&odm);CHKERRQ(ierr); > >? ? ? ? ?ierr = PetscFinalize(); > >? ? ? ? ?return ierr; > >? ? ?} > > > >? ? ?% mpiexec -n 2 ./test_overlapV3 -dm_plex_box_faces 5,5 > -dm_distribute > >? ? ?-before_dm_view -after_dm_view > >? ? ?DM Object: DM before 2 MPI processes > >? ? ? ? ?type: plex > >? ? ?DM before in 2 dimensions: > >? ? ? ? ?0-cells: 21 21 > >? ? ? ? ?1-cells: 45 45 > >? ? ? ? ?2-cells: 25 25 > >? ? ?Labels: > >? ? ? ? ?depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > >? ? ? ? ?celltype: 3 strata with value/size (0 (21), 1 (45), 3 (25)) > >? ? ? ? ?marker: 1 strata with value/size (1 (21)) > >? ? ? ? ?Face Sets: 1 strata with value/size (1 (10)) > >? ? ?DM Object: DM after 2 MPI processes > >? ? ? ? ?type: plex > >? ? ?DM after in 2 dimensions: > >? ? ? ? ?0-cells: 29 29 > >? ? ? ? ?1-cells: 65 65 > >? ? ? ? ?2-cells: 37 37 > >? ? ?Labels: > >? ? ? ? ?depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > >? ? ? ? ?celltype: 3 strata with value/size (0 (29), 1 (65), 3 (37)) > >? ? ? ? ?marker: 1 strata with value/size (1 (27)) > >? ? ? ? ?Face Sets: 1 strata with value/size (1 (13)) > > > > > > > >? ? ? > > >? ? ? >? ? Thanks, > >? ? ? > > >? ? ? >? ? ? ?Matt > >? ? ? > > >? ? ? >? ? ?Thanks, > >? ? ? > > >? ? ? >? ? ?-- > >? ? ? >? ? ?Nicolas > >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? Thanks, > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ?Matt > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ?if (!dm) {printf("Big problem\n"); dm = odm;} > >? ? ? >? ? ? >? ? ? ? ?else? ? ?{DMDestroy(&odm);} > >? ? ? >? ? ? >? ? ? ? ?ierr = PetscObjectSetName((PetscObject) dm, > "Initial > >? ? ? >? ? ? >? ? ?DM");CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? ? ?ierr = DMViewFromOptions(dm, NULL, > >? ? ? >? ? ? >? ? ?"-initial_dm_view");CHKERRQ(ierr); > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? ? ?ierr = DMDestroy(&dm);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? ? ?ierr = PetscFinalize(); > >? ? ? >? ? ? >? ? ? ? ?return ierr; > >? ? ? >? ? ? >? ? ?} > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?called with mpiexec -n 2 ./test_overlapV3 > -initial_dm_view > >? ? ? >? ? ? >? ? ?-dm_plex_box_faces 5,5 -dm_distribute > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?gives: > >? ? ? >? ? ? >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? >? ? ? ? ?type: plex > >? ? ? >? ? ? >? ? ?Initial DM in 2 dimensions: > >? ? ? >? ? ? >? ? ? ? ?0-cells: 29 29 > >? ? ? >? ? ? >? ? ? ? ?1-cells: 65 65 > >? ? ? >? ? ? >? ? ? ? ?2-cells: 37 37 > >? ? ? >? ? ? >? ? ?Labels: > >? ? ? >? ? ? >? ? ? ? ?depth: 3 strata with value/size (0 (29), 1 > (65), 2 > >? ? ?(37)) > >? ? ? >? ? ? >? ? ? ? ?celltype: 3 strata with value/size (0 (29), 1 > >? ? ?(65), 3 (37)) > >? ? ? >? ? ? >? ? ? ? ?marker: 1 strata with value/size (1 (27)) > >? ? ? >? ? ? >? ? ? ? ?Face Sets: 1 strata with value/size (1 (13)) > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?which is not what I expect ? > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Thanks, > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?-- > >? ? ? >? ? ? >? ? ?Nicolas > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?On 31/03/2021 19:02, Matthew Knepley wrote: > >? ? ? >? ? ? >? ? ? > Alright, I think the problem had to do with > keeping > >? ? ?track > >? ? ? >? ? ?of what > >? ? ? >? ? ? >? ? ?DM you > >? ? ? >? ? ? >? ? ? > were looking at. This code increases the > overlap of an > >? ? ? >? ? ?initial DM: > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > static char help[] = "Tests plex > distribution and > >? ? ? >? ? ?overlaps.\n"; > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > #include > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > int main (int argc, char * argv[]) { > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? DM ? ? ? ? ? ? dm, dm2; > >? ? ? >? ? ? >? ? ? >? ? PetscInt ? ? ? overlap; > >? ? ? >? ? ? >? ? ? >? ? MPI_Comm ? ? ? comm; > >? ? ? >? ? ? >? ? ? >? ? PetscErrorCode ierr; > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ierr = PetscInitialize(&argc, &argv, NULL, > >? ? ?help);if (ierr) > >? ? ? >? ? ? >? ? ?return ierr; > >? ? ? >? ? ? >? ? ? >? ? comm = PETSC_COMM_WORLD; > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ierr = DMPlexCreateBoxMesh(comm, 2, > PETSC_TRUE, > >? ? ?NULL, > >? ? ? >? ? ?NULL, NULL, > >? ? ? >? ? ? >? ? ? > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? >? ? ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? >? ? ierr = PetscObjectSetName((PetscObject) > dm, "Initial > >? ? ? >? ? ? >? ? ?DM");CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? >? ? ierr = DMViewFromOptions(dm, NULL, > >? ? ? >? ? ? >? ? ?"-initial_dm_view");CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ierr = DMPlexGetOverlap(dm, > &overlap);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? >? ? ierr = DMPlexDistributeOverlap(dm, > overlap+1, NULL, > >? ? ? >? ? ? >? ? ?&dm2);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? >? ? ierr = PetscObjectSetName((PetscObject) dm2, > >? ? ?"More Overlap > >? ? ? >? ? ? >? ? ? > DM");CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? >? ? ierr = DMViewFromOptions(dm2, NULL, > >? ? ? >? ? ? >? ? ?"-over_dm_view");CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ierr = DMDestroy(&dm2);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? >? ? ierr = DMDestroy(&dm);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? >? ? ierr = PetscFinalize(); > >? ? ? >? ? ? >? ? ? >? ? return ierr; > >? ? ? >? ? ? >? ? ? > } > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > and when we run it we get the expected result > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > master *:~/Downloads/tmp/Nicolas$ > >? ? ? >? ? ?/PETSc3/petsc/apple/bin/mpiexec > >? ? ? >? ? ? >? ? ?-n 2 > >? ? ? >? ? ? >? ? ? > ./test_overlap -initial_dm_view > -dm_plex_box_faces 5,5 > >? ? ? >? ? ? >? ? ?-dm_distribute > >? ? ? >? ? ? >? ? ? > -dm_distribute_overlap 1 -over_dm_view > >? ? ? >? ? ? >? ? ? > DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? >? ? ? >? ? type: plex > >? ? ? >? ? ? >? ? ? > Initial DM in 2 dimensions: > >? ? ? >? ? ? >? ? ? >? ? 0-cells: 29 29 > >? ? ? >? ? ? >? ? ? >? ? 1-cells: 65 65 > >? ? ? >? ? ? >? ? ? >? ? 2-cells: 37 37 > >? ? ? >? ? ? >? ? ? > Labels: > >? ? ? >? ? ? >? ? ? >? ? depth: 3 strata with value/size (0 (29), > 1 (65), > >? ? ?2 (37)) > >? ? ? >? ? ? >? ? ? >? ? celltype: 3 strata with value/size (0 (29), 1 > >? ? ?(65), 3 (37)) > >? ? ? >? ? ? >? ? ? >? ? marker: 1 strata with value/size (1 (27)) > >? ? ? >? ? ? >? ? ? >? ? Face Sets: 1 strata with value/size (1 (13)) > >? ? ? >? ? ? >? ? ? > DM Object: More Overlap DM 2 MPI processes > >? ? ? >? ? ? >? ? ? >? ? type: plex > >? ? ? >? ? ? >? ? ? > More Overlap DM in 2 dimensions: > >? ? ? >? ? ? >? ? ? >? ? 0-cells: 36 36 > >? ? ? >? ? ? >? ? ? >? ? 1-cells: 85 85 > >? ? ? >? ? ? >? ? ? >? ? 2-cells: 50 50 > >? ? ? >? ? ? >? ? ? > Labels: > >? ? ? >? ? ? >? ? ? >? ? depth: 3 strata with value/size (0 (36), > 1 (85), > >? ? ?2 (50)) > >? ? ? >? ? ? >? ? ? >? ? celltype: 3 strata with value/size (0 (36), 1 > >? ? ?(85), 3 (50)) > >? ? ? >? ? ? >? ? ? >? ? marker: 1 strata with value/size (1 (40)) > >? ? ? >? ? ? >? ? ? >? ? Face Sets: 1 strata with value/size (1 (20)) > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? Thanks, > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ?Matt > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > On Wed, Mar 31, 2021 at 12:57 PM Matthew Knepley > >? ? ? >? ? ? >? ? ? > > > >? ? ? > >> > >? ? ? >? ? ? > > > >? ? ? > >>> > >? ? ? >? ? ? >? ? ? > > >? ? ?> > > >? ? ?>> > >? ? ? >? ? ? > > > >? ? ? > >>>>> wrote: > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?Okay, let me show a really simple > example that > >? ? ?gives > >? ? ? >? ? ?the expected > >? ? ? >? ? ? >? ? ? >? ? ?result before I figure out what is going > wrong for > >? ? ? >? ? ?you. This code > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?static char help[] = "Tests plex > distribution and > >? ? ? >? ? ?overlaps.\n"; > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?#include > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?int main (int argc, char * argv[]) { > >? ? ? >? ? ? >? ? ? >? ? ? ? DM? ? ? ? ? ? ? ? ? ? dm; > >? ? ? >? ? ? >? ? ? >? ? ? ? MPI_Comm? ? ? ?comm; > >? ? ? >? ? ? >? ? ? >? ? ? ? PetscErrorCode ierr; > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ierr = PetscInitialize(&argc, &argv, > NULL, > >? ? ?help);if > >? ? ? >? ? ?(ierr) > >? ? ? >? ? ? >? ? ?return > >? ? ? >? ? ? >? ? ? >? ? ?ierr; > >? ? ? >? ? ? >? ? ? >? ? ? ? comm = PETSC_COMM_WORLD; > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ierr = DMPlexCreateBoxMesh(comm, 2, > >? ? ?PETSC_TRUE, NULL, > >? ? ? >? ? ? >? ? ?NULL, NULL, > >? ? ? >? ? ? >? ? ? >? ? ?NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? >? ? ? ? ierr = > DMSetFromOptions(dm);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? >? ? ? ? ierr = > PetscObjectSetName((PetscObject) dm, > >? ? ?"Initial > >? ? ? >? ? ? >? ? ? >? ? ?DM");CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? >? ? ? ? ierr = DMViewFromOptions(dm, NULL, > >? ? ? >? ? ? >? ? ?"-initial_dm_view");CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? >? ? ? ? ierr = DMDestroy(&dm);CHKERRQ(ierr); > >? ? ? >? ? ? >? ? ? >? ? ? ? ierr = PetscFinalize(); > >? ? ? >? ? ? >? ? ? >? ? ? ? return ierr; > >? ? ? >? ? ? >? ? ? >? ? ?} > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?can do all the overlap tests. For > example, you > >? ? ?can run > >? ? ? >? ? ?it naively > >? ? ? >? ? ? >? ? ? >? ? ?and get a serial mesh > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?master *:~/Downloads/tmp/Nicolas$ > >? ? ? >? ? ? >? ? ?/PETSc3/petsc/apple/bin/mpiexec -n > >? ? ? >? ? ? >? ? ? >? ? ?2 ./test_overlap -initial_dm_view > >? ? ?-dm_plex_box_faces 5,5 > >? ? ? >? ? ? >? ? ? >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? >? ? ? >? ? ? ? type: plex > >? ? ? >? ? ? >? ? ? >? ? ?Initial DM in 2 dimensions: > >? ? ? >? ? ? >? ? ? >? ? ? ? 0-cells: 36 0 > >? ? ? >? ? ? >? ? ? >? ? ? ? 1-cells: 85 0 > >? ? ? >? ? ? >? ? ? >? ? ? ? 2-cells: 50 0 > >? ? ? >? ? ? >? ? ? >? ? ?Labels: > >? ? ? >? ? ? >? ? ? >? ? ? ? celltype: 3 strata with value/size (0 > (36), > >? ? ?3 (50), > >? ? ? >? ? ?1 (85)) > >? ? ? >? ? ? >? ? ? >? ? ? ? depth: 3 strata with value/size (0 > (36), 1 > >? ? ?(85), 2 > >? ? ? >? ? ?(50)) > >? ? ? >? ? ? >? ? ? >? ? ? ? marker: 1 strata with value/size (1 (40)) > >? ? ? >? ? ? >? ? ? >? ? ? ? Face Sets: 1 strata with value/size > (1 (20)) > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?Then run it telling Plex to distribute after > >? ? ?creating > >? ? ? >? ? ?the mesh > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?master *:~/Downloads/tmp/Nicolas$ > >? ? ? >? ? ? >? ? ?/PETSc3/petsc/apple/bin/mpiexec -n > >? ? ? >? ? ? >? ? ? >? ? ?2 ./test_overlap -initial_dm_view > >? ? ?-dm_plex_box_faces 5,5 > >? ? ? >? ? ? >? ? ?-dm_distribute > >? ? ? >? ? ? >? ? ? >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? >? ? ? >? ? ? ? type: plex > >? ? ? >? ? ? >? ? ? >? ? ?Initial DM in 2 dimensions: > >? ? ? >? ? ? >? ? ? >? ? ? ? 0-cells: 21 21 > >? ? ? >? ? ? >? ? ? >? ? ? ? 1-cells: 45 45 > >? ? ? >? ? ? >? ? ? >? ? ? ? 2-cells: 25 25 > >? ? ? >? ? ? >? ? ? >? ? ?Labels: > >? ? ? >? ? ? >? ? ? >? ? ? ? depth: 3 strata with value/size (0 > (21), 1 > >? ? ?(45), 2 > >? ? ? >? ? ?(25)) > >? ? ? >? ? ? >? ? ? >? ? ? ? celltype: 3 strata with value/size (0 > (21), > >? ? ?1 (45), > >? ? ? >? ? ?3 (25)) > >? ? ? >? ? ? >? ? ? >? ? ? ? marker: 1 strata with value/size (1 (21)) > >? ? ? >? ? ? >? ? ? >? ? ? ? Face Sets: 1 strata with value/size > (1 (10)) > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?The get the same thing back with overlap = 0 > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?master *:~/Downloads/tmp/Nicolas$ > >? ? ? >? ? ? >? ? ?/PETSc3/petsc/apple/bin/mpiexec -n > >? ? ? >? ? ? >? ? ? >? ? ?2 ./test_overlap -initial_dm_view > >? ? ?-dm_plex_box_faces 5,5 > >? ? ? >? ? ? >? ? ? >? ? ?-dm_distribute -dm_distribute_overlap 0 > >? ? ? >? ? ? >? ? ? >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? >? ? ? >? ? ? ? type: plex > >? ? ? >? ? ? >? ? ? >? ? ?Initial DM in 2 dimensions: > >? ? ? >? ? ? >? ? ? >? ? ? ? 0-cells: 21 21 > >? ? ? >? ? ? >? ? ? >? ? ? ? 1-cells: 45 45 > >? ? ? >? ? ? >? ? ? >? ? ? ? 2-cells: 25 25 > >? ? ? >? ? ? >? ? ? >? ? ?Labels: > >? ? ? >? ? ? >? ? ? >? ? ? ? depth: 3 strata with value/size (0 > (21), 1 > >? ? ?(45), 2 > >? ? ? >? ? ?(25)) > >? ? ? >? ? ? >? ? ? >? ? ? ? celltype: 3 strata with value/size (0 > (21), > >? ? ?1 (45), > >? ? ? >? ? ?3 (25)) > >? ? ? >? ? ? >? ? ? >? ? ? ? marker: 1 strata with value/size (1 (21)) > >? ? ? >? ? ? >? ? ? >? ? ? ? Face Sets: 1 strata with value/size > (1 (10)) > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?and get larger local meshes with overlap = 1 > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?master *:~/Downloads/tmp/Nicolas$ > >? ? ? >? ? ? >? ? ?/PETSc3/petsc/apple/bin/mpiexec -n > >? ? ? >? ? ? >? ? ? >? ? ?2 ./test_overlap -initial_dm_view > >? ? ?-dm_plex_box_faces 5,5 > >? ? ? >? ? ? >? ? ? >? ? ?-dm_distribute -dm_distribute_overlap 1 > >? ? ? >? ? ? >? ? ? >? ? ?DM Object: Initial DM 2 MPI processes > >? ? ? >? ? ? >? ? ? >? ? ? ? type: plex > >? ? ? >? ? ? >? ? ? >? ? ?Initial DM in 2 dimensions: > >? ? ? >? ? ? >? ? ? >? ? ? ? 0-cells: 29 29 > >? ? ? >? ? ? >? ? ? >? ? ? ? 1-cells: 65 65 > >? ? ? >? ? ? >? ? ? >? ? ? ? 2-cells: 37 37 > >? ? ? >? ? ? >? ? ? >? ? ?Labels: > >? ? ? >? ? ? >? ? ? >? ? ? ? depth: 3 strata with value/size (0 > (29), 1 > >? ? ?(65), 2 > >? ? ? >? ? ?(37)) > >? ? ? >? ? ? >? ? ? >? ? ? ? celltype: 3 strata with value/size (0 > (29), > >? ? ?1 (65), > >? ? ? >? ? ?3 (37)) > >? ? ? >? ? ? >? ? ? >? ? ? ? marker: 1 strata with value/size (1 (27)) > >? ? ? >? ? ? >? ? ? >? ? ? ? Face Sets: 1 strata with value/size > (1 (13)) > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? Thanks, > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? ?Matt > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?On Wed, Mar 31, 2021 at 12:22 PM Nicolas > Barral > >? ? ? >? ? ? >? ? ? >? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >> > >? ? ? >? ? ? >? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >>> > >? ? ? >? ? ? >? ? ? > > ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >> > >? ? ? >? ? ? >? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >>>>> wrote: > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ?@+ > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ?-- > >? ? ? >? ? ? >? ? ? >? ? ? ? ?Nicolas > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ?On 31/03/2021 17:51, Matthew Knepley > wrote: > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > On Sat, Mar 27, 2021 at 9:27 AM > Nicolas > >? ? ?Barral > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > > >? ? ? > > >? ? ? >? ? ? > >? ? ? >> > >? ? ? >? ? ? >? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >>> > >? ? ? >? ? ? >? ? ? > > ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >> > >? ? ? >? ? ? >? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >>>> > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >> > >? ? ? >? ? ? >? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >>> > >? ? ? >? ? ? >? ? ? > > ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >> > >? ? ? >? ? ? >? ? ? > >? ? ? > > >? ? ? >? ? ? > >? ? ? >>>>>> wrote: > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?Hi all, > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?First, I'm not sure I understand > >? ? ?what the > >? ? ? >? ? ?overlap > >? ? ? >? ? ? >? ? ? >? ? ? ? ?parameter in > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?DMPlexDistributeOverlap does. I > >? ? ?tried the > >? ? ? >? ? ?following: > >? ? ? >? ? ? >? ? ? >? ? ? ? ?generate a small > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?mesh on 1 rank with > >? ? ?DMPlexCreateBoxMesh, then > >? ? ? >? ? ? >? ? ?distribute > >? ? ? >? ? ? >? ? ? >? ? ? ? ?it with > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?DMPlexDistribute. At this point I > >? ? ?have two nice > >? ? ? >? ? ? >? ? ? >? ? ? ? ?partitions, with shared > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?vertices and no overlapping > cells. > >? ? ?Then I call > >? ? ? >? ? ? >? ? ? >? ? ? ? ?DMPlexDistributeOverlap > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?with the overlap parameter > set to 0 > >? ? ?or 1, > >? ? ? >? ? ?and get the > >? ? ? >? ? ? >? ? ? >? ? ? ? ?same resulting > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?plex in both cases. Why is that ? > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > The overlap parameter says how > many cell > >? ? ? >? ? ?adjacencies to go > >? ? ? >? ? ? >? ? ? >? ? ? ? ?out. You > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > should not get the same > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > mesh out. We have lots of > examples that use > >? ? ? >? ? ?this. If > >? ? ? >? ? ? >? ? ?you send > >? ? ? >? ? ? >? ? ? >? ? ? ? ?your small > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > example, I can probably > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > tell you what is happening. > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ?Ok so I do have a small example on > that and the > >? ? ? >? ? ?DMClone > >? ? ? >? ? ? >? ? ?thing I > >? ? ? >? ? ? >? ? ? >? ? ? ? ?set up > >? ? ? >? ? ? >? ? ? >? ? ? ? ?to understand! I attach it to the email. > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ?For the overlap, you can change the > overlap > >? ? ? >? ? ?constant at > >? ? ? >? ? ? >? ? ?the top > >? ? ? >? ? ? >? ? ? >? ? ? ? ?of the > >? ? ? >? ? ? >? ? ? >? ? ? ? ?file. With OVERLAP=0 or 1, the > distributed > >? ? ? >? ? ?overlapping mesh > >? ? ? >? ? ? >? ? ? >? ? ? ? ?(shown using > >? ? ? >? ? ? >? ? ? >? ? ? ? ?-over_dm_view, it's DMover) are the > same, and > >? ? ? >? ? ?different > >? ? ? >? ? ? >? ? ?from the > >? ? ? >? ? ? >? ? ? >? ? ? ? ?mesh > >? ? ? >? ? ? >? ? ? >? ? ? ? ?before distributing the overlap > (shown using > >? ? ? >? ? ? >? ? ?-distrib_dm_view). For > >? ? ? >? ? ? >? ? ? >? ? ? ? ?larger overlap values they're different. > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ?The process is: > >? ? ? >? ? ? >? ? ? >? ? ? ? ?1/ create a DM dm on 1 rank > >? ? ? >? ? ? >? ? ? >? ? ? ? ?2/ clone dm into dm2 > >? ? ? >? ? ? >? ? ? >? ? ? ? ?3/ distribute dm > >? ? ? >? ? ? >? ? ? >? ? ? ? ?4/ clone dm into dm3 > >? ? ? >? ? ? >? ? ? >? ? ? ? ?5/ distribute dm overlap > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ?I print all the DMs after each step. > dm has a > >? ? ? >? ? ?distributed > >? ? ? >? ? ? >? ? ? >? ? ? ? ?overlap, dm2 > >? ? ? >? ? ? >? ? ? >? ? ? ? ?is not distributed, dm3 is > distributed but > >? ? ?without > >? ? ? >? ? ? >? ? ?overlap. Since > >? ? ? >? ? ? >? ? ? >? ? ? ? ?distribute and distributeOverlap > create new > >? ? ?DMs, I > >? ? ? >? ? ?don't seem > >? ? ? >? ? ? >? ? ? >? ? ? ? ?have a > >? ? ? >? ? ? >? ? ? >? ? ? ? ?problem with the shallow copies. > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?Second, I'm wondering what > would be > >? ? ?a good > >? ? ? >? ? ?way to > >? ? ? >? ? ? >? ? ?handle > >? ? ? >? ? ? >? ? ? >? ? ? ? ?two overlaps > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?and associated local vectors. > In my > >? ? ?adaptation > >? ? ? >? ? ? >? ? ?code, the > >? ? ? >? ? ? >? ? ? >? ? ? ? ?remeshing > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?library requires a > non-overlapping mesh, > >? ? ? >? ? ?while the > >? ? ? >? ? ? >? ? ? >? ? ? ? ?refinement criterion > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?computation is based on hessian > >? ? ? >? ? ?computations, which > >? ? ? >? ? ? >? ? ? >? ? ? ? ?require a layer of > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?overlap. What I can do is > clone the > >? ? ?dm before > >? ? ? >? ? ? >? ? ? >? ? ? ? ?distributing the overlap, > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?then manage two independent plex > >? ? ?objects with > >? ? ? >? ? ? >? ? ?their own > >? ? ? >? ? ? >? ? ? >? ? ? ? ?local sections > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?etc. and copy/trim local vectors > >? ? ?manually. > >? ? ? >? ? ?Is there a > >? ? ? >? ? ? >? ? ? >? ? ? ? ?more automatic > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?way > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?to do this ? > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > DMClone() is a shallow copy, so that > >? ? ?will not work. > >? ? ? >? ? ? >? ? ?You would > >? ? ? >? ? ? >? ? ? >? ? ? ? ?maintain > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > two different Plexes, overlapping > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > and non-overlapping, with their > own sections > >? ? ? >? ? ?and vecs. Are > >? ? ? >? ? ? >? ? ? >? ? ? ? ?you sure you > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > need to keep around the > non-overlapping one? > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > Maybe if I understood what operations > >? ? ?you want > >? ? ? >? ? ?to work, I > >? ? ? >? ? ? >? ? ? >? ? ? ? ?could say > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > something more definitive. > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ?I need to be able to pass the > >? ? ?non-overlapping mesh > >? ? ? >? ? ?to the > >? ? ? >? ? ? >? ? ? >? ? ? ? ?remesher. I > >? ? ? >? ? ? >? ? ? >? ? ? ? ?can either maintain 2 plexes, or > trim the > >? ? ?overlapping > >? ? ? >? ? ? >? ? ?plex when > >? ? ? >? ? ? >? ? ? >? ? ? ? ?I create > >? ? ? >? ? ? >? ? ? >? ? ? ? ?the arrays I give to the remesher. > I'm not sure > >? ? ? >? ? ?which is the > >? ? ? >? ? ? >? ? ? >? ? ? ? ?best/worst ? > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ?Thanks > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ?-- > >? ? ? >? ? ? >? ? ? >? ? ? ? ?Nicolas > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? Thanks, > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ? ?Matt > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?Thanks > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?-- > >? ? ? >? ? ? >? ? ? >? ? ? ? ? >? ? ?Nicolas > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > -- > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > What most experimenters take for > granted > >? ? ?before > >? ? ? >? ? ?they > >? ? ? >? ? ? >? ? ?begin their > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > experiments is infinitely more > >? ? ?interesting than any > >? ? ? >? ? ? >? ? ?results > >? ? ? >? ? ? >? ? ? >? ? ? ? ?to which > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > their experiments lead. > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > -- Norbert Wiener > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? >? ? ? >? ? ? >? ? ? ? ? > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? >? ? ?-- > >? ? ? >? ? ? >? ? ? >? ? ?What most experimenters take for granted > before > >? ? ?they > >? ? ? >? ? ?begin their > >? ? ? >? ? ? >? ? ? >? ? ?experiments is infinitely more > interesting than any > >? ? ? >? ? ?results > >? ? ? >? ? ? >? ? ?to which > >? ? ? >? ? ? >? ? ? >? ? ?their experiments lead. > >? ? ? >? ? ? >? ? ? >? ? ?-- Norbert Wiener > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? >? ? ? >? ? ? >? ? ? > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > -- > >? ? ? >? ? ? >? ? ? > What most experimenters take for granted before > >? ? ?they begin > >? ? ? >? ? ?their > >? ? ? >? ? ? >? ? ? > experiments is infinitely more interesting > than any > >? ? ? >? ? ?results to which > >? ? ? >? ? ? >? ? ? > their experiments lead. > >? ? ? >? ? ? >? ? ? > -- Norbert Wiener > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? >? ? ? >? ? ? > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > -- > >? ? ? >? ? ? > What most experimenters take for granted before > they begin > >? ? ?their > >? ? ? >? ? ? > experiments is infinitely more interesting than any > >? ? ?results to which > >? ? ? >? ? ? > their experiments lead. > >? ? ? >? ? ? > -- Norbert Wiener > >? ? ? >? ? ? > > >? ? ? >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? >? ? ? > >? ? ? > > >? ? ? > > >? ? ? > > >? ? ? > -- > >? ? ? > What most experimenters take for granted before they begin > their > >? ? ? > experiments is infinitely more interesting than any > results to which > >? ? ? > their experiments lead. > >? ? ? > -- Norbert Wiener > >? ? ? > > >? ? ? > https://www.cse.buffalo.edu/~knepley/ > >? ? ? > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From knepley at gmail.com Wed Mar 31 15:28:30 2021 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Mar 2021 16:28:30 -0400 Subject: [petsc-users] DMPlex overlap In-Reply-To: References: <0f529250-4af3-c5c2-9b9f-8e272ada24e9@math.u-bordeaux.fr> <1c9c519d-4d94-66e1-844c-329fb58d3e94@math.u-bordeaux.fr> <3d847dd6-f0c6-ff9c-d287-0b56538b2610@math.u-bordeaux.fr> Message-ID: On Wed, Mar 31, 2021 at 4:00 PM Nicolas Barral < nicolas.barral at math.u-bordeaux.fr> wrote: > On 31/03/2021 21:44, Matthew Knepley wrote: > > On Wed, Mar 31, 2021 at 3:36 PM Nicolas Barral > > > > wrote: > > > > On 31/03/2021 21:24, Matthew Knepley wrote: > > > Ok so that's worth clarifying, but my problem is that dm has a > > > 0-overlap > > > before DMPlexDistributeOverlap and odm has a 1-overlap even > > though I > > > passed "0". If I understand what you just wrote, adding "0" > > more levels > > > of overlap to 0-overlap, that should still be 0 overlap ? > > > > > > Yet when I look at the code of DMPlexDistributeOverlap, > > you're flagging > > > points to be added to the overlap whatever "k" is. > > > > > > > > > Crap. There is a bug handling 0. Evidently, no one ever asks for > > overlap 0. > > > Will fix. > > > > > ok now I'm sure I understand what you explained :) Thanks Matt. > > > > Now we can look at the other question: I need to be able to pass the > > non-overlapping mesh to the remesher. I can either maintain 2 > > plexes, or > > trim the overlapping plex when I create the arrays I pass to the > > remesher. I'm not sure which is the best/worst ? > > > > > > I would start with two plexes since it is very easy. > > > I'll try that then, hopefully it should keep me busy for a while before > the next round of questions. Thanks Matt! > Excellent. I have your fix in: https://gitlab.com/petsc/petsc/-/merge_requests/3796 Thanks, Matt > -- > Nicolas > > > > Trimming the plex would essentially make another plex anyway, but it is > > not hard using DMPlexFilter(). > > > > Thanks, > > > > Matt > > > > Thanks > > > > -- > > Nicolas > > > > > > > Thanks, > > > > > > Matt > > > > > > Sample code: > > > static char help[] = "Tests plex distribution and > overlaps.\n"; > > > > > > #include > > > > > > int main (int argc, char * argv[]) { > > > > > > DM dm, odm; > > > MPI_Comm comm; > > > PetscErrorCode ierr; > > > > > > ierr = PetscInitialize(&argc, &argv, NULL, help);if (ierr) > > > return ierr; > > > comm = PETSC_COMM_WORLD; > > > > > > ierr = DMPlexCreateBoxMesh(comm, 2, PETSC_TRUE, NULL, > > NULL, NULL, > > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > > > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > > > > > > > > > > > ierr = PetscObjectSetName((PetscObject) dm, "DM > > > before");CHKERRQ(ierr); > > > ierr = DMViewFromOptions(dm, NULL, > > "-before_dm_view");CHKERRQ(ierr); > > > DMPlexDistributeOverlap(dm, 0, NULL, &odm); > > > ierr = PetscObjectSetName((PetscObject) odm, "DM > > > after");CHKERRQ(ierr); > > > ierr = DMViewFromOptions(odm, NULL, > > "-after_dm_view");CHKERRQ(ierr); > > > > > > > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > > ierr = DMDestroy(&odm);CHKERRQ(ierr); > > > ierr = PetscFinalize(); > > > return ierr; > > > } > > > > > > % mpiexec -n 2 ./test_overlapV3 -dm_plex_box_faces 5,5 > > -dm_distribute > > > -before_dm_view -after_dm_view > > > DM Object: DM before 2 MPI processes > > > type: plex > > > DM before in 2 dimensions: > > > 0-cells: 21 21 > > > 1-cells: 45 45 > > > 2-cells: 25 25 > > > Labels: > > > depth: 3 strata with value/size (0 (21), 1 (45), 2 (25)) > > > celltype: 3 strata with value/size (0 (21), 1 (45), 3 > (25)) > > > marker: 1 strata with value/size (1 (21)) > > > Face Sets: 1 strata with value/size (1 (10)) > > > DM Object: DM after 2 MPI processes > > > type: plex > > > DM after in 2 dimensions: > > > 0-cells: 29 29 > > > 1-cells: 65 65 > > > 2-cells: 37 37 > > > Labels: > > > depth: 3 strata with value/size (0 (29), 1 (65), 2 (37)) > > > celltype: 3 strata with value/size (0 (29), 1 (65), 3 > (37)) > > > marker: 1 strata with value/size (1 (27)) > > > Face Sets: 1 strata with value/size (1 (13)) > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Matt > > > > > > > > Thanks, > > > > > > > > -- > > > > Nicolas > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Matt > > > > > > > > > > if (!dm) {printf("Big problem\n"); dm = > odm;} > > > > > else {DMDestroy(&odm);} > > > > > ierr = PetscObjectSetName((PetscObject) dm, > > "Initial > > > > > DM");CHKERRQ(ierr); > > > > > ierr = DMViewFromOptions(dm, NULL, > > > > > "-initial_dm_view");CHKERRQ(ierr); > > > > > > > > > > > > > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > > > > ierr = PetscFinalize(); > > > > > return ierr; > > > > > } > > > > > > > > > > called with mpiexec -n 2 ./test_overlapV3 > > -initial_dm_view > > > > > -dm_plex_box_faces 5,5 -dm_distribute > > > > > > > > > > gives: > > > > > DM Object: Initial DM 2 MPI processes > > > > > type: plex > > > > > Initial DM in 2 dimensions: > > > > > 0-cells: 29 29 > > > > > 1-cells: 65 65 > > > > > 2-cells: 37 37 > > > > > Labels: > > > > > depth: 3 strata with value/size (0 (29), 1 > > (65), 2 > > > (37)) > > > > > celltype: 3 strata with value/size (0 (29), > 1 > > > (65), 3 (37)) > > > > > marker: 1 strata with value/size (1 (27)) > > > > > Face Sets: 1 strata with value/size (1 (13)) > > > > > > > > > > which is not what I expect ? > > > > > > > > > > Thanks, > > > > > > > > > > -- > > > > > Nicolas > > > > > > > > > > On 31/03/2021 19:02, Matthew Knepley wrote: > > > > > > Alright, I think the problem had to do with > > keeping > > > track > > > > of what > > > > > DM you > > > > > > were looking at. This code increases the > > overlap of an > > > > initial DM: > > > > > > > > > > > > static char help[] = "Tests plex > > distribution and > > > > overlaps.\n"; > > > > > > > > > > > > #include > > > > > > > > > > > > int main (int argc, char * argv[]) { > > > > > > > > > > > > DM dm, dm2; > > > > > > PetscInt overlap; > > > > > > MPI_Comm comm; > > > > > > PetscErrorCode ierr; > > > > > > > > > > > > ierr = PetscInitialize(&argc, &argv, NULL, > > > help);if (ierr) > > > > > return ierr; > > > > > > comm = PETSC_COMM_WORLD; > > > > > > > > > > > > ierr = DMPlexCreateBoxMesh(comm, 2, > > PETSC_TRUE, > > > NULL, > > > > NULL, NULL, > > > > > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > > > > > > ierr = DMSetFromOptions(dm);CHKERRQ(ierr); > > > > > > ierr = PetscObjectSetName((PetscObject) > > dm, "Initial > > > > > DM");CHKERRQ(ierr); > > > > > > ierr = DMViewFromOptions(dm, NULL, > > > > > "-initial_dm_view");CHKERRQ(ierr); > > > > > > > > > > > > ierr = DMPlexGetOverlap(dm, > > &overlap);CHKERRQ(ierr); > > > > > > ierr = DMPlexDistributeOverlap(dm, > > overlap+1, NULL, > > > > > &dm2);CHKERRQ(ierr); > > > > > > ierr = PetscObjectSetName((PetscObject) > dm2, > > > "More Overlap > > > > > > DM");CHKERRQ(ierr); > > > > > > ierr = DMViewFromOptions(dm2, NULL, > > > > > "-over_dm_view");CHKERRQ(ierr); > > > > > > > > > > > > ierr = DMDestroy(&dm2);CHKERRQ(ierr); > > > > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > > > > > ierr = PetscFinalize(); > > > > > > return ierr; > > > > > > } > > > > > > > > > > > > and when we run it we get the expected result > > > > > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > > > /PETSc3/petsc/apple/bin/mpiexec > > > > > -n 2 > > > > > > ./test_overlap -initial_dm_view > > -dm_plex_box_faces 5,5 > > > > > -dm_distribute > > > > > > -dm_distribute_overlap 1 -over_dm_view > > > > > > DM Object: Initial DM 2 MPI processes > > > > > > type: plex > > > > > > Initial DM in 2 dimensions: > > > > > > 0-cells: 29 29 > > > > > > 1-cells: 65 65 > > > > > > 2-cells: 37 37 > > > > > > Labels: > > > > > > depth: 3 strata with value/size (0 (29), > > 1 (65), > > > 2 (37)) > > > > > > celltype: 3 strata with value/size (0 > (29), 1 > > > (65), 3 (37)) > > > > > > marker: 1 strata with value/size (1 (27)) > > > > > > Face Sets: 1 strata with value/size (1 > (13)) > > > > > > DM Object: More Overlap DM 2 MPI processes > > > > > > type: plex > > > > > > More Overlap DM in 2 dimensions: > > > > > > 0-cells: 36 36 > > > > > > 1-cells: 85 85 > > > > > > 2-cells: 50 50 > > > > > > Labels: > > > > > > depth: 3 strata with value/size (0 (36), > > 1 (85), > > > 2 (50)) > > > > > > celltype: 3 strata with value/size (0 > (36), 1 > > > (85), 3 (50)) > > > > > > marker: 1 strata with value/size (1 (40)) > > > > > > Face Sets: 1 strata with value/size (1 > (20)) > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Matt > > > > > > > > > > > > On Wed, Mar 31, 2021 at 12:57 PM Matthew > Knepley > > > > > > > > > > > > > >> > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > >>>>> wrote: > > > > > > > > > > > > Okay, let me show a really simple > > example that > > > gives > > > > the expected > > > > > > result before I figure out what is going > > wrong for > > > > you. This code > > > > > > > > > > > > static char help[] = "Tests plex > > distribution and > > > > overlaps.\n"; > > > > > > > > > > > > #include > > > > > > > > > > > > int main (int argc, char * argv[]) { > > > > > > DM dm; > > > > > > MPI_Comm comm; > > > > > > PetscErrorCode ierr; > > > > > > > > > > > > ierr = PetscInitialize(&argc, &argv, > > NULL, > > > help);if > > > > (ierr) > > > > > return > > > > > > ierr; > > > > > > comm = PETSC_COMM_WORLD; > > > > > > > > > > > > ierr = DMPlexCreateBoxMesh(comm, 2, > > > PETSC_TRUE, NULL, > > > > > NULL, NULL, > > > > > > NULL, PETSC_TRUE, &dm);CHKERRQ(ierr); > > > > > > ierr = > > DMSetFromOptions(dm);CHKERRQ(ierr); > > > > > > ierr = > > PetscObjectSetName((PetscObject) dm, > > > "Initial > > > > > > DM");CHKERRQ(ierr); > > > > > > ierr = DMViewFromOptions(dm, NULL, > > > > > "-initial_dm_view");CHKERRQ(ierr); > > > > > > ierr = DMDestroy(&dm);CHKERRQ(ierr); > > > > > > ierr = PetscFinalize(); > > > > > > return ierr; > > > > > > } > > > > > > > > > > > > can do all the overlap tests. For > > example, you > > > can run > > > > it naively > > > > > > and get a serial mesh > > > > > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > > > > /PETSc3/petsc/apple/bin/mpiexec -n > > > > > > 2 ./test_overlap -initial_dm_view > > > -dm_plex_box_faces 5,5 > > > > > > DM Object: Initial DM 2 MPI processes > > > > > > type: plex > > > > > > Initial DM in 2 dimensions: > > > > > > 0-cells: 36 0 > > > > > > 1-cells: 85 0 > > > > > > 2-cells: 50 0 > > > > > > Labels: > > > > > > celltype: 3 strata with value/size (0 > > (36), > > > 3 (50), > > > > 1 (85)) > > > > > > depth: 3 strata with value/size (0 > > (36), 1 > > > (85), 2 > > > > (50)) > > > > > > marker: 1 strata with value/size (1 > (40)) > > > > > > Face Sets: 1 strata with value/size > > (1 (20)) > > > > > > > > > > > > Then run it telling Plex to distribute > after > > > creating > > > > the mesh > > > > > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > > > > /PETSc3/petsc/apple/bin/mpiexec -n > > > > > > 2 ./test_overlap -initial_dm_view > > > -dm_plex_box_faces 5,5 > > > > > -dm_distribute > > > > > > DM Object: Initial DM 2 MPI processes > > > > > > type: plex > > > > > > Initial DM in 2 dimensions: > > > > > > 0-cells: 21 21 > > > > > > 1-cells: 45 45 > > > > > > 2-cells: 25 25 > > > > > > Labels: > > > > > > depth: 3 strata with value/size (0 > > (21), 1 > > > (45), 2 > > > > (25)) > > > > > > celltype: 3 strata with value/size (0 > > (21), > > > 1 (45), > > > > 3 (25)) > > > > > > marker: 1 strata with value/size (1 > (21)) > > > > > > Face Sets: 1 strata with value/size > > (1 (10)) > > > > > > > > > > > > The get the same thing back with overlap > = 0 > > > > > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > > > > /PETSc3/petsc/apple/bin/mpiexec -n > > > > > > 2 ./test_overlap -initial_dm_view > > > -dm_plex_box_faces 5,5 > > > > > > -dm_distribute -dm_distribute_overlap 0 > > > > > > DM Object: Initial DM 2 MPI processes > > > > > > type: plex > > > > > > Initial DM in 2 dimensions: > > > > > > 0-cells: 21 21 > > > > > > 1-cells: 45 45 > > > > > > 2-cells: 25 25 > > > > > > Labels: > > > > > > depth: 3 strata with value/size (0 > > (21), 1 > > > (45), 2 > > > > (25)) > > > > > > celltype: 3 strata with value/size (0 > > (21), > > > 1 (45), > > > > 3 (25)) > > > > > > marker: 1 strata with value/size (1 > (21)) > > > > > > Face Sets: 1 strata with value/size > > (1 (10)) > > > > > > > > > > > > and get larger local meshes with overlap > = 1 > > > > > > > > > > > > master *:~/Downloads/tmp/Nicolas$ > > > > > /PETSc3/petsc/apple/bin/mpiexec -n > > > > > > 2 ./test_overlap -initial_dm_view > > > -dm_plex_box_faces 5,5 > > > > > > -dm_distribute -dm_distribute_overlap 1 > > > > > > DM Object: Initial DM 2 MPI processes > > > > > > type: plex > > > > > > Initial DM in 2 dimensions: > > > > > > 0-cells: 29 29 > > > > > > 1-cells: 65 65 > > > > > > 2-cells: 37 37 > > > > > > Labels: > > > > > > depth: 3 strata with value/size (0 > > (29), 1 > > > (65), 2 > > > > (37)) > > > > > > celltype: 3 strata with value/size (0 > > (29), > > > 1 (65), > > > > 3 (37)) > > > > > > marker: 1 strata with value/size (1 > (27)) > > > > > > Face Sets: 1 strata with value/size > > (1 (13)) > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Matt > > > > > > > > > > > > On Wed, Mar 31, 2021 at 12:22 PM Nicolas > > Barral > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >>>>> wrote: > > > > > > > > > > > > > > > > > > > > > > > > @+ > > > > > > > > > > > > -- > > > > > > Nicolas > > > > > > > > > > > > On 31/03/2021 17:51, Matthew Knepley > > wrote: > > > > > > > On Sat, Mar 27, 2021 at 9:27 AM > > Nicolas > > > Barral > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >>>> > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >>>>>> wrote: > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > First, I'm not sure I > understand > > > what the > > > > overlap > > > > > > parameter in > > > > > > > DMPlexDistributeOverlap does. > I > > > tried the > > > > following: > > > > > > generate a small > > > > > > > mesh on 1 rank with > > > DMPlexCreateBoxMesh, then > > > > > distribute > > > > > > it with > > > > > > > DMPlexDistribute. At this > point I > > > have two nice > > > > > > partitions, with shared > > > > > > > vertices and no overlapping > > cells. > > > Then I call > > > > > > DMPlexDistributeOverlap > > > > > > > with the overlap parameter > > set to 0 > > > or 1, > > > > and get the > > > > > > same resulting > > > > > > > plex in both cases. Why is > that ? > > > > > > > > > > > > > > > > > > > > > The overlap parameter says how > > many cell > > > > adjacencies to go > > > > > > out. You > > > > > > > should not get the same > > > > > > > mesh out. We have lots of > > examples that use > > > > this. If > > > > > you send > > > > > > your small > > > > > > > example, I can probably > > > > > > > tell you what is happening. > > > > > > > > > > > > > > > > > > > Ok so I do have a small example on > > that and the > > > > DMClone > > > > > thing I > > > > > > set up > > > > > > to understand! I attach it to the > email. > > > > > > > > > > > > For the overlap, you can change the > > overlap > > > > constant at > > > > > the top > > > > > > of the > > > > > > file. With OVERLAP=0 or 1, the > > distributed > > > > overlapping mesh > > > > > > (shown using > > > > > > -over_dm_view, it's DMover) are the > > same, and > > > > different > > > > > from the > > > > > > mesh > > > > > > before distributing the overlap > > (shown using > > > > > -distrib_dm_view). For > > > > > > larger overlap values they're > different. > > > > > > > > > > > > The process is: > > > > > > 1/ create a DM dm on 1 rank > > > > > > 2/ clone dm into dm2 > > > > > > 3/ distribute dm > > > > > > 4/ clone dm into dm3 > > > > > > 5/ distribute dm overlap > > > > > > > > > > > > I print all the DMs after each step. > > dm has a > > > > distributed > > > > > > overlap, dm2 > > > > > > is not distributed, dm3 is > > distributed but > > > without > > > > > overlap. Since > > > > > > distribute and distributeOverlap > > create new > > > DMs, I > > > > don't seem > > > > > > have a > > > > > > problem with the shallow copies. > > > > > > > > > > > > > > > > > > > Second, I'm wondering what > > would be > > > a good > > > > way to > > > > > handle > > > > > > two overlaps > > > > > > > and associated local vectors. > > In my > > > adaptation > > > > > code, the > > > > > > remeshing > > > > > > > library requires a > > non-overlapping mesh, > > > > while the > > > > > > refinement criterion > > > > > > > computation is based on > hessian > > > > computations, which > > > > > > require a layer of > > > > > > > overlap. What I can do is > > clone the > > > dm before > > > > > > distributing the overlap, > > > > > > > then manage two independent > plex > > > objects with > > > > > their own > > > > > > local sections > > > > > > > etc. and copy/trim local > vectors > > > manually. > > > > Is there a > > > > > > more automatic > > > > > > > way > > > > > > > to do this ? > > > > > > > > > > > > > > > > > > > > > DMClone() is a shallow copy, so > that > > > will not work. > > > > > You would > > > > > > maintain > > > > > > > two different Plexes, overlapping > > > > > > > and non-overlapping, with their > > own sections > > > > and vecs. Are > > > > > > you sure you > > > > > > > need to keep around the > > non-overlapping one? > > > > > > > Maybe if I understood what > operations > > > you want > > > > to work, I > > > > > > could say > > > > > > > something more definitive. > > > > > > > > > > > > > I need to be able to pass the > > > non-overlapping mesh > > > > to the > > > > > > remesher. I > > > > > > can either maintain 2 plexes, or > > trim the > > > overlapping > > > > > plex when > > > > > > I create > > > > > > the arrays I give to the remesher. > > I'm not sure > > > > which is the > > > > > > best/worst ? > > > > > > > > > > > > Thanks > > > > > > > > > > > > -- > > > > > > Nicolas > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Matt > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > -- > > > > > > > Nicolas > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > What most experimenters take for > > granted > > > before > > > > they > > > > > begin their > > > > > > > experiments is infinitely more > > > interesting than any > > > > > results > > > > > > to which > > > > > > > their experiments lead. > > > > > > > -- Norbert Wiener > > > > > > > > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > < > http://www.cse.buffalo.edu/~knepley/> > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > What most experimenters take for granted > > before > > > they > > > > begin their > > > > > > experiments is infinitely more > > interesting than any > > > > results > > > > > to which > > > > > > their experiments lead. > > > > > > -- Norbert Wiener > > > > > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > What most experimenters take for granted > before > > > they begin > > > > their > > > > > > experiments is infinitely more interesting > > than any > > > > results to which > > > > > > their experiments lead. > > > > > > -- Norbert Wiener > > > > > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > What most experimenters take for granted before > > they begin > > > their > > > > > experiments is infinitely more interesting than any > > > results to which > > > > > their experiments lead. > > > > > -- Norbert Wiener > > > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > > > > > > -- > > > > What most experimenters take for granted before they begin > > their > > > > experiments is infinitely more interesting than any > > results to which > > > > their experiments lead. > > > > -- Norbert Wiener > > > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to > which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: