From ksi2443 at gmail.com Thu Dec 1 01:31:45 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Thu, 1 Dec 2022 16:31:45 +0900 Subject: [petsc-users] Speed of matrix assembly Message-ID: Hello, I set a large-sized global matrix through matsetsize and proceed with the process of assembling multiple local-sized matrices into a global matrix (By using MatSetValue). However, only using matsetsize much less than expected performance. In this case, what can be done to speed up assembly? Thanks, Hyung Kim -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Dec 1 07:23:27 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 1 Dec 2022 08:23:27 -0500 Subject: [petsc-users] Speed of matrix assembly In-Reply-To: References: Message-ID: On Thu, Dec 1, 2022 at 2:32 AM ??? wrote: > Hello, > > > I set a large-sized global matrix through matsetsize and proceed with the > process of assembling multiple local-sized matrices into a global matrix > (By using MatSetValue). > > However, only using matsetsize much less than expected performance. > In this case, what can be done to speed up assembly? > https://petsc.org/main/faq/#assembling-large-sparse-matrices-takes-a-long-time-what-can-i-do-to-make-this-process-faster-or-matsetvalues-is-so-slow-what-can-i-do-to-speed-it-up Thanks, Matt > Thanks, > Hyung Kim > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Dec 1 07:23:46 2022 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 1 Dec 2022 08:23:46 -0500 Subject: [petsc-users] Speed of matrix assembly In-Reply-To: References: Message-ID: This is the recommended call sequence to create a matrix: PetscCall(MatCreate(comm, &A)); PetscCall(MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, N, N)); PetscCall(MatSetFromOptions(A)); PetscCall(MatSeqAIJSetPreallocation(A, 3, NULL)); PetscCall(MatMPIAIJSetPreallocation(A, 3, NULL, 2, NULL)); You need to look at https://petsc.org/main/docs/manualpages/Mat/MatMPIAIJSetPreallocation And set the preallocation correctly. Note, you can run with -info (very noisy so filter with grep or something) and look for messages about the number of mallocs in the matrix assembly. You want to preallocate enough to have zero mallocs. Mark On Thu, Dec 1, 2022 at 2:32 AM ??? wrote: > Hello, > > > I set a large-sized global matrix through matsetsize and proceed with the > process of assembling multiple local-sized matrices into a global matrix > (By using MatSetValue). > > However, only using matsetsize much less than expected performance. > In this case, what can be done to speed up assembly? > > Thanks, > Hyung Kim > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leejaku at gmail.com Thu Dec 1 08:42:29 2022 From: leejaku at gmail.com (Jake Lee) Date: Thu, 1 Dec 2022 23:42:29 +0900 Subject: [petsc-users] Compile error on petsc-3.17 Message-ID: Dear Petsc Project Manager, I am Jake Lee with the Advanced Institute of Convergence Technology in South Korea. I'm programming with Petsc-3.17 and I encountered the following two errors. PETSC_FUNCTION_NAME_CXX and PETSC_RESTRICT macros are not defined? 1. ./petsc-3.17.0/include/petscmacros.h:10:31: error: ?PETSC_FUNCTION_NAME_CXX? was not declared in this scope; did you mean ?PETSC_FUNCTION_NAME_C?? 10 | # define PETSC_FUNCTION_NAME PETSC_FUNCTION_NAME_CXX 2. In file included from ./slepc-3.17.0/include/slepcsys.h:18, from ./slepc-3.17.0/include/slepcst.h:16, from ./slepc-3.17.0/include/slepceps.h:16, from main.cpp:15: ./petsc-3.17.0/include/petscsys.h:2548:108: error: expected ?,? or ?...? before ?*? token 2548 | static inline PetscErrorCode PetscSegBufferGetInts(PetscSegBuffer seg,size_t count,PetscInt *PETSC_RESTRICT*slot) {return PetscSegBufferGet(seg,count,(void**)slot);} | ^ ./petsc-3.17.0/include/petscsys.h: In function ?PetscErrorCode PetscSegBufferGetInts(PetscSegBuffer, size_t, PetscInt*)?: ./petsc-3.17.0/include/petscsys.h:2548:159: error: ?slot? was not declared in this scope 2548 | rrorCode PetscSegBufferGetInts(PetscSegBuffer seg,size_t count,PetscInt *PETSC_RESTRICT*slot) {return PetscSegBufferGet(seg,count,(void**)slot);} | ^~~~ It seems to be a simple problem. Can I get some hints for them? My environment is... - CentOS 7 - gcc 9.3.1 - compile option g++ -std=c++17 -I./eigen-3.4.0 -I./boost_1_77_0 -I./petsc-3.17.0/include -I./petsc-3.17.0/arch-linux-c-debug/include -I./slepc-3.17.0/include -I./slepc-3.17.0/arch-linux-c-debug/include -c main.cpp Thank you! Jake Lee (???) -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Dec 1 09:06:43 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 1 Dec 2022 10:06:43 -0500 Subject: [petsc-users] Compile error on petsc-3.17 In-Reply-To: References: Message-ID: On Thu, Dec 1, 2022 at 9:53 AM Jake Lee wrote: > Dear Petsc Project Manager, > > I am Jake Lee with the Advanced Institute of Convergence Technology in > South Korea. > I'm programming with Petsc-3.17 and I encountered the following two > errors. > PETSC_FUNCTION_NAME_CXX and PETSC_RESTRICT macros are not defined? > > 1. ./petsc-3.17.0/include/petscmacros.h:10:31: error: > ?PETSC_FUNCTION_NAME_CXX? was not declared in this scope; did you mean > ?PETSC_FUNCTION_NAME_C?? > 10 | # define PETSC_FUNCTION_NAME PETSC_FUNCTION_NAME_CXX > > 2. In file included from ./slepc-3.17.0/include/slepcsys.h:18, > from ./slepc-3.17.0/include/slepcst.h:16, > from ./slepc-3.17.0/include/slepceps.h:16, > from main.cpp:15: > ./petsc-3.17.0/include/petscsys.h:2548:108: error: expected ?,? or ?...? > before ?*? token > 2548 | static inline PetscErrorCode PetscSegBufferGetInts(PetscSegBuffer > seg,size_t count,PetscInt *PETSC_RESTRICT*slot) {return > PetscSegBufferGet(seg,count,(void**)slot);} > | > ^ > ./petsc-3.17.0/include/petscsys.h: In function ?PetscErrorCode > PetscSegBufferGetInts(PetscSegBuffer, size_t, PetscInt*)?: > ./petsc-3.17.0/include/petscsys.h:2548:159: error: ?slot? was not declared > in this scope > 2548 | rrorCode PetscSegBufferGetInts(PetscSegBuffer seg,size_t > count,PetscInt *PETSC_RESTRICT*slot) {return > PetscSegBufferGet(seg,count,(void**)slot);} > | > ^~~~ > > 1. PETSC_FUNCTION_NAME_CXX is defined in petscconf.h which is created by configure. Perhaps you need to rerun your configure if you updated from an earlier version. You can check in petsc-3.17.0/arch-linux-c-debug/include/petscconf.h for the macro 2. The PETSC_RESTRICT is a knock-on error from the last error. Thanks, Matt > It seems to be a simple problem. Can I get some hints for them? > > My environment is... > - CentOS 7 > - gcc 9.3.1 > - compile option > g++ -std=c++17 -I./eigen-3.4.0 -I./boost_1_77_0 -I./petsc-3.17.0/include > -I./petsc-3.17.0/arch-linux-c-debug/include -I./slepc-3.17.0/include > -I./slepc-3.17.0/arch-linux-c-debug/include -c main.cpp > > Thank you! > > Jake Lee (???) > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Dec 1 09:22:08 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 1 Dec 2022 09:22:08 -0600 (CST) Subject: [petsc-users] Compile error on petsc-3.17 In-Reply-To: References: Message-ID: <282e770e-82d4-9f8e-6548-986c7d7786ea@mcs.anl.gov> Do you get these errors when you compile petsc/slepc examples - with corresponding makefiles? Likely the issue is the difference in g++ compiler options [between these example builds - and your build] Satish On Thu, 1 Dec 2022, Jake Lee wrote: > Dear Petsc Project Manager, > > I am Jake Lee with the Advanced Institute of Convergence Technology in > South Korea. > I'm programming with Petsc-3.17 and I encountered the following two errors. > PETSC_FUNCTION_NAME_CXX and PETSC_RESTRICT macros are not defined? > > 1. ./petsc-3.17.0/include/petscmacros.h:10:31: error: > ?PETSC_FUNCTION_NAME_CXX? was not declared in this scope; did you mean > ?PETSC_FUNCTION_NAME_C?? > 10 | # define PETSC_FUNCTION_NAME PETSC_FUNCTION_NAME_CXX > > 2. In file included from ./slepc-3.17.0/include/slepcsys.h:18, > from ./slepc-3.17.0/include/slepcst.h:16, > from ./slepc-3.17.0/include/slepceps.h:16, > from main.cpp:15: > ./petsc-3.17.0/include/petscsys.h:2548:108: error: expected ?,? or ?...? > before ?*? token > 2548 | static inline PetscErrorCode PetscSegBufferGetInts(PetscSegBuffer > seg,size_t count,PetscInt *PETSC_RESTRICT*slot) {return > PetscSegBufferGet(seg,count,(void**)slot);} > | > ^ > ./petsc-3.17.0/include/petscsys.h: In function ?PetscErrorCode > PetscSegBufferGetInts(PetscSegBuffer, size_t, PetscInt*)?: > ./petsc-3.17.0/include/petscsys.h:2548:159: error: ?slot? was not declared > in this scope > 2548 | rrorCode PetscSegBufferGetInts(PetscSegBuffer seg,size_t > count,PetscInt *PETSC_RESTRICT*slot) {return > PetscSegBufferGet(seg,count,(void**)slot);} > | > ^~~~ > > > It seems to be a simple problem. Can I get some hints for them? > > My environment is... > - CentOS 7 > - gcc 9.3.1 > - compile option > g++ -std=c++17 -I./eigen-3.4.0 -I./boost_1_77_0 -I./petsc-3.17.0/include > -I./petsc-3.17.0/arch-linux-c-debug/include -I./slepc-3.17.0/include > -I./slepc-3.17.0/arch-linux-c-debug/include -c main.cpp > > Thank you! > > Jake Lee (???) > From leejaku at gmail.com Thu Dec 1 10:03:53 2022 From: leejaku at gmail.com (Jake Lee) Date: Fri, 2 Dec 2022 01:03:53 +0900 Subject: [petsc-users] Compile error on petsc-3.17 In-Reply-To: References: Message-ID: Dear Matt, It was my "configure" mistake. I was set --with-cxx=0 not --with-cxx=g++. This issue is solved now. Thank you very much. Jake Lee (???) 2022? 12? 2? (?) ?? 12:06, Matthew Knepley ?? ??: > On Thu, Dec 1, 2022 at 9:53 AM Jake Lee wrote: > >> Dear Petsc Project Manager, >> >> I am Jake Lee with the Advanced Institute of Convergence Technology in >> South Korea. >> I'm programming with Petsc-3.17 and I encountered the following two >> errors. >> PETSC_FUNCTION_NAME_CXX and PETSC_RESTRICT macros are not defined? >> >> 1. ./petsc-3.17.0/include/petscmacros.h:10:31: error: >> ?PETSC_FUNCTION_NAME_CXX? was not declared in this scope; did you mean >> ?PETSC_FUNCTION_NAME_C?? >> 10 | # define PETSC_FUNCTION_NAME PETSC_FUNCTION_NAME_CXX >> >> 2. In file included from ./slepc-3.17.0/include/slepcsys.h:18, >> from ./slepc-3.17.0/include/slepcst.h:16, >> from ./slepc-3.17.0/include/slepceps.h:16, >> from main.cpp:15: >> ./petsc-3.17.0/include/petscsys.h:2548:108: error: expected ?,? or ?...? >> before ?*? token >> 2548 | static inline PetscErrorCode PetscSegBufferGetInts(PetscSegBuffer >> seg,size_t count,PetscInt *PETSC_RESTRICT*slot) {return >> PetscSegBufferGet(seg,count,(void**)slot);} >> | >> ^ >> ./petsc-3.17.0/include/petscsys.h: In function ?PetscErrorCode >> PetscSegBufferGetInts(PetscSegBuffer, size_t, PetscInt*)?: >> ./petsc-3.17.0/include/petscsys.h:2548:159: error: ?slot? was not >> declared in this scope >> 2548 | rrorCode PetscSegBufferGetInts(PetscSegBuffer seg,size_t >> count,PetscInt *PETSC_RESTRICT*slot) {return >> PetscSegBufferGet(seg,count,(void**)slot);} >> | >> ^~~~ >> >> > 1. PETSC_FUNCTION_NAME_CXX is defined in petscconf.h which is created by > configure. Perhaps you need to rerun your configure > if you updated from an earlier version. You can check in > > petsc-3.17.0/arch-linux-c-debug/include/petscconf.h > > for the macro > > 2. The PETSC_RESTRICT is a knock-on error from the last error. > > Thanks, > > Matt > > >> It seems to be a simple problem. Can I get some hints for them? >> >> My environment is... >> - CentOS 7 >> - gcc 9.3.1 >> - compile option >> g++ -std=c++17 -I./eigen-3.4.0 -I./boost_1_77_0 >> -I./petsc-3.17.0/include -I./petsc-3.17.0/arch-linux-c-debug/include >> -I./slepc-3.17.0/include -I./slepc-3.17.0/arch-linux-c-debug/include -c >> main.cpp >> >> Thank you! >> >> Jake Lee (???) >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Dec 1 11:55:08 2022 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 1 Dec 2022 12:55:08 -0500 Subject: [petsc-users] Report Bug TaoALMM class In-Reply-To: References: <4eec06f9-d534-7a02-9abe-6d1415f663f0@math.tu-freiberg.de> <14f2cdd6-9cbe-20a6-0c7d-3006b2ee4dc1@math.tu-freiberg.de> <5E53FE56-5C68-4F06-8A48-54ACBDC800C7@petsc.dev> <892a51c2-17f7-ac1f-f55d-05981978a4f4@math.tu-freiberg.de> <2E5E8937-D739-4CFB-9A21-28DFBF5791B5@petsc.dev> Message-ID: Stephan, The Tao author has kindly fixed the bug in the code you found in https://gitlab.com/petsc/petsc/-/merge_requests/5890 Please let us know if this works and resolves your problem. Thanks Barry > On Nov 22, 2022, at 4:03 AM, Stephan K?hler wrote: > > Yeah, I also read this. But for me as a reader "highly recommended" does not mean that Armijo does not work correct with ALMM :). > > And I think that Armijo is easier than trust-region in the sense that if Armijo does not work, than I did something wrong in the Hessian or the gradient. In trust-region methods, there are more parameters and maybe I set one of them wrong or something like that. > At least, this is my view. But I'm also not an expert on trust-region methods. > > On 12.11.22 06:00, Barry Smith wrote: >> >> I noticed this in the TAOALMM manual page. >> >> It is also highly recommended that the subsolver chosen by the user utilize a trust-region >> strategy for globalization (default: TAOBQNKTR) especially if the outer problem features bound constraints. >> >> I am far from an expert on these topics. >> >>> On Nov 4, 2022, at 7:43 AM, Stephan K?hler wrote: >>> >>> Barry, >>> >>> this is a nonartificial code. This is a problem in the ALMM subsolver. I want to solve a problem with a TaoALMM solver what then happens is: >>> >>> TaoSolve(tao) /* TaoALMM solver */ >>> | >>> | >>> |--------> This calls the TaoALMM subsolver routine >>> >>> TaoSolve(subsolver) >>> | >>> | >>> |-----------> The subsolver does not correctly work, at least with an Armijo line search, since the solution is overwritten within the line search. >>> In my case, the subsolver does not make any progress although it is possible. >>> >>> To get to my real problem you can simply change line 268 to if(0) (from if(1) -----> if(0)) and line 317 from // ierr = TaoSolve(tao); CHKERRQ(ierr); -------> ierr = TaoSolve(tao); CHKERRQ(ierr); >>> What you can see is that the solver does not make any progress, but it should make progress. >>> >>> To be honest, I do not really know why the option -tao_almm_subsolver_tao_ls_monitor has know effect if the ALMM solver is called and not the subsolver. I also do not know why -tao_almm_subsolver_tao_view prints as termination reason for the subsolver >>> >>> Solution converged: ||g(X)|| <= gatol >>> >>> This is obviously not the case. I set the tolerance >>> -tao_almm_subsolver_tao_gatol 1e-8 \ >>> -tao_almm_subsolver_tao_grtol 1e-8 \ >>> >>> I encountered this and then I looked into the ALMM class and therefore I tried to call the subsolver (previous example). >>> >>> I attach the updated programm and also the options. >>> >>> Stephan >>> >>> >>> >>> >>> >>> On 03.11.22 22:15, Barry Smith wrote: >>>> >>>> Thanks for your response and the code. I understand the potential problem and how your code demonstrates a bug if the TaoALMMSubsolverObjective() is used in the manner you use in the example where you directly call TaoComputeObjective() multiple times line a line search code might. >>>> >>>> What I don't have or understand is how to reproduce the problem in a real code that uses Tao. That is where the Tao Armijo line search code has a problem when it is used (somehow) in a Tao solver with ALMM. You suggest "If you have an example for your own, you can switch the Armijo line search by the option -tao_ls_type armijo. The thing is that it will cause no problems if the line search accepts the steps with step length one." I don't see how to do this if I use -tao_type almm I cannot use -tao_ls_type armijo; that is the option -tao_ls_type doesn't seem to me to be usable in the context of almm (since almm internally does directly its own trust region approach for globalization). If we remove the if (1) code from your example, is there some Tao options I can use to get the bug to appear inside the Tao solve? >>>> >>>> I'll try to explain again, I agree that the fact that the Tao solution is aliased (within the ALMM solver) is a problem with repeated calls to TaoComputeObjective() but I cannot see how these repeated calls could ever happen in the use of TaoSolve() with the ALMM solver. That is when is this "design problem" a true problem as opposed to just a potential problem that can be demonstrated in artificial code? >>>> >>>> The reason I need to understand the non-artificial situation it breaks things is to come up with an appropriate correction for the current code. >>>> >>>> Barry >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>>> On Nov 3, 2022, at 12:46 PM, Stephan K?hler wrote: >>>>> >>>>> Barry, >>>>> >>>>> so far, I have not experimented with trust-region methods, but I can imagine that this "design feature" causes no problem for trust-region methods, if the old point is saved and after the trust-region check fails the old point is copied to the actual point. But the implementation of the Armijo line search method does not work that way. Here, the actual point will always be overwritten. Only if the line search fails, then the old point is restored, but then the TaoSolve method ends with a line search failure. >>>>> >>>>> If you have an example for your own, you can switch the Armijo line search by the option -tao_ls_type armijo. The thing is that it will cause no problems if the line search accepts the steps with step length one. >>>>> It is also possible that, by luck, it will cause no problems, if the "excessive" step brings a reduction of the objective >>>>> >>>>> Otherwise, I attach my example, which is not minimal, but here you can see that it causes problems. You need to set the paths to the PETSc library in the makefile. You find the options for this problem in the run_test_tao_neohooke.sh script. >>>>> The import part begins at line 292 in test_tao_neohooke.cpp >>>>> >>>>> Stephan >>>>> >>>>> On 02.11.22 19:04, Barry Smith wrote: >>>>>> Stephan, >>>>>> >>>>>> I have located the troublesome line in TaoSetUp_ALMM() it has the line >>>>>> >>>>>> auglag->Px = tao->solution; >>>>>> >>>>>> and in alma.h it has >>>>>> >>>>>> Vec Px, LgradX, Ce, Ci, G; /* aliased vectors (do not destroy!) */ >>>>>> >>>>>> Now auglag->P in some situations alias auglag->P and in some cases auglag->Px serves to hold a portion of auglag->P. So then in TaoALMMSubsolverObjective_Private() >>>>>> the lines >>>>>> >>>>>> PetscCall(VecCopy(P, auglag->P)); >>>>>> PetscCall((*auglag->sub_obj)(auglag->parent)); >>>>>> >>>>>> causes, just as you said, tao->solution to be overwritten by the P at which the objective function is being computed. In other words, the solution of the outer Tao is aliased with the solution of the inner Tao, by design. >>>>>> >>>>>> You are definitely correct, the use of TaoALMMSubsolverObjective_Private and TaoALMMSubsolverObjectiveAndGradient_Private in a line search would be problematic. >>>>>> >>>>>> I am not an expert at these methods or their implementations. Could you point to an actual use case within Tao that triggers the problem. Is there a set of command line options or code calls to Tao that fail due to this "design feature". Within the standard use of ALMM I do not see how the objective function would be used within a line search. The TaoSolve_ALMM() code is self-correcting in that if a trust region check fails it automatically rolls back the solution. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On Oct 28, 2022, at 4:27 AM, Stephan K?hler wrote: >>>>>>> >>>>>>> Dear PETSc/Tao team, >>>>>>> >>>>>>> it seems to be that there is a bug in the TaoALMM class: >>>>>>> >>>>>>> In the methods TaoALMMSubsolverObjective_Private and TaoALMMSubsolverObjectiveAndGradient_Private the vector where the function value for the augmented Lagrangian is evaluate >>>>>>> is copied into the current solution, see, e.g., https://petsc.org/release/src/tao/constrained/impls/almm/almm.c.html line 672 or 682. This causes subsolver routine to not converge if the line search for the subsolver rejects the step length 1. for some >>>>>>> update. In detail: >>>>>>> >>>>>>> Suppose the current iterate is xk and the current update is dxk. The line search evaluates the augmented Lagrangian now at (xk + dxk). This causes that the value (xk + dxk) is copied in the current solution. If the point (xk + dxk) is rejected, the line search should >>>>>>> try the point (xk + alpha * dxk), where alpha < 1. But due to the copying, what happens is that the point ((xk + dxk) + alpha * dxk) is evaluated, see, e.g., https://petsc.org/release/src/tao/linesearch/impls/armijo/armijo.c.html line 191. >>>>>>> >>>>>>> Best regards >>>>>>> Stephan K?hler >>>>>>> >>>>>>> -- >>>>>>> Stephan K?hler >>>>>>> TU Bergakademie Freiberg >>>>>>> Institut f?r numerische Mathematik und Optimierung >>>>>>> >>>>>>> Akademiestra?e 6 >>>>>>> 09599 Freiberg >>>>>>> Geb?udeteil Mittelbau, Zimmer 2.07 >>>>>>> >>>>>>> Telefon: +49 (0)3731 39-3173 (B?ro) >>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> Stephan K?hler >>>>> TU Bergakademie Freiberg >>>>> Institut f?r numerische Mathematik und Optimierung >>>>> >>>>> Akademiestra?e 6 >>>>> 09599 Freiberg >>>>> Geb?udeteil Mittelbau, Zimmer 2.07 >>>>> >>>>> Telefon: +49 (0)3731 39-3173 (B?ro) >>>>> >>>> >>> >>> -- >>> Stephan K?hler >>> TU Bergakademie Freiberg >>> Institut f?r numerische Mathematik und Optimierung >>> >>> Akademiestra?e 6 >>> 09599 Freiberg >>> Geb?udeteil Mittelbau, Zimmer 2.07 >>> >>> Telefon: +49 (0)3731 39-3173 (B?ro) >>> >> > > -- > Stephan K?hler > TU Bergakademie Freiberg > Institut f?r numerische Mathematik und Optimierung > > Akademiestra?e 6 > 09599 Freiberg > Geb?udeteil Mittelbau, Zimmer 2.07 > > Telefon: +49 (0)3731 39-3173 (B?ro) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Thu Dec 1 13:54:30 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Thu, 1 Dec 2022 13:54:30 -0600 Subject: [petsc-users] [EXTERNAL] Re: Using multiple MPI ranks with COO interface crashes in some cases In-Reply-To: References: Message-ID: Hi, Philip, The petsc bug is fixed in https://gitlab.com/petsc/petsc/-/merge_requests/5892, which is now in petsc/release and will be merged to petsc/main Thanks. --Junchao Zhang On Tue, Nov 22, 2022 at 11:56 AM Fackler, Philip wrote: > Great! Thank you! > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Tuesday, November 22, 2022 12:02 > *To:* Fackler, Philip > *Cc:* petsc-users at mcs.anl.gov ; Blondel, Sophie < > sblondel at utk.edu> > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Using multiple MPI ranks with > COO interface crashes in some cases > > > > > On Tue, Nov 22, 2022 at 10:14 AM Fackler, Philip > wrote: > > Yes, that one is. I haven't updated the tests. So just build the > SystemTester target or the xolotl target. > > OK, I see. I reproduced the petsc error and am looking into it. Thanks a > lot. > > > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Monday, November 21, 2022 15:36 > *To:* Fackler, Philip > *Cc:* petsc-users at mcs.anl.gov ; Blondel, Sophie < > sblondel at utk.edu> > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Using multiple MPI ranks with > COO interface crashes in some cases > > > > > On Mon, Nov 21, 2022 at 9:31 AM Fackler, Philip > wrote: > > Not sure why. I'm using the same compiler. But you can try constructing > the object explicitly on that line: > > idPairs.push_back(core::RowColPair{i, i}); > > > WIth your change, I continued but met another error: > > /home/jczhang/xolotl/test/core/diffusion/Diffusion2DHandlerTester.cpp(79): > error: class "xolotl::core::diffusion::Diffusion2DHandler" has no member > "initializeOFill" > > it seems all these problems are related to the branch > * feature-petsc-kokkos, *instead of the compiler etc. When I switched to > origin/stable, I could build xolotl. > > > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Sunday, November 20, 2022 13:25 > *To:* Fackler, Philip > *Cc:* petsc-users at mcs.anl.gov ; Blondel, Sophie < > sblondel at utk.edu> > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Using multiple MPI ranks with > COO interface crashes in some cases > > > > On Tue, Nov 15, 2022 at 10:55 AM Fackler, Philip > wrote: > > I built petsc with: > > $ ./configure PETSC_DIR=$PWD PETSC_ARCH=arch-kokkos-serial-debug > --with-cc=mpicc --with-cxx=mpicxx --with-fc=0 --with-debugging=0 > --prefix=$HOME/build/petsc/debug/install --with-64-bit-indices > --with-shared-libraries --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --download-kokkos > --download-kokkos-kernels > > $ make PETSC_DIR=$PWD PETSC_ARCH=arch-kokkos-serial-debug all > > $ make PETSC_DIR=$PWD PETSC_ARCH=arch-kokkos-serial-debug install > > > Then I build xolotl in a separate build directory (after checking out the > "feature-petsc-kokkos" branch) with: > > $ cmake -DCMAKE_BUILD_TYPE=Debug > -DKokkos_DIR=$HOME/build/petsc/debug/install > -DPETSC_DIR=$HOME/build/petsc/debug/install > > $ make -j4 SystemTester > > Hi, Philip, I tried multiple times and still failed at building xolotl. > I installed boost-1.74 and HDF5, and used gcc-11.3. > > make -j4 SystemTester > ... > [ 9%] Building CXX object > xolotl/core/CMakeFiles/xolotlCore.dir/src/diffusion/DiffusionHandler.cpp.o > /home/jczhang/xolotl/xolotl/core/src/diffusion/DiffusionHandler.cpp(55): > error: no instance of overloaded function "std::vector<_Tp, > _Alloc>::push_back [with _Tp=xolotl::core::RowColPair, > _Alloc=std::allocator]" matches the argument list > argument types are: ({...}) > object type is: std::vector std::allocator> > > 1 error detected in the compilation of > "/home/jczhang/xolotl/xolotl/core/src/diffusion/DiffusionHandler.cpp". > > > > > Then, from the xolotl build directory, run (for example): > > $ mpirun -n 2 ./test/system/SystemTester -t System/NE_4 -- -v > > Note that this test case will use the parameter file > '/benchmarks/params_system_NE_4.txt' which has the command-line > arguments for petsc in its "petscArgs=..." line. If you look at > '/test/system/SystemTester.cpp' all the system test cases > follow the same naming convention with their corresponding parameter files > under '/benchmarks'. > > The failure happens with the NE_4 case (which is 2D) and the PSI_3 case > (which is 1D). > > Let me know if this is still unclear. > > Thanks, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Tuesday, November 15, 2022 00:16 > *To:* Fackler, Philip > *Cc:* petsc-users at mcs.anl.gov ; Blondel, Sophie < > sblondel at utk.edu> > *Subject:* [EXTERNAL] Re: [petsc-users] Using multiple MPI ranks with COO > interface crashes in some cases > > Hi, Philip, > Can you tell me instructions to build Xolotl to reproduce the error? > --Junchao Zhang > > > On Mon, Nov 14, 2022 at 12:24 PM Fackler, Philip via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > In Xolotl's "feature-petsc-kokkos" branch, I have moved our code to use > the COO interface for preallocating and setting values in the Jacobian > matrix. I have found that with some of our test cases, using more than one > MPI rank results in a crash. Way down in the preconditioner code in petsc a > Mat gets computed that has "null" for the "productsymbolic" member of its > "ops". It's pretty far removed from where we compute the Jacobian entries, > so I haven't been able (so far) to track it back to an error in my code. > I'd appreciate some help with this from someone who is more familiar with > the petsc guts so we can figure out what I'm doing wrong. (I'm assuming > it's a bug in Xolotl.) > > Note that this is using the kokkos backend for Mat and Vec in petsc, but > with a serial-only build of kokkos and kokkos-kernels. So, it's a CPU-only > multiple MPI rank run. > > Here's a paste of the error output showing the relevant parts of the call > stack: > > [ERROR] [0]PETSC ERROR: > [ERROR] --------------------- Error Message > -------------------------------------------------------------- > [ERROR] [1]PETSC ERROR: > [ERROR] --------------------- Error Message > -------------------------------------------------------------- > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] No support for this operation for this object type > [ERROR] [1]PETSC ERROR: > [ERROR] No support for this operation for this object type > [ERROR] [0]PETSC ERROR: > [ERROR] No method productsymbolic for Mat of type (null) > [ERROR] No method productsymbolic for Mat of type (null) > [ERROR] [0]PETSC ERROR: > [ERROR] [1]PETSC ERROR: > [ERROR] See hxxps://petsc.org/release/faq/ for trouble shooting. > [ERROR] See hxxps://petsc.org/release/faq/ for trouble shooting. > [ERROR] [0]PETSC ERROR: > [ERROR] [1]PETSC ERROR: > [ERROR] Petsc Development GIT revision: v3.18.1-115-gdca010e0e9a GIT > Date: 2022-10-28 14:39:41 +0000 > [ERROR] Petsc Development GIT revision: v3.18.1-115-gdca010e0e9a GIT > Date: 2022-10-28 14:39:41 +0000 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] Unknown Name on a named PC0115427 by 4pf Mon Nov 14 13:22:01 2022 > [ERROR] Unknown Name on a named PC0115427 by 4pf Mon Nov 14 13:22:01 2022 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] Configure options PETSC_DIR=/home/4pf/repos/petsc > PETSC_ARCH=arch-kokkos-serial-debug --with-debugging=1 --with-cc=mpicc > --with-cxx=mpicxx --with-fc=0 --with-cudac=0 > --prefix=/home/4pf/build/petsc/serial-debug/install --with-64-bit-indices > --with-shared-libraries > --with-kokkos-dir=/home/4pf/build/kokkos/serial/install > --with-kokkos-kernels-dir=/home/4pf/build/kokkos-kernels/serial/install > [ERROR] Configure options PETSC_DIR=/home/4pf/repos/petsc > PETSC_ARCH=arch-kokkos-serial-debug --with-debugging=1 --with-cc=mpicc > --with-cxx=mpicxx --with-fc=0 --with-cudac=0 > --prefix=/home/4pf/build/petsc/serial-debug/install --with-64-bit-indices > --with-shared-libraries > --with-kokkos-dir=/home/4pf/build/kokkos/serial/install > --with-kokkos-kernels-dir=/home/4pf/build/kokkos-kernels/serial/install > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #1 MatProductSymbolic_MPIAIJKokkos_AB() at > /home/4pf/repos/petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx:918 > [ERROR] #1 MatProductSymbolic_MPIAIJKokkos_AB() at > /home/4pf/repos/petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx:918 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #2 MatProductSymbolic_MPIAIJKokkos() at > /home/4pf/repos/petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx:1138 > [ERROR] #2 MatProductSymbolic_MPIAIJKokkos() at > /home/4pf/repos/petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx:1138 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #3 MatProductSymbolic() at > /home/4pf/repos/petsc/src/mat/interface/matproduct.c:793 > [ERROR] #3 MatProductSymbolic() at > /home/4pf/repos/petsc/src/mat/interface/matproduct.c:793 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #4 MatProduct_Private() at > /home/4pf/repos/petsc/src/mat/interface/matrix.c:9820 > [ERROR] #4 MatProduct_Private() at > /home/4pf/repos/petsc/src/mat/interface/matrix.c:9820 > [ERROR] [0]PETSC ERROR: > [ERROR] [1]PETSC ERROR: > [ERROR] #5 MatMatMult() at > /home/4pf/repos/petsc/src/mat/interface/matrix.c:9897 > [ERROR] #5 MatMatMult() at > /home/4pf/repos/petsc/src/mat/interface/matrix.c:9897 > [ERROR] [0]PETSC ERROR: > [ERROR] [1]PETSC ERROR: > [ERROR] #6 PCGAMGOptProlongator_AGG() at > /home/4pf/repos/petsc/src/ksp/pc/impls/gamg/agg.c:769 > [ERROR] #6 PCGAMGOptProlongator_AGG() at > /home/4pf/repos/petsc/src/ksp/pc/impls/gamg/agg.c:769 > [ERROR] [0]PETSC ERROR: > [ERROR] [1]PETSC ERROR: > [ERROR] #7 PCSetUp_GAMG() at > /home/4pf/repos/petsc/src/ksp/pc/impls/gamg/gamg.c:639 > [ERROR] #7 PCSetUp_GAMG() at > /home/4pf/repos/petsc/src/ksp/pc/impls/gamg/gamg.c:639 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #8 PCSetUp() at > /home/4pf/repos/petsc/src/ksp/pc/interface/precon.c:994 > [ERROR] #8 PCSetUp() at > /home/4pf/repos/petsc/src/ksp/pc/interface/precon.c:994 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #9 KSPSetUp() at > /home/4pf/repos/petsc/src/ksp/ksp/interface/itfunc.c:406 > [ERROR] #9 KSPSetUp() at > /home/4pf/repos/petsc/src/ksp/ksp/interface/itfunc.c:406 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #10 KSPSolve_Private() at > /home/4pf/repos/petsc/src/ksp/ksp/interface/itfunc.c:825 > [ERROR] #10 KSPSolve_Private() at > /home/4pf/repos/petsc/src/ksp/ksp/interface/itfunc.c:825 > [ERROR] [0]PETSC ERROR: > [ERROR] [1]PETSC ERROR: > [ERROR] #11 KSPSolve() at > /home/4pf/repos/petsc/src/ksp/ksp/interface/itfunc.c:1071 > [ERROR] #11 KSPSolve() at > /home/4pf/repos/petsc/src/ksp/ksp/interface/itfunc.c:1071 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #12 PCApply_FieldSplit() at > /home/4pf/repos/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c:1246 > [ERROR] #12 PCApply_FieldSplit() at > /home/4pf/repos/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c:1246 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #13 PCApply() at > /home/4pf/repos/petsc/src/ksp/pc/interface/precon.c:441 > [ERROR] #13 PCApply() at > /home/4pf/repos/petsc/src/ksp/pc/interface/precon.c:441 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #14 KSP_PCApply() at > /home/4pf/repos/petsc/include/petsc/private/kspimpl.h:380 > [ERROR] #14 KSP_PCApply() at > /home/4pf/repos/petsc/include/petsc/private/kspimpl.h:380 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #15 KSPFGMRESCycle() at > /home/4pf/repos/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c:152 > [ERROR] #15 KSPFGMRESCycle() at > /home/4pf/repos/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c:152 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #16 KSPSolve_FGMRES() at > /home/4pf/repos/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c:273 > [ERROR] #16 KSPSolve_FGMRES() at > /home/4pf/repos/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c:273 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #17 KSPSolve_Private() at > /home/4pf/repos/petsc/src/ksp/ksp/interface/itfunc.c:899 > [ERROR] #17 KSPSolve_Private() at > /home/4pf/repos/petsc/src/ksp/ksp/interface/itfunc.c:899 > [ERROR] [0]PETSC ERROR: > [ERROR] [1]PETSC ERROR: > [ERROR] #18 KSPSolve() at > /home/4pf/repos/petsc/src/ksp/ksp/interface/itfunc.c:1071 > [ERROR] #18 KSPSolve() at > /home/4pf/repos/petsc/src/ksp/ksp/interface/itfunc.c:1071 > [ERROR] [0]PETSC ERROR: > [ERROR] [1]PETSC ERROR: > [ERROR] #19 SNESSolve_NEWTONLS() at > /home/4pf/repos/petsc/src/snes/impls/ls/ls.c:210 > [ERROR] #19 SNESSolve_NEWTONLS() at > /home/4pf/repos/petsc/src/snes/impls/ls/ls.c:210 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #20 SNESSolve() at > /home/4pf/repos/petsc/src/snes/interface/snes.c:4689 > [ERROR] #20 SNESSolve() at > /home/4pf/repos/petsc/src/snes/interface/snes.c:4689 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #21 TSStep_ARKIMEX() at > /home/4pf/repos/petsc/src/ts/impls/arkimex/arkimex.c:791 > [ERROR] #21 TSStep_ARKIMEX() at > /home/4pf/repos/petsc/src/ts/impls/arkimex/arkimex.c:791 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #22 TSStep() at /home/4pf/repos/petsc/src/ts/interface/ts.c:3445 > [ERROR] #22 TSStep() at /home/4pf/repos/petsc/src/ts/interface/ts.c:3445 > [ERROR] [1]PETSC ERROR: > [ERROR] [0]PETSC ERROR: > [ERROR] #23 TSSolve() at /home/4pf/repos/petsc/src/ts/interface/ts.c:3836 > [ERROR] #23 TSSolve() at /home/4pf/repos/petsc/src/ts/interface/ts.c:3836 > [ERROR] PetscSolver::solve: TSSolve failed. > [ERROR] PetscSolver::solve: TSSolve failed. > Aborting. > Aborting. > > > > Thanks for the help, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aduarteg at utexas.edu Thu Dec 1 14:43:44 2022 From: aduarteg at utexas.edu (Alfredo J Duarte Gomez) Date: Thu, 1 Dec 2022 14:43:44 -0600 Subject: [petsc-users] PETSc geometric multigrid use Message-ID: Good morning, Good afternoon, I am testing the performance of some preconditioners on a problem I have and I wanted to try geometric multigrid. I am using a DMDA object and respective matrices (the mesh is structured, but not rectangular, it is curvilinear) I have used the default version of algebraic multigrid (that is -pc_gamg_type agg) very easily. When I try to use -pc_gamg_type geo, it returns an error. What additional lines of code do I have to add to use the geo option? Or does this require some sort of fast remeshing routine, given that I am using curvilinear grids? Thank you and have a good day. -Alfredo -- Alfredo Duarte Graduate Research Assistant The University of Texas at Austin -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Dec 1 15:53:31 2022 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 1 Dec 2022 16:53:31 -0500 Subject: [petsc-users] PETSc geometric multigrid use In-Reply-To: References: Message-ID: I don't think you want to use -pc_type gamg if you want to use geometric multigrid. You can can use -pc_type mg and the DMDA. The only thing I think you need to change from the default use of DMDA and -pc_type mg is to provide custom code that computes the interpolation between levels to take into account the curvilinear grids. Barry > On Dec 1, 2022, at 3:43 PM, Alfredo J Duarte Gomez wrote: > > Good morning, > > Good afternoon, > > I am testing the performance of some preconditioners on a problem I have and I wanted to try geometric multigrid. I am using a DMDA object and respective matrices (the mesh is structured, but not rectangular, it is curvilinear) > > I have used the default version of algebraic multigrid (that is -pc_gamg_type agg) very easily. > > When I try to use -pc_gamg_type geo, it returns an error. > > What additional lines of code do I have to add to use the geo option? > > Or does this require some sort of fast remeshing routine, given that I am using curvilinear grids? > > Thank you and have a good day. > > -Alfredo > > -- > Alfredo Duarte > Graduate Research Assistant > The University of Texas at Austin -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Thu Dec 1 16:05:50 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Thu, 1 Dec 2022 16:05:50 -0600 Subject: [petsc-users] [EXTERNAL] Re: Kokkos backend for Mat and Vec diverging when running on CUDA device. In-Reply-To: References: Message-ID: Hi, Philip, Sorry for the long delay. I could not get something useful from the -log_view output. Since I have already built xolotl, could you give me instructions on how to do a xolotl test to reproduce the divergence with petsc GPU backends (but fine on CPU)? Thank you. --Junchao Zhang On Wed, Nov 16, 2022 at 1:38 PM Fackler, Philip wrote: > ------------------------------------------------------------------ PETSc > Performance Summary: > ------------------------------------------------------------------ > > Unknown Name on a named PC0115427 with 1 processor, by 4pf Wed Nov 16 > 14:36:46 2022 > Using Petsc Development GIT revision: v3.18.1-115-gdca010e0e9a GIT Date: > 2022-10-28 14:39:41 +0000 > > Max Max/Min Avg Total > Time (sec): 6.023e+00 1.000 6.023e+00 > Objects: 1.020e+02 1.000 1.020e+02 > Flops: 1.080e+09 1.000 1.080e+09 1.080e+09 > Flops/sec: 1.793e+08 1.000 1.793e+08 1.793e+08 > MPI Msg Count: 0.000e+00 0.000 0.000e+00 0.000e+00 > MPI Msg Len (bytes): 0.000e+00 0.000 0.000e+00 0.000e+00 > MPI Reductions: 0.000e+00 0.000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flops > and VecAXPY() for complex vectors of length N > --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count > %Total Avg %Total Count %Total > 0: Main Stage: 6.0226e+00 100.0% 1.0799e+09 100.0% 0.000e+00 > 0.0% 0.000e+00 0.0% 0.000e+00 0.0% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > AvgLen: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over > all processors) > GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU > time over all processors) > CpuToGpu Count: total number of CPU to GPU copies per processor > CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per > processor) > GpuToCpu Count: total number of GPU to CPU copies per processor > GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per > processor) > GPU %F: percent flops on GPU in this event > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total > GPU - CpuToGpu - - GpuToCpu - GPU > > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > Mflop/s Count Size Count Size %F > > > ------------------------------------------------------------------------------------------------------------------------ > --------------------------------------- > > > --- Event Stage 0: Main Stage > > BuildTwoSided 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > DMCreateMat 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SFSetGraph 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SFSetUp 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SFPack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SFUnpack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecDot 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecMDot 775 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecNorm 1728 1.0 nan nan 1.92e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecScale 1983 1.0 nan nan 6.24e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecCopy 780 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecSet 4955 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecAXPY 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecAYPX 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecAXPBYCZ 643 1.0 nan nan 1.79e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecWAXPY 502 1.0 nan nan 5.58e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecMAXPY 1159 1.0 nan nan 3.68e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecScatterBegin 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan > -nan 2 5.14e-03 0 0.00e+00 0 > > VecScatterEnd 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecReduceArith 380 1.0 nan nan 4.23e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecReduceComm 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecNormalize 965 1.0 nan nan 1.61e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > TSStep 20 1.0 5.8699e+00 1.0 1.08e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 97100 0 0 0 97100 0 0 0 184 > -nan 2 5.14e-03 0 0.00e+00 54 > > TSFunctionEval 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 63 1 0 0 0 63 1 0 0 0 -nan > -nan 1 3.36e-04 0 0.00e+00 100 > > TSJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 97 > > MatMult 1930 1.0 nan nan 4.46e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 41 0 0 0 1 41 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > MatMultTranspose 1 1.0 nan nan 3.44e+05 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > MatSolve 965 1.0 nan nan 5.04e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 5 0 0 0 1 5 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatSOR 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatLUFactorSym 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatLUFactorNum 190 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 11 0 0 0 1 11 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatScale 190 1.0 nan nan 3.26e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > MatAssemblyBegin 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatAssemblyEnd 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatGetRowIJ 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatCreateSubMats 380 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatGetOrdering 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatZeroEntries 379 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatSetPreallCOO 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatSetValuesCOO 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > KSPSetUp 760 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > KSPSolve 190 1.0 5.8052e-01 1.0 9.30e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 10 86 0 0 0 10 86 0 0 0 1602 > -nan 1 4.80e-03 0 0.00e+00 46 > > KSPGMRESOrthog 775 1.0 nan nan 2.27e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 2 0 0 0 1 2 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > SNESSolve 71 1.0 5.7117e+00 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 95 99 0 0 0 95 99 0 0 0 188 > -nan 1 4.80e-03 0 0.00e+00 53 > > SNESSetUp 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SNESFunctionEval 573 1.0 nan nan 2.23e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 60 2 0 0 0 60 2 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > SNESJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 97 > > SNESLineSearch 190 1.0 nan nan 1.05e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 53 10 0 0 0 53 10 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > PCSetUp 570 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 2 11 0 0 0 2 11 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > PCApply 965 1.0 nan nan 6.14e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 8 57 0 0 0 8 57 0 0 0 -nan > -nan 1 4.80e-03 0 0.00e+00 19 > > KSPSolve_FS_0 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > KSPSolve_FS_1 965 1.0 nan nan 1.66e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 2 15 0 0 0 2 15 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > > --- Event Stage 1: Unknown > > > ------------------------------------------------------------------------------------------------------------------------ > --------------------------------------- > > > Object Type Creations Destructions. Reports information only > for process 0. > > --- Event Stage 0: Main Stage > > Container 5 5 > Distributed Mesh 2 2 > Index Set 11 11 > IS L to G Mapping 1 1 > Star Forest Graph 7 7 > Discrete System 2 2 > Weak Form 2 2 > Vector 49 49 > TSAdapt 1 1 > TS 1 1 > DMTS 1 1 > SNES 1 1 > DMSNES 3 3 > SNESLineSearch 1 1 > Krylov Solver 4 4 > DMKSP interface 1 1 > Matrix 4 4 > Preconditioner 4 4 > Viewer 2 1 > > --- Event Stage 1: Unknown > > > ======================================================================================================================== > Average time to get PetscTime(): 3.14e-08 > #PETSc Option Table entries: > -log_view > -log_view_gpu_times > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with 64 bit PetscInt > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 8 > Configure options: PETSC_DIR=/home/4pf/repos/petsc > PETSC_ARCH=arch-kokkos-cuda-no-tpls --with-cc=mpicc --with-cxx=mpicxx > --with-fc=0 --with-cuda --with-debugging=0 --with-shared-libraries > --prefix=/home/4pf/build/petsc/cuda-no-tpls/install --with-64-bit-indices > --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --CUDAOPTFLAGS=-O3 > --with-kokkos-dir=/home/4pf/build/kokkos/cuda/install > --with-kokkos-kernels-dir=/home/4pf/build/kokkos-kernels/cuda-no-tpls/install > > ----------------------------------------- > Libraries compiled on 2022-11-01 21:01:08 on PC0115427 > Machine characteristics: Linux-5.15.0-52-generic-x86_64-with-glibc2.35 > Using PETSc directory: /home/4pf/build/petsc/cuda-no-tpls/install > Using PETSc arch: > ----------------------------------------- > > Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas > -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector > -fvisibility=hidden -O3 > ----------------------------------------- > > Using include paths: -I/home/4pf/build/petsc/cuda-no-tpls/install/include > -I/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/include > -I/home/4pf/build/kokkos/cuda/install/include -I/usr/local/cuda-11.8/include > ----------------------------------------- > > Using C linker: mpicc > Using libraries: -Wl,-rpath,/home/4pf/build/petsc/cuda-no-tpls/install/lib > -L/home/4pf/build/petsc/cuda-no-tpls/install/lib -lpetsc > -Wl,-rpath,/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib > -L/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib > -Wl,-rpath,/home/4pf/build/kokkos/cuda/install/lib > -L/home/4pf/build/kokkos/cuda/install/lib > -Wl,-rpath,/usr/local/cuda-11.8/lib64 -L/usr/local/cuda-11.8/lib64 > -L/usr/local/cuda-11.8/lib64/stubs -lkokkoskernels -lkokkoscontainers > -lkokkoscore -llapack -lblas -lm -lcudart -lnvToolsExt -lcufft -lcublas > -lcusparse -lcusolver -lcurand -lcuda -lquadmath -lstdc++ -ldl > ----------------------------------------- > > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Tuesday, November 15, 2022 13:03 > *To:* Fackler, Philip > *Cc:* xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov>; Blondel, Sophie ; Roth, > Philip > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and > Vec diverging when running on CUDA device. > > Can you paste -log_view result so I can see what functions are used? > > --Junchao Zhang > > > On Tue, Nov 15, 2022 at 10:24 AM Fackler, Philip > wrote: > > Yes, most (but not all) of our system test cases fail with the kokkos/cuda > or cuda backends. All of them pass with the CPU-only kokkos backend. > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Monday, November 14, 2022 19:34 > *To:* Fackler, Philip > *Cc:* xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov>; Blondel, Sophie ; Zhang, > Junchao ; Roth, Philip > *Subject:* [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec > diverging when running on CUDA device. > > Hi, Philip, > Sorry to hear that. It seems you could run the same code on CPUs but > not no GPUs (with either petsc/Kokkos backend or petsc/cuda backend, is it > right? > > --Junchao Zhang > > > On Mon, Nov 14, 2022 at 12:13 PM Fackler, Philip via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > This is an issue I've brought up before (and discussed in-person with > Richard). I wanted to bring it up again because I'm hitting the limits of > what I know to do, and I need help figuring this out. > > The problem can be reproduced using Xolotl's "develop" branch built > against a petsc build with kokkos and kokkos-kernels enabled. Then, either > add the relevant kokkos options to the "petscArgs=" line in the system test > parameter file(s), or just replace the system test parameter files with the > ones from the "feature-petsc-kokkos" branch. See here the files that > begin with "params_system_". > > Note that those files use the "kokkos" options, but the problem is similar > using the corresponding cuda/cusparse options. I've already tried building > kokkos-kernels with no TPLs and got slightly different results, but the > same problem. > > Any help would be appreciated. > > Thanks, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From narnoldm at umich.edu Fri Dec 2 04:16:03 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Fri, 2 Dec 2022 05:16:03 -0500 Subject: [petsc-users] DMPlex Traversal Message-ID: Hi Petsc Users I have become familiar with usage of DMPlexGet(Restore)TransitiveClosure as well as general traversal of the graph using DMPlexGetCone and GetSupport. I was wondering if there is a more trivial way to get access to adjacent points of the same Stratum. For example, I have an 2D mesh. I have identified the point value of a cell of interest and I would like to get the point value of the surrounding cells. So far the way I get that is I descend a level using GetCone and then for each constituent point (face in this case) call GetSupport (which yields the adjacent cell and the original cell) and discard the original cell. Is there a direct function call that achieves this? I've looked a bit into the Adjacency commands but they seem to be more for variable influence (FV or FE) and using them seems to throw an error. Sincerely Nicholas -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From matteo.semplice at uninsubria.it Fri Dec 2 05:48:41 2022 From: matteo.semplice at uninsubria.it (Matteo Semplice) Date: Fri, 2 Dec 2022 12:48:41 +0100 Subject: [petsc-users] locate DMSwarm particles with respect to a background DMDA mesh In-Reply-To: References: <8a853d0e-b856-5dc2-5439-d25911d672e4@uninsubria.it> Message-ID: <35ebfa58-eed5-39fa-8b3d-918ff9d7e633@uninsubria.it> Hi. I am sorry to take this up again, but further tests show that it's not right yet. Il 04/11/22 12:48, Matthew Knepley ha scritto: > On Fri, Nov 4, 2022 at 7:46 AM Matteo Semplice > wrote: > > On 04/11/2022 02:43, Matthew Knepley wrote: >> On Thu, Nov 3, 2022 at 8:36 PM Matthew Knepley >> wrote: >> >> On Thu, Oct 27, 2022 at 11:57 AM Semplice Matteo >> wrote: >> >> Dear Petsc developers, >> I am trying to use a DMSwarm to locate a cloud of points >> with respect to a background mesh. In the real >> application the points will be loaded from disk, but I >> have created a small demo in which >> >> * each processor creates Npart particles, all within >> the domain covered by the mesh, but not all in the >> local portion of the mesh >> * migrate the particles >> >> After migration most particles are not any more in the >> DMSwarm (how many and which ones seems to depend on the >> number of cpus, but it never happens that all particle >> survive the migration process). >> >> I am clearly missing some step, since I'd expect that a >> DMDA would be able to locate particles without the need >> to go through a DMShell as it is done in >> src/dm/tutorials/swarm_ex3.c.html >> >> >> I attach my demo code. >> >> Could someone give me a hint? >> >> >> Thanks for sending this. I found the problem. Someone has >> some overly fancy code inside DMDA to figure out the local >> bounding box from the coordinates. >> It is broken for DM_BOUNDARY_GHOSTED, but we never tested >> with this. I will fix it. >> >> >> Okay, I think this fix is correct >> >> https://gitlab.com/petsc/petsc/-/merge_requests/5802 >> >> >> I incorporated your test as src/dm/impls/da/tests/ex1.c. Can you >> take a look and see if this fixes your issue? > > Yes, we have tested 2d and 3d, with various combinations of > DM_BOUNDARY_* along different directions and it works like a charm. > > On a side note, neither DMSwarmViewXDMF nor DMSwarmMigrate seem to > be implemented for 1d: I get > > [0]PETSC ERROR: No support for this operation for this object > type[0]PETSC ERROR: Support not provided for 1D > > However, currently I have no need for this feature. > > Finally, if the test is meant to stay in the source, you may > remove the call to DMSwarmRegisterPetscDatatypeField as in the > attached patch. > > Thanks a lot!! > > Thanks! Glad it works. > > ? ?Matt > There are still problems when not using 1,2 or 4 cpus. Any other number of cpus that I've tested does not work corectly. I have modified src/dm/impls/da/tests/ex1.c to show the problem (see attached patch). In particular, I have introduced a -Nx option to allow grid sizes higher than the standard 4x4 and also suppressed by default particle coordinates output (but there's an option -part_view that recreates the old behaviour). Here's the output on my machine: $ mpirun -np 1 ./da_test_ex1 -Nx 40 Total of 8 particles ... calling DMSwarmMigrate ... Total of 8 particles $ mpirun -np 2 ./da_test_ex1 -Nx 40 Total of 16 particles ... calling DMSwarmMigrate ... Total of 16 particles $ mpirun -np 4 ./da_test_ex1 -Nx 40 Total of 32 particles ... calling DMSwarmMigrate ... Total of 32 particles $ mpirun -np 3 ./da_test_ex1 -Nx 40 Total of 24 particles ... calling DMSwarmMigrate ... Total of 22 particles $ mpirun -oversubscribe -np 5 ./da_test_ex1 -Nx 40 Total of 40 particles ... calling DMSwarmMigrate ... Total of 22 particles $mpirun -oversubscribe -np 8 ./da_test_ex1 -Nx 40 Total of 64 particles ... calling DMSwarmMigrate ... Total of 46 particles As you see, only 1,2,4 cpus do not lose particles. (The test could be easily modified to return nTotPart/predistribution - nTotPart/postdistribution so that it will have a nonzero exit code in case it loses particles) Best ??? Matteo -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: da-tests-ex1.patch Type: text/x-patch Size: 2079 bytes Desc: not available URL: From knepley at gmail.com Fri Dec 2 05:48:53 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 2 Dec 2022 06:48:53 -0500 Subject: [petsc-users] DMPlex Traversal In-Reply-To: References: Message-ID: On Fri, Dec 2, 2022 at 5:16 AM Nicholas Arnold-Medabalimi < narnoldm at umich.edu> wrote: > Hi Petsc Users > > I have become familiar with usage of DMPlexGet(Restore)TransitiveClosure > as well as general traversal of the graph using DMPlexGetCone and > GetSupport. > > I was wondering if there is a more trivial way to get access to adjacent > points of the same Stratum. > > For example, I have an 2D mesh. I have identified the point value of a > cell of interest and I would like to get the point value of the surrounding > cells. So far the way I get that is I descend a level using GetCone and > then for each constituent point (face in this case) call GetSupport (which > yields the adjacent cell and the original cell) and discard the original > cell. > > Is there a direct function call that achieves this? I've looked a bit into > the Adjacency commands but they seem to be more for variable influence (FV > or FE) and using them seems to throw an error. > It is important to lay out why you want to do this, because it influences the choice of implementation. If you just want to discover topology, then doing it using Cone and Support is fine. This is what GetAdjacency is doing underneath. However, if you want to repeatedly do this, which most people do for dof traversal, then you should make an index. There are two cases of this in Plexx right now, but we could easily have more. GetAdjacency is used to construct the sparsity pattern for the Jacobian, which serves as an index. Also GetClosure is used to construct the closure index which we use when setting values since this is a common operation in FE. What do you want to do with this? Thanks, Matt > Sincerely > Nicholas > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From narnoldm at umich.edu Fri Dec 2 06:02:35 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Fri, 2 Dec 2022 07:02:35 -0500 Subject: [petsc-users] DMPlex Traversal In-Reply-To: References: Message-ID: Hi Currently, I am just trying to replace the mesh management in an existing code using DMPlex. So I am distributing the DMPlex and then passing the mesh information and translating the PetscSF information off into the existing MPI solver. So I think using the cone and support should be fine. It does happen repeatedly during the run time but with a different distribution each time so I think GetAdjacency will cover my use case. Thank you for the clarification Sincerely Nicholas On Fri, Dec 2, 2022 at 6:49 AM Matthew Knepley wrote: > On Fri, Dec 2, 2022 at 5:16 AM Nicholas Arnold-Medabalimi < > narnoldm at umich.edu> wrote: > >> Hi Petsc Users >> >> I have become familiar with usage of DMPlexGet(Restore)TransitiveClosure >> as well as general traversal of the graph using DMPlexGetCone and >> GetSupport. >> >> I was wondering if there is a more trivial way to get access to adjacent >> points of the same Stratum. >> >> For example, I have an 2D mesh. I have identified the point value of a >> cell of interest and I would like to get the point value of the surrounding >> cells. So far the way I get that is I descend a level using GetCone and >> then for each constituent point (face in this case) call GetSupport (which >> yields the adjacent cell and the original cell) and discard the original >> cell. >> >> Is there a direct function call that achieves this? I've looked a bit >> into the Adjacency commands but they seem to be more for variable influence >> (FV or FE) and using them seems to throw an error. >> > > It is important to lay out why you want to do this, because it influences > the choice of implementation. > > If you just want to discover topology, then doing it using Cone and > Support is fine. This is what GetAdjacency is doing underneath. > > However, if you want to repeatedly do this, which most people do for > dof traversal, then you should make an index. There are two cases > of this in Plexx right now, but we could easily have more. GetAdjacency is > used to construct the sparsity pattern for the Jacobian, which > serves as an index. Also GetClosure is used to construct the closure index > which we use when setting values since this is a common > operation in FE. > > What do you want to do with this? > > Thanks, > > Matt > > >> Sincerely >> Nicholas >> >> -- >> Nicholas Arnold-Medabalimi >> >> Ph.D. Candidate >> Computational Aeroscience Lab >> University of Michigan >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Dec 2 06:24:20 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 2 Dec 2022 07:24:20 -0500 Subject: [petsc-users] [EXTERNAL] Re: Kokkos backend for Mat and Vec diverging when running on CUDA device. In-Reply-To: References: Message-ID: Maybe Philip could narrow this down by using not GMRES/SOR solvers? Try GMRES/jacobi Try bicg/sor If one of those fixes the problem it might help or at least get Philip moving. Mark On Thu, Dec 1, 2022 at 5:06 PM Junchao Zhang wrote: > Hi, Philip, > Sorry for the long delay. I could not get something useful from the > -log_view output. Since I have already built xolotl, could you give me > instructions on how to do a xolotl test to reproduce the divergence with > petsc GPU backends (but fine on CPU)? > Thank you. > --Junchao Zhang > > > On Wed, Nov 16, 2022 at 1:38 PM Fackler, Philip > wrote: > >> ------------------------------------------------------------------ PETSc >> Performance Summary: >> ------------------------------------------------------------------ >> >> Unknown Name on a named PC0115427 with 1 processor, by 4pf Wed Nov 16 >> 14:36:46 2022 >> Using Petsc Development GIT revision: v3.18.1-115-gdca010e0e9a GIT Date: >> 2022-10-28 14:39:41 +0000 >> >> Max Max/Min Avg Total >> Time (sec): 6.023e+00 1.000 6.023e+00 >> Objects: 1.020e+02 1.000 1.020e+02 >> Flops: 1.080e+09 1.000 1.080e+09 1.080e+09 >> Flops/sec: 1.793e+08 1.000 1.793e+08 1.793e+08 >> MPI Msg Count: 0.000e+00 0.000 0.000e+00 0.000e+00 >> MPI Msg Len (bytes): 0.000e+00 0.000 0.000e+00 0.000e+00 >> MPI Reductions: 0.000e+00 0.000 >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length N >> --> 2N flops >> and VecAXPY() for complex vectors of length N >> --> 8N flops >> >> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages >> --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total Count >> %Total Avg %Total Count %Total >> 0: Main Stage: 6.0226e+00 100.0% 1.0799e+09 100.0% 0.000e+00 >> 0.0% 0.000e+00 0.0% 0.000e+00 0.0% >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flop: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all processors >> Mess: number of messages sent >> AvgLen: average message length (bytes) >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() >> and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flop in this >> phase >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time >> over all processors) >> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU >> time over all processors) >> CpuToGpu Count: total number of CPU to GPU copies per processor >> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per >> processor) >> GpuToCpu Count: total number of GPU to CPU copies per processor >> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per >> processor) >> GPU %F: percent flops on GPU in this event >> >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flop >> --- Global --- --- Stage ---- Total >> GPU - CpuToGpu - - GpuToCpu - GPU >> >> Max Ratio Max Ratio Max Ratio Mess AvgLen >> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> Mflop/s Count Size Count Size %F >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> --------------------------------------- >> >> >> --- Event Stage 0: Main Stage >> >> BuildTwoSided 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> DMCreateMat 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> SFSetGraph 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> SFSetUp 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> SFPack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> SFUnpack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> VecDot 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecMDot 775 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> VecNorm 1728 1.0 nan nan 1.92e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecScale 1983 1.0 nan nan 6.24e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecCopy 780 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> VecSet 4955 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 2 0 0 0 0 2 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> VecAXPY 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecAYPX 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecAXPBYCZ 643 1.0 nan nan 1.79e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecWAXPY 502 1.0 nan nan 5.58e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecMAXPY 1159 1.0 nan nan 3.68e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecScatterBegin 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan >> -nan 2 5.14e-03 0 0.00e+00 0 >> >> VecScatterEnd 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> VecReduceArith 380 1.0 nan nan 4.23e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecReduceComm 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> VecNormalize 965 1.0 nan nan 1.61e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> TSStep 20 1.0 5.8699e+00 1.0 1.08e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 97100 0 0 0 97100 0 0 0 184 >> -nan 2 5.14e-03 0 0.00e+00 54 >> >> TSFunctionEval 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 63 1 0 0 0 63 1 0 0 0 -nan >> -nan 1 3.36e-04 0 0.00e+00 100 >> >> TSJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 97 >> >> MatMult 1930 1.0 nan nan 4.46e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 1 41 0 0 0 1 41 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> MatMultTranspose 1 1.0 nan nan 3.44e+05 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> MatSolve 965 1.0 nan nan 5.04e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 1 5 0 0 0 1 5 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatSOR 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatLUFactorSym 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatLUFactorNum 190 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 1 11 0 0 0 1 11 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatScale 190 1.0 nan nan 3.26e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> MatAssemblyBegin 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatAssemblyEnd 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatGetRowIJ 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatCreateSubMats 380 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatGetOrdering 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatZeroEntries 379 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatSetPreallCOO 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatSetValuesCOO 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> KSPSetUp 760 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> KSPSolve 190 1.0 5.8052e-01 1.0 9.30e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 10 86 0 0 0 10 86 0 0 0 1602 >> -nan 1 4.80e-03 0 0.00e+00 46 >> >> KSPGMRESOrthog 775 1.0 nan nan 2.27e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 1 2 0 0 0 1 2 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> SNESSolve 71 1.0 5.7117e+00 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 95 99 0 0 0 95 99 0 0 0 188 >> -nan 1 4.80e-03 0 0.00e+00 53 >> >> SNESSetUp 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> SNESFunctionEval 573 1.0 nan nan 2.23e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 60 2 0 0 0 60 2 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> SNESJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 97 >> >> SNESLineSearch 190 1.0 nan nan 1.05e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 53 10 0 0 0 53 10 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> PCSetUp 570 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 2 11 0 0 0 2 11 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> PCApply 965 1.0 nan nan 6.14e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 8 57 0 0 0 8 57 0 0 0 -nan >> -nan 1 4.80e-03 0 0.00e+00 19 >> >> KSPSolve_FS_0 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> KSPSolve_FS_1 965 1.0 nan nan 1.66e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 2 15 0 0 0 2 15 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> >> --- Event Stage 1: Unknown >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> --------------------------------------- >> >> >> Object Type Creations Destructions. Reports information only >> for process 0. >> >> --- Event Stage 0: Main Stage >> >> Container 5 5 >> Distributed Mesh 2 2 >> Index Set 11 11 >> IS L to G Mapping 1 1 >> Star Forest Graph 7 7 >> Discrete System 2 2 >> Weak Form 2 2 >> Vector 49 49 >> TSAdapt 1 1 >> TS 1 1 >> DMTS 1 1 >> SNES 1 1 >> DMSNES 3 3 >> SNESLineSearch 1 1 >> Krylov Solver 4 4 >> DMKSP interface 1 1 >> Matrix 4 4 >> Preconditioner 4 4 >> Viewer 2 1 >> >> --- Event Stage 1: Unknown >> >> >> ======================================================================================================================== >> Average time to get PetscTime(): 3.14e-08 >> #PETSc Option Table entries: >> -log_view >> -log_view_gpu_times >> #End of PETSc Option Table entries >> Compiled without FORTRAN kernels >> Compiled with 64 bit PetscInt >> Compiled with full precision matrices (default) >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >> sizeof(PetscScalar) 8 sizeof(PetscInt) 8 >> Configure options: PETSC_DIR=/home/4pf/repos/petsc >> PETSC_ARCH=arch-kokkos-cuda-no-tpls --with-cc=mpicc --with-cxx=mpicxx >> --with-fc=0 --with-cuda --with-debugging=0 --with-shared-libraries >> --prefix=/home/4pf/build/petsc/cuda-no-tpls/install --with-64-bit-indices >> --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --CUDAOPTFLAGS=-O3 >> --with-kokkos-dir=/home/4pf/build/kokkos/cuda/install >> --with-kokkos-kernels-dir=/home/4pf/build/kokkos-kernels/cuda-no-tpls/install >> >> ----------------------------------------- >> Libraries compiled on 2022-11-01 21:01:08 on PC0115427 >> Machine characteristics: Linux-5.15.0-52-generic-x86_64-with-glibc2.35 >> Using PETSc directory: /home/4pf/build/petsc/cuda-no-tpls/install >> Using PETSc arch: >> ----------------------------------------- >> >> Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas >> -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector >> -fvisibility=hidden -O3 >> ----------------------------------------- >> >> Using include paths: -I/home/4pf/build/petsc/cuda-no-tpls/install/include >> -I/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/include >> -I/home/4pf/build/kokkos/cuda/install/include -I/usr/local/cuda-11.8/include >> ----------------------------------------- >> >> Using C linker: mpicc >> Using libraries: >> -Wl,-rpath,/home/4pf/build/petsc/cuda-no-tpls/install/lib >> -L/home/4pf/build/petsc/cuda-no-tpls/install/lib -lpetsc >> -Wl,-rpath,/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib >> -L/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib >> -Wl,-rpath,/home/4pf/build/kokkos/cuda/install/lib >> -L/home/4pf/build/kokkos/cuda/install/lib >> -Wl,-rpath,/usr/local/cuda-11.8/lib64 -L/usr/local/cuda-11.8/lib64 >> -L/usr/local/cuda-11.8/lib64/stubs -lkokkoskernels -lkokkoscontainers >> -lkokkoscore -llapack -lblas -lm -lcudart -lnvToolsExt -lcufft -lcublas >> -lcusparse -lcusolver -lcurand -lcuda -lquadmath -lstdc++ -ldl >> ----------------------------------------- >> >> >> >> *Philip Fackler * >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> *Oak Ridge National Laboratory* >> ------------------------------ >> *From:* Junchao Zhang >> *Sent:* Tuesday, November 15, 2022 13:03 >> *To:* Fackler, Philip >> *Cc:* xolotl-psi-development at lists.sourceforge.net < >> xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < >> petsc-users at mcs.anl.gov>; Blondel, Sophie ; Roth, >> Philip >> *Subject:* Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and >> Vec diverging when running on CUDA device. >> >> Can you paste -log_view result so I can see what functions are used? >> >> --Junchao Zhang >> >> >> On Tue, Nov 15, 2022 at 10:24 AM Fackler, Philip >> wrote: >> >> Yes, most (but not all) of our system test cases fail with the >> kokkos/cuda or cuda backends. All of them pass with the CPU-only kokkos >> backend. >> >> >> *Philip Fackler * >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> *Oak Ridge National Laboratory* >> ------------------------------ >> *From:* Junchao Zhang >> *Sent:* Monday, November 14, 2022 19:34 >> *To:* Fackler, Philip >> *Cc:* xolotl-psi-development at lists.sourceforge.net < >> xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < >> petsc-users at mcs.anl.gov>; Blondel, Sophie ; Zhang, >> Junchao ; Roth, Philip >> *Subject:* [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec >> diverging when running on CUDA device. >> >> Hi, Philip, >> Sorry to hear that. It seems you could run the same code on CPUs but >> not no GPUs (with either petsc/Kokkos backend or petsc/cuda backend, is it >> right? >> >> --Junchao Zhang >> >> >> On Mon, Nov 14, 2022 at 12:13 PM Fackler, Philip via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >> This is an issue I've brought up before (and discussed in-person with >> Richard). I wanted to bring it up again because I'm hitting the limits of >> what I know to do, and I need help figuring this out. >> >> The problem can be reproduced using Xolotl's "develop" branch built >> against a petsc build with kokkos and kokkos-kernels enabled. Then, either >> add the relevant kokkos options to the "petscArgs=" line in the system test >> parameter file(s), or just replace the system test parameter files with the >> ones from the "feature-petsc-kokkos" branch. See here the files that >> begin with "params_system_". >> >> Note that those files use the "kokkos" options, but the problem is >> similar using the corresponding cuda/cusparse options. I've already tried >> building kokkos-kernels with no TPLs and got slightly different results, >> but the same problem. >> >> Any help would be appreciated. >> >> Thanks, >> >> >> *Philip Fackler * >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> *Oak Ridge National Laboratory* >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From haubnerj at simula.no Fri Dec 2 06:32:28 2022 From: haubnerj at simula.no (Johannes Haubner) Date: Fri, 2 Dec 2022 13:32:28 +0100 Subject: [petsc-users] Action of inverse of Cholesky factors in parallel Message-ID: Given a s.p.d. matrix A and a right-hand-side b, we want to compute (L^T)^{-1}b and L^{-1}b, where A= L L^T is the Cholesky decomposition of A. In serial, we are able to do this using pc_factor_mat_solver_type petsc and using "solveForward" and "solveBackward" (see https://github.com/JohannesHaubner/petsc-cholesky/blob/main/main.py). In parallel, we are not able to do this yet. According to p. 52 https://slepc.upv.es/documentation/slepc.pdf , PETSc's Cholesky factorization is not parallel. However, using "mumps" or "superlu_dist" prevents us from using "solveForward" or "solveBackward". Is there a way of using solveBackward in parallel with a distributed matrix (MPI)? From karthikeyan.chockalingam at stfc.ac.uk Fri Dec 2 06:47:55 2022 From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI) Date: Fri, 2 Dec 2022 12:47:55 +0000 Subject: [petsc-users] Using MatCreateSBAIJ and MatZeroRowsColumns in parallel Message-ID: Hello, I have system matrix which is symmetric and thought I could make use of MatCreateSBAIJ. I don?t understand how to set the blocksize bs. I believe it has taken into account, when having multiple components/dofs per node. (i) Currently, I have only one scalar field to solve so I set bs = 1 as below. Is it correct? ierr = MatCreateSBAIJ(PETSC_COMM_WORLD, 1, PETSC_DECIDE, PETSC_DECIDE, N, N, d_nz, PETSC_NULL, o_nz, PETSC_NULL, &A); CHKERRQ(ierr); ierr = MatCreateVecs(A, &right, &left);CHKERRQ(ierr); (ii) If you multiple components/dofs per node, say 2, is the then block size = 2? (iii) To apply homogenous Dirichlet boundary condition, I make use of MatZeroRowsColumns. It works in serial but while applying in parallel the following error is thrown, [0]PETSC ERROR: No method zerorowscolumns for Mat of type mpisbaij How do I fix it? (iv) Is there performance again, when using MatCreateSBAIJ for large symmetric system matrix? I read there is more communication involved. Best, Karthik. This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Fri Dec 2 06:48:42 2022 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 2 Dec 2022 13:48:42 +0100 Subject: [petsc-users] Action of inverse of Cholesky factors in parallel In-Reply-To: References: Message-ID: As far as I know, the solveForward/Backward operations are not implemented for external packages. What are you trying to do? One may be tempted to use the Cholesky decomposition to transform a symmetric-definite generalized eigenvalue problem into a standard eigenvalue problem, as is done e.g. in LAPACK https://netlib.org/lapack/lug/node54.html - But note that in SLEPc you do not need to do this, as EPSSolve() will take care of symmetry using Cholesky without having to apply solveForward/Backward separately. Jose > El 2 dic 2022, a las 13:32, Johannes Haubner escribi?: > > Given a s.p.d. matrix A and a right-hand-side b, we want to compute (L^T)^{-1}b and L^{-1}b, where A= L L^T is the Cholesky decomposition of A. In serial, we are able to do this using pc_factor_mat_solver_type petsc and using "solveForward" and "solveBackward" (see https://github.com/JohannesHaubner/petsc-cholesky/blob/main/main.py). In parallel, we are not able to do this yet. According to p. 52 https://slepc.upv.es/documentation/slepc.pdf , PETSc's Cholesky factorization is not parallel. However, using "mumps" or "superlu_dist" prevents us from using "solveForward" or "solveBackward". Is there a way of using solveBackward in parallel with a distributed matrix (MPI)? From knepley at gmail.com Fri Dec 2 07:17:06 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 2 Dec 2022 08:17:06 -0500 Subject: [petsc-users] Using MatCreateSBAIJ and MatZeroRowsColumns in parallel In-Reply-To: References: Message-ID: On Fri, Dec 2, 2022 at 7:48 AM Karthikeyan Chockalingam - STFC UKRI via petsc-users wrote: > Hello, > > > > I have system matrix which is symmetric and thought I could make use of MatCreateSBAIJ. > I don?t understand how to set the blocksize bs. I believe it has taken > into account, when having multiple components/dofs per node. > > > > (i) Currently, I have only one scalar field to solve so I set bs > = 1 as below. Is it correct? > > > > ierr = MatCreateSBAIJ(PETSC_COMM_WORLD, 1, PETSC_DECIDE, > PETSC_DECIDE, N, N, d_nz, PETSC_NULL, o_nz, PETSC_NULL, &A); CHKERRQ > (ierr); > > > > ierr = MatCreateVecs(A, &right, &left);CHKERRQ(ierr); > > > > > > (ii) If you multiple components/dofs per node, say 2, is the then > block size = 2? > > > > (iii) To apply homogenous Dirichlet boundary condition, I make use > of MatZeroRowsColumns. It works in serial but while applying in parallel > the following error is thrown, > > > > [0]PETSC ERROR: No method zerorowscolumns for Mat of type mpisbaij > > > > How do I fix it? > > > > (iv) Is there performance again, when using MatCreateSBAIJ > for large symmetric system matrix? I read there is more communication > involved. > For block size 1, there is no improvement. Even for 2 I would say it is small. The SBAIJ matrix will do the same number of flops and use about the same bandwidth, but it will save on storage. Unless storage is a big deal for you, I would use AIJ, get everything working, and profile to see how important various operations are. Thanks, Matt > Best, > > Karthik. > > > > > > This email and any attachments are intended solely for the use of the > named recipients. If you are not the intended recipient you must not use, > disclose, copy or distribute this email or any of its attachments and > should notify the sender immediately and delete this email from your > system. UK Research and Innovation (UKRI) has taken every reasonable > precaution to minimise risk of this email or any attachments containing > viruses or malware but the recipient should carry out its own virus and > malware checks before opening the attachments. UKRI does not accept any > liability for any losses or damages which the recipient may sustain due to > presence of any viruses. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From haubnerj at simula.no Mon Dec 5 01:26:35 2022 From: haubnerj at simula.no (Johannes Haubner) Date: Mon, 5 Dec 2022 08:26:35 +0100 Subject: [petsc-users] Action of inverse of Cholesky factors in parallel In-Reply-To: References: Message-ID: <4EE2BE54-3CFC-4130-B36A-76A31EEC1A12@simula.no> We are trying to write an interface to general purpose optimization solvers for PDE constrained optimal control problems in Hilbert spaces based on PETSc. Ipopt or the scipy implementation of L-BFGS work with the Euclidean inner product. For discretizations of PDE constrained optimization problems in Hilbert spaces one typically wants to work with inner products of the form x^T A x (A s.p.d, A typically not equal to the identity matrix). Not working with the correct inner product typically leads to mesh dependent behavior of the optimization algorithm, which can be motivated by considering the optimization problem min_{x in X} 1/2 ||x||^2_X =: f(x) where X denotes a Hilbert space. We choose x_0 != 0 and apply a gradient descent algorithm with step size 1. Case 1: X = R^d, d in N, || x ||_X ^2 = x^T x: It holds nabla f(x) = x. Applying gradient descent with step size 1: x_1 = x_0 - x_0 = 0. Hence, we have convergence to the optimal solution in 1 iteration. Case 2: X = H^1(Omega), Omega subset R^d, d in {2, 3}, (as a specific example for X being a infinite dimensional Hilbert space) || x ||_X ^2 = int x(xi) x(xi) dxi + int nabla x(xi) . nabla x(xi) *dxi Derivative of f(x) is an element of the dual space of X and given by Df(x) (delta x) = int x (xi) delta_x(xi) dx + int nabla x(xi) . nabla delta_x(xi) *dxi for delta_x in X Gradient (Riesz representation of the derivative) is given by nabla f = x Hence, gradient descent with step size 1: x_1 = x_0 - x_0 = 0 Case 3: Discretization of Case 2 using finite elements: f(x) := 1/2 x^T (M + S) x, where M denotes the mass matrix, S the stiffness matrix, and x the degrees of freedom of the discretization Neglecting (by working with the Euclidean inner product) that x encodes a H^1-function and just solving the discretized optimization problem with gradient descent with step size 1 yields nabla f(x) = (M + S) x and x_1 = x_0 - (M + S) x_0 where the condition number of (M + S) depends on the mesh size, i.e. mesh size dependent convergence to optimal solution (and not in 1 iteration). One can either respect the correct inner product (in the example x^T (M + S) x) in the implementation of the optimization algorithms (see e.g. https://github.com/funsim/moola) or by doing a "discrete hack". The latter consists of introducing a auxiliary variable y, applying a linear transformation y --> (L^T)^{-1} y =: x, where (M + S) = L L^T (that is why we need the action of the inverse of the Cholesky factors (L^{-1} y is needed for applying the chain rule when computing the derivative)), and optimizing for y instead of x. Doing this trick gives for the above example: tilde f(y) := f(L^-T y) = 1/2 y^T L^{-1} (M + S) L^{-T} y = 1/2 y^T y and gradient descent in y with step size 1 yields y_1 = y_0 - y_0 = 0. Therefore, also x_1 = L^{-T} y_1 = 0 (i.e., convergence in 1 iteration). Johannes > On 2. Dec 2022, at 13:48, Jose E. Roman wrote: > > As far as I know, the solveForward/Backward operations are not implemented for external packages. > > What are you trying to do? One may be tempted to use the Cholesky decomposition to transform a symmetric-definite generalized eigenvalue problem into a standard eigenvalue problem, as is done e.g. in LAPACK https://netlib.org/lapack/lug/node54.html - But note that in SLEPc you do not need to do this, as EPSSolve() will take care of symmetry using Cholesky without having to apply solveForward/Backward separately. > > Jose > > > > >> El 2 dic 2022, a las 13:32, Johannes Haubner escribi?: >> >> Given a s.p.d. matrix A and a right-hand-side b, we want to compute (L^T)^{-1}b and L^{-1}b, where A= L L^T is the Cholesky decomposition of A. In serial, we are able to do this using pc_factor_mat_solver_type petsc and using "solveForward" and "solveBackward" (see https://github.com/JohannesHaubner/petsc-cholesky/blob/main/main.py). In parallel, we are not able to do this yet. According to p. 52 https://slepc.upv.es/documentation/slepc.pdf , PETSc's Cholesky factorization is not parallel. However, using "mumps" or "superlu_dist" prevents us from using "solveForward" or "solveBackward". Is there a way of using solveBackward in parallel with a distributed matrix (MPI)? > From yuanxi at advancesoft.jp Mon Dec 5 02:08:17 2022 From: yuanxi at advancesoft.jp (=?UTF-8?B?6KKB54WV?=) Date: Mon, 5 Dec 2022 17:08:17 +0900 Subject: [petsc-users] How to introduce external iterative solver into PETSc Message-ID: Dear PETSc developers, I have my own linear solver and am trying to put it into PETSc as an external solver. Following the implementation of mumps, mkl_cpardiso, supelu etc, I think I should do the follow: 1. Add my solver name into MatSolverType. 2. Register my solver by calling MatSolverTypeRegister to let petsc record the existence of a new solver. The problem is that the above external solvers are all direct solvers, and a MatFactorType parameter should be set to indicate its factorization type, such as LU, QR, or Cholesky. But my solver is an iterative one, that means I cannot specify its MatFactorType. I wish to understand 1. Am I doing it the right way? And if so 2. How to set such parameters as MatFactorType, lufactorsymbolic, lufactornumeric. Many thanks, Yuan Ph.D. in Solid Mechanics -------------- next part -------------- An HTML attachment was scrubbed... URL: From yyang85 at alumni.stanford.edu Mon Dec 5 02:12:39 2022 From: yyang85 at alumni.stanford.edu (Yuyun Yang) Date: Mon, 5 Dec 2022 08:12:39 +0000 Subject: [petsc-users] Example for MatSeqAIJKron? Message-ID: Dear PETSc team, Is there an example for using MatSeqAIJKron? I?m using MatCreate for all matrices in the code, and wonder if I need to switch to MatCreateSeqAIJ in order to use this function? Just want to compute simple Kronecker products of a sparse matrix with an identity matrix. Thank you, Yuyun -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Mon Dec 5 02:34:22 2022 From: pierre at joliv.et (Pierre Jolivet) Date: Mon, 5 Dec 2022 09:34:22 +0100 Subject: [petsc-users] Example for MatSeqAIJKron? In-Reply-To: References: Message-ID: Dear Yuyun, Here is the simple example that I wrote to test the function: https://petsc.org/release/src/mat/tests/ex248.c.html You can stick to MatCreate(), but this will only work if the type of your Mat is indeed MatSeqAIJ. If you need this for other types, let us know. Thanks, Pierre > On 5 Dec 2022, at 9:12 AM, Yuyun Yang wrote: > > Dear PETSc team, > > Is there an example for using MatSeqAIJKron? I?m using MatCreate for all matrices in the code, and wonder if I need to switch to MatCreateSeqAIJ in order to use this function? Just want to compute simple Kronecker products of a sparse matrix with an identity matrix. > > Thank you, > Yuyun -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Mon Dec 5 02:40:07 2022 From: jroman at dsic.upv.es (Jose E. Roman) Date: Mon, 5 Dec 2022 09:40:07 +0100 Subject: [petsc-users] Action of inverse of Cholesky factors in parallel In-Reply-To: <4EE2BE54-3CFC-4130-B36A-76A31EEC1A12@simula.no> References: <4EE2BE54-3CFC-4130-B36A-76A31EEC1A12@simula.no> Message-ID: <9CEDC0D9-8010-404C-B9D1-27B1BD7DEEF1@dsic.upv.es> MUMPS does not support doing the forward or backward solve only. You should ask MUMPS developers to add an option for this, then we would be able to interface it in PETSc. Regarding the other Cholesky options (see https://petsc.org/release/overview/linear_solve_table/), PaStiX has the same problem as MUMPS; CHOLMOD is sequential; SuperLU_DIST implements LU so it would not be helpful for you if the forward/backward solves were available. If you want to go the inner-product route, SLEPc provides some support via its BV object, including basis orthogonalization, see https://slepc.upv.es/documentation/current/docs/manualpages/BV/BVSetMatrix.html Jose > El 5 dic 2022, a las 8:26, Johannes Haubner escribi?: > > We are trying to write an interface to general purpose optimization solvers for PDE constrained optimal control problems in Hilbert spaces based on PETSc. Ipopt or the scipy implementation of L-BFGS work with the Euclidean inner product. For discretizations of PDE constrained optimization problems in Hilbert spaces one typically wants to work with inner products of the form x^T A x (A s.p.d, A typically not equal to the identity matrix). > > Not working with the correct inner product typically leads to mesh dependent behavior of the optimization algorithm, which can be motivated by considering the optimization problem > > min_{x in X} 1/2 ||x||^2_X =: f(x) > > where X denotes a Hilbert space. We choose x_0 != 0 and apply a gradient descent algorithm with step size 1. > > Case 1: X = R^d, d in N, || x ||_X ^2 = x^T x: > It holds nabla f(x) = x. > Applying gradient descent with step size 1: x_1 = x_0 - x_0 = 0. > Hence, we have convergence to the optimal solution in 1 iteration. > > Case 2: X = H^1(Omega), Omega subset R^d, d in {2, 3}, > (as a specific example for X being a infinite dimensional Hilbert space) > || x ||_X ^2 = int x(xi) x(xi) dxi + int nabla x(xi) . nabla x(xi) *dxi > Derivative of f(x) is an element of the dual space of X and given by > Df(x) (delta x) = int x (xi) delta_x(xi) dx + int nabla x(xi) . nabla delta_x(xi) *dxi > for delta_x in X > Gradient (Riesz representation of the derivative) is given by nabla f = x > Hence, gradient descent with step size 1: x_1 = x_0 - x_0 = 0 > > Case 3: Discretization of Case 2 using finite elements: > f(x) := 1/2 x^T (M + S) x, where M denotes the mass matrix, S the stiffness matrix, > and x the degrees of freedom of the discretization > Neglecting (by working with the Euclidean inner product) that x encodes a H^1-function > and just solving the discretized optimization problem with gradient descent with step size 1 yields > nabla f(x) = (M + S) x > and x_1 = x_0 - (M + S) x_0 > where the condition number of (M + S) depends on the mesh size, > i.e. mesh size dependent convergence to optimal solution (and not in 1 iteration). > > One can either respect the correct inner product (in the example x^T (M + S) x) in the implementation of the optimization algorithms (see e.g. https://github.com/funsim/moola) or by doing a "discrete hack". The latter consists of introducing a auxiliary variable y, applying a linear transformation y --> (L^T)^{-1} y =: x, where (M + S) = L L^T (that is why we need the action of the inverse of the Cholesky factors (L^{-1} y is needed for applying the chain rule when computing the derivative)), and optimizing for y instead of x. > > Doing this trick gives for the above example: > tilde f(y) := f(L^-T y) = 1/2 y^T L^{-1} (M + S) L^{-T} y = 1/2 y^T y > and gradient descent in y with step size 1 yields y_1 = y_0 - y_0 = 0. > Therefore, also x_1 = L^{-T} y_1 = 0 (i.e., convergence in 1 iteration). > > Johannes > >> On 2. Dec 2022, at 13:48, Jose E. Roman wrote: >> >> As far as I know, the solveForward/Backward operations are not implemented for external packages. >> >> What are you trying to do? One may be tempted to use the Cholesky decomposition to transform a symmetric-definite generalized eigenvalue problem into a standard eigenvalue problem, as is done e.g. in LAPACK https://netlib.org/lapack/lug/node54.html - But note that in SLEPc you do not need to do this, as EPSSolve() will take care of symmetry using Cholesky without having to apply solveForward/Backward separately. >> >> Jose >> >> >> >> >>> El 2 dic 2022, a las 13:32, Johannes Haubner escribi?: >>> >>> Given a s.p.d. matrix A and a right-hand-side b, we want to compute (L^T)^{-1}b and L^{-1}b, where A= L L^T is the Cholesky decomposition of A. In serial, we are able to do this using pc_factor_mat_solver_type petsc and using "solveForward" and "solveBackward" (see https://github.com/JohannesHaubner/petsc-cholesky/blob/main/main.py). In parallel, we are not able to do this yet. According to p. 52 https://slepc.upv.es/documentation/slepc.pdf , PETSc's Cholesky factorization is not parallel. However, using "mumps" or "superlu_dist" prevents us from using "solveForward" or "solveBackward". Is there a way of using solveBackward in parallel with a distributed matrix (MPI)? >> > From yyang85 at alumni.stanford.edu Mon Dec 5 02:48:35 2022 From: yyang85 at alumni.stanford.edu (Yuyun Yang) Date: Mon, 5 Dec 2022 08:48:35 +0000 Subject: [petsc-users] Example for MatSeqAIJKron? In-Reply-To: References: Message-ID: Great, thank you! From: Pierre Jolivet Date: Monday, December 5, 2022 at 4:34 PM To: Yuyun Yang Cc: petsc-users Subject: Re: [petsc-users] Example for MatSeqAIJKron? Dear Yuyun, Here is the simple example that I wrote to test the function: https://petsc.org/release/src/mat/tests/ex248.c.html You can stick to MatCreate(), but this will only work if the type of your Mat is indeed MatSeqAIJ. If you need this for other types, let us know. Thanks, Pierre On 5 Dec 2022, at 9:12 AM, Yuyun Yang wrote: Dear PETSc team, Is there an example for using MatSeqAIJKron? I?m using MatCreate for all matrices in the code, and wonder if I need to switch to MatCreateSeqAIJ in order to use this function? Just want to compute simple Kronecker products of a sparse matrix with an identity matrix. Thank you, Yuyun -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Mon Dec 5 05:47:59 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Mon, 5 Dec 2022 20:47:59 +0900 Subject: [petsc-users] About matrix assembly Message-ID: Hello, In matrix preallocation procedure, I tried 2 options to preallocate global matrix. The first is ?MatSeqAIJSetPreallocation? and the second is ?MatSetPreallocationCOO?. When I adopt the first option ?MatSeqAIJSetPreallocation(Mat, nz, nnz)?, I just put overestimated nz for getting enough memory space and also getting nice performance. However, It couldn?t run without ?MatSetOption(Mat, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE);? And also, there are no speed up compare with no preallocation case. When in the second option ?MatSetPreallocationCOO(Mat,ncoo,coo_i,coo_j)?, I put correct size parameters(ncoo, coo_i, coo_j). However, It couldn?t run without ?MatSetOption(Mat, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE);? Regarding this problem, I suspect that it is a problem caused by mapping a small-sized local matrix with a different order from the order of coo_i and coo_j to the global matrix by using ?matsetvalue?. And also, there are no speed up compare with no preallocation case. 1. How can I do proper preallocation procedure? 2. Why in my cases there are no speed up? Thanks, Hyung Kim -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 5 07:11:07 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 5 Dec 2022 08:11:07 -0500 Subject: [petsc-users] How to introduce external iterative solver into PETSc In-Reply-To: References: Message-ID: On Mon, Dec 5, 2022 at 3:08 AM ?? wrote: > Dear PETSc developers, > > I have my own linear solver and am trying to put it into PETSc as an > external solver. Following the implementation of mumps, mkl_cpardiso, > supelu etc, I think I should do the follow: > > 1. Add my solver name into MatSolverType. > 2. Register my solver by calling MatSolverTypeRegister > > to let petsc record the existence of a new solver. The problem is that the > above external solvers are all direct solvers, and a MatFactorType > parameter should be set to indicate its factorization type, such as LU, QR, > or Cholesky. But my solver is an iterative one, that means I cannot specify > its MatFactorType. I wish to understand > > 1. Am I doing it the right way? And if so > 2. How to set such parameters as MatFactorType, > lufactorsymbolic, lufactornumeric. > This was a misunderstanding. If it is an iterative solver, you can follow the template for Jacobi: https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/pc/impls/jacobi/jacobi.c There are comments throughout this file showing you how to register your own preconditioner. You should not need to alter PETSc source. I have done this myself with the BAMG preconditioner. Thanks, Matt > Many thanks, > > Yuan > Ph.D. in Solid Mechanics > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 5 07:16:31 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 5 Dec 2022 08:16:31 -0500 Subject: [petsc-users] About matrix assembly In-Reply-To: References: Message-ID: On Mon, Dec 5, 2022 at 6:48 AM ??? wrote: > Hello, > > In matrix preallocation procedure, I tried 2 options to preallocate global > matrix. > The first is ?MatSeqAIJSetPreallocation? and the second is > ?MatSetPreallocationCOO?. > > > > When I adopt the first option ?MatSeqAIJSetPreallocation(Mat, nz, nnz)?, I > just put overestimated nz for getting enough memory space and also getting > nice performance. > > However, It couldn?t run without ?MatSetOption(Mat, > MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE);? And also, there are no speed > up compare with no preallocation case. > 1. This means your nz was not big enough. 2. I suggest computing the correct length for every row. The easiest way to do this is to use https://petsc.org/main/docs/manualpages/Mat/MatPreallocatorPreallocate/ There re instructions on that page. This is how I do it in the library now. > > > When in the second option ?MatSetPreallocationCOO(Mat,ncoo,coo_i,coo_j)?, > I put correct size parameters(ncoo, coo_i, coo_j). However, It couldn?t run > without ?MatSetOption(Mat, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE);? > Regarding this problem, I suspect that it is a problem caused by mapping a > small-sized local matrix with a different order from the order of coo_i and > coo_j to the global matrix by using ?matsetvalue?. And also, there are no > speed up compare with no preallocation case. > If you cannot run without that flag, it means that you are inserting different nonzeros then you did with MatSetPreallocationCOO(). You need to tell it _exactly_ the same nonzeros as you input with MatSetValues(). Thanks, Matt > 1. How can I do proper preallocation procedure? > > 2. Why in my cases there are no speed up? > > > > Thanks, > > Hyung Kim > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Dec 5 09:01:50 2022 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 5 Dec 2022 10:01:50 -0500 Subject: [petsc-users] About matrix assembly In-Reply-To: References: Message-ID: I don't think you understand COO. Look at the example https://petsc.org/release/src/mat/tutorials/ex18.c.html "MatSetPreallocationCOO" is not a great name because you are giving it the (i,j) indices not just the sizes for memory allocation. A MatSetPreallocationCOO that is consistent with MatSeqAIJSetPreallocation would just take one integer: "ncoo" . MatSetIndicesCOO might be a better name. Mark On Mon, Dec 5, 2022 at 8:16 AM Matthew Knepley wrote: > On Mon, Dec 5, 2022 at 6:48 AM ??? wrote: > >> Hello, >> >> In matrix preallocation procedure, I tried 2 options to preallocate >> global matrix. >> The first is ?MatSeqAIJSetPreallocation? and the second is >> ?MatSetPreallocationCOO?. >> >> >> >> When I adopt the first option ?MatSeqAIJSetPreallocation(Mat, nz, nnz)?, >> I just put overestimated nz for getting enough memory space and also >> getting nice performance. >> >> However, It couldn?t run without ?MatSetOption(Mat, >> MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE);? And also, there are no speed >> up compare with no preallocation case. >> > 1. This means your nz was not big enough. > > 2. I suggest computing the correct length for every row. The easiest way > to do this is to use > > https://petsc.org/main/docs/manualpages/Mat/MatPreallocatorPreallocate/ > > There re instructions on that page. This is how I do it in the library now. > > >> >> >> When in the second option ?MatSetPreallocationCOO(Mat,ncoo,coo_i,coo_j)?, >> I put correct size parameters(ncoo, coo_i, coo_j). However, It couldn?t run >> without ?MatSetOption(Mat, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE);? >> Regarding this problem, I suspect that it is a problem caused by mapping a >> small-sized local matrix with a different order from the order of coo_i and >> coo_j to the global matrix by using ?matsetvalue?. And also, there are no >> speed up compare with no preallocation case. >> > > If you cannot run without that flag, it means that you are inserting > different nonzeros then you did with MatSetPreallocationCOO(). You need to > tell > it _exactly_ the same nonzeros as you input with MatSetValues(). > > Thanks, > > Matt > > >> 1. How can I do proper preallocation procedure? >> >> 2. Why in my cases there are no speed up? >> >> >> >> Thanks, >> >> Hyung Kim >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From facklerpw at ornl.gov Mon Dec 5 11:08:03 2022 From: facklerpw at ornl.gov (Fackler, Philip) Date: Mon, 5 Dec 2022 17:08:03 +0000 Subject: [petsc-users] [EXTERNAL] Re: Kokkos backend for Mat and Vec diverging when running on CUDA device. In-Reply-To: References: Message-ID: Junchao, Thank you for working on this. If you open the parameter file for, say, the PSI_2 system test case (benchmarks/params_system_PSI_2.txt), simply add -dm_mat_type aijkokkos -dm_vec_type kokkos?` to the "petscArgs=" field (or the corresponding cusparse/cuda option). Thanks, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang Sent: Thursday, December 1, 2022 17:05 To: Fackler, Philip Cc: xolotl-psi-development at lists.sourceforge.net ; petsc-users at mcs.anl.gov ; Blondel, Sophie ; Roth, Philip Subject: Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec diverging when running on CUDA device. Hi, Philip, Sorry for the long delay. I could not get something useful from the -log_view output. Since I have already built xolotl, could you give me instructions on how to do a xolotl test to reproduce the divergence with petsc GPU backends (but fine on CPU)? Thank you. --Junchao Zhang On Wed, Nov 16, 2022 at 1:38 PM Fackler, Philip > wrote: ------------------------------------------------------------------ PETSc Performance Summary: ------------------------------------------------------------------ Unknown Name on a named PC0115427 with 1 processor, by 4pf Wed Nov 16 14:36:46 2022 Using Petsc Development GIT revision: v3.18.1-115-gdca010e0e9a GIT Date: 2022-10-28 14:39:41 +0000 Max Max/Min Avg Total Time (sec): 6.023e+00 1.000 6.023e+00 Objects: 1.020e+02 1.000 1.020e+02 Flops: 1.080e+09 1.000 1.080e+09 1.080e+09 Flops/sec: 1.793e+08 1.000 1.793e+08 1.793e+08 MPI Msg Count: 0.000e+00 0.000 0.000e+00 0.000e+00 MPI Msg Len (bytes): 0.000e+00 0.000 0.000e+00 0.000e+00 MPI Reductions: 0.000e+00 0.000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 6.0226e+00 100.0% 1.0799e+09 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) CpuToGpu Count: total number of CPU to GPU copies per processor CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) GpuToCpu Count: total number of GPU to CPU copies per processor GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) GPU %F: percent flops on GPU in this event ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F ------------------------------------------------------------------------------------------------------------------------ --------------------------------------- --- Event Stage 0: Main Stage BuildTwoSided 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 DMCreateMat 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFSetGraph 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFSetUp 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFPack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFUnpack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecDot 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecMDot 775 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecNorm 1728 1.0 nan nan 1.92e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecScale 1983 1.0 nan nan 6.24e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecCopy 780 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecSet 4955 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecAXPY 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecAYPX 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecAXPBYCZ 643 1.0 nan nan 1.79e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecWAXPY 502 1.0 nan nan 5.58e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecMAXPY 1159 1.0 nan nan 3.68e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecScatterBegin 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan -nan 2 5.14e-03 0 0.00e+00 0 VecScatterEnd 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecReduceArith 380 1.0 nan nan 4.23e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecReduceComm 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecNormalize 965 1.0 nan nan 1.61e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 TSStep 20 1.0 5.8699e+00 1.0 1.08e+09 1.0 0.0e+00 0.0e+00 0.0e+00 97100 0 0 0 97100 0 0 0 184 -nan 2 5.14e-03 0 0.00e+00 54 TSFunctionEval 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 0.0e+00 63 1 0 0 0 63 1 0 0 0 -nan -nan 1 3.36e-04 0 0.00e+00 100 TSJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 97 MatMult 1930 1.0 nan nan 4.46e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 41 0 0 0 1 41 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 MatMultTranspose 1 1.0 nan nan 3.44e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 MatSolve 965 1.0 nan nan 5.04e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatSOR 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatLUFactorSym 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatLUFactorNum 190 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 11 0 0 0 1 11 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatScale 190 1.0 nan nan 3.26e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 MatAssemblyBegin 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatAssemblyEnd 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatGetRowIJ 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatCreateSubMats 380 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatGetOrdering 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatZeroEntries 379 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatSetPreallCOO 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatSetValuesCOO 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSetUp 760 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSolve 190 1.0 5.8052e-01 1.0 9.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 10 86 0 0 0 10 86 0 0 0 1602 -nan 1 4.80e-03 0 0.00e+00 46 KSPGMRESOrthog 775 1.0 nan nan 2.27e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 SNESSolve 71 1.0 5.7117e+00 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 0.0e+00 95 99 0 0 0 95 99 0 0 0 188 -nan 1 4.80e-03 0 0.00e+00 53 SNESSetUp 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SNESFunctionEval 573 1.0 nan nan 2.23e+07 1.0 0.0e+00 0.0e+00 0.0e+00 60 2 0 0 0 60 2 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 SNESJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 97 SNESLineSearch 190 1.0 nan nan 1.05e+08 1.0 0.0e+00 0.0e+00 0.0e+00 53 10 0 0 0 53 10 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 PCSetUp 570 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 11 0 0 0 2 11 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 PCApply 965 1.0 nan nan 6.14e+08 1.0 0.0e+00 0.0e+00 0.0e+00 8 57 0 0 0 8 57 0 0 0 -nan -nan 1 4.80e-03 0 0.00e+00 19 KSPSolve_FS_0 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSolve_FS_1 965 1.0 nan nan 1.66e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 15 0 0 0 2 15 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 --- Event Stage 1: Unknown ------------------------------------------------------------------------------------------------------------------------ --------------------------------------- Object Type Creations Destructions. Reports information only for process 0. --- Event Stage 0: Main Stage Container 5 5 Distributed Mesh 2 2 Index Set 11 11 IS L to G Mapping 1 1 Star Forest Graph 7 7 Discrete System 2 2 Weak Form 2 2 Vector 49 49 TSAdapt 1 1 TS 1 1 DMTS 1 1 SNES 1 1 DMSNES 3 3 SNESLineSearch 1 1 Krylov Solver 4 4 DMKSP interface 1 1 Matrix 4 4 Preconditioner 4 4 Viewer 2 1 --- Event Stage 1: Unknown ======================================================================================================================== Average time to get PetscTime(): 3.14e-08 #PETSc Option Table entries: -log_view -log_view_gpu_times #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with 64 bit PetscInt Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 Configure options: PETSC_DIR=/home/4pf/repos/petsc PETSC_ARCH=arch-kokkos-cuda-no-tpls --with-cc=mpicc --with-cxx=mpicxx --with-fc=0 --with-cuda --with-debugging=0 --with-shared-libraries --prefix=/home/4pf/build/petsc/cuda-no-tpls/install --with-64-bit-indices --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --CUDAOPTFLAGS=-O3 --with-kokkos-dir=/home/4pf/build/kokkos/cuda/install --with-kokkos-kernels-dir=/home/4pf/build/kokkos-kernels/cuda-no-tpls/install ----------------------------------------- Libraries compiled on 2022-11-01 21:01:08 on PC0115427 Machine characteristics: Linux-5.15.0-52-generic-x86_64-with-glibc2.35 Using PETSc directory: /home/4pf/build/petsc/cuda-no-tpls/install Using PETSc arch: ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -O3 ----------------------------------------- Using include paths: -I/home/4pf/build/petsc/cuda-no-tpls/install/include -I/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/include -I/home/4pf/build/kokkos/cuda/install/include -I/usr/local/cuda-11.8/include ----------------------------------------- Using C linker: mpicc Using libraries: -Wl,-rpath,/home/4pf/build/petsc/cuda-no-tpls/install/lib -L/home/4pf/build/petsc/cuda-no-tpls/install/lib -lpetsc -Wl,-rpath,/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib -L/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib -Wl,-rpath,/home/4pf/build/kokkos/cuda/install/lib -L/home/4pf/build/kokkos/cuda/install/lib -Wl,-rpath,/usr/local/cuda-11.8/lib64 -L/usr/local/cuda-11.8/lib64 -L/usr/local/cuda-11.8/lib64/stubs -lkokkoskernels -lkokkoscontainers -lkokkoscore -llapack -lblas -lm -lcudart -lnvToolsExt -lcufft -lcublas -lcusparse -lcusolver -lcurand -lcuda -lquadmath -lstdc++ -ldl ----------------------------------------- Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang > Sent: Tuesday, November 15, 2022 13:03 To: Fackler, Philip > Cc: xolotl-psi-development at lists.sourceforge.net >; petsc-users at mcs.anl.gov >; Blondel, Sophie >; Roth, Philip > Subject: Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec diverging when running on CUDA device. Can you paste -log_view result so I can see what functions are used? --Junchao Zhang On Tue, Nov 15, 2022 at 10:24 AM Fackler, Philip > wrote: Yes, most (but not all) of our system test cases fail with the kokkos/cuda or cuda backends. All of them pass with the CPU-only kokkos backend. Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang > Sent: Monday, November 14, 2022 19:34 To: Fackler, Philip > Cc: xolotl-psi-development at lists.sourceforge.net >; petsc-users at mcs.anl.gov >; Blondel, Sophie >; Zhang, Junchao >; Roth, Philip > Subject: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec diverging when running on CUDA device. Hi, Philip, Sorry to hear that. It seems you could run the same code on CPUs but not no GPUs (with either petsc/Kokkos backend or petsc/cuda backend, is it right? --Junchao Zhang On Mon, Nov 14, 2022 at 12:13 PM Fackler, Philip via petsc-users > wrote: This is an issue I've brought up before (and discussed in-person with Richard). I wanted to bring it up again because I'm hitting the limits of what I know to do, and I need help figuring this out. The problem can be reproduced using Xolotl's "develop" branch built against a petsc build with kokkos and kokkos-kernels enabled. Then, either add the relevant kokkos options to the "petscArgs=" line in the system test parameter file(s), or just replace the system test parameter files with the ones from the "feature-petsc-kokkos" branch. See here the files that begin with "params_system_". Note that those files use the "kokkos" options, but the problem is similar using the corresponding cuda/cusparse options. I've already tried building kokkos-kernels with no TPLs and got slightly different results, but the same problem. Any help would be appreciated. Thanks, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From rajesh.singh at pnnl.gov Mon Dec 5 14:22:34 2022 From: rajesh.singh at pnnl.gov (Singh, Rajesh K) Date: Mon, 5 Dec 2022 20:22:34 +0000 Subject: [petsc-users] Installation With MSYS2 and MinGW Compilers Message-ID: Dear All: I am having diffisulty to install PETSC on the window system. I went through the the steps esplained in the web site and got following error. [cid:image001.png at 01D908A4.407860F0] Help for resolving this issue would be really appricaited. Thanks, Rajesh -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 47894 bytes Desc: image001.png URL: From junchao.zhang at gmail.com Mon Dec 5 14:22:59 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Mon, 5 Dec 2022 14:22:59 -0600 Subject: [petsc-users] [EXTERNAL] Re: Kokkos backend for Mat and Vec diverging when running on CUDA device. In-Reply-To: References: Message-ID: Hello, Philip, Do I still need to use the feature-petsc-kokkos branch? --Junchao Zhang On Mon, Dec 5, 2022 at 11:08 AM Fackler, Philip wrote: > Junchao, > > Thank you for working on this. If you open the parameter file for, say, > the PSI_2 system test case (benchmarks/params_system_PSI_2.txt), simply add -dm_mat_type > aijkokkos -dm_vec_type kokkos?` to the "petscArgs=" field (or the > corresponding cusparse/cuda option). > > Thanks, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Thursday, December 1, 2022 17:05 > *To:* Fackler, Philip > *Cc:* xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov>; Blondel, Sophie ; Roth, > Philip > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and > Vec diverging when running on CUDA device. > > Hi, Philip, > Sorry for the long delay. I could not get something useful from the > -log_view output. Since I have already built xolotl, could you give me > instructions on how to do a xolotl test to reproduce the divergence with > petsc GPU backends (but fine on CPU)? > Thank you. > --Junchao Zhang > > > On Wed, Nov 16, 2022 at 1:38 PM Fackler, Philip > wrote: > > ------------------------------------------------------------------ PETSc > Performance Summary: > ------------------------------------------------------------------ > > Unknown Name on a named PC0115427 with 1 processor, by 4pf Wed Nov 16 > 14:36:46 2022 > Using Petsc Development GIT revision: v3.18.1-115-gdca010e0e9a GIT Date: > 2022-10-28 14:39:41 +0000 > > Max Max/Min Avg Total > Time (sec): 6.023e+00 1.000 6.023e+00 > Objects: 1.020e+02 1.000 1.020e+02 > Flops: 1.080e+09 1.000 1.080e+09 1.080e+09 > Flops/sec: 1.793e+08 1.000 1.793e+08 1.793e+08 > MPI Msg Count: 0.000e+00 0.000 0.000e+00 0.000e+00 > MPI Msg Len (bytes): 0.000e+00 0.000 0.000e+00 0.000e+00 > MPI Reductions: 0.000e+00 0.000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flops > and VecAXPY() for complex vectors of length N > --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count > %Total Avg %Total Count %Total > 0: Main Stage: 6.0226e+00 100.0% 1.0799e+09 100.0% 0.000e+00 > 0.0% 0.000e+00 0.0% 0.000e+00 0.0% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > AvgLen: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over > all processors) > GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU > time over all processors) > CpuToGpu Count: total number of CPU to GPU copies per processor > CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per > processor) > GpuToCpu Count: total number of GPU to CPU copies per processor > GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per > processor) > GPU %F: percent flops on GPU in this event > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total > GPU - CpuToGpu - - GpuToCpu - GPU > > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > Mflop/s Count Size Count Size %F > > > ------------------------------------------------------------------------------------------------------------------------ > --------------------------------------- > > > --- Event Stage 0: Main Stage > > BuildTwoSided 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > DMCreateMat 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SFSetGraph 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SFSetUp 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SFPack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SFUnpack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecDot 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecMDot 775 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecNorm 1728 1.0 nan nan 1.92e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecScale 1983 1.0 nan nan 6.24e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecCopy 780 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecSet 4955 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecAXPY 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecAYPX 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecAXPBYCZ 643 1.0 nan nan 1.79e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecWAXPY 502 1.0 nan nan 5.58e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecMAXPY 1159 1.0 nan nan 3.68e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecScatterBegin 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan > -nan 2 5.14e-03 0 0.00e+00 0 > > VecScatterEnd 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecReduceArith 380 1.0 nan nan 4.23e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecReduceComm 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecNormalize 965 1.0 nan nan 1.61e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > TSStep 20 1.0 5.8699e+00 1.0 1.08e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 97100 0 0 0 97100 0 0 0 184 > -nan 2 5.14e-03 0 0.00e+00 54 > > TSFunctionEval 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 63 1 0 0 0 63 1 0 0 0 -nan > -nan 1 3.36e-04 0 0.00e+00 100 > > TSJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 97 > > MatMult 1930 1.0 nan nan 4.46e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 41 0 0 0 1 41 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > MatMultTranspose 1 1.0 nan nan 3.44e+05 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > MatSolve 965 1.0 nan nan 5.04e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 5 0 0 0 1 5 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatSOR 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatLUFactorSym 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatLUFactorNum 190 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 11 0 0 0 1 11 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatScale 190 1.0 nan nan 3.26e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > MatAssemblyBegin 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatAssemblyEnd 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatGetRowIJ 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatCreateSubMats 380 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatGetOrdering 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatZeroEntries 379 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatSetPreallCOO 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatSetValuesCOO 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > KSPSetUp 760 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > KSPSolve 190 1.0 5.8052e-01 1.0 9.30e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 10 86 0 0 0 10 86 0 0 0 1602 > -nan 1 4.80e-03 0 0.00e+00 46 > > KSPGMRESOrthog 775 1.0 nan nan 2.27e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 2 0 0 0 1 2 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > SNESSolve 71 1.0 5.7117e+00 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 95 99 0 0 0 95 99 0 0 0 188 > -nan 1 4.80e-03 0 0.00e+00 53 > > SNESSetUp 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SNESFunctionEval 573 1.0 nan nan 2.23e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 60 2 0 0 0 60 2 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > SNESJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 97 > > SNESLineSearch 190 1.0 nan nan 1.05e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 53 10 0 0 0 53 10 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > PCSetUp 570 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 2 11 0 0 0 2 11 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > PCApply 965 1.0 nan nan 6.14e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 8 57 0 0 0 8 57 0 0 0 -nan > -nan 1 4.80e-03 0 0.00e+00 19 > > KSPSolve_FS_0 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > KSPSolve_FS_1 965 1.0 nan nan 1.66e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 2 15 0 0 0 2 15 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > > --- Event Stage 1: Unknown > > > ------------------------------------------------------------------------------------------------------------------------ > --------------------------------------- > > > Object Type Creations Destructions. Reports information only > for process 0. > > --- Event Stage 0: Main Stage > > Container 5 5 > Distributed Mesh 2 2 > Index Set 11 11 > IS L to G Mapping 1 1 > Star Forest Graph 7 7 > Discrete System 2 2 > Weak Form 2 2 > Vector 49 49 > TSAdapt 1 1 > TS 1 1 > DMTS 1 1 > SNES 1 1 > DMSNES 3 3 > SNESLineSearch 1 1 > Krylov Solver 4 4 > DMKSP interface 1 1 > Matrix 4 4 > Preconditioner 4 4 > Viewer 2 1 > > --- Event Stage 1: Unknown > > > ======================================================================================================================== > Average time to get PetscTime(): 3.14e-08 > #PETSc Option Table entries: > -log_view > -log_view_gpu_times > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with 64 bit PetscInt > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 8 > Configure options: PETSC_DIR=/home/4pf/repos/petsc > PETSC_ARCH=arch-kokkos-cuda-no-tpls --with-cc=mpicc --with-cxx=mpicxx > --with-fc=0 --with-cuda --with-debugging=0 --with-shared-libraries > --prefix=/home/4pf/build/petsc/cuda-no-tpls/install --with-64-bit-indices > --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --CUDAOPTFLAGS=-O3 > --with-kokkos-dir=/home/4pf/build/kokkos/cuda/install > --with-kokkos-kernels-dir=/home/4pf/build/kokkos-kernels/cuda-no-tpls/install > > ----------------------------------------- > Libraries compiled on 2022-11-01 21:01:08 on PC0115427 > Machine characteristics: Linux-5.15.0-52-generic-x86_64-with-glibc2.35 > Using PETSc directory: /home/4pf/build/petsc/cuda-no-tpls/install > Using PETSc arch: > ----------------------------------------- > > Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas > -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector > -fvisibility=hidden -O3 > ----------------------------------------- > > Using include paths: -I/home/4pf/build/petsc/cuda-no-tpls/install/include > -I/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/include > -I/home/4pf/build/kokkos/cuda/install/include -I/usr/local/cuda-11.8/include > ----------------------------------------- > > Using C linker: mpicc > Using libraries: -Wl,-rpath,/home/4pf/build/petsc/cuda-no-tpls/install/lib > -L/home/4pf/build/petsc/cuda-no-tpls/install/lib -lpetsc > -Wl,-rpath,/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib > -L/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib > -Wl,-rpath,/home/4pf/build/kokkos/cuda/install/lib > -L/home/4pf/build/kokkos/cuda/install/lib > -Wl,-rpath,/usr/local/cuda-11.8/lib64 -L/usr/local/cuda-11.8/lib64 > -L/usr/local/cuda-11.8/lib64/stubs -lkokkoskernels -lkokkoscontainers > -lkokkoscore -llapack -lblas -lm -lcudart -lnvToolsExt -lcufft -lcublas > -lcusparse -lcusolver -lcurand -lcuda -lquadmath -lstdc++ -ldl > ----------------------------------------- > > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Tuesday, November 15, 2022 13:03 > *To:* Fackler, Philip > *Cc:* xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov>; Blondel, Sophie ; Roth, > Philip > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and > Vec diverging when running on CUDA device. > > Can you paste -log_view result so I can see what functions are used? > > --Junchao Zhang > > > On Tue, Nov 15, 2022 at 10:24 AM Fackler, Philip > wrote: > > Yes, most (but not all) of our system test cases fail with the kokkos/cuda > or cuda backends. All of them pass with the CPU-only kokkos backend. > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Monday, November 14, 2022 19:34 > *To:* Fackler, Philip > *Cc:* xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov>; Blondel, Sophie ; Zhang, > Junchao ; Roth, Philip > *Subject:* [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec > diverging when running on CUDA device. > > Hi, Philip, > Sorry to hear that. It seems you could run the same code on CPUs but > not no GPUs (with either petsc/Kokkos backend or petsc/cuda backend, is it > right? > > --Junchao Zhang > > > On Mon, Nov 14, 2022 at 12:13 PM Fackler, Philip via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > This is an issue I've brought up before (and discussed in-person with > Richard). I wanted to bring it up again because I'm hitting the limits of > what I know to do, and I need help figuring this out. > > The problem can be reproduced using Xolotl's "develop" branch built > against a petsc build with kokkos and kokkos-kernels enabled. Then, either > add the relevant kokkos options to the "petscArgs=" line in the system test > parameter file(s), or just replace the system test parameter files with the > ones from the "feature-petsc-kokkos" branch. See here the files that > begin with "params_system_". > > Note that those files use the "kokkos" options, but the problem is similar > using the corresponding cuda/cusparse options. I've already tried building > kokkos-kernels with no TPLs and got slightly different results, but the > same problem. > > Any help would be appreciated. > > Thanks, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Mon Dec 5 14:40:11 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Mon, 5 Dec 2022 14:40:11 -0600 Subject: [petsc-users] [EXTERNAL] Re: Kokkos backend for Mat and Vec diverging when running on CUDA device. In-Reply-To: References: Message-ID: I configured with xolotl branch feature-petsc-kokkos, and typed `make` under ~/xolotl-build/. Though there were errors, a lot of *Tester were built. [ 62%] Built target xolotlViz [ 63%] Linking CXX executable TemperatureProfileHandlerTester [ 64%] Linking CXX executable TemperatureGradientHandlerTester [ 64%] Built target TemperatureProfileHandlerTester [ 64%] Built target TemperatureConstantHandlerTester [ 64%] Built target TemperatureGradientHandlerTester [ 65%] Linking CXX executable HeatEquationHandlerTester [ 65%] Built target HeatEquationHandlerTester [ 66%] Linking CXX executable FeFitFluxHandlerTester [ 66%] Linking CXX executable W111FitFluxHandlerTester [ 67%] Linking CXX executable FuelFitFluxHandlerTester [ 67%] Linking CXX executable W211FitFluxHandlerTester Which Tester should I use to run with the parameter file benchmarks/params_system_PSI_2.txt? And how many ranks should I use? Could you give an example command line? Thanks. --Junchao Zhang On Mon, Dec 5, 2022 at 2:22 PM Junchao Zhang wrote: > Hello, Philip, > Do I still need to use the feature-petsc-kokkos branch? > --Junchao Zhang > > > On Mon, Dec 5, 2022 at 11:08 AM Fackler, Philip > wrote: > >> Junchao, >> >> Thank you for working on this. If you open the parameter file for, say, >> the PSI_2 system test case (benchmarks/params_system_PSI_2.txt), simply add -dm_mat_type >> aijkokkos -dm_vec_type kokkos?` to the "petscArgs=" field (or the >> corresponding cusparse/cuda option). >> >> Thanks, >> >> >> *Philip Fackler * >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> *Oak Ridge National Laboratory* >> ------------------------------ >> *From:* Junchao Zhang >> *Sent:* Thursday, December 1, 2022 17:05 >> *To:* Fackler, Philip >> *Cc:* xolotl-psi-development at lists.sourceforge.net < >> xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < >> petsc-users at mcs.anl.gov>; Blondel, Sophie ; Roth, >> Philip >> *Subject:* Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and >> Vec diverging when running on CUDA device. >> >> Hi, Philip, >> Sorry for the long delay. I could not get something useful from the >> -log_view output. Since I have already built xolotl, could you give me >> instructions on how to do a xolotl test to reproduce the divergence with >> petsc GPU backends (but fine on CPU)? >> Thank you. >> --Junchao Zhang >> >> >> On Wed, Nov 16, 2022 at 1:38 PM Fackler, Philip >> wrote: >> >> ------------------------------------------------------------------ PETSc >> Performance Summary: >> ------------------------------------------------------------------ >> >> Unknown Name on a named PC0115427 with 1 processor, by 4pf Wed Nov 16 >> 14:36:46 2022 >> Using Petsc Development GIT revision: v3.18.1-115-gdca010e0e9a GIT Date: >> 2022-10-28 14:39:41 +0000 >> >> Max Max/Min Avg Total >> Time (sec): 6.023e+00 1.000 6.023e+00 >> Objects: 1.020e+02 1.000 1.020e+02 >> Flops: 1.080e+09 1.000 1.080e+09 1.080e+09 >> Flops/sec: 1.793e+08 1.000 1.793e+08 1.793e+08 >> MPI Msg Count: 0.000e+00 0.000 0.000e+00 0.000e+00 >> MPI Msg Len (bytes): 0.000e+00 0.000 0.000e+00 0.000e+00 >> MPI Reductions: 0.000e+00 0.000 >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length N >> --> 2N flops >> and VecAXPY() for complex vectors of length N >> --> 8N flops >> >> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages >> --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total Count >> %Total Avg %Total Count %Total >> 0: Main Stage: 6.0226e+00 100.0% 1.0799e+09 100.0% 0.000e+00 >> 0.0% 0.000e+00 0.0% 0.000e+00 0.0% >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flop: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all processors >> Mess: number of messages sent >> AvgLen: average message length (bytes) >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() >> and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flop in this >> phase >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time >> over all processors) >> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU >> time over all processors) >> CpuToGpu Count: total number of CPU to GPU copies per processor >> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per >> processor) >> GpuToCpu Count: total number of GPU to CPU copies per processor >> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per >> processor) >> GPU %F: percent flops on GPU in this event >> >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flop >> --- Global --- --- Stage ---- Total >> GPU - CpuToGpu - - GpuToCpu - GPU >> >> Max Ratio Max Ratio Max Ratio Mess AvgLen >> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> Mflop/s Count Size Count Size %F >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> --------------------------------------- >> >> >> --- Event Stage 0: Main Stage >> >> BuildTwoSided 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> DMCreateMat 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> SFSetGraph 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> SFSetUp 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> SFPack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> SFUnpack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> VecDot 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecMDot 775 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> VecNorm 1728 1.0 nan nan 1.92e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecScale 1983 1.0 nan nan 6.24e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecCopy 780 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> VecSet 4955 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 2 0 0 0 0 2 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> VecAXPY 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecAYPX 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecAXPBYCZ 643 1.0 nan nan 1.79e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecWAXPY 502 1.0 nan nan 5.58e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecMAXPY 1159 1.0 nan nan 3.68e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecScatterBegin 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan >> -nan 2 5.14e-03 0 0.00e+00 0 >> >> VecScatterEnd 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> VecReduceArith 380 1.0 nan nan 4.23e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> VecReduceComm 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> VecNormalize 965 1.0 nan nan 1.61e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> TSStep 20 1.0 5.8699e+00 1.0 1.08e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 97100 0 0 0 97100 0 0 0 184 >> -nan 2 5.14e-03 0 0.00e+00 54 >> >> TSFunctionEval 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 63 1 0 0 0 63 1 0 0 0 -nan >> -nan 1 3.36e-04 0 0.00e+00 100 >> >> TSJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 97 >> >> MatMult 1930 1.0 nan nan 4.46e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 1 41 0 0 0 1 41 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> MatMultTranspose 1 1.0 nan nan 3.44e+05 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> MatSolve 965 1.0 nan nan 5.04e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 1 5 0 0 0 1 5 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatSOR 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatLUFactorSym 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatLUFactorNum 190 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 1 11 0 0 0 1 11 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatScale 190 1.0 nan nan 3.26e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> MatAssemblyBegin 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatAssemblyEnd 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatGetRowIJ 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatCreateSubMats 380 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatGetOrdering 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatZeroEntries 379 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatSetPreallCOO 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> MatSetValuesCOO 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> KSPSetUp 760 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> KSPSolve 190 1.0 5.8052e-01 1.0 9.30e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 10 86 0 0 0 10 86 0 0 0 1602 >> -nan 1 4.80e-03 0 0.00e+00 46 >> >> KSPGMRESOrthog 775 1.0 nan nan 2.27e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 1 2 0 0 0 1 2 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> SNESSolve 71 1.0 5.7117e+00 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 95 99 0 0 0 95 99 0 0 0 188 >> -nan 1 4.80e-03 0 0.00e+00 53 >> >> SNESSetUp 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> SNESFunctionEval 573 1.0 nan nan 2.23e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 60 2 0 0 0 60 2 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> SNESJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 97 >> >> SNESLineSearch 190 1.0 nan nan 1.05e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 53 10 0 0 0 53 10 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 100 >> >> PCSetUp 570 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 2 11 0 0 0 2 11 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> PCApply 965 1.0 nan nan 6.14e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 8 57 0 0 0 8 57 0 0 0 -nan >> -nan 1 4.80e-03 0 0.00e+00 19 >> >> KSPSolve_FS_0 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> KSPSolve_FS_1 965 1.0 nan nan 1.66e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 2 15 0 0 0 2 15 0 0 0 -nan >> -nan 0 0.00e+00 0 0.00e+00 0 >> >> >> --- Event Stage 1: Unknown >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> --------------------------------------- >> >> >> Object Type Creations Destructions. Reports information only >> for process 0. >> >> --- Event Stage 0: Main Stage >> >> Container 5 5 >> Distributed Mesh 2 2 >> Index Set 11 11 >> IS L to G Mapping 1 1 >> Star Forest Graph 7 7 >> Discrete System 2 2 >> Weak Form 2 2 >> Vector 49 49 >> TSAdapt 1 1 >> TS 1 1 >> DMTS 1 1 >> SNES 1 1 >> DMSNES 3 3 >> SNESLineSearch 1 1 >> Krylov Solver 4 4 >> DMKSP interface 1 1 >> Matrix 4 4 >> Preconditioner 4 4 >> Viewer 2 1 >> >> --- Event Stage 1: Unknown >> >> >> ======================================================================================================================== >> Average time to get PetscTime(): 3.14e-08 >> #PETSc Option Table entries: >> -log_view >> -log_view_gpu_times >> #End of PETSc Option Table entries >> Compiled without FORTRAN kernels >> Compiled with 64 bit PetscInt >> Compiled with full precision matrices (default) >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >> sizeof(PetscScalar) 8 sizeof(PetscInt) 8 >> Configure options: PETSC_DIR=/home/4pf/repos/petsc >> PETSC_ARCH=arch-kokkos-cuda-no-tpls --with-cc=mpicc --with-cxx=mpicxx >> --with-fc=0 --with-cuda --with-debugging=0 --with-shared-libraries >> --prefix=/home/4pf/build/petsc/cuda-no-tpls/install --with-64-bit-indices >> --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --CUDAOPTFLAGS=-O3 >> --with-kokkos-dir=/home/4pf/build/kokkos/cuda/install >> --with-kokkos-kernels-dir=/home/4pf/build/kokkos-kernels/cuda-no-tpls/install >> >> ----------------------------------------- >> Libraries compiled on 2022-11-01 21:01:08 on PC0115427 >> Machine characteristics: Linux-5.15.0-52-generic-x86_64-with-glibc2.35 >> Using PETSc directory: /home/4pf/build/petsc/cuda-no-tpls/install >> Using PETSc arch: >> ----------------------------------------- >> >> Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas >> -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector >> -fvisibility=hidden -O3 >> ----------------------------------------- >> >> Using include paths: -I/home/4pf/build/petsc/cuda-no-tpls/install/include >> -I/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/include >> -I/home/4pf/build/kokkos/cuda/install/include -I/usr/local/cuda-11.8/include >> ----------------------------------------- >> >> Using C linker: mpicc >> Using libraries: >> -Wl,-rpath,/home/4pf/build/petsc/cuda-no-tpls/install/lib >> -L/home/4pf/build/petsc/cuda-no-tpls/install/lib -lpetsc >> -Wl,-rpath,/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib >> -L/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib >> -Wl,-rpath,/home/4pf/build/kokkos/cuda/install/lib >> -L/home/4pf/build/kokkos/cuda/install/lib >> -Wl,-rpath,/usr/local/cuda-11.8/lib64 -L/usr/local/cuda-11.8/lib64 >> -L/usr/local/cuda-11.8/lib64/stubs -lkokkoskernels -lkokkoscontainers >> -lkokkoscore -llapack -lblas -lm -lcudart -lnvToolsExt -lcufft -lcublas >> -lcusparse -lcusolver -lcurand -lcuda -lquadmath -lstdc++ -ldl >> ----------------------------------------- >> >> >> >> *Philip Fackler * >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> *Oak Ridge National Laboratory* >> ------------------------------ >> *From:* Junchao Zhang >> *Sent:* Tuesday, November 15, 2022 13:03 >> *To:* Fackler, Philip >> *Cc:* xolotl-psi-development at lists.sourceforge.net < >> xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < >> petsc-users at mcs.anl.gov>; Blondel, Sophie ; Roth, >> Philip >> *Subject:* Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and >> Vec diverging when running on CUDA device. >> >> Can you paste -log_view result so I can see what functions are used? >> >> --Junchao Zhang >> >> >> On Tue, Nov 15, 2022 at 10:24 AM Fackler, Philip >> wrote: >> >> Yes, most (but not all) of our system test cases fail with the >> kokkos/cuda or cuda backends. All of them pass with the CPU-only kokkos >> backend. >> >> >> *Philip Fackler * >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> *Oak Ridge National Laboratory* >> ------------------------------ >> *From:* Junchao Zhang >> *Sent:* Monday, November 14, 2022 19:34 >> *To:* Fackler, Philip >> *Cc:* xolotl-psi-development at lists.sourceforge.net < >> xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < >> petsc-users at mcs.anl.gov>; Blondel, Sophie ; Zhang, >> Junchao ; Roth, Philip >> *Subject:* [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec >> diverging when running on CUDA device. >> >> Hi, Philip, >> Sorry to hear that. It seems you could run the same code on CPUs but >> not no GPUs (with either petsc/Kokkos backend or petsc/cuda backend, is it >> right? >> >> --Junchao Zhang >> >> >> On Mon, Nov 14, 2022 at 12:13 PM Fackler, Philip via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >> This is an issue I've brought up before (and discussed in-person with >> Richard). I wanted to bring it up again because I'm hitting the limits of >> what I know to do, and I need help figuring this out. >> >> The problem can be reproduced using Xolotl's "develop" branch built >> against a petsc build with kokkos and kokkos-kernels enabled. Then, either >> add the relevant kokkos options to the "petscArgs=" line in the system test >> parameter file(s), or just replace the system test parameter files with the >> ones from the "feature-petsc-kokkos" branch. See here the files that >> begin with "params_system_". >> >> Note that those files use the "kokkos" options, but the problem is >> similar using the corresponding cuda/cusparse options. I've already tried >> building kokkos-kernels with no TPLs and got slightly different results, >> but the same problem. >> >> Any help would be appreciated. >> >> Thanks, >> >> >> *Philip Fackler * >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> *Oak Ridge National Laboratory* >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Mon Dec 5 14:40:39 2022 From: pierre at joliv.et (Pierre Jolivet) Date: Mon, 5 Dec 2022 21:40:39 +0100 Subject: [petsc-users] Installation With MSYS2 and MinGW Compilers In-Reply-To: References: Message-ID: <8457B691-CBD6-4C3B-B46F-306F6F985600@joliv.et> Hello Rajesh, Do you need Fortran bindings? Otherwise, ./configure --with-fortran-bindings=0 should do the trick. Sowing compilation is broken with MinGW compiler. If you need Fortran bindings, we could try to fix it. Thanks, Pierre > On 5 Dec 2022, at 9:22 PM, Singh, Rajesh K via petsc-users wrote: > > Dear All: > > I am having diffisulty to install PETSC on the window system. I went through the the steps esplained in the web site and got following error. > > > > Help for resolving this issue would be really appricaited. > > Thanks, > Rajesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From rajesh.singh at pnnl.gov Mon Dec 5 14:50:02 2022 From: rajesh.singh at pnnl.gov (Singh, Rajesh K) Date: Mon, 5 Dec 2022 20:50:02 +0000 Subject: [petsc-users] Installation With MSYS2 and MinGW Compilers In-Reply-To: <8457B691-CBD6-4C3B-B46F-306F6F985600@joliv.et> References: <8457B691-CBD6-4C3B-B46F-306F6F985600@joliv.et> Message-ID: Hi Pierre, Thank you so much for prompt response. I will run FORTRAN based code with PETSc. Therefore I guess I will need Fortran binding. Regards Rajesh From: Pierre Jolivet Sent: Monday, December 5, 2022 12:41 PM To: Singh, Rajesh K Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers Check twice before you click! This email originated from outside PNNL. Hello Rajesh, Do you need Fortran bindings? Otherwise, ./configure --with-fortran-bindings=0 should do the trick. Sowing compilation is broken with MinGW compiler. If you need Fortran bindings, we could try to fix it. Thanks, Pierre On 5 Dec 2022, at 9:22 PM, Singh, Rajesh K via petsc-users > wrote: Dear All: I am having diffisulty to install PETSC on the window system. I went through the the steps esplained in the web site and got following error. Help for resolving this issue would be really appricaited. Thanks, Rajesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Mon Dec 5 15:15:21 2022 From: pierre at joliv.et (Pierre Jolivet) Date: Mon, 5 Dec 2022 22:15:21 +0100 Subject: [petsc-users] Installation With MSYS2 and MinGW Compilers In-Reply-To: References: <8457B691-CBD6-4C3B-B46F-306F6F985600@joliv.et> Message-ID: <79287ACA-B977-43A1-A42A-B5A26023935C@joliv.et> > On 5 Dec 2022, at 9:50 PM, Singh, Rajesh K wrote: > > Hi Pierre, > > Thank you so much for prompt response. I will run FORTRAN based code with PETSc. Therefore I guess I will need Fortran binding. OK, so, two things: 1) as said earlier, Sowing is broken with MinGW, but I?m sadly one of the few PETSc people using this environment, so I?m one of the few who can fix it, but I can?t tell you when I?ll be able to deliver 2) if you stick to an official tarball, the Fortran bindings should be shipped in. While I work on 1), could you stick to, e.g., https://ftp.mcs.anl.gov/pub/petsc/release-snapshots/petsc-3.18.2.tar.gz? Thanks, Pierre > Regards > Rajesh > > From: Pierre Jolivet > > Sent: Monday, December 5, 2022 12:41 PM > To: Singh, Rajesh K > > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers > > Check twice before you click! This email originated from outside PNNL. > > Hello Rajesh, > Do you need Fortran bindings? > Otherwise, ./configure --with-fortran-bindings=0 should do the trick. > Sowing compilation is broken with MinGW compiler. > If you need Fortran bindings, we could try to fix it. > > Thanks, > Pierre > > > On 5 Dec 2022, at 9:22 PM, Singh, Rajesh K via petsc-users > wrote: > > Dear All: > > I am having diffisulty to install PETSC on the window system. I went through the the steps esplained in the web site and got following error. > > > > Help for resolving this issue would be really appricaited. > > Thanks, > Rajesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From rajesh.singh at pnnl.gov Mon Dec 5 17:51:34 2022 From: rajesh.singh at pnnl.gov (Singh, Rajesh K) Date: Mon, 5 Dec 2022 23:51:34 +0000 Subject: [petsc-users] Installation With MSYS2 and MinGW Compilers In-Reply-To: <79287ACA-B977-43A1-A42A-B5A26023935C@joliv.et> References: <8457B691-CBD6-4C3B-B46F-306F6F985600@joliv.et> <79287ACA-B977-43A1-A42A-B5A26023935C@joliv.et> Message-ID: Hi Pierre, I got following error while compiling PETSc. make all check [cid:image001.png at 01D908C1.7267DCE0] Help for this would be appreciated. Thanks, Rajesh From: Pierre Jolivet Sent: Monday, December 5, 2022 1:15 PM To: Singh, Rajesh K Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers On 5 Dec 2022, at 9:50 PM, Singh, Rajesh K > wrote: Hi Pierre, Thank you so much for prompt response. I will run FORTRAN based code with PETSc. Therefore I guess I will need Fortran binding. OK, so, two things: 1) as said earlier, Sowing is broken with MinGW, but I'm sadly one of the few PETSc people using this environment, so I'm one of the few who can fix it, but I can't tell you when I'll be able to deliver 2) if you stick to an official tarball, the Fortran bindings should be shipped in. While I work on 1), could you stick to, e.g., https://ftp.mcs.anl.gov/pub/petsc/release-snapshots/petsc-3.18.2.tar.gz? Thanks, Pierre Regards Rajesh From: Pierre Jolivet > Sent: Monday, December 5, 2022 12:41 PM To: Singh, Rajesh K > Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers Check twice before you click! This email originated from outside PNNL. Hello Rajesh, Do you need Fortran bindings? Otherwise, ./configure --with-fortran-bindings=0 should do the trick. Sowing compilation is broken with MinGW compiler. If you need Fortran bindings, we could try to fix it. Thanks, Pierre On 5 Dec 2022, at 9:22 PM, Singh, Rajesh K via petsc-users > wrote: Dear All: I am having diffisulty to install PETSC on the window system. I went through the the steps esplained in the web site and got following error. Help for resolving this issue would be really appricaited. Thanks, Rajesh -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 64580 bytes Desc: image001.png URL: From yuanxi at advancesoft.jp Mon Dec 5 19:55:22 2022 From: yuanxi at advancesoft.jp (=?UTF-8?B?6KKB54WV?=) Date: Tue, 6 Dec 2022 10:55:22 +0900 Subject: [petsc-users] How to introduce external iterative solver into PETSc In-Reply-To: References: Message-ID: Dear Matt It seems that I am in the wrong way! I will retry following your example. Thanks a lot! Yuan 2022?12?5?(?) 22:11 Matthew Knepley : > On Mon, Dec 5, 2022 at 3:08 AM ?? wrote: > >> Dear PETSc developers, >> >> I have my own linear solver and am trying to put it into PETSc as an >> external solver. Following the implementation of mumps, mkl_cpardiso, >> supelu etc, I think I should do the follow: >> >> 1. Add my solver name into MatSolverType. >> 2. Register my solver by calling MatSolverTypeRegister >> >> to let petsc record the existence of a new solver. The problem is that >> the above external solvers are all direct solvers, and a MatFactorType >> parameter should be set to indicate its factorization type, such as LU, QR, >> or Cholesky. But my solver is an iterative one, that means I cannot specify >> its MatFactorType. I wish to understand >> >> 1. Am I doing it the right way? And if so >> 2. How to set such parameters as MatFactorType, >> lufactorsymbolic, lufactornumeric. >> > > This was a misunderstanding. If it is an iterative solver, you can follow > the template for Jacobi: > > > https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/pc/impls/jacobi/jacobi.c > > There are comments throughout this file showing you how to register your > own preconditioner. You should not need to alter PETSc source. > I have done this myself with the BAMG preconditioner. > > Thanks, > > Matt > > >> Many thanks, >> >> Yuan >> Ph.D. in Solid Mechanics >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon Dec 5 20:52:30 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 5 Dec 2022 20:52:30 -0600 (CST) Subject: [petsc-users] Installation With MSYS2 and MinGW Compilers In-Reply-To: References: <8457B691-CBD6-4C3B-B46F-306F6F985600@joliv.et> <79287ACA-B977-43A1-A42A-B5A26023935C@joliv.et> Message-ID: <26ad1924-bea0-4c37-71fb-a164214902e6@mcs.anl.gov> Can you send corresponding configure.log and make.log? [should be in PETSC_DIR/PETSC_ARCH/lib/petsc/conf Also what is your requirement wrt using PETSc on windows? - need to link with other MS compiler libraries? - Can you use WSL? Our primary instructions for windows usage is with MS/Intel compilers [with cygwin tools] not MSYS2. https://petsc.org/release/install/windows/ But As Pierre mentioned - MSYS2 should also work. Satish On Mon, 5 Dec 2022, Singh, Rajesh K via petsc-users wrote: > Hi Pierre, > > I got following error while compiling PETSc. > > make all check > > [cid:image001.png at 01D908C1.7267DCE0] > > Help for this would be appreciated. > > Thanks, > Rajesh > > From: Pierre Jolivet > Sent: Monday, December 5, 2022 1:15 PM > To: Singh, Rajesh K > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers > > > On 5 Dec 2022, at 9:50 PM, Singh, Rajesh K > wrote: > > Hi Pierre, > > Thank you so much for prompt response. I will run FORTRAN based code with PETSc. Therefore I guess I will need Fortran binding. > > OK, so, two things: > 1) as said earlier, Sowing is broken with MinGW, but I'm sadly one of the few PETSc people using this environment, so I'm one of the few who can fix it, but I can't tell you when I'll be able to deliver > 2) if you stick to an official tarball, the Fortran bindings should be shipped in. While I work on 1), could you stick to, e.g., https://ftp.mcs.anl.gov/pub/petsc/release-snapshots/petsc-3.18.2.tar.gz? > > Thanks, > Pierre > > > Regards > Rajesh > > From: Pierre Jolivet > > Sent: Monday, December 5, 2022 12:41 PM > To: Singh, Rajesh K > > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers > > Check twice before you click! This email originated from outside PNNL. > > Hello Rajesh, > Do you need Fortran bindings? > Otherwise, ./configure --with-fortran-bindings=0 should do the trick. > Sowing compilation is broken with MinGW compiler. > If you need Fortran bindings, we could try to fix it. > > Thanks, > Pierre > > > > On 5 Dec 2022, at 9:22 PM, Singh, Rajesh K via petsc-users > wrote: > > Dear All: > > I am having diffisulty to install PETSC on the window system. I went through the the steps esplained in the web site and got following error. > > > > Help for resolving this issue would be really appricaited. > > Thanks, > Rajesh > > From rajesh.singh at pnnl.gov Mon Dec 5 22:11:52 2022 From: rajesh.singh at pnnl.gov (Singh, Rajesh K) Date: Tue, 6 Dec 2022 04:11:52 +0000 Subject: [petsc-users] Installation With MSYS2 and MinGW Compilers In-Reply-To: <26ad1924-bea0-4c37-71fb-a164214902e6@mcs.anl.gov> References: <8457B691-CBD6-4C3B-B46F-306F6F985600@joliv.et> <79287ACA-B977-43A1-A42A-B5A26023935C@joliv.et> <26ad1924-bea0-4c37-71fb-a164214902e6@mcs.anl.gov> Message-ID: Hi Satish, Thank you so much for offering help for installing PETSc in MSYS2 and mingw64. Attached are configure.log and make.log files. I am newb in this. Please let me know if you need further information. I will be glad to provide. Regards, Rajesh -----Original Message----- From: Satish Balay Sent: Monday, December 5, 2022 6:53 PM To: Singh, Rajesh K Cc: Pierre Jolivet ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers Can you send corresponding configure.log and make.log? [should be in PETSC_DIR/PETSC_ARCH/lib/petsc/conf Also what is your requirement wrt using PETSc on windows? - need to link with other MS compiler libraries? - Can you use WSL? Our primary instructions for windows usage is with MS/Intel compilers [with cygwin tools] not MSYS2. https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpetsc.org%2Frelease%2Finstall%2Fwindows%2F&data=05%7C01%7Crajesh.singh%40pnnl.gov%7Ccf1975f0691a4a6f77a808dad734eec8%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638058922771517675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hpyNWLZfFlS6NgV5ehRZ0EwyCnkrJw4yYT0HBvrVJF8%3D&reserved=0 But As Pierre mentioned - MSYS2 should also work. Satish On Mon, 5 Dec 2022, Singh, Rajesh K via petsc-users wrote: > Hi Pierre, > > I got following error while compiling PETSc. > > make all check > > [cid:image001.png at 01D908C1.7267DCE0] > > Help for this would be appreciated. > > Thanks, > Rajesh > > From: Pierre Jolivet > Sent: Monday, December 5, 2022 1:15 PM > To: Singh, Rajesh K > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers > > > On 5 Dec 2022, at 9:50 PM, Singh, Rajesh K > wrote: > > Hi Pierre, > > Thank you so much for prompt response. I will run FORTRAN based code with PETSc. Therefore I guess I will need Fortran binding. > > OK, so, two things: > 1) as said earlier, Sowing is broken with MinGW, but I'm sadly one of > the few PETSc people using this environment, so I'm one of the few who > can fix it, but I can't tell you when I'll be able to deliver > 2) if you stick to an official tarball, the Fortran bindings should be shipped in. While I work on 1), could you stick to, e.g., https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fftp.mcs.anl.gov%2Fpub%2Fpetsc%2Frelease-snapshots%2Fpetsc-3.18.2.tar.gz&data=05%7C01%7Crajesh.singh%40pnnl.gov%7Ccf1975f0691a4a6f77a808dad734eec8%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638058922771517675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3gU1u2WRKKvNlBMbq%2FL6RZwvp%2FcRY%2BcLx5i9gJbp9nI%3D&reserved=0? > > Thanks, > Pierre > > > Regards > Rajesh > > From: Pierre Jolivet > > Sent: Monday, December 5, 2022 12:41 PM > To: Singh, Rajesh K > > > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers > > Check twice before you click! This email originated from outside PNNL. > > Hello Rajesh, > Do you need Fortran bindings? > Otherwise, ./configure --with-fortran-bindings=0 should do the trick. > Sowing compilation is broken with MinGW compiler. > If you need Fortran bindings, we could try to fix it. > > Thanks, > Pierre > > > > On 5 Dec 2022, at 9:22 PM, Singh, Rajesh K via petsc-users > wrote: > > Dear All: > > I am having diffisulty to install PETSC on the window system. I went through the the steps esplained in the web site and got following error. > > > > Help for resolving this issue would be really appricaited. > > Thanks, > Rajesh > > -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: application/octet-stream Size: 60155 bytes Desc: make.log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 1120478 bytes Desc: configure.log URL: From balay at mcs.anl.gov Mon Dec 5 23:43:31 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 5 Dec 2022 23:43:31 -0600 (CST) Subject: [petsc-users] Installation With MSYS2 and MinGW Compilers In-Reply-To: References: <8457B691-CBD6-4C3B-B46F-306F6F985600@joliv.et> <79287ACA-B977-43A1-A42A-B5A26023935C@joliv.et> <26ad1924-bea0-4c37-71fb-a164214902e6@mcs.anl.gov> Message-ID: <1ec10cfe-54e8-cfcd-c504-04740412d014@mcs.anl.gov> The build log looks fine. Its not clear why these warnings [and errors] are coming up in make check [while there are no such warnings in the build]. Since you are using PETSc fromfortran - Can you try compiling/running a test manually and see if that works. cd src/snes/tutorials make ex5f90 ./ex5f90 mpiexec -n 2 ./ex5f90 Satish On Tue, 6 Dec 2022, Singh, Rajesh K wrote: > Hi Satish, > > Thank you so much for offering help for installing PETSc in MSYS2 and mingw64. Attached are configure.log and make.log files. I am newb in this. Please let me know if you need further information. I will be glad to provide. > > Regards, > Rajesh > > -----Original Message----- > From: Satish Balay > Sent: Monday, December 5, 2022 6:53 PM > To: Singh, Rajesh K > Cc: Pierre Jolivet ; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers > > Can you send corresponding configure.log and make.log? [should be in PETSC_DIR/PETSC_ARCH/lib/petsc/conf > > Also what is your requirement wrt using PETSc on windows? > - need to link with other MS compiler libraries? > - Can you use WSL? > > Our primary instructions for windows usage is with MS/Intel compilers [with cygwin tools] not MSYS2. > > https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpetsc.org%2Frelease%2Finstall%2Fwindows%2F&data=05%7C01%7Crajesh.singh%40pnnl.gov%7Ccf1975f0691a4a6f77a808dad734eec8%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638058922771517675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hpyNWLZfFlS6NgV5ehRZ0EwyCnkrJw4yYT0HBvrVJF8%3D&reserved=0 > > But As Pierre mentioned - MSYS2 should also work. > > Satish > > On Mon, 5 Dec 2022, Singh, Rajesh K via petsc-users wrote: > > > Hi Pierre, > > > > I got following error while compiling PETSc. > > > > make all check > > > > [cid:image001.png at 01D908C1.7267DCE0] > > > > Help for this would be appreciated. > > > > Thanks, > > Rajesh > > > > From: Pierre Jolivet > > Sent: Monday, December 5, 2022 1:15 PM > > To: Singh, Rajesh K > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers > > > > > > On 5 Dec 2022, at 9:50 PM, Singh, Rajesh K > wrote: > > > > Hi Pierre, > > > > Thank you so much for prompt response. I will run FORTRAN based code with PETSc. Therefore I guess I will need Fortran binding. > > > > OK, so, two things: > > 1) as said earlier, Sowing is broken with MinGW, but I'm sadly one of > > the few PETSc people using this environment, so I'm one of the few who > > can fix it, but I can't tell you when I'll be able to deliver > > 2) if you stick to an official tarball, the Fortran bindings should be shipped in. While I work on 1), could you stick to, e.g., https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fftp.mcs.anl.gov%2Fpub%2Fpetsc%2Frelease-snapshots%2Fpetsc-3.18.2.tar.gz&data=05%7C01%7Crajesh.singh%40pnnl.gov%7Ccf1975f0691a4a6f77a808dad734eec8%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638058922771517675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3gU1u2WRKKvNlBMbq%2FL6RZwvp%2FcRY%2BcLx5i9gJbp9nI%3D&reserved=0? > > > > Thanks, > > Pierre > > > > > > Regards > > Rajesh > > > > From: Pierre Jolivet > > > Sent: Monday, December 5, 2022 12:41 PM > > To: Singh, Rajesh K > > > > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers > > > > Check twice before you click! This email originated from outside PNNL. > > > > Hello Rajesh, > > Do you need Fortran bindings? > > Otherwise, ./configure --with-fortran-bindings=0 should do the trick. > > Sowing compilation is broken with MinGW compiler. > > If you need Fortran bindings, we could try to fix it. > > > > Thanks, > > Pierre > > > > > > > > On 5 Dec 2022, at 9:22 PM, Singh, Rajesh K via petsc-users > wrote: > > > > Dear All: > > > > I am having diffisulty to install PETSC on the window system. I went through the the steps esplained in the web site and got following error. > > > > > > > > Help for resolving this issue would be really appricaited. > > > > Thanks, > > Rajesh > > > > > > From pierre at joliv.et Tue Dec 6 02:01:26 2022 From: pierre at joliv.et (Pierre Jolivet) Date: Tue, 6 Dec 2022 09:01:26 +0100 Subject: [petsc-users] Installation With MSYS2 and MinGW Compilers In-Reply-To: <1ec10cfe-54e8-cfcd-c504-04740412d014@mcs.anl.gov> References: <8457B691-CBD6-4C3B-B46F-306F6F985600@joliv.et> <79287ACA-B977-43A1-A42A-B5A26023935C@joliv.et> <26ad1924-bea0-4c37-71fb-a164214902e6@mcs.anl.gov> <1ec10cfe-54e8-cfcd-c504-04740412d014@mcs.anl.gov> Message-ID: <7D050C4D-7DB9-4DD9-A621-E54FCC66D8D1@joliv.et> > On 6 Dec 2022, at 6:43 AM, Satish Balay wrote: > > The build log looks fine. Its not clear why these warnings [and > errors] are coming up in make check [while there are no such warnings > in the build]. 1) MinGW does not handle %td and %zu, so a default build triggers tons of warnings. Rajesh, you can add CFLAGS="-Wno-format-extra-args -Wno-stringop-overflow -Wformat=0" CXXFLAGS="-Wno-format-extra-args -Wno-stringop-overflow -Wformat=0" to have much fewer gibberish printed on screen 2) I?m able to reproduce the ?Circular [?] dependency dropped.? warnings after doing two successive make, but I know too little about Fortran (or Sowing?) to understand what could trigger this. But I agree with Satish, the build looks good otherwise and should work. I made the necessary changes to Sowing, I hope I?ll be able to finish the MR (https://gitlab.com/petsc/petsc/-/merge_requests/5903/) by the end of the day so you can switch back to using the repository, if need be. Thanks, Pierre > Since you are using PETSc fromfortran - Can you try compiling/running > a test manually and see if that works. > > cd src/snes/tutorials > make ex5f90 > ./ex5f90 > mpiexec -n 2 ./ex5f90 > > Satish > > On Tue, 6 Dec 2022, Singh, Rajesh K wrote: > >> Hi Satish, >> >> Thank you so much for offering help for installing PETSc in MSYS2 and mingw64. Attached are configure.log and make.log files. I am newb in this. Please let me know if you need further information. I will be glad to provide. >> >> Regards, >> Rajesh >> >> -----Original Message----- >> From: Satish Balay >> Sent: Monday, December 5, 2022 6:53 PM >> To: Singh, Rajesh K >> Cc: Pierre Jolivet ; petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers >> >> Can you send corresponding configure.log and make.log? [should be in PETSC_DIR/PETSC_ARCH/lib/petsc/conf >> >> Also what is your requirement wrt using PETSc on windows? >> - need to link with other MS compiler libraries? >> - Can you use WSL? >> >> Our primary instructions for windows usage is with MS/Intel compilers [with cygwin tools] not MSYS2. >> >> https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpetsc.org%2Frelease%2Finstall%2Fwindows%2F&data=05%7C01%7Crajesh.singh%40pnnl.gov%7Ccf1975f0691a4a6f77a808dad734eec8%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638058922771517675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hpyNWLZfFlS6NgV5ehRZ0EwyCnkrJw4yYT0HBvrVJF8%3D&reserved=0 >> >> But As Pierre mentioned - MSYS2 should also work. >> >> Satish >> >> On Mon, 5 Dec 2022, Singh, Rajesh K via petsc-users wrote: >> >>> Hi Pierre, >>> >>> I got following error while compiling PETSc. >>> >>> make all check >>> >>> [cid:image001.png at 01D908C1.7267DCE0] >>> >>> Help for this would be appreciated. >>> >>> Thanks, >>> Rajesh >>> >>> From: Pierre Jolivet >>> Sent: Monday, December 5, 2022 1:15 PM >>> To: Singh, Rajesh K >>> Cc: petsc-users at mcs.anl.gov >>> Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers >>> >>> >>> On 5 Dec 2022, at 9:50 PM, Singh, Rajesh K > wrote: >>> >>> Hi Pierre, >>> >>> Thank you so much for prompt response. I will run FORTRAN based code with PETSc. Therefore I guess I will need Fortran binding. >>> >>> OK, so, two things: >>> 1) as said earlier, Sowing is broken with MinGW, but I'm sadly one of >>> the few PETSc people using this environment, so I'm one of the few who >>> can fix it, but I can't tell you when I'll be able to deliver >>> 2) if you stick to an official tarball, the Fortran bindings should be shipped in. While I work on 1), could you stick to, e.g., https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fftp.mcs.anl.gov%2Fpub%2Fpetsc%2Frelease-snapshots%2Fpetsc-3.18.2.tar.gz&data=05%7C01%7Crajesh.singh%40pnnl.gov%7Ccf1975f0691a4a6f77a808dad734eec8%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638058922771517675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3gU1u2WRKKvNlBMbq%2FL6RZwvp%2FcRY%2BcLx5i9gJbp9nI%3D&reserved=0 1u2WRKKvNlBMbq%2FL6RZwvp%2FcRY%2BcLx5i9gJbp9nI%3D&reserved=0>? >>> >>> Thanks, >>> Pierre >>> >>> >>> Regards >>> Rajesh >>> >>> From: Pierre Jolivet > >>> Sent: Monday, December 5, 2022 12:41 PM >>> To: Singh, Rajesh K >>> > >>> Cc: petsc-users at mcs.anl.gov >>> Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers >>> >>> Check twice before you click! This email originated from outside PNNL. >>> >>> Hello Rajesh, >>> Do you need Fortran bindings? >>> Otherwise, ./configure --with-fortran-bindings=0 should do the trick. >>> Sowing compilation is broken with MinGW compiler. >>> If you need Fortran bindings, we could try to fix it. >>> >>> Thanks, >>> Pierre >>> >>> >>> >>> On 5 Dec 2022, at 9:22 PM, Singh, Rajesh K via petsc-users > wrote: >>> >>> Dear All: >>> >>> I am having diffisulty to install PETSC on the window system. I went through the the steps esplained in the web site and got following error. >>> >>> >>> >>> Help for resolving this issue would be really appricaited. >>> >>> Thanks, >>> Rajesh >>> >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Tue Dec 6 04:15:46 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Tue, 6 Dec 2022 19:15:46 +0900 Subject: [petsc-users] About Preconditioner and MUMPS Message-ID: Hello, I have some questions about pc and mumps_icntl. 1. What?s the difference between adopt preconditioner by code (for example, PetscCall(PCSetType(pc,PCLU)) and option -pc_type lu?? And also, What?s the priority between code pcsettype and option -pc_type ?? 2. When I tried to use METIS in MUMPS, I adopted metis by option (for example, -mat_mumps_icntl_7 5). In this situation, it is impossible to use metis without pc_type lu. However, in my case pc type lu makes the performance poor. So I don?t want to use lu preconditioner. How can I do this? Thanks, Hyung Kim -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Tue Dec 6 05:45:16 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Tue, 6 Dec 2022 20:45:16 +0900 Subject: [petsc-users] About MPIRUN Message-ID: Hello, There is a code which can run in not mpirun and also it can run in mpi_linear_solver_server. However, it has an error in just mpirun case such as mpirun -np ./program. The error output is as below. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Object is in wrong state [0]PETSC ERROR: Not for unassembled vector [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.18.1, unknown [0]PETSC ERROR: ./app on a arch-linux-c-debug named ubuntu by ksi2443 Tue Dec 6 03:39:13 2022 [0]PETSC ERROR: Configure options -download-mumps -download-scalapack -download-parmetis -download-metis [0]PETSC ERROR: #1 VecCopy() at /home/ksi2443/petsc/src/vec/vec/interface/vector.c:1625 [0]PETSC ERROR: #2 KSPInitialResidual() at /home/ksi2443/petsc/src/ksp/ksp/interface/itres.c:60 [0]PETSC ERROR: #3 KSPSolve_GMRES() at /home/ksi2443/petsc/src/ksp/ksp/impls/gmres/gmres.c:227 [0]PETSC ERROR: #4 KSPSolve_Private() at /home/ksi2443/petsc/src/ksp/ksp/interface/itfunc.c:899 [0]PETSC ERROR: #5 KSPSolve() at /home/ksi2443/petsc/src/ksp/ksp/interface/itfunc.c:1071 [0]PETSC ERROR: #6 main() at /home/ksi2443/Downloads/coding/a1.c:450 [0]PETSC ERROR: No PETSc Option Table entries [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- How can I fix this?? Thanks, Hyung Kim -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Dec 6 07:56:28 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 6 Dec 2022 08:56:28 -0500 Subject: [petsc-users] About Preconditioner and MUMPS In-Reply-To: References: Message-ID: On Tue, Dec 6, 2022 at 5:16 AM ??? wrote: > Hello, > > > > I have some questions about pc and mumps_icntl. > > 1. What?s the difference between adopt preconditioner by code (for > example, PetscCall(PCSetType(pc,PCLU)) and option -pc_type lu?? > If you call KSPSetFromOptions(), then they are the same. > And also, What?s the priority between code pcsettype and option -pc_type ?? > PCSetType() will force the change. If you call KSPSetFromOptions() after that, -pc_type will override the change. 2. When I tried to use METIS in MUMPS, I adopted metis by option (for > example, -mat_mumps_icntl_7 5). In this situation, it is impossible to use > metis without pc_type lu. > This is incorrect. You might still call PCSetType() if you want, but you must call KSPSetFromOptions() after that. > However, in my case pc type lu makes the performance poor. So I don?t want > to use lu preconditioner. How can I do this? > This statement is incoherent. MUMPS is LU preconditioning, so if you use MUMPS you are using LU. Thanks, Matt > Thanks, > > Hyung Kim > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Dec 6 07:58:03 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 6 Dec 2022 08:58:03 -0500 Subject: [petsc-users] About MPIRUN In-Reply-To: References: Message-ID: On Tue, Dec 6, 2022 at 6:45 AM ??? wrote: > Hello, > > > There is a code which can run in not mpirun and also it can run in > mpi_linear_solver_server. > However, it has an error in just mpirun case such as mpirun -np ./program. > The error output is as below. > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: Not for unassembled vector > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.18.1, unknown > [0]PETSC ERROR: ./app on a arch-linux-c-debug named ubuntu by ksi2443 Tue > Dec 6 03:39:13 2022 > [0]PETSC ERROR: Configure options -download-mumps -download-scalapack > -download-parmetis -download-metis > [0]PETSC ERROR: #1 VecCopy() at > /home/ksi2443/petsc/src/vec/vec/interface/vector.c:1625 > [0]PETSC ERROR: #2 KSPInitialResidual() at > /home/ksi2443/petsc/src/ksp/ksp/interface/itres.c:60 > [0]PETSC ERROR: #3 KSPSolve_GMRES() at > /home/ksi2443/petsc/src/ksp/ksp/impls/gmres/gmres.c:227 > [0]PETSC ERROR: #4 KSPSolve_Private() at > /home/ksi2443/petsc/src/ksp/ksp/interface/itfunc.c:899 > [0]PETSC ERROR: #5 KSPSolve() at > /home/ksi2443/petsc/src/ksp/ksp/interface/itfunc.c:1071 > [0]PETSC ERROR: #6 main() at /home/ksi2443/Downloads/coding/a1.c:450 > [0]PETSC ERROR: No PETSc Option Table entries > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > > How can I fix this?? > It looks like we do not check the assembled state in parallel, since it cannot cause a problem, but every time you update values with VecSetValues(), you should call VecAssemblyBegin/End(). Thanks Matt > Thanks, > Hyung Kim > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Dec 6 09:24:33 2022 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 6 Dec 2022 10:24:33 -0500 Subject: [petsc-users] About Preconditioner and MUMPS In-Reply-To: References: Message-ID: > On Dec 6, 2022, at 5:15 AM, ??? wrote: > > Hello, > > > I have some questions about pc and mumps_icntl. > > 1. What?s the difference between adopt preconditioner by code (for example, PetscCall(PCSetType(pc,PCLU)) and option -pc_type lu?? > And also, What?s the priority between code pcsettype and option -pc_type ?? > > > 2. When I tried to use METIS in MUMPS, I adopted metis by option (for example, -mat_mumps_icntl_7 5). In this situation, it is impossible to use metis without pc_type lu. However, in my case pc type lu makes the performance poor. So I don?t want to use lu preconditioner. How can I do this? > The package MUMPS has an option to use metis in its ordering process which can be turned on as indicated while using MUMPS. Most preconditioners that PETSc can use do not use metis for any purpose hence there is no option to turn on its use. For what purpose do you wish to use metis? Partitioning, ordering, ? > > Thanks, > > Hyung Kim > -------------- next part -------------- An HTML attachment was scrubbed... URL: From facklerpw at ornl.gov Tue Dec 6 10:10:07 2022 From: facklerpw at ornl.gov (Fackler, Philip) Date: Tue, 6 Dec 2022 16:10:07 +0000 Subject: [petsc-users] [EXTERNAL] Re: Kokkos backend for Mat and Vec diverging when running on CUDA device. In-Reply-To: References: Message-ID: I think it would be simpler to use the develop branch for this issue. But you can still just build the SystemTester. Then (if you changed the PSI_1 case) run: ./test/system/SystemTester -t System/PSI_1 -- -v? (No need for multiple MPI ranks) Thanks, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang Sent: Monday, December 5, 2022 15:40 To: Fackler, Philip Cc: xolotl-psi-development at lists.sourceforge.net ; petsc-users at mcs.anl.gov ; Blondel, Sophie ; Roth, Philip Subject: Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec diverging when running on CUDA device. I configured with xolotl branch feature-petsc-kokkos, and typed `make` under ~/xolotl-build/. Though there were errors, a lot of *Tester were built. [ 62%] Built target xolotlViz [ 63%] Linking CXX executable TemperatureProfileHandlerTester [ 64%] Linking CXX executable TemperatureGradientHandlerTester [ 64%] Built target TemperatureProfileHandlerTester [ 64%] Built target TemperatureConstantHandlerTester [ 64%] Built target TemperatureGradientHandlerTester [ 65%] Linking CXX executable HeatEquationHandlerTester [ 65%] Built target HeatEquationHandlerTester [ 66%] Linking CXX executable FeFitFluxHandlerTester [ 66%] Linking CXX executable W111FitFluxHandlerTester [ 67%] Linking CXX executable FuelFitFluxHandlerTester [ 67%] Linking CXX executable W211FitFluxHandlerTester Which Tester should I use to run with the parameter file benchmarks/params_system_PSI_2.txt? And how many ranks should I use? Could you give an example command line? Thanks. --Junchao Zhang On Mon, Dec 5, 2022 at 2:22 PM Junchao Zhang > wrote: Hello, Philip, Do I still need to use the feature-petsc-kokkos branch? --Junchao Zhang On Mon, Dec 5, 2022 at 11:08 AM Fackler, Philip > wrote: Junchao, Thank you for working on this. If you open the parameter file for, say, the PSI_2 system test case (benchmarks/params_system_PSI_2.txt), simply add -dm_mat_type aijkokkos -dm_vec_type kokkos?` to the "petscArgs=" field (or the corresponding cusparse/cuda option). Thanks, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang > Sent: Thursday, December 1, 2022 17:05 To: Fackler, Philip > Cc: xolotl-psi-development at lists.sourceforge.net >; petsc-users at mcs.anl.gov >; Blondel, Sophie >; Roth, Philip > Subject: Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec diverging when running on CUDA device. Hi, Philip, Sorry for the long delay. I could not get something useful from the -log_view output. Since I have already built xolotl, could you give me instructions on how to do a xolotl test to reproduce the divergence with petsc GPU backends (but fine on CPU)? Thank you. --Junchao Zhang On Wed, Nov 16, 2022 at 1:38 PM Fackler, Philip > wrote: ------------------------------------------------------------------ PETSc Performance Summary: ------------------------------------------------------------------ Unknown Name on a named PC0115427 with 1 processor, by 4pf Wed Nov 16 14:36:46 2022 Using Petsc Development GIT revision: v3.18.1-115-gdca010e0e9a GIT Date: 2022-10-28 14:39:41 +0000 Max Max/Min Avg Total Time (sec): 6.023e+00 1.000 6.023e+00 Objects: 1.020e+02 1.000 1.020e+02 Flops: 1.080e+09 1.000 1.080e+09 1.080e+09 Flops/sec: 1.793e+08 1.000 1.793e+08 1.793e+08 MPI Msg Count: 0.000e+00 0.000 0.000e+00 0.000e+00 MPI Msg Len (bytes): 0.000e+00 0.000 0.000e+00 0.000e+00 MPI Reductions: 0.000e+00 0.000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 6.0226e+00 100.0% 1.0799e+09 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) CpuToGpu Count: total number of CPU to GPU copies per processor CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) GpuToCpu Count: total number of GPU to CPU copies per processor GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) GPU %F: percent flops on GPU in this event ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F ------------------------------------------------------------------------------------------------------------------------ --------------------------------------- --- Event Stage 0: Main Stage BuildTwoSided 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 DMCreateMat 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFSetGraph 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFSetUp 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFPack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFUnpack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecDot 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecMDot 775 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecNorm 1728 1.0 nan nan 1.92e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecScale 1983 1.0 nan nan 6.24e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecCopy 780 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecSet 4955 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecAXPY 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecAYPX 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecAXPBYCZ 643 1.0 nan nan 1.79e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecWAXPY 502 1.0 nan nan 5.58e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecMAXPY 1159 1.0 nan nan 3.68e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecScatterBegin 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan -nan 2 5.14e-03 0 0.00e+00 0 VecScatterEnd 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecReduceArith 380 1.0 nan nan 4.23e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecReduceComm 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecNormalize 965 1.0 nan nan 1.61e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 TSStep 20 1.0 5.8699e+00 1.0 1.08e+09 1.0 0.0e+00 0.0e+00 0.0e+00 97100 0 0 0 97100 0 0 0 184 -nan 2 5.14e-03 0 0.00e+00 54 TSFunctionEval 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 0.0e+00 63 1 0 0 0 63 1 0 0 0 -nan -nan 1 3.36e-04 0 0.00e+00 100 TSJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 97 MatMult 1930 1.0 nan nan 4.46e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 41 0 0 0 1 41 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 MatMultTranspose 1 1.0 nan nan 3.44e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 MatSolve 965 1.0 nan nan 5.04e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatSOR 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatLUFactorSym 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatLUFactorNum 190 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 11 0 0 0 1 11 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatScale 190 1.0 nan nan 3.26e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 MatAssemblyBegin 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatAssemblyEnd 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatGetRowIJ 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatCreateSubMats 380 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatGetOrdering 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatZeroEntries 379 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatSetPreallCOO 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatSetValuesCOO 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSetUp 760 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSolve 190 1.0 5.8052e-01 1.0 9.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 10 86 0 0 0 10 86 0 0 0 1602 -nan 1 4.80e-03 0 0.00e+00 46 KSPGMRESOrthog 775 1.0 nan nan 2.27e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 SNESSolve 71 1.0 5.7117e+00 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 0.0e+00 95 99 0 0 0 95 99 0 0 0 188 -nan 1 4.80e-03 0 0.00e+00 53 SNESSetUp 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SNESFunctionEval 573 1.0 nan nan 2.23e+07 1.0 0.0e+00 0.0e+00 0.0e+00 60 2 0 0 0 60 2 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 SNESJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 97 SNESLineSearch 190 1.0 nan nan 1.05e+08 1.0 0.0e+00 0.0e+00 0.0e+00 53 10 0 0 0 53 10 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 PCSetUp 570 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 11 0 0 0 2 11 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 PCApply 965 1.0 nan nan 6.14e+08 1.0 0.0e+00 0.0e+00 0.0e+00 8 57 0 0 0 8 57 0 0 0 -nan -nan 1 4.80e-03 0 0.00e+00 19 KSPSolve_FS_0 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSolve_FS_1 965 1.0 nan nan 1.66e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 15 0 0 0 2 15 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 --- Event Stage 1: Unknown ------------------------------------------------------------------------------------------------------------------------ --------------------------------------- Object Type Creations Destructions. Reports information only for process 0. --- Event Stage 0: Main Stage Container 5 5 Distributed Mesh 2 2 Index Set 11 11 IS L to G Mapping 1 1 Star Forest Graph 7 7 Discrete System 2 2 Weak Form 2 2 Vector 49 49 TSAdapt 1 1 TS 1 1 DMTS 1 1 SNES 1 1 DMSNES 3 3 SNESLineSearch 1 1 Krylov Solver 4 4 DMKSP interface 1 1 Matrix 4 4 Preconditioner 4 4 Viewer 2 1 --- Event Stage 1: Unknown ======================================================================================================================== Average time to get PetscTime(): 3.14e-08 #PETSc Option Table entries: -log_view -log_view_gpu_times #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with 64 bit PetscInt Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 Configure options: PETSC_DIR=/home/4pf/repos/petsc PETSC_ARCH=arch-kokkos-cuda-no-tpls --with-cc=mpicc --with-cxx=mpicxx --with-fc=0 --with-cuda --with-debugging=0 --with-shared-libraries --prefix=/home/4pf/build/petsc/cuda-no-tpls/install --with-64-bit-indices --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --CUDAOPTFLAGS=-O3 --with-kokkos-dir=/home/4pf/build/kokkos/cuda/install --with-kokkos-kernels-dir=/home/4pf/build/kokkos-kernels/cuda-no-tpls/install ----------------------------------------- Libraries compiled on 2022-11-01 21:01:08 on PC0115427 Machine characteristics: Linux-5.15.0-52-generic-x86_64-with-glibc2.35 Using PETSc directory: /home/4pf/build/petsc/cuda-no-tpls/install Using PETSc arch: ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -O3 ----------------------------------------- Using include paths: -I/home/4pf/build/petsc/cuda-no-tpls/install/include -I/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/include -I/home/4pf/build/kokkos/cuda/install/include -I/usr/local/cuda-11.8/include ----------------------------------------- Using C linker: mpicc Using libraries: -Wl,-rpath,/home/4pf/build/petsc/cuda-no-tpls/install/lib -L/home/4pf/build/petsc/cuda-no-tpls/install/lib -lpetsc -Wl,-rpath,/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib -L/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib -Wl,-rpath,/home/4pf/build/kokkos/cuda/install/lib -L/home/4pf/build/kokkos/cuda/install/lib -Wl,-rpath,/usr/local/cuda-11.8/lib64 -L/usr/local/cuda-11.8/lib64 -L/usr/local/cuda-11.8/lib64/stubs -lkokkoskernels -lkokkoscontainers -lkokkoscore -llapack -lblas -lm -lcudart -lnvToolsExt -lcufft -lcublas -lcusparse -lcusolver -lcurand -lcuda -lquadmath -lstdc++ -ldl ----------------------------------------- Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang > Sent: Tuesday, November 15, 2022 13:03 To: Fackler, Philip > Cc: xolotl-psi-development at lists.sourceforge.net >; petsc-users at mcs.anl.gov >; Blondel, Sophie >; Roth, Philip > Subject: Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec diverging when running on CUDA device. Can you paste -log_view result so I can see what functions are used? --Junchao Zhang On Tue, Nov 15, 2022 at 10:24 AM Fackler, Philip > wrote: Yes, most (but not all) of our system test cases fail with the kokkos/cuda or cuda backends. All of them pass with the CPU-only kokkos backend. Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang > Sent: Monday, November 14, 2022 19:34 To: Fackler, Philip > Cc: xolotl-psi-development at lists.sourceforge.net >; petsc-users at mcs.anl.gov >; Blondel, Sophie >; Zhang, Junchao >; Roth, Philip > Subject: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec diverging when running on CUDA device. Hi, Philip, Sorry to hear that. It seems you could run the same code on CPUs but not no GPUs (with either petsc/Kokkos backend or petsc/cuda backend, is it right? --Junchao Zhang On Mon, Nov 14, 2022 at 12:13 PM Fackler, Philip via petsc-users > wrote: This is an issue I've brought up before (and discussed in-person with Richard). I wanted to bring it up again because I'm hitting the limits of what I know to do, and I need help figuring this out. The problem can be reproduced using Xolotl's "develop" branch built against a petsc build with kokkos and kokkos-kernels enabled. Then, either add the relevant kokkos options to the "petscArgs=" line in the system test parameter file(s), or just replace the system test parameter files with the ones from the "feature-petsc-kokkos" branch. See here the files that begin with "params_system_". Note that those files use the "kokkos" options, but the problem is similar using the corresponding cuda/cusparse options. I've already tried building kokkos-kernels with no TPLs and got slightly different results, but the same problem. Any help would be appreciated. Thanks, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From rajesh.singh at pnnl.gov Tue Dec 6 12:50:54 2022 From: rajesh.singh at pnnl.gov (Singh, Rajesh K) Date: Tue, 6 Dec 2022 18:50:54 +0000 Subject: [petsc-users] Installation With MSYS2 and MinGW Compilers In-Reply-To: <7D050C4D-7DB9-4DD9-A621-E54FCC66D8D1@joliv.et> References: <8457B691-CBD6-4C3B-B46F-306F6F985600@joliv.et> <79287ACA-B977-43A1-A42A-B5A26023935C@joliv.et> <26ad1924-bea0-4c37-71fb-a164214902e6@mcs.anl.gov> <1ec10cfe-54e8-cfcd-c504-04740412d014@mcs.anl.gov> <7D050C4D-7DB9-4DD9-A621-E54FCC66D8D1@joliv.et> Message-ID: Hi Pierre, Thank you again for prompt response. 1) MinGW does not handle %td and %zu, so a default build triggers tons of warnings. Rajesh, you can add CFLAGS="-Wno-format-extra-args -Wno-stringop-overflow -Wformat=0" CXXFLAGS="-Wno-format-extra-args -Wno-stringop-overflow -Wformat=0" to have much fewer gibberish printed on screen Where should I add these FLAGS? In make file or somewhere else. I also followed steps suggested by Sathish. I got errors. Sing***@WE***** MINGW64 ~/petsc-3.18.2/src/snes/tutorials $ mpiexe -n 2 ex5f90.exe bash: mpiexe: command not found Then I tried command mpif90.exe -n 2 ./ex5f90.exe I received many errors. Your helps for fixing these issue would be appreciated. Thanks, Rajesh From: Pierre Jolivet Sent: Tuesday, December 6, 2022 12:01 AM To: petsc-users Cc: Singh, Rajesh K Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers On 6 Dec 2022, at 6:43 AM, Satish Balay > wrote: The build log looks fine. Its not clear why these warnings [and errors] are coming up in make check [while there are no such warnings in the build]. 1) MinGW does not handle %td and %zu, so a default build triggers tons of warnings. Rajesh, you can add CFLAGS="-Wno-format-extra-args -Wno-stringop-overflow -Wformat=0" CXXFLAGS="-Wno-format-extra-args -Wno-stringop-overflow -Wformat=0" to have much fewer gibberish printed on screen 2) I'm able to reproduce the "Circular [...] dependency dropped." warnings after doing two successive make, but I know too little about Fortran (or Sowing?) to understand what could trigger this. But I agree with Satish, the build looks good otherwise and should work. I made the necessary changes to Sowing, I hope I'll be able to finish the MR (https://gitlab.com/petsc/petsc/-/merge_requests/5903/) by the end of the day so you can switch back to using the repository, if need be. Thanks, Pierre Since you are using PETSc fromfortran - Can you try compiling/running a test manually and see if that works. cd src/snes/tutorials make ex5f90 ./ex5f90 mpiexec -n 2 ./ex5f90 Satish On Tue, 6 Dec 2022, Singh, Rajesh K wrote: Hi Satish, Thank you so much for offering help for installing PETSc in MSYS2 and mingw64. Attached are configure.log and make.log files. I am newb in this. Please let me know if you need further information. I will be glad to provide. Regards, Rajesh -----Original Message----- From: Satish Balay > Sent: Monday, December 5, 2022 6:53 PM To: Singh, Rajesh K > Cc: Pierre Jolivet >; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers Can you send corresponding configure.log and make.log? [should be in PETSC_DIR/PETSC_ARCH/lib/petsc/conf Also what is your requirement wrt using PETSc on windows? - need to link with other MS compiler libraries? - Can you use WSL? Our primary instructions for windows usage is with MS/Intel compilers [with cygwin tools] not MSYS2. https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpetsc.org%2Frelease%2Finstall%2Fwindows%2F&data=05%7C01%7Crajesh.singh%40pnnl.gov%7Ccf1975f0691a4a6f77a808dad734eec8%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638058922771517675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hpyNWLZfFlS6NgV5ehRZ0EwyCnkrJw4yYT0HBvrVJF8%3D&reserved=0 But As Pierre mentioned - MSYS2 should also work. Satish On Mon, 5 Dec 2022, Singh, Rajesh K via petsc-users wrote: Hi Pierre, I got following error while compiling PETSc. make all check [cid:image001.png at 01D908C1.7267DCE0] Help for this would be appreciated. Thanks, Rajesh From: Pierre Jolivet > Sent: Monday, December 5, 2022 1:15 PM To: Singh, Rajesh K > Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers On 5 Dec 2022, at 9:50 PM, Singh, Rajesh K >> wrote: Hi Pierre, Thank you so much for prompt response. I will run FORTRAN based code with PETSc. Therefore I guess I will need Fortran binding. OK, so, two things: 1) as said earlier, Sowing is broken with MinGW, but I'm sadly one of the few PETSc people using this environment, so I'm one of the few who can fix it, but I can't tell you when I'll be able to deliver 2) if you stick to an official tarball, the Fortran bindings should be shipped in. While I work on 1), could you stick to, e.g., https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fftp.mcs.anl.gov%2Fpub%2Fpetsc%2Frelease-snapshots%2Fpetsc-3.18.2.tar.gz&data=05%7C01%7Crajesh.singh%40pnnl.gov%7Ccf1975f0691a4a6f77a808dad734eec8%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638058922771517675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3gU1u2WRKKvNlBMbq%2FL6RZwvp%2FcRY%2BcLx5i9gJbp9nI%3D&reserved=0 1u2WRKKvNlBMbq%2FL6RZwvp%2FcRY%2BcLx5i9gJbp9nI%3D&reserved=0>? Thanks, Pierre Regards Rajesh From: Pierre Jolivet >> Sent: Monday, December 5, 2022 12:41 PM To: Singh, Rajesh K >> Cc: petsc-users at mcs.anl.gov> Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers Check twice before you click! This email originated from outside PNNL. Hello Rajesh, Do you need Fortran bindings? Otherwise, ./configure --with-fortran-bindings=0 should do the trick. Sowing compilation is broken with MinGW compiler. If you need Fortran bindings, we could try to fix it. Thanks, Pierre On 5 Dec 2022, at 9:22 PM, Singh, Rajesh K via petsc-users >> wrote: Dear All: I am having diffisulty to install PETSC on the window system. I went through the the steps esplained in the web site and got following error. Help for resolving this issue would be really appricaited. Thanks, Rajesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Tue Dec 6 13:03:39 2022 From: pierre at joliv.et (Pierre Jolivet) Date: Tue, 6 Dec 2022 20:03:39 +0100 Subject: [petsc-users] Installation With MSYS2 and MinGW Compilers In-Reply-To: References: <8457B691-CBD6-4C3B-B46F-306F6F985600@joliv.et> <79287ACA-B977-43A1-A42A-B5A26023935C@joliv.et> <26ad1924-bea0-4c37-71fb-a164214902e6@mcs.anl.gov> <1ec10cfe-54e8-cfcd-c504-04740412d014@mcs.anl.gov> <7D050C4D-7DB9-4DD9-A621-E54FCC66D8D1@joliv.et> Message-ID: > On 6 Dec 2022, at 7:50 PM, Singh, Rajesh K wrote: > > Hi Pierre, > > Thank you again for prompt response. > > 1) MinGW does not handle %td and %zu, so a default build triggers tons of warnings. Rajesh, you can add CFLAGS="-Wno-format-extra-args -Wno-stringop-overflow -Wformat=0" CXXFLAGS="-Wno-format-extra-args -Wno-stringop-overflow -Wformat=0" to have much fewer gibberish printed on screen > > Where should I add these FLAGS? In make file or somewhere else. On the ./configure command line. > I also followed steps suggested by Sathish. I got errors. > > Sing***@WE***** MINGW64 ~/petsc-3.18.2/src/snes/tutorials > $ mpiexe -n 2 ex5f90.exe > bash: mpiexe: command not found Either add Microsoft MPI to your path, or use /C/Program\ Files/Microsoft\ MPI/Bin/mpiexec.exe instead of mpiexe Thanks, Pierre > Then I tried command > > mpif90.exe -n 2 ./ex5f90.exe > > I received many errors. > > Your helps for fixing these issue would be appreciated. > > Thanks, > Rajesh > > > > > > From: Pierre Jolivet > Sent: Tuesday, December 6, 2022 12:01 AM > To: petsc-users > Cc: Singh, Rajesh K > Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers > > > On 6 Dec 2022, at 6:43 AM, Satish Balay > wrote: > > The build log looks fine. Its not clear why these warnings [and > errors] are coming up in make check [while there are no such warnings > in the build]. > > 1) MinGW does not handle %td and %zu, so a default build triggers tons of warnings. Rajesh, you can add CFLAGS="-Wno-format-extra-args -Wno-stringop-overflow -Wformat=0" CXXFLAGS="-Wno-format-extra-args -Wno-stringop-overflow -Wformat=0" to have much fewer gibberish printed on screen > 2) I?m able to reproduce the ?Circular [?] dependency dropped.? warnings after doing two successive make, but I know too little about Fortran (or Sowing?) to understand what could trigger this. > But I agree with Satish, the build looks good otherwise and should work. > I made the necessary changes to Sowing, I hope I?ll be able to finish the MR (https://gitlab.com/petsc/petsc/-/merge_requests/5903/ ) by the end of the day so you can switch back to using the repository, if need be. > > Thanks, > Pierre > > > Since you are using PETSc fromfortran - Can you try compiling/running > a test manually and see if that works. > > cd src/snes/tutorials > make ex5f90 > ./ex5f90 > mpiexec -n 2 ./ex5f90 > > Satish > > On Tue, 6 Dec 2022, Singh, Rajesh K wrote: > > > Hi Satish, > > Thank you so much for offering help for installing PETSc in MSYS2 and mingw64. Attached are configure.log and make.log files. I am newb in this. Please let me know if you need further information. I will be glad to provide. > > Regards, > Rajesh > > -----Original Message----- > From: Satish Balay > > Sent: Monday, December 5, 2022 6:53 PM > To: Singh, Rajesh K > > Cc: Pierre Jolivet >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers > > Can you send corresponding configure.log and make.log? [should be in PETSC_DIR/PETSC_ARCH/lib/petsc/conf > > Also what is your requirement wrt using PETSc on windows? > - need to link with other MS compiler libraries? > - Can you use WSL? > > Our primary instructions for windows usage is with MS/Intel compilers [with cygwin tools] not MSYS2. > > https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpetsc.org%2Frelease%2Finstall%2Fwindows%2F&data=05%7C01%7Crajesh.singh%40pnnl.gov%7Ccf1975f0691a4a6f77a808dad734eec8%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638058922771517675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=hpyNWLZfFlS6NgV5ehRZ0EwyCnkrJw4yYT0HBvrVJF8%3D&reserved=0 > > But As Pierre mentioned - MSYS2 should also work. > > Satish > > On Mon, 5 Dec 2022, Singh, Rajesh K via petsc-users wrote: > > > Hi Pierre, > > I got following error while compiling PETSc. > > make all check > > [cid:image001.png at 01D908C1.7267DCE0] > > Help for this would be appreciated. > > Thanks, > Rajesh > > From: Pierre Jolivet > > Sent: Monday, December 5, 2022 1:15 PM > To: Singh, Rajesh K > > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers > > > On 5 Dec 2022, at 9:50 PM, Singh, Rajesh K >> wrote: > > Hi Pierre, > > Thank you so much for prompt response. I will run FORTRAN based code with PETSc. Therefore I guess I will need Fortran binding. > > OK, so, two things: > 1) as said earlier, Sowing is broken with MinGW, but I'm sadly one of > the few PETSc people using this environment, so I'm one of the few who > can fix it, but I can't tell you when I'll be able to deliver > 2) if you stick to an official tarball, the Fortran bindings should be shipped in. While I work on 1), could you stick to, e.g., https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fftp.mcs.anl.gov%2Fpub%2Fpetsc%2Frelease-snapshots%2Fpetsc-3.18.2.tar.gz&data=05%7C01%7Crajesh.singh%40pnnl.gov%7Ccf1975f0691a4a6f77a808dad734eec8%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C638058922771517675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3gU1u2WRKKvNlBMbq%2FL6RZwvp%2FcRY%2BcLx5i9gJbp9nI%3D&reserved=0 > 1u2WRKKvNlBMbq%2FL6RZwvp%2FcRY%2BcLx5i9gJbp9nI%3D&reserved=0>? > > > Thanks, > Pierre > > > Regards > Rajesh > > From: Pierre Jolivet >> > Sent: Monday, December 5, 2022 12:41 PM > To: Singh, Rajesh K > >> > Cc: petsc-users at mcs.anl.gov> > Subject: Re: [petsc-users] Installation With MSYS2 and MinGW Compilers > > Check twice before you click! This email originated from outside PNNL. > > Hello Rajesh, > Do you need Fortran bindings? > Otherwise, ./configure --with-fortran-bindings=0 should do the trick. > Sowing compilation is broken with MinGW compiler. > If you need Fortran bindings, we could try to fix it. > > Thanks, > Pierre > > > > On 5 Dec 2022, at 9:22 PM, Singh, Rajesh K via petsc-users >> wrote: > > Dear All: > > I am having diffisulty to install PETSC on the window system. I went through the the steps esplained in the web site and got following error. > > > > Help for resolving this issue would be really appricaited. > > Thanks, > Rajesh > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Tue Dec 6 22:34:57 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Wed, 7 Dec 2022 13:34:57 +0900 Subject: [petsc-users] About MPIRUN In-Reply-To: References: Message-ID: I already done VecAssemblyBegin/End(). However, only mpirun case these outputs are represented. There are more error outputs as below. -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF with errorcode 73. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- [ubuntu:02473] PMIX ERROR: UNREACHABLE in file ../../../src/server/pmix_server.c at line 2193 [ubuntu:02473] PMIX ERROR: UNREACHABLE in file ../../../src/server/pmix_server.c at line 2193 [ubuntu:02473] PMIX ERROR: UNREACHABLE in file ../../../src/server/pmix_server.c at line 2193 [ubuntu:02473] 3 more processes have sent help message help-mpi-api.txt / mpi-abort [ubuntu:02473] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages Could this be the cause of the former petsc error?? Thanks, Hyung Kim 2022? 12? 6? (?) ?? 10:58, Matthew Knepley ?? ??: > On Tue, Dec 6, 2022 at 6:45 AM ??? wrote: > >> Hello, >> >> >> There is a code which can run in not mpirun and also it can run in >> mpi_linear_solver_server. >> However, it has an error in just mpirun case such as mpirun -np ./program. >> The error output is as below. >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: Object is in wrong state >> [0]PETSC ERROR: Not for unassembled vector >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.18.1, unknown >> [0]PETSC ERROR: ./app on a arch-linux-c-debug named ubuntu by ksi2443 Tue >> Dec 6 03:39:13 2022 >> [0]PETSC ERROR: Configure options -download-mumps -download-scalapack >> -download-parmetis -download-metis >> [0]PETSC ERROR: #1 VecCopy() at >> /home/ksi2443/petsc/src/vec/vec/interface/vector.c:1625 >> [0]PETSC ERROR: #2 KSPInitialResidual() at >> /home/ksi2443/petsc/src/ksp/ksp/interface/itres.c:60 >> [0]PETSC ERROR: #3 KSPSolve_GMRES() at >> /home/ksi2443/petsc/src/ksp/ksp/impls/gmres/gmres.c:227 >> [0]PETSC ERROR: #4 KSPSolve_Private() at >> /home/ksi2443/petsc/src/ksp/ksp/interface/itfunc.c:899 >> [0]PETSC ERROR: #5 KSPSolve() at >> /home/ksi2443/petsc/src/ksp/ksp/interface/itfunc.c:1071 >> [0]PETSC ERROR: #6 main() at /home/ksi2443/Downloads/coding/a1.c:450 >> [0]PETSC ERROR: No PETSc Option Table entries >> [0]PETSC ERROR: ----------------End of Error Message -------send entire >> error message to petsc-maint at mcs.anl.gov---------- >> >> How can I fix this?? >> > > It looks like we do not check the assembled state in parallel, since it > cannot cause a problem, but every time you > update values with VecSetValues(), you should call VecAssemblyBegin/End(). > > Thanks > > Matt > > >> Thanks, >> Hyung Kim >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From narnoldm at umich.edu Wed Dec 7 02:34:43 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Wed, 7 Dec 2022 03:34:43 -0500 Subject: [petsc-users] Petsc Section in DMPlex In-Reply-To: References: Message-ID: Hi Matthew Thank you for the help. This clarified a great deal. I have a follow-up question related to DMPlexFilter. It may be better to describe what I'm trying to achieve. I have a general mesh I am solving which has a section with cell center finite volume states, as described in my initial email. After calculating some metrics, I tag a bunch of cells with an identifying Label and use DMFilter to generate a new DM which is only that subset of cells. Generally, this leads to a pretty unbalanced DM so I then plan to use DMPlexDIstribute to balance that DM across the processors. The coordinates pass along fine, but the state(or I should say Section) does not at least as far as I can tell. Assuming I can get a filtered DM I then distribute the DM and state using the method you described above and it seems to be working ok. The last connection I have to make is the transfer of information from the full mesh to the "sampled" filtered mesh. From what I can gather I would need to get the mapping of points using DMPlexGetSubpointIS and then manually copy the values from the full DM section to the filtered DM? I have the process from full->filtered->distributed all working for the coordinates so its just a matter of transferring the section correctly. I appreciate all the help you have provided. Sincerely Nicholas On Mon, Nov 28, 2022 at 6:19 AM Matthew Knepley wrote: > On Sun, Nov 27, 2022 at 10:22 PM Nicholas Arnold-Medabalimi < > narnoldm at umich.edu> wrote: > >> Hi Petsc Users >> >> I have a question about properly using PetscSection to assign state >> variables to a DM. I have an existing DMPlex mesh distributed on 2 >> processors. My goal is to have state variables set to the cell centers. I >> then want to call DMPlexDistribute, which I hope will balance the mesh >> elements and hopefully transport the state variables to the hosting >> processors as the cells are distributed to a different processor count or >> simply just redistributing after doing mesh adaption. >> >> Looking at the DMPlex User guide, I should be able to achieve this with a >> single field section using SetDof and assigning the DOF to the points >> corresponding to cells. >> > > Note that if you want several different fields, you can clone the DM > first for this field > > call DMClone(dm,dmState,ierr) > > and use dmState in your calls below. > > >> >> call DMPlexGetHeightStratum(dm,0,c0,c1,ierr) >> call DMPlexGetChart(dm,p0,p1,ierr) >> call PetscSectionCreate(PETSC_COMM_WORLD,section,ierr) >> call PetscSectionSetNumFields(section,1,ierr) call >> PetscSectionSetChart(section,p0,p1,ierr) >> do i = c0, (c1-1) >> call PetscSectionSetDof(section,i,nvar,ierr) >> end do >> call PetscSectionSetup(section,ierr) >> call DMSetLocalSection(dm,section,ierr) >> > > In the loop, I would add a call to > > call PetscSectionSetFieldDof(section,i,0,nvar,ierr) > > This also puts in the field breakdown. It is not essential, but nicer. > > >> From here, it looks like I can access and set the state vars using >> >> call DMGetGlobalVector(dmplex,state,ierr) >> call DMGetGlobalSection(dmplex,section,ierr) >> call VecGetArrayF90(state,stateVec,ierr) >> do i = c0, (c1-1) >> call PetscSectionGetOffset(section,i,offset,ierr) >> stateVec(offset:(offset+nvar))=state_i(:) !simplified assignment >> end do >> call VecRestoreArrayF90(state,stateVec,ierr) >> call DMRestoreGlobalVector(dmplex,state,ierr) >> >> To my understanding, I should be using Global vector since this is a pure >> assignment operation and I don't need the ghost cells. >> > > Yes. > > But the behavior I am seeing isn't exactly what I'd expect. >> >> To be honest, I'm somewhat unclear on a few things >> >> 1) Should be using nvar fields with 1 DOF each or 1 field with nvar >> DOFs or what the distinction between the two methods are? >> > > We have two divisions in a Section. A field can have a number of > components. This is intended to model a vector or tensor field. > Then a Section can have a number of fields, such as velocity and pressure > for a Stokes problem. The division is mainly to help the > user, so I would use the most natural one. > > >> 2) Adding a print statement after the offset assignment I get (on rank 0 >> of 2) >> cell 1 offset 0 >> cell 2 offset 18 >> cell 3 offset 36 >> which is expected and works but on rank 1 I get >> cell 1 offset 9000 >> cell 2 offset 9018 >> cell 3 offset 9036 >> >> which isn't exactly what I would expect. Shouldn't the offsets reset at 0 >> for the next rank? >> > > The local and global sections hold different information. This is the > source of the confusion. The local section does describe a local > vector, and thus includes overlap or "ghost" dofs. The global section > describes a global vector. However, it is intended to deliver > global indices, and thus the offsets give back global indices. When you > use VecGetArray*() you are getting out the local array, and > thus you have to subtract the first index on this process. You can get > that from > > VecGetOwnershipRange(v, &rstart, &rEnd); > > This is the same whether you are using DMDA or DMPlex or any other DM. > > >> 3) Does calling DMPlexDistribute also distribute the section data >> associated with the DOF, based on the description in DMPlexDistribute it >> looks like it should? >> > > No. By default, DMPlexDistribute() only distributes coordinate data. I you > want to distribute your field, it would look something like this: > > DMPlexDistribute(dm, 0, &sfDist, &dmDist); > VecCreate(comm, &stateDist); > VecSetDM(sateDist, dmDist); > PetscSectionCreate(comm §ionDist); > DMSetLocalSection(dmDist, sectionDist); > DMPlexDistributeField(dmDist, sfDist, section, state, sectionDist, > stateDist); > > We do this in src/dm/impls/plex/tests/ex36.c > > THanks, > > Matt > > I'd appreciate any insight into the specifics of this usage. I expect I >> have a misconception on the local vs global section. Thank you. >> >> Sincerely >> Nicholas >> >> -- >> Nicholas Arnold-Medabalimi >> >> Ph.D. Candidate >> Computational Aeroscience Lab >> University of Michigan >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Wed Dec 7 04:02:52 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Wed, 7 Dec 2022 19:02:52 +0900 Subject: [petsc-users] About MPIRUN In-Reply-To: References: Message-ID: This error was caused by the inconsistent index of vecgetvalues in the mpirun case. For example, for the problem that the global vector size is 4, when mpirun -np 2, the value obtained from each process with vecgetvalues should be 2, but in my code tried to get 4 values, so it became a problem. How to solve this problem? I want to get a scalar array so that all process array has the same value with global vector size and values. Thanks, Hyung Kim 2022? 12? 7? (?) ?? 1:34, ??? ?? ??: > I already done VecAssemblyBegin/End(). > > However, only mpirun case these outputs are represented. > There are more error outputs as below. > > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF > with errorcode 73. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > [ubuntu:02473] PMIX ERROR: UNREACHABLE in file > ../../../src/server/pmix_server.c at line 2193 > [ubuntu:02473] PMIX ERROR: UNREACHABLE in file > ../../../src/server/pmix_server.c at line 2193 > [ubuntu:02473] PMIX ERROR: UNREACHABLE in file > ../../../src/server/pmix_server.c at line 2193 > [ubuntu:02473] 3 more processes have sent help message help-mpi-api.txt / > mpi-abort > [ubuntu:02473] Set MCA parameter "orte_base_help_aggregate" to 0 to see > all help / error messages > > Could this be the cause of the former petsc error?? > > > Thanks, > Hyung Kim > > 2022? 12? 6? (?) ?? 10:58, Matthew Knepley ?? ??: > >> On Tue, Dec 6, 2022 at 6:45 AM ??? wrote: >> >>> Hello, >>> >>> >>> There is a code which can run in not mpirun and also it can run in >>> mpi_linear_solver_server. >>> However, it has an error in just mpirun case such as mpirun -np >>> ./program. >>> The error output is as below. >>> [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> [0]PETSC ERROR: Object is in wrong state >>> [0]PETSC ERROR: Not for unassembled vector >>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >>> [0]PETSC ERROR: Petsc Release Version 3.18.1, unknown >>> [0]PETSC ERROR: ./app on a arch-linux-c-debug named ubuntu by ksi2443 >>> Tue Dec 6 03:39:13 2022 >>> [0]PETSC ERROR: Configure options -download-mumps -download-scalapack >>> -download-parmetis -download-metis >>> [0]PETSC ERROR: #1 VecCopy() at >>> /home/ksi2443/petsc/src/vec/vec/interface/vector.c:1625 >>> [0]PETSC ERROR: #2 KSPInitialResidual() at >>> /home/ksi2443/petsc/src/ksp/ksp/interface/itres.c:60 >>> [0]PETSC ERROR: #3 KSPSolve_GMRES() at >>> /home/ksi2443/petsc/src/ksp/ksp/impls/gmres/gmres.c:227 >>> [0]PETSC ERROR: #4 KSPSolve_Private() at >>> /home/ksi2443/petsc/src/ksp/ksp/interface/itfunc.c:899 >>> [0]PETSC ERROR: #5 KSPSolve() at >>> /home/ksi2443/petsc/src/ksp/ksp/interface/itfunc.c:1071 >>> [0]PETSC ERROR: #6 main() at /home/ksi2443/Downloads/coding/a1.c:450 >>> [0]PETSC ERROR: No PETSc Option Table entries >>> [0]PETSC ERROR: ----------------End of Error Message -------send entire >>> error message to petsc-maint at mcs.anl.gov---------- >>> >>> How can I fix this?? >>> >> >> It looks like we do not check the assembled state in parallel, since it >> cannot cause a problem, but every time you >> update values with VecSetValues(), you should call VecAssemblyBegin/End(). >> >> Thanks >> >> Matt >> >> >>> Thanks, >>> Hyung Kim >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Wed Dec 7 04:13:15 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Wed, 7 Dec 2022 19:13:15 +0900 Subject: [petsc-users] About Preconditioner and MUMPS In-Reply-To: References: Message-ID: I want to use METIS for ordering. I heard the MUMPS has good performance with METIS ordering. However there are some wonder things. 1. With option "-mpi_linear_solver_server -ksp_type preonly -pc_type mpi -mpi_pc_type lu " the MUMPS solving is slower than with option "-mpi_linear_solver_server -pc_type mpi -ksp_type preonly". Why does this result happen? 2. (MPIRUN case (actually, mpi_linear_solver_server case))) In my code, there is already has "PetscCall(PCSetType(pc,PCLU))" . However, to use METIS by using "-mpi_mat_mumps_icntl_7 5" I must append this option "-mpi_pc_type pu". If I don't apply "-mpi_pc_type lu", the metis option ("-mpi_mat_mumps_icntl_7 5"). Can I get some information about this? Thanks, Hyung Kim 2022? 12? 7? (?) ?? 12:24, Barry Smith ?? ??: > > > On Dec 6, 2022, at 5:15 AM, ??? wrote: > > Hello, > > > I have some questions about pc and mumps_icntl. > > 1. What?s the difference between adopt preconditioner by code (for > example, PetscCall(PCSetType(pc,PCLU)) and option -pc_type lu?? > And also, What?s the priority between code pcsettype and option -pc_type ?? > > 2. When I tried to use METIS in MUMPS, I adopted metis by option (for > example, -mat_mumps_icntl_7 5). In this situation, it is impossible to use > metis without pc_type lu. However, in my case pc type lu makes the > performance poor. So I don?t want to use lu preconditioner. How can I do > this? > > The package MUMPS has an option to use metis in its ordering process > which can be turned on as indicated while using MUMPS. Most > preconditioners that PETSc can use do not use metis for any purpose hence > there is no option to turn on its use. For what purpose do you wish to use > metis? Partitioning, ordering, ? > > > > > > > Thanks, > > Hyung Kim > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Dec 7 04:50:33 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 7 Dec 2022 05:50:33 -0500 Subject: [petsc-users] About MPIRUN In-Reply-To: References: Message-ID: On Wed, Dec 7, 2022 at 5:03 AM ??? wrote: > This error was caused by the inconsistent index of vecgetvalues in the > mpirun case. > > For example, for the problem that the global vector size is 4, when mpirun > -np 2, the value obtained from each process with vecgetvalues should be 2, > but in my code tried to get 4 values, so it became a problem. > > How to solve this problem? > I want to get a scalar array so that all process array has the same value > with global vector size and values. > This is a fundamentally nonscalable operation. Are you sure you want to do this? If so, you can use https://petsc.org/main/docs/manualpages/PetscSF/VecScatterCreateToZero/ Thanks Matt > Thanks, > Hyung Kim > > 2022? 12? 7? (?) ?? 1:34, ??? ?? ??: > >> I already done VecAssemblyBegin/End(). >> >> However, only mpirun case these outputs are represented. >> There are more error outputs as below. >> >> -------------------------------------------------------------------------- >> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF >> with errorcode 73. >> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >> You may or may not see output from other processes, depending on >> exactly when Open MPI kills them. >> -------------------------------------------------------------------------- >> [ubuntu:02473] PMIX ERROR: UNREACHABLE in file >> ../../../src/server/pmix_server.c at line 2193 >> [ubuntu:02473] PMIX ERROR: UNREACHABLE in file >> ../../../src/server/pmix_server.c at line 2193 >> [ubuntu:02473] PMIX ERROR: UNREACHABLE in file >> ../../../src/server/pmix_server.c at line 2193 >> [ubuntu:02473] 3 more processes have sent help message help-mpi-api.txt / >> mpi-abort >> [ubuntu:02473] Set MCA parameter "orte_base_help_aggregate" to 0 to see >> all help / error messages >> >> Could this be the cause of the former petsc error?? >> >> >> Thanks, >> Hyung Kim >> >> 2022? 12? 6? (?) ?? 10:58, Matthew Knepley ?? ??: >> >>> On Tue, Dec 6, 2022 at 6:45 AM ??? wrote: >>> >>>> Hello, >>>> >>>> >>>> There is a code which can run in not mpirun and also it can run in >>>> mpi_linear_solver_server. >>>> However, it has an error in just mpirun case such as mpirun -np >>>> ./program. >>>> The error output is as below. >>>> [0]PETSC ERROR: --------------------- Error Message >>>> -------------------------------------------------------------- >>>> [0]PETSC ERROR: Object is in wrong state >>>> [0]PETSC ERROR: Not for unassembled vector >>>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble >>>> shooting. >>>> [0]PETSC ERROR: Petsc Release Version 3.18.1, unknown >>>> [0]PETSC ERROR: ./app on a arch-linux-c-debug named ubuntu by ksi2443 >>>> Tue Dec 6 03:39:13 2022 >>>> [0]PETSC ERROR: Configure options -download-mumps -download-scalapack >>>> -download-parmetis -download-metis >>>> [0]PETSC ERROR: #1 VecCopy() at >>>> /home/ksi2443/petsc/src/vec/vec/interface/vector.c:1625 >>>> [0]PETSC ERROR: #2 KSPInitialResidual() at >>>> /home/ksi2443/petsc/src/ksp/ksp/interface/itres.c:60 >>>> [0]PETSC ERROR: #3 KSPSolve_GMRES() at >>>> /home/ksi2443/petsc/src/ksp/ksp/impls/gmres/gmres.c:227 >>>> [0]PETSC ERROR: #4 KSPSolve_Private() at >>>> /home/ksi2443/petsc/src/ksp/ksp/interface/itfunc.c:899 >>>> [0]PETSC ERROR: #5 KSPSolve() at >>>> /home/ksi2443/petsc/src/ksp/ksp/interface/itfunc.c:1071 >>>> [0]PETSC ERROR: #6 main() at /home/ksi2443/Downloads/coding/a1.c:450 >>>> [0]PETSC ERROR: No PETSc Option Table entries >>>> [0]PETSC ERROR: ----------------End of Error Message -------send entire >>>> error message to petsc-maint at mcs.anl.gov---------- >>>> >>>> How can I fix this?? >>>> >>> >>> It looks like we do not check the assembled state in parallel, since it >>> cannot cause a problem, but every time you >>> update values with VecSetValues(), you should call >>> VecAssemblyBegin/End(). >>> >>> Thanks >>> >>> Matt >>> >>> >>>> Thanks, >>>> Hyung Kim >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Dec 7 04:59:49 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 7 Dec 2022 05:59:49 -0500 Subject: [petsc-users] Petsc Section in DMPlex In-Reply-To: References: Message-ID: On Wed, Dec 7, 2022 at 3:35 AM Nicholas Arnold-Medabalimi < narnoldm at umich.edu> wrote: > Hi Matthew > > Thank you for the help. This clarified a great deal. > > I have a follow-up question related to DMPlexFilter. It may be better to > describe what I'm trying to achieve. > > I have a general mesh I am solving which has a section with cell center > finite volume states, as described in my initial email. After calculating > some metrics, I tag a bunch of cells with an identifying Label and use > DMFilter to generate a new DM which is only that subset of cells. > Generally, this leads to a pretty unbalanced DM so I then plan to use > DMPlexDIstribute to balance that DM across the processors. The coordinates > pass along fine, but the state(or I should say Section) does not at least > as far as I can tell. > > Assuming I can get a filtered DM I then distribute the DM and state using > the method you described above and it seems to be working ok. > > The last connection I have to make is the transfer of information from the > full mesh to the "sampled" filtered mesh. From what I can gather I would > need to get the mapping of points using DMPlexGetSubpointIS and then > manually copy the values from the full DM section to the filtered DM? I > have the process from full->filtered->distributed all working for the > coordinates so its just a matter of transferring the section correctly. > > I appreciate all the help you have provided. > Let's do this in two steps, which makes it easier to debug. First, do not redistribute the submesh. Just use DMPlexGetSubpointIS() to get the mapping of filtered points to points in the original mesh. Then create an expanded IS using the Section which makes dofs in the filtered mesh to dofs in the original mesh. From this use https://petsc.org/main/docs/manualpages/Vec/VecISCopy/ to move values between the original vector and the filtered vector. Once that works, you can try redistributing the filtered mesh. Before calling DMPlexDistribute() on the filtered mesh, you need to call https://petsc.org/main/docs/manualpages/DM/DMSetUseNatural/ When you redistribute, it will compute a mapping back to the original layout. Now when you want to transfer values, you 1) Create a natural vector with DMCreateNaturalVec() 2) Use DMGlobalToNaturalBegin/End() to move values from the filtered vector to the natural vector 3) Use VecISCopy() to move values from the natural vector to the original vector Let me know if you have any problems. Thanks, Matt > Sincerely > Nicholas > > > > On Mon, Nov 28, 2022 at 6:19 AM Matthew Knepley wrote: > >> On Sun, Nov 27, 2022 at 10:22 PM Nicholas Arnold-Medabalimi < >> narnoldm at umich.edu> wrote: >> >>> Hi Petsc Users >>> >>> I have a question about properly using PetscSection to assign state >>> variables to a DM. I have an existing DMPlex mesh distributed on 2 >>> processors. My goal is to have state variables set to the cell centers. I >>> then want to call DMPlexDistribute, which I hope will balance the mesh >>> elements and hopefully transport the state variables to the hosting >>> processors as the cells are distributed to a different processor count or >>> simply just redistributing after doing mesh adaption. >>> >>> Looking at the DMPlex User guide, I should be able to achieve this with >>> a single field section using SetDof and assigning the DOF to the points >>> corresponding to cells. >>> >> >> Note that if you want several different fields, you can clone the DM >> first for this field >> >> call DMClone(dm,dmState,ierr) >> >> and use dmState in your calls below. >> >> >>> >>> call DMPlexGetHeightStratum(dm,0,c0,c1,ierr) >>> call DMPlexGetChart(dm,p0,p1,ierr) >>> call PetscSectionCreate(PETSC_COMM_WORLD,section,ierr) >>> call PetscSectionSetNumFields(section,1,ierr) call >>> PetscSectionSetChart(section,p0,p1,ierr) >>> do i = c0, (c1-1) >>> call PetscSectionSetDof(section,i,nvar,ierr) >>> end do >>> call PetscSectionSetup(section,ierr) >>> call DMSetLocalSection(dm,section,ierr) >>> >> >> In the loop, I would add a call to >> >> call PetscSectionSetFieldDof(section,i,0,nvar,ierr) >> >> This also puts in the field breakdown. It is not essential, but nicer. >> >> >>> From here, it looks like I can access and set the state vars using >>> >>> call DMGetGlobalVector(dmplex,state,ierr) >>> call DMGetGlobalSection(dmplex,section,ierr) >>> call VecGetArrayF90(state,stateVec,ierr) >>> do i = c0, (c1-1) >>> call PetscSectionGetOffset(section,i,offset,ierr) >>> stateVec(offset:(offset+nvar))=state_i(:) !simplified assignment >>> end do >>> call VecRestoreArrayF90(state,stateVec,ierr) >>> call DMRestoreGlobalVector(dmplex,state,ierr) >>> >>> To my understanding, I should be using Global vector since this is a >>> pure assignment operation and I don't need the ghost cells. >>> >> >> Yes. >> >> But the behavior I am seeing isn't exactly what I'd expect. >>> >>> To be honest, I'm somewhat unclear on a few things >>> >>> 1) Should be using nvar fields with 1 DOF each or 1 field with nvar >>> DOFs or what the distinction between the two methods are? >>> >> >> We have two divisions in a Section. A field can have a number of >> components. This is intended to model a vector or tensor field. >> Then a Section can have a number of fields, such as velocity and pressure >> for a Stokes problem. The division is mainly to help the >> user, so I would use the most natural one. >> >> >>> 2) Adding a print statement after the offset assignment I get (on rank 0 >>> of 2) >>> cell 1 offset 0 >>> cell 2 offset 18 >>> cell 3 offset 36 >>> which is expected and works but on rank 1 I get >>> cell 1 offset 9000 >>> cell 2 offset 9018 >>> cell 3 offset 9036 >>> >>> which isn't exactly what I would expect. Shouldn't the offsets reset at >>> 0 for the next rank? >>> >> >> The local and global sections hold different information. This is the >> source of the confusion. The local section does describe a local >> vector, and thus includes overlap or "ghost" dofs. The global section >> describes a global vector. However, it is intended to deliver >> global indices, and thus the offsets give back global indices. When you >> use VecGetArray*() you are getting out the local array, and >> thus you have to subtract the first index on this process. You can get >> that from >> >> VecGetOwnershipRange(v, &rstart, &rEnd); >> >> This is the same whether you are using DMDA or DMPlex or any other DM. >> >> >>> 3) Does calling DMPlexDistribute also distribute the section data >>> associated with the DOF, based on the description in DMPlexDistribute it >>> looks like it should? >>> >> >> No. By default, DMPlexDistribute() only distributes coordinate data. I >> you want to distribute your field, it would look something like this: >> >> DMPlexDistribute(dm, 0, &sfDist, &dmDist); >> VecCreate(comm, &stateDist); >> VecSetDM(sateDist, dmDist); >> PetscSectionCreate(comm §ionDist); >> DMSetLocalSection(dmDist, sectionDist); >> DMPlexDistributeField(dmDist, sfDist, section, state, sectionDist, >> stateDist); >> >> We do this in src/dm/impls/plex/tests/ex36.c >> >> THanks, >> >> Matt >> >> I'd appreciate any insight into the specifics of this usage. I expect I >>> have a misconception on the local vs global section. Thank you. >>> >>> Sincerely >>> Nicholas >>> >>> -- >>> Nicholas Arnold-Medabalimi >>> >>> Ph.D. Candidate >>> Computational Aeroscience Lab >>> University of Michigan >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Dec 7 05:04:50 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 7 Dec 2022 06:04:50 -0500 Subject: [petsc-users] About Preconditioner and MUMPS In-Reply-To: References: Message-ID: On Wed, Dec 7, 2022 at 5:13 AM ??? wrote: > I want to use METIS for ordering. > I heard the MUMPS has good performance with METIS ordering. > > However there are some wonder things. > 1. With option "-mpi_linear_solver_server -ksp_type preonly -pc_type mpi > -mpi_pc_type lu " the MUMPS solving is slower than with option > "-mpi_linear_solver_server -pc_type mpi -ksp_type preonly". > Why does this result happen? > You are probably not using MUMPS. Always always always use -ksp_view to see exactly what solver you are using. > 2. (MPIRUN case (actually, mpi_linear_solver_server case))) In my code, > there is already has "PetscCall(PCSetType(pc,PCLU))" . However, to use > METIS by using "-mpi_mat_mumps_icntl_7 5" I must append this option > "-mpi_pc_type pu". > If I don't apply "-mpi_pc_type lu", the metis option > ("-mpi_mat_mumps_icntl_7 5"). Can I get some information about this? > Again, it seems like the solver configuration is not what you think it is. Thanks, Matt > Thanks, > Hyung Kim > > 2022? 12? 7? (?) ?? 12:24, Barry Smith ?? ??: > >> >> >> On Dec 6, 2022, at 5:15 AM, ??? wrote: >> >> Hello, >> >> >> I have some questions about pc and mumps_icntl. >> >> 1. What?s the difference between adopt preconditioner by code (for >> example, PetscCall(PCSetType(pc,PCLU)) and option -pc_type lu?? >> And also, What?s the priority between code pcsettype and option -pc_type >> ?? >> >> 2. When I tried to use METIS in MUMPS, I adopted metis by option >> (for example, -mat_mumps_icntl_7 5). In this situation, it is impossible to >> use metis without pc_type lu. However, in my case pc type lu makes the >> performance poor. So I don?t want to use lu preconditioner. How can I do >> this? >> >> The package MUMPS has an option to use metis in its ordering process >> which can be turned on as indicated while using MUMPS. Most >> preconditioners that PETSc can use do not use metis for any purpose hence >> there is no option to turn on its use. For what purpose do you wish to use >> metis? Partitioning, ordering, ? >> >> >> >> >> >> >> Thanks, >> >> Hyung Kim >> >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Wed Dec 7 05:15:15 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Wed, 7 Dec 2022 20:15:15 +0900 Subject: [petsc-users] About Preconditioner and MUMPS In-Reply-To: References: Message-ID: I think I don't understand the meaning of -pc_type mpi -mpi_pc_type lu What's the exact meaning of -pc_type mpi and -mpi_pc_type lu?? Is this difference coming from 'mpi_linear_solver_server' option?? Thanks, Hyung Kim 2022? 12? 7? (?) ?? 8:05, Matthew Knepley ?? ??: > On Wed, Dec 7, 2022 at 5:13 AM ??? wrote: > >> I want to use METIS for ordering. >> I heard the MUMPS has good performance with METIS ordering. >> >> However there are some wonder things. >> 1. With option "-mpi_linear_solver_server -ksp_type preonly -pc_type >> mpi -mpi_pc_type lu " the MUMPS solving is slower than with option >> "-mpi_linear_solver_server -pc_type mpi -ksp_type preonly". >> Why does this result happen? >> > > You are probably not using MUMPS. Always always always use -ksp_view to > see exactly what solver you are using. > > >> 2. (MPIRUN case (actually, mpi_linear_solver_server case))) In my code, >> there is already has "PetscCall(PCSetType(pc,PCLU))" . However, to use >> METIS by using "-mpi_mat_mumps_icntl_7 5" I must append this option >> "-mpi_pc_type pu". >> If I don't apply "-mpi_pc_type lu", the metis option >> ("-mpi_mat_mumps_icntl_7 5"). Can I get some information about this? >> > > Again, it seems like the solver configuration is not what you think it is. > > Thanks, > > Matt > > >> Thanks, >> Hyung Kim >> >> 2022? 12? 7? (?) ?? 12:24, Barry Smith ?? ??: >> >>> >>> >>> On Dec 6, 2022, at 5:15 AM, ??? wrote: >>> >>> Hello, >>> >>> >>> I have some questions about pc and mumps_icntl. >>> >>> 1. What?s the difference between adopt preconditioner by code (for >>> example, PetscCall(PCSetType(pc,PCLU)) and option -pc_type lu?? >>> And also, What?s the priority between code pcsettype and option -pc_type >>> ?? >>> >>> 2. When I tried to use METIS in MUMPS, I adopted metis by option >>> (for example, -mat_mumps_icntl_7 5). In this situation, it is impossible to >>> use metis without pc_type lu. However, in my case pc type lu makes the >>> performance poor. So I don?t want to use lu preconditioner. How can I do >>> this? >>> >>> The package MUMPS has an option to use metis in its ordering process >>> which can be turned on as indicated while using MUMPS. Most >>> preconditioners that PETSc can use do not use metis for any purpose hence >>> there is no option to turn on its use. For what purpose do you wish to use >>> metis? Partitioning, ordering, ? >>> >>> >>> >>> >>> >>> >>> Thanks, >>> >>> Hyung Kim >>> >>> >>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Dec 7 05:40:59 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 7 Dec 2022 06:40:59 -0500 Subject: [petsc-users] About Preconditioner and MUMPS In-Reply-To: References: Message-ID: On Wed, Dec 7, 2022 at 6:15 AM ??? wrote: > I think I don't understand the meaning of > -pc_type mpi > This option says to use the PCMPI preconditioner. This allows you to parallelize the solver in what is otherwise a serial code. > -mpi_pc_type lu > This tells the underlying solver in PCMPI to use the LU preconditioner. > What's the exact meaning of -pc_type mpi and -mpi_pc_type lu?? > Is this difference coming from 'mpi_linear_solver_server' option?? > Please use -ksp_view as I asked to look at the entire solver. Send it anytime you mail about solver questions. Thanks Matt > Thanks, > Hyung Kim > > 2022? 12? 7? (?) ?? 8:05, Matthew Knepley ?? ??: > >> On Wed, Dec 7, 2022 at 5:13 AM ??? wrote: >> >>> I want to use METIS for ordering. >>> I heard the MUMPS has good performance with METIS ordering. >>> >>> However there are some wonder things. >>> 1. With option "-mpi_linear_solver_server -ksp_type preonly -pc_type >>> mpi -mpi_pc_type lu " the MUMPS solving is slower than with option >>> "-mpi_linear_solver_server -pc_type mpi -ksp_type preonly". >>> Why does this result happen? >>> >> >> You are probably not using MUMPS. Always always always use -ksp_view to >> see exactly what solver you are using. >> >> >>> 2. (MPIRUN case (actually, mpi_linear_solver_server case))) In my >>> code, there is already has "PetscCall(PCSetType(pc,PCLU))" . However, to >>> use METIS by using "-mpi_mat_mumps_icntl_7 5" I must append this option >>> "-mpi_pc_type pu". >>> If I don't apply "-mpi_pc_type lu", the metis option >>> ("-mpi_mat_mumps_icntl_7 5"). Can I get some information about this? >>> >> >> Again, it seems like the solver configuration is not what you think it is. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Hyung Kim >>> >>> 2022? 12? 7? (?) ?? 12:24, Barry Smith ?? ??: >>> >>>> >>>> >>>> On Dec 6, 2022, at 5:15 AM, ??? wrote: >>>> >>>> Hello, >>>> >>>> >>>> I have some questions about pc and mumps_icntl. >>>> >>>> 1. What?s the difference between adopt preconditioner by code (for >>>> example, PetscCall(PCSetType(pc,PCLU)) and option -pc_type lu?? >>>> And also, What?s the priority between code pcsettype and option >>>> -pc_type ?? >>>> >>>> 2. When I tried to use METIS in MUMPS, I adopted metis by option >>>> (for example, -mat_mumps_icntl_7 5). In this situation, it is impossible to >>>> use metis without pc_type lu. However, in my case pc type lu makes the >>>> performance poor. So I don?t want to use lu preconditioner. How can I do >>>> this? >>>> >>>> The package MUMPS has an option to use metis in its ordering process >>>> which can be turned on as indicated while using MUMPS. Most >>>> preconditioners that PETSc can use do not use metis for any purpose hence >>>> there is no option to turn on its use. For what purpose do you wish to use >>>> metis? Partitioning, ordering, ? >>>> >>>> >>>> >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Hyung Kim >>>> >>>> >>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From narnoldm at umich.edu Wed Dec 7 05:51:20 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Wed, 7 Dec 2022 06:51:20 -0500 Subject: [petsc-users] Petsc Section in DMPlex In-Reply-To: References: Message-ID: Hi Thank you so much for your patience. One thing to note: I don't have any need to go back from the filtered distributed mapping back to the full but it is good to know. One aside question. 1) Is natural and global ordering the same in this context? As far as implementing what you have described. When I call ISView on the generated SubpointIS, I get an unusual error which I'm not sure how to interpret. (this case is running on 2 ranks and the filter label has points located on both ranks of the original DM. However, if I manually get the indices (the commented lines), it seems to not have any issues. call DMPlexFilter(dmplex_full, iBlankLabel, 1, dmplex_filtered,ierr) call DMPlexGetSubpointIS(dmplex_filtered, subpointsIS,ierr) !call ISGetIndicesF90(subpointsIS, subPointKey,ierr) !write(*,*) subPointKey !call ISRestoreIndicesF90(subpointsIS, subPointKey,ierr) call ISView(subpointsIS,PETSC_VIEWER_STDOUT_WORLD,ierr) [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Arguments must have same communicators [1]PETSC ERROR: Different communicators in the two objects: Argument # 1 and 2 flag 3 [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [1]PETSC ERROR: Petsc Development GIT revision: v3.18.1-320-g7810d690132 GIT Date: 2022-11-20 20:25:41 -0600 [1]PETSC ERROR: Configure options with-fc=mpiifort with-mpi-f90=mpiifort --download-triangle --download-parmetis --download-metis --with-debugging=1 --download-hdf5 --prefix=/home/narnoldm/packages/petsc_install [1]PETSC ERROR: #1 ISView() at /home/narnoldm/packages/petsc/src/vec/is/is/interface/index.c:1629 As far as the overall process you have described my question on first glance is do I have to allocate/create the vector that is output by VecISCopy before calling it, or does it create the vector automatically? I think I would need to create it first using a section and Setting the Vec in the filtered DM? And I presume in this case I would be using the scatter reverse option to go from the full set to the reduced set? Sincerely Nicholas Sincerely Nick On Wed, Dec 7, 2022 at 6:00 AM Matthew Knepley wrote: > On Wed, Dec 7, 2022 at 3:35 AM Nicholas Arnold-Medabalimi < > narnoldm at umich.edu> wrote: > >> Hi Matthew >> >> Thank you for the help. This clarified a great deal. >> >> I have a follow-up question related to DMPlexFilter. It may be better to >> describe what I'm trying to achieve. >> >> I have a general mesh I am solving which has a section with cell center >> finite volume states, as described in my initial email. After calculating >> some metrics, I tag a bunch of cells with an identifying Label and use >> DMFilter to generate a new DM which is only that subset of cells. >> Generally, this leads to a pretty unbalanced DM so I then plan to use >> DMPlexDIstribute to balance that DM across the processors. The coordinates >> pass along fine, but the state(or I should say Section) does not at least >> as far as I can tell. >> >> Assuming I can get a filtered DM I then distribute the DM and state using >> the method you described above and it seems to be working ok. >> >> The last connection I have to make is the transfer of information from >> the full mesh to the "sampled" filtered mesh. From what I can gather I >> would need to get the mapping of points using DMPlexGetSubpointIS and then >> manually copy the values from the full DM section to the filtered DM? I >> have the process from full->filtered->distributed all working for the >> coordinates so its just a matter of transferring the section correctly. >> >> I appreciate all the help you have provided. >> > > Let's do this in two steps, which makes it easier to debug. First, do not > redistribute the submesh. Just use DMPlexGetSubpointIS() > to get the mapping of filtered points to points in the original mesh. Then > create an expanded IS using the Section which makes > dofs in the filtered mesh to dofs in the original mesh. From this use > > https://petsc.org/main/docs/manualpages/Vec/VecISCopy/ > > to move values between the original vector and the filtered vector. > > Once that works, you can try redistributing the filtered mesh. Before > calling DMPlexDistribute() on the filtered mesh, you need to call > > https://petsc.org/main/docs/manualpages/DM/DMSetUseNatural/ > > When you redistribute, it will compute a mapping back to the original > layout. Now when you want to transfer values, you > > 1) Create a natural vector with DMCreateNaturalVec() > > 2) Use DMGlobalToNaturalBegin/End() to move values from the filtered > vector to the natural vector > > 3) Use VecISCopy() to move values from the natural vector to the > original vector > > Let me know if you have any problems. > > Thanks, > > Matt > > >> Sincerely >> Nicholas >> >> >> >> On Mon, Nov 28, 2022 at 6:19 AM Matthew Knepley >> wrote: >> >>> On Sun, Nov 27, 2022 at 10:22 PM Nicholas Arnold-Medabalimi < >>> narnoldm at umich.edu> wrote: >>> >>>> Hi Petsc Users >>>> >>>> I have a question about properly using PetscSection to assign state >>>> variables to a DM. I have an existing DMPlex mesh distributed on 2 >>>> processors. My goal is to have state variables set to the cell centers. I >>>> then want to call DMPlexDistribute, which I hope will balance the mesh >>>> elements and hopefully transport the state variables to the hosting >>>> processors as the cells are distributed to a different processor count or >>>> simply just redistributing after doing mesh adaption. >>>> >>>> Looking at the DMPlex User guide, I should be able to achieve this with >>>> a single field section using SetDof and assigning the DOF to the points >>>> corresponding to cells. >>>> >>> >>> Note that if you want several different fields, you can clone the DM >>> first for this field >>> >>> call DMClone(dm,dmState,ierr) >>> >>> and use dmState in your calls below. >>> >>> >>>> >>>> call DMPlexGetHeightStratum(dm,0,c0,c1,ierr) >>>> call DMPlexGetChart(dm,p0,p1,ierr) >>>> call PetscSectionCreate(PETSC_COMM_WORLD,section,ierr) >>>> call PetscSectionSetNumFields(section,1,ierr) call >>>> PetscSectionSetChart(section,p0,p1,ierr) >>>> do i = c0, (c1-1) >>>> call PetscSectionSetDof(section,i,nvar,ierr) >>>> end do >>>> call PetscSectionSetup(section,ierr) >>>> call DMSetLocalSection(dm,section,ierr) >>>> >>> >>> In the loop, I would add a call to >>> >>> call PetscSectionSetFieldDof(section,i,0,nvar,ierr) >>> >>> This also puts in the field breakdown. It is not essential, but nicer. >>> >>> >>>> From here, it looks like I can access and set the state vars using >>>> >>>> call DMGetGlobalVector(dmplex,state,ierr) >>>> call DMGetGlobalSection(dmplex,section,ierr) >>>> call VecGetArrayF90(state,stateVec,ierr) >>>> do i = c0, (c1-1) >>>> call PetscSectionGetOffset(section,i,offset,ierr) >>>> stateVec(offset:(offset+nvar))=state_i(:) !simplified assignment >>>> end do >>>> call VecRestoreArrayF90(state,stateVec,ierr) >>>> call DMRestoreGlobalVector(dmplex,state,ierr) >>>> >>>> To my understanding, I should be using Global vector since this is a >>>> pure assignment operation and I don't need the ghost cells. >>>> >>> >>> Yes. >>> >>> But the behavior I am seeing isn't exactly what I'd expect. >>>> >>>> To be honest, I'm somewhat unclear on a few things >>>> >>>> 1) Should be using nvar fields with 1 DOF each or 1 field with nvar >>>> DOFs or what the distinction between the two methods are? >>>> >>> >>> We have two divisions in a Section. A field can have a number of >>> components. This is intended to model a vector or tensor field. >>> Then a Section can have a number of fields, such as velocity and >>> pressure for a Stokes problem. The division is mainly to help the >>> user, so I would use the most natural one. >>> >>> >>>> 2) Adding a print statement after the offset assignment I get (on rank >>>> 0 of 2) >>>> cell 1 offset 0 >>>> cell 2 offset 18 >>>> cell 3 offset 36 >>>> which is expected and works but on rank 1 I get >>>> cell 1 offset 9000 >>>> cell 2 offset 9018 >>>> cell 3 offset 9036 >>>> >>>> which isn't exactly what I would expect. Shouldn't the offsets reset at >>>> 0 for the next rank? >>>> >>> >>> The local and global sections hold different information. This is the >>> source of the confusion. The local section does describe a local >>> vector, and thus includes overlap or "ghost" dofs. The global section >>> describes a global vector. However, it is intended to deliver >>> global indices, and thus the offsets give back global indices. When you >>> use VecGetArray*() you are getting out the local array, and >>> thus you have to subtract the first index on this process. You can get >>> that from >>> >>> VecGetOwnershipRange(v, &rstart, &rEnd); >>> >>> This is the same whether you are using DMDA or DMPlex or any other DM. >>> >>> >>>> 3) Does calling DMPlexDistribute also distribute the section data >>>> associated with the DOF, based on the description in DMPlexDistribute it >>>> looks like it should? >>>> >>> >>> No. By default, DMPlexDistribute() only distributes coordinate data. I >>> you want to distribute your field, it would look something like this: >>> >>> DMPlexDistribute(dm, 0, &sfDist, &dmDist); >>> VecCreate(comm, &stateDist); >>> VecSetDM(sateDist, dmDist); >>> PetscSectionCreate(comm §ionDist); >>> DMSetLocalSection(dmDist, sectionDist); >>> DMPlexDistributeField(dmDist, sfDist, section, state, sectionDist, >>> stateDist); >>> >>> We do this in src/dm/impls/plex/tests/ex36.c >>> >>> THanks, >>> >>> Matt >>> >>> I'd appreciate any insight into the specifics of this usage. I expect I >>>> have a misconception on the local vs global section. Thank you. >>>> >>>> Sincerely >>>> Nicholas >>>> >>>> -- >>>> Nicholas Arnold-Medabalimi >>>> >>>> Ph.D. Candidate >>>> Computational Aeroscience Lab >>>> University of Michigan >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> >> -- >> Nicholas Arnold-Medabalimi >> >> Ph.D. Candidate >> Computational Aeroscience Lab >> University of Michigan >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Dec 7 06:05:39 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 7 Dec 2022 07:05:39 -0500 Subject: [petsc-users] Petsc Section in DMPlex In-Reply-To: References: Message-ID: On Wed, Dec 7, 2022 at 6:51 AM Nicholas Arnold-Medabalimi < narnoldm at umich.edu> wrote: > Hi > > Thank you so much for your patience. One thing to note: I don't have any > need to go back from the filtered distributed mapping back to the full but > it is good to know. > > One aside question. > 1) Is natural and global ordering the same in this context? > No. > As far as implementing what you have described. > > When I call ISView on the generated SubpointIS, I get an unusual error > which I'm not sure how to interpret. (this case is running on 2 ranks and > the filter label has points located on both ranks of the original DM. > However, if I manually get the indices (the commented lines), it seems to > not have any issues. > call DMPlexFilter(dmplex_full, iBlankLabel, 1, dmplex_filtered,ierr) > call DMPlexGetSubpointIS(dmplex_filtered, subpointsIS,ierr) > !call ISGetIndicesF90(subpointsIS, subPointKey,ierr) > !write(*,*) subPointKey > !call ISRestoreIndicesF90(subpointsIS, subPointKey,ierr) > call ISView(subpointsIS,PETSC_VIEWER_STDOUT_WORLD,ierr) > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Arguments must have same communicators > [1]PETSC ERROR: Different communicators in the two objects: Argument # 1 > and 2 flag 3 > [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [1]PETSC ERROR: Petsc Development GIT revision: v3.18.1-320-g7810d690132 > GIT Date: 2022-11-20 20:25:41 -0600 > [1]PETSC ERROR: Configure options with-fc=mpiifort with-mpi-f90=mpiifort > --download-triangle --download-parmetis --download-metis --with-debugging=1 > --download-hdf5 --prefix=/home/narnoldm/packages/petsc_install > [1]PETSC ERROR: #1 ISView() at > /home/narnoldm/packages/petsc/src/vec/is/is/interface/index.c:1629 > The problem here is the subpointsIS is a _serial_ object, and you are using a parallel viewer. You can use PETSC_VIEWER_STDOUT_SELF, or you can pull out the singleton viewer from STDOUT_WORLD if you want them all to print in order. > As far as the overall process you have described my question on first > glance is do I have to allocate/create the vector that is output by > VecISCopy before calling it, or does it create the vector automatically? > You create both vectors. I would do it using DMCreateGlobalVector() from both DMs. > I think I would need to create it first using a section and Setting the > Vec in the filtered DM? > Setting the Section in the filtered DM. > And I presume in this case I would be using the scatter reverse option to > go from the full set to the reduced set? > Yes Thanks Matt > Sincerely > Nicholas > > > > > > > Sincerely > Nick > > On Wed, Dec 7, 2022 at 6:00 AM Matthew Knepley wrote: > >> On Wed, Dec 7, 2022 at 3:35 AM Nicholas Arnold-Medabalimi < >> narnoldm at umich.edu> wrote: >> >>> Hi Matthew >>> >>> Thank you for the help. This clarified a great deal. >>> >>> I have a follow-up question related to DMPlexFilter. It may be better to >>> describe what I'm trying to achieve. >>> >>> I have a general mesh I am solving which has a section with cell center >>> finite volume states, as described in my initial email. After calculating >>> some metrics, I tag a bunch of cells with an identifying Label and use >>> DMFilter to generate a new DM which is only that subset of cells. >>> Generally, this leads to a pretty unbalanced DM so I then plan to use >>> DMPlexDIstribute to balance that DM across the processors. The coordinates >>> pass along fine, but the state(or I should say Section) does not at least >>> as far as I can tell. >>> >>> Assuming I can get a filtered DM I then distribute the DM and state >>> using the method you described above and it seems to be working ok. >>> >>> The last connection I have to make is the transfer of information from >>> the full mesh to the "sampled" filtered mesh. From what I can gather I >>> would need to get the mapping of points using DMPlexGetSubpointIS and then >>> manually copy the values from the full DM section to the filtered DM? I >>> have the process from full->filtered->distributed all working for the >>> coordinates so its just a matter of transferring the section correctly. >>> >>> I appreciate all the help you have provided. >>> >> >> Let's do this in two steps, which makes it easier to debug. First, do not >> redistribute the submesh. Just use DMPlexGetSubpointIS() >> to get the mapping of filtered points to points in the original mesh. >> Then create an expanded IS using the Section which makes >> dofs in the filtered mesh to dofs in the original mesh. From this use >> >> https://petsc.org/main/docs/manualpages/Vec/VecISCopy/ >> >> to move values between the original vector and the filtered vector. >> >> Once that works, you can try redistributing the filtered mesh. Before >> calling DMPlexDistribute() on the filtered mesh, you need to call >> >> https://petsc.org/main/docs/manualpages/DM/DMSetUseNatural/ >> >> When you redistribute, it will compute a mapping back to the original >> layout. Now when you want to transfer values, you >> >> 1) Create a natural vector with DMCreateNaturalVec() >> >> 2) Use DMGlobalToNaturalBegin/End() to move values from the filtered >> vector to the natural vector >> >> 3) Use VecISCopy() to move values from the natural vector to the >> original vector >> >> Let me know if you have any problems. >> >> Thanks, >> >> Matt >> >> >>> Sincerely >>> Nicholas >>> >>> >>> >>> On Mon, Nov 28, 2022 at 6:19 AM Matthew Knepley >>> wrote: >>> >>>> On Sun, Nov 27, 2022 at 10:22 PM Nicholas Arnold-Medabalimi < >>>> narnoldm at umich.edu> wrote: >>>> >>>>> Hi Petsc Users >>>>> >>>>> I have a question about properly using PetscSection to assign state >>>>> variables to a DM. I have an existing DMPlex mesh distributed on 2 >>>>> processors. My goal is to have state variables set to the cell centers. I >>>>> then want to call DMPlexDistribute, which I hope will balance the mesh >>>>> elements and hopefully transport the state variables to the hosting >>>>> processors as the cells are distributed to a different processor count or >>>>> simply just redistributing after doing mesh adaption. >>>>> >>>>> Looking at the DMPlex User guide, I should be able to achieve this >>>>> with a single field section using SetDof and assigning the DOF to the >>>>> points corresponding to cells. >>>>> >>>> >>>> Note that if you want several different fields, you can clone the DM >>>> first for this field >>>> >>>> call DMClone(dm,dmState,ierr) >>>> >>>> and use dmState in your calls below. >>>> >>>> >>>>> >>>>> call DMPlexGetHeightStratum(dm,0,c0,c1,ierr) >>>>> call DMPlexGetChart(dm,p0,p1,ierr) >>>>> call PetscSectionCreate(PETSC_COMM_WORLD,section,ierr) >>>>> call PetscSectionSetNumFields(section,1,ierr) call >>>>> PetscSectionSetChart(section,p0,p1,ierr) >>>>> do i = c0, (c1-1) >>>>> call PetscSectionSetDof(section,i,nvar,ierr) >>>>> end do >>>>> call PetscSectionSetup(section,ierr) >>>>> call DMSetLocalSection(dm,section,ierr) >>>>> >>>> >>>> In the loop, I would add a call to >>>> >>>> call PetscSectionSetFieldDof(section,i,0,nvar,ierr) >>>> >>>> This also puts in the field breakdown. It is not essential, but nicer. >>>> >>>> >>>>> From here, it looks like I can access and set the state vars using >>>>> >>>>> call DMGetGlobalVector(dmplex,state,ierr) >>>>> call DMGetGlobalSection(dmplex,section,ierr) >>>>> call VecGetArrayF90(state,stateVec,ierr) >>>>> do i = c0, (c1-1) >>>>> call PetscSectionGetOffset(section,i,offset,ierr) >>>>> stateVec(offset:(offset+nvar))=state_i(:) !simplified assignment >>>>> end do >>>>> call VecRestoreArrayF90(state,stateVec,ierr) >>>>> call DMRestoreGlobalVector(dmplex,state,ierr) >>>>> >>>>> To my understanding, I should be using Global vector since this is a >>>>> pure assignment operation and I don't need the ghost cells. >>>>> >>>> >>>> Yes. >>>> >>>> But the behavior I am seeing isn't exactly what I'd expect. >>>>> >>>>> To be honest, I'm somewhat unclear on a few things >>>>> >>>>> 1) Should be using nvar fields with 1 DOF each or 1 field with nvar >>>>> DOFs or what the distinction between the two methods are? >>>>> >>>> >>>> We have two divisions in a Section. A field can have a number of >>>> components. This is intended to model a vector or tensor field. >>>> Then a Section can have a number of fields, such as velocity and >>>> pressure for a Stokes problem. The division is mainly to help the >>>> user, so I would use the most natural one. >>>> >>>> >>>>> 2) Adding a print statement after the offset assignment I get (on rank >>>>> 0 of 2) >>>>> cell 1 offset 0 >>>>> cell 2 offset 18 >>>>> cell 3 offset 36 >>>>> which is expected and works but on rank 1 I get >>>>> cell 1 offset 9000 >>>>> cell 2 offset 9018 >>>>> cell 3 offset 9036 >>>>> >>>>> which isn't exactly what I would expect. Shouldn't the offsets reset >>>>> at 0 for the next rank? >>>>> >>>> >>>> The local and global sections hold different information. This is the >>>> source of the confusion. The local section does describe a local >>>> vector, and thus includes overlap or "ghost" dofs. The global section >>>> describes a global vector. However, it is intended to deliver >>>> global indices, and thus the offsets give back global indices. When you >>>> use VecGetArray*() you are getting out the local array, and >>>> thus you have to subtract the first index on this process. You can get >>>> that from >>>> >>>> VecGetOwnershipRange(v, &rstart, &rEnd); >>>> >>>> This is the same whether you are using DMDA or DMPlex or any other DM. >>>> >>>> >>>>> 3) Does calling DMPlexDistribute also distribute the section data >>>>> associated with the DOF, based on the description in DMPlexDistribute it >>>>> looks like it should? >>>>> >>>> >>>> No. By default, DMPlexDistribute() only distributes coordinate data. I >>>> you want to distribute your field, it would look something like this: >>>> >>>> DMPlexDistribute(dm, 0, &sfDist, &dmDist); >>>> VecCreate(comm, &stateDist); >>>> VecSetDM(sateDist, dmDist); >>>> PetscSectionCreate(comm §ionDist); >>>> DMSetLocalSection(dmDist, sectionDist); >>>> DMPlexDistributeField(dmDist, sfDist, section, state, sectionDist, >>>> stateDist); >>>> >>>> We do this in src/dm/impls/plex/tests/ex36.c >>>> >>>> THanks, >>>> >>>> Matt >>>> >>>> I'd appreciate any insight into the specifics of this usage. I expect I >>>>> have a misconception on the local vs global section. Thank you. >>>>> >>>>> Sincerely >>>>> Nicholas >>>>> >>>>> -- >>>>> Nicholas Arnold-Medabalimi >>>>> >>>>> Ph.D. Candidate >>>>> Computational Aeroscience Lab >>>>> University of Michigan >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >>> -- >>> Nicholas Arnold-Medabalimi >>> >>> Ph.D. Candidate >>> Computational Aeroscience Lab >>> University of Michigan >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Wed Dec 7 07:15:26 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Wed, 7 Dec 2022 22:15:26 +0900 Subject: [petsc-users] About Preconditioner and MUMPS In-Reply-To: References: Message-ID: Following your comments, I used below command mpirun -np 4 ./app -ksp_type preonly -pc_type mpi -mpi_linear_solver_server -mpi_pc_type lu -mpi_pc_factor_mat_solver_type mumps -mpi_mat_mumps_icntl_7 5 -mpi_ksp_view so the output is as below KSP Object: (mpi_) 1 MPI process type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (mpi_) 1 MPI process type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: external factor fill ratio given 0., needed 0. Factored matrix follows: Mat Object: (mpi_) 1 MPI process type: mumps rows=192, cols=192 package used to perform factorization: mumps total: nonzeros=17334, allocated nonzeros=17334 MUMPS run parameters: Use -mpi_ksp_view ::ascii_info_detail to display information for all processes RINFOG(1) (global estimated flops for the elimination after analysis): 949441. RINFOG(2) (global estimated flops for the assembly after factorization): 18774. RINFOG(3) (global estimated flops for the elimination after factorization): 949441. (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0.,0.)*(2^0) INFOG(3) (estimated real workspace for factors on all processors after analysis): 17334 INFOG(4) (estimated integer workspace for factors on all processors after analysis): 1724 INFOG(5) (estimated maximum front size in the complete tree): 96 INFOG(6) (number of nodes in the complete tree): 16 INFOG(7) (ordering option effectively used after analysis): 5 INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100 INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 17334 INFOG(10) (total integer space store the matrix factors after factorization): 1724 INFOG(11) (order of largest frontal matrix after factorization): 96 INFOG(12) (number of off-diagonal pivots): 0 INFOG(13) (number of delayed pivots after factorization): 0 INFOG(14) (number of memory compress after factorization): 0 INFOG(15) (number of steps of iterative refinement after solution): 0 INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 1 INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 1 INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 1 INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 1 INFOG(20) (estimated number of entries in the factors): 17334 INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 1 INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 1 INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0 INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1 INFOG(25) (after factorization: number of pivots modified by static pivoting): 0 INFOG(28) (after factorization: number of null pivots encountered): 0 INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 17334 INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 0, 0 INFOG(32) (after analysis: type of analysis done): 1 INFOG(33) (value used for ICNTL(8)): 7 INFOG(34) (exponent of the determinant if determinant is requested): 0 INFOG(35) (after factorization: number of entries taking into account BLR factor compression - sum over all processors): 17334 INFOG(36) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - value on the most memory consuming processor): 0 INFOG(37) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - sum over all processors): 0 INFOG(38) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - value on the most memory consuming processor): 0 INFOG(39) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - sum over all processors): 0 linear system matrix = precond matrix: Mat Object: 1 MPI process type: seqaij rows=192, cols=192 total: nonzeros=9000, allocated nonzeros=36864 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 64 nodes, limit used is 5 Is it correct that I successfully computed with mumps by using metis(icntl_7 5)? Thanks, Hyung Kim 2022? 12? 7? (?) ?? 8:41, Matthew Knepley ?? ??: > On Wed, Dec 7, 2022 at 6:15 AM ??? wrote: > >> I think I don't understand the meaning of >> -pc_type mpi >> > > This option says to use the PCMPI preconditioner. This allows you to > parallelize the > solver in what is otherwise a serial code. > > >> -mpi_pc_type lu >> > > This tells the underlying solver in PCMPI to use the LU preconditioner. > > >> What's the exact meaning of -pc_type mpi and -mpi_pc_type lu?? >> Is this difference coming from 'mpi_linear_solver_server' option?? >> > > Please use -ksp_view as I asked to look at the entire solver. Send it > anytime you mail about solver questions. > > Thanks > > Matt > > >> Thanks, >> Hyung Kim >> >> 2022? 12? 7? (?) ?? 8:05, Matthew Knepley ?? ??: >> >>> On Wed, Dec 7, 2022 at 5:13 AM ??? wrote: >>> >>>> I want to use METIS for ordering. >>>> I heard the MUMPS has good performance with METIS ordering. >>>> >>>> However there are some wonder things. >>>> 1. With option "-mpi_linear_solver_server -ksp_type preonly -pc_type >>>> mpi -mpi_pc_type lu " the MUMPS solving is slower than with option >>>> "-mpi_linear_solver_server -pc_type mpi -ksp_type preonly". >>>> Why does this result happen? >>>> >>> >>> You are probably not using MUMPS. Always always always use -ksp_view to >>> see exactly what solver you are using. >>> >>> >>>> 2. (MPIRUN case (actually, mpi_linear_solver_server case))) In my >>>> code, there is already has "PetscCall(PCSetType(pc,PCLU))" . However, to >>>> use METIS by using "-mpi_mat_mumps_icntl_7 5" I must append this option >>>> "-mpi_pc_type pu". >>>> If I don't apply "-mpi_pc_type lu", the metis option >>>> ("-mpi_mat_mumps_icntl_7 5"). Can I get some information about this? >>>> >>> >>> Again, it seems like the solver configuration is not what you think it >>> is. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks, >>>> Hyung Kim >>>> >>>> 2022? 12? 7? (?) ?? 12:24, Barry Smith ?? ??: >>>> >>>>> >>>>> >>>>> On Dec 6, 2022, at 5:15 AM, ??? wrote: >>>>> >>>>> Hello, >>>>> >>>>> >>>>> I have some questions about pc and mumps_icntl. >>>>> >>>>> 1. What?s the difference between adopt preconditioner by code >>>>> (for example, PetscCall(PCSetType(pc,PCLU)) and option -pc_type lu?? >>>>> And also, What?s the priority between code pcsettype and option >>>>> -pc_type ?? >>>>> >>>>> 2. When I tried to use METIS in MUMPS, I adopted metis by option >>>>> (for example, -mat_mumps_icntl_7 5). In this situation, it is impossible to >>>>> use metis without pc_type lu. However, in my case pc type lu makes the >>>>> performance poor. So I don?t want to use lu preconditioner. How can I do >>>>> this? >>>>> >>>>> The package MUMPS has an option to use metis in its ordering >>>>> process which can be turned on as indicated while using MUMPS. Most >>>>> preconditioners that PETSc can use do not use metis for any purpose hence >>>>> there is no option to turn on its use. For what purpose do you wish to use >>>>> metis? Partitioning, ordering, ? >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Hyung Kim >>>>> >>>>> >>>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Dec 7 07:38:36 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 7 Dec 2022 08:38:36 -0500 Subject: [petsc-users] About Preconditioner and MUMPS In-Reply-To: References: Message-ID: On Wed, Dec 7, 2022 at 8:15 AM ??? wrote: > Following your comments, > I used below command > mpirun -np 4 ./app -ksp_type preonly -pc_type mpi > -mpi_linear_solver_server -mpi_pc_type lu -mpi_pc_factor_mat_solver_type > mumps -mpi_mat_mumps_icntl_7 5 -mpi_ksp_view > > > so the output is as below > > KSP Object: (mpi_) 1 MPI process > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (mpi_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: external > factor fill ratio given 0., needed 0. > Factored matrix follows: > Mat Object: (mpi_) 1 MPI process > type: mumps > rows=192, cols=192 > package used to perform factorization: mumps > total: nonzeros=17334, allocated nonzeros=17334 > MUMPS run parameters: > Use -mpi_ksp_view ::ascii_info_detail to display information > for all processes > RINFOG(1) (global estimated flops for the elimination after > analysis): 949441. > RINFOG(2) (global estimated flops for the assembly after > factorization): 18774. > RINFOG(3) (global estimated flops for the elimination after > factorization): 949441. > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): > (0.,0.)*(2^0) > INFOG(3) (estimated real workspace for factors on all > processors after analysis): 17334 > INFOG(4) (estimated integer workspace for factors on all > processors after analysis): 1724 > INFOG(5) (estimated maximum front size in the complete > tree): 96 > INFOG(6) (number of nodes in the complete tree): 16 > INFOG(7) (ordering option effectively used after analysis): 5 > INFOG(8) (structural symmetry in percent of the permuted > matrix after analysis): 100 > INFOG(9) (total real/complex workspace to store the matrix > factors after factorization): 17334 > INFOG(10) (total integer space store the matrix factors > after factorization): 1724 > INFOG(11) (order of largest frontal matrix after > factorization): 96 > INFOG(12) (number of off-diagonal pivots): 0 > INFOG(13) (number of delayed pivots after factorization): 0 > INFOG(14) (number of memory compress after factorization): 0 > INFOG(15) (number of steps of iterative refinement after > solution): 0 > INFOG(16) (estimated size (in MB) of all MUMPS internal data > for factorization after analysis: value on the most memory consuming > processor): 1 > INFOG(17) (estimated size of all MUMPS internal data for > factorization after analysis: sum over all processors): 1 > INFOG(18) (size of all MUMPS internal data allocated during > factorization: value on the most memory consuming processor): 1 > INFOG(19) (size of all MUMPS internal data allocated during > factorization: sum over all processors): 1 > INFOG(20) (estimated number of entries in the factors): 17334 > INFOG(21) (size in MB of memory effectively used during > factorization - value on the most memory consuming processor): 1 > INFOG(22) (size in MB of memory effectively used during > factorization - sum over all processors): 1 > INFOG(23) (after analysis: value of ICNTL(6) effectively > used): 0 > INFOG(24) (after analysis: value of ICNTL(12) effectively > used): 1 > INFOG(25) (after factorization: number of pivots modified by > static pivoting): 0 > INFOG(28) (after factorization: number of null pivots > encountered): 0 > INFOG(29) (after factorization: effective number of entries > in the factors (sum over all processors)): 17334 > INFOG(30, 31) (after solution: size in Mbytes of memory used > during solution phase): 0, 0 > INFOG(32) (after analysis: type of analysis done): 1 > INFOG(33) (value used for ICNTL(8)): 7 > INFOG(34) (exponent of the determinant if determinant is > requested): 0 > INFOG(35) (after factorization: number of entries taking > into account BLR factor compression - sum over all processors): 17334 > INFOG(36) (after analysis: estimated size of all MUMPS > internal data for running BLR in-core - value on the most memory consuming > processor): 0 > INFOG(37) (after analysis: estimated size of all MUMPS > internal data for running BLR in-core - sum over all processors): 0 > INFOG(38) (after analysis: estimated size of all MUMPS > internal data for running BLR out-of-core - value on the most memory > consuming processor): 0 > INFOG(39) (after analysis: estimated size of all MUMPS > internal data for running BLR out-of-core - sum over all processors): 0 > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaij > rows=192, cols=192 > total: nonzeros=9000, allocated nonzeros=36864 > total number of mallocs used during MatSetValues calls=0 > using I-node routines: found 64 nodes, limit used is 5 > > Is it correct that I successfully computed with mumps by using > metis(icntl_7 5)? > Yes, first you know it is MUMPS Mat Object: (mpi_) 1 MPI process type: mumps Second you know METIS was used INFOG(7) (ordering option effectively used after analysis): 5 I would not expect much speedup on 2 processes with such a small matrix rows=192, cols=192 Thanks Matt > Thanks, > Hyung Kim > > 2022? 12? 7? (?) ?? 8:41, Matthew Knepley ?? ??: > >> On Wed, Dec 7, 2022 at 6:15 AM ??? wrote: >> >>> I think I don't understand the meaning of >>> -pc_type mpi >>> >> >> This option says to use the PCMPI preconditioner. This allows you to >> parallelize the >> solver in what is otherwise a serial code. >> >> >>> -mpi_pc_type lu >>> >> >> This tells the underlying solver in PCMPI to use the LU preconditioner. >> >> >>> What's the exact meaning of -pc_type mpi and -mpi_pc_type lu?? >>> Is this difference coming from 'mpi_linear_solver_server' option?? >>> >> >> Please use -ksp_view as I asked to look at the entire solver. Send it >> anytime you mail about solver questions. >> >> Thanks >> >> Matt >> >> >>> Thanks, >>> Hyung Kim >>> >>> 2022? 12? 7? (?) ?? 8:05, Matthew Knepley ?? ??: >>> >>>> On Wed, Dec 7, 2022 at 5:13 AM ??? wrote: >>>> >>>>> I want to use METIS for ordering. >>>>> I heard the MUMPS has good performance with METIS ordering. >>>>> >>>>> However there are some wonder things. >>>>> 1. With option "-mpi_linear_solver_server -ksp_type preonly -pc_type >>>>> mpi -mpi_pc_type lu " the MUMPS solving is slower than with option >>>>> "-mpi_linear_solver_server -pc_type mpi -ksp_type preonly". >>>>> Why does this result happen? >>>>> >>>> >>>> You are probably not using MUMPS. Always always always use -ksp_view to >>>> see exactly what solver you are using. >>>> >>>> >>>>> 2. (MPIRUN case (actually, mpi_linear_solver_server case))) In my >>>>> code, there is already has "PetscCall(PCSetType(pc,PCLU))" . However, to >>>>> use METIS by using "-mpi_mat_mumps_icntl_7 5" I must append this option >>>>> "-mpi_pc_type pu". >>>>> If I don't apply "-mpi_pc_type lu", the metis option >>>>> ("-mpi_mat_mumps_icntl_7 5"). Can I get some information about this? >>>>> >>>> >>>> Again, it seems like the solver configuration is not what you think it >>>> is. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks, >>>>> Hyung Kim >>>>> >>>>> 2022? 12? 7? (?) ?? 12:24, Barry Smith ?? ??: >>>>> >>>>>> >>>>>> >>>>>> On Dec 6, 2022, at 5:15 AM, ??? wrote: >>>>>> >>>>>> Hello, >>>>>> >>>>>> >>>>>> I have some questions about pc and mumps_icntl. >>>>>> >>>>>> 1. What?s the difference between adopt preconditioner by code >>>>>> (for example, PetscCall(PCSetType(pc,PCLU)) and option -pc_type lu?? >>>>>> And also, What?s the priority between code pcsettype and option >>>>>> -pc_type ?? >>>>>> >>>>>> 2. When I tried to use METIS in MUMPS, I adopted metis by option >>>>>> (for example, -mat_mumps_icntl_7 5). In this situation, it is impossible to >>>>>> use metis without pc_type lu. However, in my case pc type lu makes the >>>>>> performance poor. So I don?t want to use lu preconditioner. How can I do >>>>>> this? >>>>>> >>>>>> The package MUMPS has an option to use metis in its ordering >>>>>> process which can be turned on as indicated while using MUMPS. Most >>>>>> preconditioners that PETSc can use do not use metis for any purpose hence >>>>>> there is no option to turn on its use. For what purpose do you wish to use >>>>>> metis? Partitioning, ordering, ? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Hyung Kim >>>>>> >>>>>> >>>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Wed Dec 7 15:21:24 2022 From: mlohry at gmail.com (Mark Lohry) Date: Wed, 7 Dec 2022 16:21:24 -0500 Subject: [petsc-users] prevent linking to multithreaded BLAS? Message-ID: I ran into an unexpected issue -- on an NP-core machine, each MPI rank of my application was launching NP threads, such that when running with multiple ranks the machine was quickly oversubscribed and performance tanked. The root cause of this was petsc linking against the system-provided library (libopenblas0-pthread in this case) set by the update-alternatives in ubuntu. At some point this machine got updated to using the threaded blas implementation instead of serial; not sure how, and I wouldn't have noticed if I weren't running interactively. Is there any mechanism in petsc or its build system to prevent linking against an inappropriate BLAS, or do I need to be diligent about manually setting the BLAS library in the configuration stage? Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Dec 7 15:33:37 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 7 Dec 2022 16:33:37 -0500 Subject: [petsc-users] prevent linking to multithreaded BLAS? In-Reply-To: References: Message-ID: We don't have configure code to detect if the BLAS is thread parallel, nor do we have code to tell it not to use a thread parallel version. Except if it is using MKL then we do force it to not use the threaded BLAS. A "cheat" would be for you to just set the environmental variable BLAS uses for number of threads to 1 always, then you would not need to worry about checking to avoid the "bad" library. Barry > On Dec 7, 2022, at 4:21 PM, Mark Lohry wrote: > > I ran into an unexpected issue -- on an NP-core machine, each MPI rank of my application was launching NP threads, such that when running with multiple ranks the machine was quickly oversubscribed and performance tanked. > > The root cause of this was petsc linking against the system-provided library (libopenblas0-pthread in this case) set by the update-alternatives in ubuntu. At some point this machine got updated to using the threaded blas implementation instead of serial; not sure how, and I wouldn't have noticed if I weren't running interactively. > > Is there any mechanism in petsc or its build system to prevent linking against an inappropriate BLAS, or do I need to be diligent about manually setting the BLAS library in the configuration stage? > > Thanks, > Mark From balay at mcs.anl.gov Wed Dec 7 15:35:08 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 7 Dec 2022 15:35:08 -0600 (CST) Subject: [petsc-users] prevent linking to multithreaded BLAS? In-Reply-To: References: Message-ID: <05539ceb-d9ec-7b03-4344-1f0cbfe57bd3@mcs.anl.gov> If you don't specify a blas to use - petsc configure will guess and use what it can find. So only way to force it use a particular blas is to specify one [one way is --download-fblaslapack] Wrt multi-thread openblas - you can force it run single threaded [by one of these 2 env variables] # Use single thread openblas export OPENBLAS_NUM_THREADS=1 export OMP_NUM_THREADS=1 Satish On Wed, 7 Dec 2022, Mark Lohry wrote: > I ran into an unexpected issue -- on an NP-core machine, each MPI rank of > my application was launching NP threads, such that when running with > multiple ranks the machine was quickly oversubscribed and performance > tanked. > > The root cause of this was petsc linking against the system-provided > library (libopenblas0-pthread in this case) set by the update-alternatives > in ubuntu. At some point this machine got updated to using the threaded > blas implementation instead of serial; not sure how, and I wouldn't have > noticed if I weren't running interactively. > > Is there any mechanism in petsc or its build system to prevent linking > against an inappropriate BLAS, or do I need to be diligent about manually > setting the BLAS library in the configuration stage? > > Thanks, > Mark > From mlohry at gmail.com Wed Dec 7 15:47:34 2022 From: mlohry at gmail.com (Mark Lohry) Date: Wed, 7 Dec 2022 16:47:34 -0500 Subject: [petsc-users] prevent linking to multithreaded BLAS? In-Reply-To: <05539ceb-d9ec-7b03-4344-1f0cbfe57bd3@mcs.anl.gov> References: <05539ceb-d9ec-7b03-4344-1f0cbfe57bd3@mcs.anl.gov> Message-ID: Thanks, yes, I figured out the OMP_NUM_THREADS=1 way while triaging it, and the --download-fblaslapack way occurred to me. I was hoping for something that "just worked" (refuse to build in this case) but I don't know if it's programmatically possible for petsc to tell whether or not it's linking to a threaded BLAS? On Wed, Dec 7, 2022 at 4:35 PM Satish Balay wrote: > If you don't specify a blas to use - petsc configure will guess and use > what it can find. > > So only way to force it use a particular blas is to specify one [one way > is --download-fblaslapack] > > Wrt multi-thread openblas - you can force it run single threaded [by one > of these 2 env variables] > > # Use single thread openblas > export OPENBLAS_NUM_THREADS=1 > export OMP_NUM_THREADS=1 > > Satish > > > On Wed, 7 Dec 2022, Mark Lohry wrote: > > > I ran into an unexpected issue -- on an NP-core machine, each MPI rank of > > my application was launching NP threads, such that when running with > > multiple ranks the machine was quickly oversubscribed and performance > > tanked. > > > > The root cause of this was petsc linking against the system-provided > > library (libopenblas0-pthread in this case) set by the > update-alternatives > > in ubuntu. At some point this machine got updated to using the threaded > > blas implementation instead of serial; not sure how, and I wouldn't have > > noticed if I weren't running interactively. > > > > Is there any mechanism in petsc or its build system to prevent linking > > against an inappropriate BLAS, or do I need to be diligent about manually > > setting the BLAS library in the configuration stage? > > > > Thanks, > > Mark > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Dec 7 17:32:48 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 7 Dec 2022 18:32:48 -0500 Subject: [petsc-users] prevent linking to multithreaded BLAS? In-Reply-To: References: <05539ceb-d9ec-7b03-4344-1f0cbfe57bd3@mcs.anl.gov> Message-ID: <35A089AA-4AB6-4450-A348-95DBFAA4F68E@petsc.dev> There would need to be, for example, some symbol in all the threaded BLAS libraries that is not in the unthreaded libraries. Of at least in some of the threaded libraries but never in the unthreaded. BlasLapack.py could check for the special symbol(s) to determine. Barry > On Dec 7, 2022, at 4:47 PM, Mark Lohry wrote: > > Thanks, yes, I figured out the OMP_NUM_THREADS=1 way while triaging it, and the --download-fblaslapack way occurred to me. > > I was hoping for something that "just worked" (refuse to build in this case) but I don't know if it's programmatically possible for petsc to tell whether or not it's linking to a threaded BLAS? > > On Wed, Dec 7, 2022 at 4:35 PM Satish Balay > wrote: >> If you don't specify a blas to use - petsc configure will guess and use what it can find. >> >> So only way to force it use a particular blas is to specify one [one way is --download-fblaslapack] >> >> Wrt multi-thread openblas - you can force it run single threaded [by one of these 2 env variables] >> >> # Use single thread openblas >> export OPENBLAS_NUM_THREADS=1 >> export OMP_NUM_THREADS=1 >> >> Satish >> >> >> On Wed, 7 Dec 2022, Mark Lohry wrote: >> >> > I ran into an unexpected issue -- on an NP-core machine, each MPI rank of >> > my application was launching NP threads, such that when running with >> > multiple ranks the machine was quickly oversubscribed and performance >> > tanked. >> > >> > The root cause of this was petsc linking against the system-provided >> > library (libopenblas0-pthread in this case) set by the update-alternatives >> > in ubuntu. At some point this machine got updated to using the threaded >> > blas implementation instead of serial; not sure how, and I wouldn't have >> > noticed if I weren't running interactively. >> > >> > Is there any mechanism in petsc or its build system to prevent linking >> > against an inappropriate BLAS, or do I need to be diligent about manually >> > setting the BLAS library in the configuration stage? >> > >> > Thanks, >> > Mark >> > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Dec 7 17:32:48 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 7 Dec 2022 18:32:48 -0500 Subject: [petsc-users] prevent linking to multithreaded BLAS? In-Reply-To: References: <05539ceb-d9ec-7b03-4344-1f0cbfe57bd3@mcs.anl.gov> Message-ID: <35A089AA-4AB6-4450-A348-95DBFAA4F68E@petsc.dev> There would need to be, for example, some symbol in all the threaded BLAS libraries that is not in the unthreaded libraries. Of at least in some of the threaded libraries but never in the unthreaded. BlasLapack.py could check for the special symbol(s) to determine. Barry > On Dec 7, 2022, at 4:47 PM, Mark Lohry wrote: > > Thanks, yes, I figured out the OMP_NUM_THREADS=1 way while triaging it, and the --download-fblaslapack way occurred to me. > > I was hoping for something that "just worked" (refuse to build in this case) but I don't know if it's programmatically possible for petsc to tell whether or not it's linking to a threaded BLAS? > > On Wed, Dec 7, 2022 at 4:35 PM Satish Balay > wrote: >> If you don't specify a blas to use - petsc configure will guess and use what it can find. >> >> So only way to force it use a particular blas is to specify one [one way is --download-fblaslapack] >> >> Wrt multi-thread openblas - you can force it run single threaded [by one of these 2 env variables] >> >> # Use single thread openblas >> export OPENBLAS_NUM_THREADS=1 >> export OMP_NUM_THREADS=1 >> >> Satish >> >> >> On Wed, 7 Dec 2022, Mark Lohry wrote: >> >> > I ran into an unexpected issue -- on an NP-core machine, each MPI rank of >> > my application was launching NP threads, such that when running with >> > multiple ranks the machine was quickly oversubscribed and performance >> > tanked. >> > >> > The root cause of this was petsc linking against the system-provided >> > library (libopenblas0-pthread in this case) set by the update-alternatives >> > in ubuntu. At some point this machine got updated to using the threaded >> > blas implementation instead of serial; not sure how, and I wouldn't have >> > noticed if I weren't running interactively. >> > >> > Is there any mechanism in petsc or its build system to prevent linking >> > against an inappropriate BLAS, or do I need to be diligent about manually >> > setting the BLAS library in the configuration stage? >> > >> > Thanks, >> > Mark >> > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Wed Dec 7 17:47:15 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 7 Dec 2022 17:47:15 -0600 Subject: [petsc-users] [EXTERNAL] Re: Kokkos backend for Mat and Vec diverging when running on CUDA device. In-Reply-To: References: Message-ID: Hi, Philip, I could reproduce the error. I need to find a way to debug it. Thanks. /home/jczhang/xolotl/test/system/SystemTestCase.cpp(317): fatal error: in "System/PSI_1": absolute value of diffNorm{0.19704848134353209} exceeds 1e-10 *** 1 failure is detected in the test module "Regression" --Junchao Zhang On Tue, Dec 6, 2022 at 10:10 AM Fackler, Philip wrote: > I think it would be simpler to use the develop branch for this issue. But > you can still just build the SystemTester. Then (if you changed the PSI_1 > case) run: > > ./test/system/SystemTester -t System/PSI_1 -- -v? > > (No need for multiple MPI ranks) > > Thanks, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Monday, December 5, 2022 15:40 > *To:* Fackler, Philip > *Cc:* xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov>; Blondel, Sophie ; Roth, > Philip > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and > Vec diverging when running on CUDA device. > > I configured with xolotl branch feature-petsc-kokkos, and typed `make` > under ~/xolotl-build/. Though there were errors, a lot of *Tester were > built. > > [ 62%] Built target xolotlViz > [ 63%] Linking CXX executable TemperatureProfileHandlerTester > [ 64%] Linking CXX executable TemperatureGradientHandlerTester > [ 64%] Built target TemperatureProfileHandlerTester > [ 64%] Built target TemperatureConstantHandlerTester > [ 64%] Built target TemperatureGradientHandlerTester > [ 65%] Linking CXX executable HeatEquationHandlerTester > [ 65%] Built target HeatEquationHandlerTester > [ 66%] Linking CXX executable FeFitFluxHandlerTester > [ 66%] Linking CXX executable W111FitFluxHandlerTester > [ 67%] Linking CXX executable FuelFitFluxHandlerTester > [ 67%] Linking CXX executable W211FitFluxHandlerTester > > Which Tester should I use to run with the parameter file > benchmarks/params_system_PSI_2.txt? And how many ranks should I use? > Could you give an example command line? > Thanks. > > --Junchao Zhang > > > On Mon, Dec 5, 2022 at 2:22 PM Junchao Zhang > wrote: > > Hello, Philip, > Do I still need to use the feature-petsc-kokkos branch? > --Junchao Zhang > > > On Mon, Dec 5, 2022 at 11:08 AM Fackler, Philip > wrote: > > Junchao, > > Thank you for working on this. If you open the parameter file for, say, > the PSI_2 system test case (benchmarks/params_system_PSI_2.txt), simply add -dm_mat_type > aijkokkos -dm_vec_type kokkos?` to the "petscArgs=" field (or the > corresponding cusparse/cuda option). > > Thanks, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Thursday, December 1, 2022 17:05 > *To:* Fackler, Philip > *Cc:* xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov>; Blondel, Sophie ; Roth, > Philip > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and > Vec diverging when running on CUDA device. > > Hi, Philip, > Sorry for the long delay. I could not get something useful from the > -log_view output. Since I have already built xolotl, could you give me > instructions on how to do a xolotl test to reproduce the divergence with > petsc GPU backends (but fine on CPU)? > Thank you. > --Junchao Zhang > > > On Wed, Nov 16, 2022 at 1:38 PM Fackler, Philip > wrote: > > ------------------------------------------------------------------ PETSc > Performance Summary: > ------------------------------------------------------------------ > > Unknown Name on a named PC0115427 with 1 processor, by 4pf Wed Nov 16 > 14:36:46 2022 > Using Petsc Development GIT revision: v3.18.1-115-gdca010e0e9a GIT Date: > 2022-10-28 14:39:41 +0000 > > Max Max/Min Avg Total > Time (sec): 6.023e+00 1.000 6.023e+00 > Objects: 1.020e+02 1.000 1.020e+02 > Flops: 1.080e+09 1.000 1.080e+09 1.080e+09 > Flops/sec: 1.793e+08 1.000 1.793e+08 1.793e+08 > MPI Msg Count: 0.000e+00 0.000 0.000e+00 0.000e+00 > MPI Msg Len (bytes): 0.000e+00 0.000 0.000e+00 0.000e+00 > MPI Reductions: 0.000e+00 0.000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flops > and VecAXPY() for complex vectors of length N > --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count > %Total Avg %Total Count %Total > 0: Main Stage: 6.0226e+00 100.0% 1.0799e+09 100.0% 0.000e+00 > 0.0% 0.000e+00 0.0% 0.000e+00 0.0% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > AvgLen: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over > all processors) > GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU > time over all processors) > CpuToGpu Count: total number of CPU to GPU copies per processor > CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per > processor) > GpuToCpu Count: total number of GPU to CPU copies per processor > GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per > processor) > GPU %F: percent flops on GPU in this event > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total > GPU - CpuToGpu - - GpuToCpu - GPU > > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > Mflop/s Count Size Count Size %F > > > ------------------------------------------------------------------------------------------------------------------------ > --------------------------------------- > > > --- Event Stage 0: Main Stage > > BuildTwoSided 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > DMCreateMat 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SFSetGraph 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SFSetUp 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SFPack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SFUnpack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecDot 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecMDot 775 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecNorm 1728 1.0 nan nan 1.92e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecScale 1983 1.0 nan nan 6.24e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecCopy 780 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecSet 4955 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecAXPY 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecAYPX 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecAXPBYCZ 643 1.0 nan nan 1.79e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecWAXPY 502 1.0 nan nan 5.58e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecMAXPY 1159 1.0 nan nan 3.68e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecScatterBegin 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan > -nan 2 5.14e-03 0 0.00e+00 0 > > VecScatterEnd 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecReduceArith 380 1.0 nan nan 4.23e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > VecReduceComm 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > VecNormalize 965 1.0 nan nan 1.61e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > TSStep 20 1.0 5.8699e+00 1.0 1.08e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 97100 0 0 0 97100 0 0 0 184 > -nan 2 5.14e-03 0 0.00e+00 54 > > TSFunctionEval 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 63 1 0 0 0 63 1 0 0 0 -nan > -nan 1 3.36e-04 0 0.00e+00 100 > > TSJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 97 > > MatMult 1930 1.0 nan nan 4.46e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 41 0 0 0 1 41 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > MatMultTranspose 1 1.0 nan nan 3.44e+05 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > MatSolve 965 1.0 nan nan 5.04e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 5 0 0 0 1 5 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatSOR 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatLUFactorSym 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatLUFactorNum 190 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 11 0 0 0 1 11 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatScale 190 1.0 nan nan 3.26e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > MatAssemblyBegin 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatAssemblyEnd 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatGetRowIJ 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatCreateSubMats 380 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatGetOrdering 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatZeroEntries 379 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatSetPreallCOO 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > MatSetValuesCOO 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > KSPSetUp 760 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > KSPSolve 190 1.0 5.8052e-01 1.0 9.30e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 10 86 0 0 0 10 86 0 0 0 1602 > -nan 1 4.80e-03 0 0.00e+00 46 > > KSPGMRESOrthog 775 1.0 nan nan 2.27e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 2 0 0 0 1 2 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > SNESSolve 71 1.0 5.7117e+00 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 95 99 0 0 0 95 99 0 0 0 188 > -nan 1 4.80e-03 0 0.00e+00 53 > > SNESSetUp 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > SNESFunctionEval 573 1.0 nan nan 2.23e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 60 2 0 0 0 60 2 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > SNESJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 97 > > SNESLineSearch 190 1.0 nan nan 1.05e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 53 10 0 0 0 53 10 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 100 > > PCSetUp 570 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 2 11 0 0 0 2 11 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > PCApply 965 1.0 nan nan 6.14e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 8 57 0 0 0 8 57 0 0 0 -nan > -nan 1 4.80e-03 0 0.00e+00 19 > > KSPSolve_FS_0 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > KSPSolve_FS_1 965 1.0 nan nan 1.66e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 2 15 0 0 0 2 15 0 0 0 -nan > -nan 0 0.00e+00 0 0.00e+00 0 > > > --- Event Stage 1: Unknown > > > ------------------------------------------------------------------------------------------------------------------------ > --------------------------------------- > > > Object Type Creations Destructions. Reports information only > for process 0. > > --- Event Stage 0: Main Stage > > Container 5 5 > Distributed Mesh 2 2 > Index Set 11 11 > IS L to G Mapping 1 1 > Star Forest Graph 7 7 > Discrete System 2 2 > Weak Form 2 2 > Vector 49 49 > TSAdapt 1 1 > TS 1 1 > DMTS 1 1 > SNES 1 1 > DMSNES 3 3 > SNESLineSearch 1 1 > Krylov Solver 4 4 > DMKSP interface 1 1 > Matrix 4 4 > Preconditioner 4 4 > Viewer 2 1 > > --- Event Stage 1: Unknown > > > ======================================================================================================================== > Average time to get PetscTime(): 3.14e-08 > #PETSc Option Table entries: > -log_view > -log_view_gpu_times > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with 64 bit PetscInt > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 8 > Configure options: PETSC_DIR=/home/4pf/repos/petsc > PETSC_ARCH=arch-kokkos-cuda-no-tpls --with-cc=mpicc --with-cxx=mpicxx > --with-fc=0 --with-cuda --with-debugging=0 --with-shared-libraries > --prefix=/home/4pf/build/petsc/cuda-no-tpls/install --with-64-bit-indices > --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --CUDAOPTFLAGS=-O3 > --with-kokkos-dir=/home/4pf/build/kokkos/cuda/install > --with-kokkos-kernels-dir=/home/4pf/build/kokkos-kernels/cuda-no-tpls/install > > ----------------------------------------- > Libraries compiled on 2022-11-01 21:01:08 on PC0115427 > Machine characteristics: Linux-5.15.0-52-generic-x86_64-with-glibc2.35 > Using PETSc directory: /home/4pf/build/petsc/cuda-no-tpls/install > Using PETSc arch: > ----------------------------------------- > > Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas > -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector > -fvisibility=hidden -O3 > ----------------------------------------- > > Using include paths: -I/home/4pf/build/petsc/cuda-no-tpls/install/include > -I/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/include > -I/home/4pf/build/kokkos/cuda/install/include -I/usr/local/cuda-11.8/include > ----------------------------------------- > > Using C linker: mpicc > Using libraries: -Wl,-rpath,/home/4pf/build/petsc/cuda-no-tpls/install/lib > -L/home/4pf/build/petsc/cuda-no-tpls/install/lib -lpetsc > -Wl,-rpath,/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib > -L/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib > -Wl,-rpath,/home/4pf/build/kokkos/cuda/install/lib > -L/home/4pf/build/kokkos/cuda/install/lib > -Wl,-rpath,/usr/local/cuda-11.8/lib64 -L/usr/local/cuda-11.8/lib64 > -L/usr/local/cuda-11.8/lib64/stubs -lkokkoskernels -lkokkoscontainers > -lkokkoscore -llapack -lblas -lm -lcudart -lnvToolsExt -lcufft -lcublas > -lcusparse -lcusolver -lcurand -lcuda -lquadmath -lstdc++ -ldl > ----------------------------------------- > > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Tuesday, November 15, 2022 13:03 > *To:* Fackler, Philip > *Cc:* xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov>; Blondel, Sophie ; Roth, > Philip > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and > Vec diverging when running on CUDA device. > > Can you paste -log_view result so I can see what functions are used? > > --Junchao Zhang > > > On Tue, Nov 15, 2022 at 10:24 AM Fackler, Philip > wrote: > > Yes, most (but not all) of our system test cases fail with the kokkos/cuda > or cuda backends. All of them pass with the CPU-only kokkos backend. > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Monday, November 14, 2022 19:34 > *To:* Fackler, Philip > *Cc:* xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov>; Blondel, Sophie ; Zhang, > Junchao ; Roth, Philip > *Subject:* [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec > diverging when running on CUDA device. > > Hi, Philip, > Sorry to hear that. It seems you could run the same code on CPUs but > not no GPUs (with either petsc/Kokkos backend or petsc/cuda backend, is it > right? > > --Junchao Zhang > > > On Mon, Nov 14, 2022 at 12:13 PM Fackler, Philip via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > This is an issue I've brought up before (and discussed in-person with > Richard). I wanted to bring it up again because I'm hitting the limits of > what I know to do, and I need help figuring this out. > > The problem can be reproduced using Xolotl's "develop" branch built > against a petsc build with kokkos and kokkos-kernels enabled. Then, either > add the relevant kokkos options to the "petscArgs=" line in the system test > parameter file(s), or just replace the system test parameter files with the > ones from the "feature-petsc-kokkos" branch. See here the files that > begin with "params_system_". > > Note that those files use the "kokkos" options, but the problem is similar > using the corresponding cuda/cusparse options. I've already tried building > kokkos-kernels with no TPLs and got slightly different results, but the > same problem. > > Any help would be appreciated. > > Thanks, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From narnoldm at umich.edu Wed Dec 7 20:21:32 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Wed, 7 Dec 2022 21:21:32 -0500 Subject: [petsc-users] Petsc Section in DMPlex In-Reply-To: References: Message-ID: Thank you for the help. I think the last piece of the puzzle is how do I create the "expanded IS" from the subpoint IS using the section? Sincerely Nicholas On Wed, Dec 7, 2022 at 7:06 AM Matthew Knepley wrote: > On Wed, Dec 7, 2022 at 6:51 AM Nicholas Arnold-Medabalimi < > narnoldm at umich.edu> wrote: > >> Hi >> >> Thank you so much for your patience. One thing to note: I don't have any >> need to go back from the filtered distributed mapping back to the full but >> it is good to know. >> >> One aside question. >> 1) Is natural and global ordering the same in this context? >> > > No. > > >> As far as implementing what you have described. >> >> When I call ISView on the generated SubpointIS, I get an unusual error >> which I'm not sure how to interpret. (this case is running on 2 ranks and >> the filter label has points located on both ranks of the original DM. >> However, if I manually get the indices (the commented lines), it seems to >> not have any issues. >> call DMPlexFilter(dmplex_full, iBlankLabel, 1, dmplex_filtered,ierr) >> call DMPlexGetSubpointIS(dmplex_filtered, subpointsIS,ierr) >> !call ISGetIndicesF90(subpointsIS, subPointKey,ierr) >> !write(*,*) subPointKey >> !call ISRestoreIndicesF90(subpointsIS, subPointKey,ierr) >> call ISView(subpointsIS,PETSC_VIEWER_STDOUT_WORLD,ierr) >> >> [1]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [1]PETSC ERROR: Arguments must have same communicators >> [1]PETSC ERROR: Different communicators in the two objects: Argument # 1 >> and 2 flag 3 >> [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [1]PETSC ERROR: Petsc Development GIT revision: v3.18.1-320-g7810d690132 >> GIT Date: 2022-11-20 20:25:41 -0600 >> [1]PETSC ERROR: Configure options with-fc=mpiifort with-mpi-f90=mpiifort >> --download-triangle --download-parmetis --download-metis --with-debugging=1 >> --download-hdf5 --prefix=/home/narnoldm/packages/petsc_install >> [1]PETSC ERROR: #1 ISView() at >> /home/narnoldm/packages/petsc/src/vec/is/is/interface/index.c:1629 >> > > The problem here is the subpointsIS is a _serial_ object, and you are > using a parallel viewer. You can use PETSC_VIEWER_STDOUT_SELF, > or you can pull out the singleton viewer from STDOUT_WORLD if you want > them all to print in order. > > >> As far as the overall process you have described my question on first >> glance is do I have to allocate/create the vector that is output by >> VecISCopy before calling it, or does it create the vector automatically? >> > > You create both vectors. I would do it using DMCreateGlobalVector() from > both DMs. > > >> I think I would need to create it first using a section and Setting the >> Vec in the filtered DM? >> > > Setting the Section in the filtered DM. > > >> And I presume in this case I would be using the scatter reverse option to >> go from the full set to the reduced set? >> > > Yes > > Thanks > > Matt > > >> Sincerely >> Nicholas >> >> >> >> >> >> >> Sincerely >> Nick >> >> On Wed, Dec 7, 2022 at 6:00 AM Matthew Knepley wrote: >> >>> On Wed, Dec 7, 2022 at 3:35 AM Nicholas Arnold-Medabalimi < >>> narnoldm at umich.edu> wrote: >>> >>>> Hi Matthew >>>> >>>> Thank you for the help. This clarified a great deal. >>>> >>>> I have a follow-up question related to DMPlexFilter. It may be better >>>> to describe what I'm trying to achieve. >>>> >>>> I have a general mesh I am solving which has a section with cell center >>>> finite volume states, as described in my initial email. After calculating >>>> some metrics, I tag a bunch of cells with an identifying Label and use >>>> DMFilter to generate a new DM which is only that subset of cells. >>>> Generally, this leads to a pretty unbalanced DM so I then plan to use >>>> DMPlexDIstribute to balance that DM across the processors. The coordinates >>>> pass along fine, but the state(or I should say Section) does not at least >>>> as far as I can tell. >>>> >>>> Assuming I can get a filtered DM I then distribute the DM and state >>>> using the method you described above and it seems to be working ok. >>>> >>>> The last connection I have to make is the transfer of information from >>>> the full mesh to the "sampled" filtered mesh. From what I can gather I >>>> would need to get the mapping of points using DMPlexGetSubpointIS and then >>>> manually copy the values from the full DM section to the filtered DM? I >>>> have the process from full->filtered->distributed all working for the >>>> coordinates so its just a matter of transferring the section correctly. >>>> >>>> I appreciate all the help you have provided. >>>> >>> >>> Let's do this in two steps, which makes it easier to debug. First, do >>> not redistribute the submesh. Just use DMPlexGetSubpointIS() >>> to get the mapping of filtered points to points in the original mesh. >>> Then create an expanded IS using the Section which makes >>> dofs in the filtered mesh to dofs in the original mesh. From this use >>> >>> https://petsc.org/main/docs/manualpages/Vec/VecISCopy/ >>> >>> to move values between the original vector and the filtered vector. >>> >>> Once that works, you can try redistributing the filtered mesh. Before >>> calling DMPlexDistribute() on the filtered mesh, you need to call >>> >>> https://petsc.org/main/docs/manualpages/DM/DMSetUseNatural/ >>> >>> When you redistribute, it will compute a mapping back to the original >>> layout. Now when you want to transfer values, you >>> >>> 1) Create a natural vector with DMCreateNaturalVec() >>> >>> 2) Use DMGlobalToNaturalBegin/End() to move values from the filtered >>> vector to the natural vector >>> >>> 3) Use VecISCopy() to move values from the natural vector to the >>> original vector >>> >>> Let me know if you have any problems. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Sincerely >>>> Nicholas >>>> >>>> >>>> >>>> On Mon, Nov 28, 2022 at 6:19 AM Matthew Knepley >>>> wrote: >>>> >>>>> On Sun, Nov 27, 2022 at 10:22 PM Nicholas Arnold-Medabalimi < >>>>> narnoldm at umich.edu> wrote: >>>>> >>>>>> Hi Petsc Users >>>>>> >>>>>> I have a question about properly using PetscSection to assign state >>>>>> variables to a DM. I have an existing DMPlex mesh distributed on 2 >>>>>> processors. My goal is to have state variables set to the cell centers. I >>>>>> then want to call DMPlexDistribute, which I hope will balance the mesh >>>>>> elements and hopefully transport the state variables to the hosting >>>>>> processors as the cells are distributed to a different processor count or >>>>>> simply just redistributing after doing mesh adaption. >>>>>> >>>>>> Looking at the DMPlex User guide, I should be able to achieve this >>>>>> with a single field section using SetDof and assigning the DOF to the >>>>>> points corresponding to cells. >>>>>> >>>>> >>>>> Note that if you want several different fields, you can clone the DM >>>>> first for this field >>>>> >>>>> call DMClone(dm,dmState,ierr) >>>>> >>>>> and use dmState in your calls below. >>>>> >>>>> >>>>>> >>>>>> call DMPlexGetHeightStratum(dm,0,c0,c1,ierr) >>>>>> call DMPlexGetChart(dm,p0,p1,ierr) >>>>>> call PetscSectionCreate(PETSC_COMM_WORLD,section,ierr) >>>>>> call PetscSectionSetNumFields(section,1,ierr) call >>>>>> PetscSectionSetChart(section,p0,p1,ierr) >>>>>> do i = c0, (c1-1) >>>>>> call PetscSectionSetDof(section,i,nvar,ierr) >>>>>> end do >>>>>> call PetscSectionSetup(section,ierr) >>>>>> call DMSetLocalSection(dm,section,ierr) >>>>>> >>>>> >>>>> In the loop, I would add a call to >>>>> >>>>> call PetscSectionSetFieldDof(section,i,0,nvar,ierr) >>>>> >>>>> This also puts in the field breakdown. It is not essential, but nicer. >>>>> >>>>> >>>>>> From here, it looks like I can access and set the state vars using >>>>>> >>>>>> call DMGetGlobalVector(dmplex,state,ierr) >>>>>> call DMGetGlobalSection(dmplex,section,ierr) >>>>>> call VecGetArrayF90(state,stateVec,ierr) >>>>>> do i = c0, (c1-1) >>>>>> call PetscSectionGetOffset(section,i,offset,ierr) >>>>>> stateVec(offset:(offset+nvar))=state_i(:) !simplified assignment >>>>>> end do >>>>>> call VecRestoreArrayF90(state,stateVec,ierr) >>>>>> call DMRestoreGlobalVector(dmplex,state,ierr) >>>>>> >>>>>> To my understanding, I should be using Global vector since this is a >>>>>> pure assignment operation and I don't need the ghost cells. >>>>>> >>>>> >>>>> Yes. >>>>> >>>>> But the behavior I am seeing isn't exactly what I'd expect. >>>>>> >>>>>> To be honest, I'm somewhat unclear on a few things >>>>>> >>>>>> 1) Should be using nvar fields with 1 DOF each or 1 field with nvar >>>>>> DOFs or what the distinction between the two methods are? >>>>>> >>>>> >>>>> We have two divisions in a Section. A field can have a number of >>>>> components. This is intended to model a vector or tensor field. >>>>> Then a Section can have a number of fields, such as velocity and >>>>> pressure for a Stokes problem. The division is mainly to help the >>>>> user, so I would use the most natural one. >>>>> >>>>> >>>>>> 2) Adding a print statement after the offset assignment I get (on >>>>>> rank 0 of 2) >>>>>> cell 1 offset 0 >>>>>> cell 2 offset 18 >>>>>> cell 3 offset 36 >>>>>> which is expected and works but on rank 1 I get >>>>>> cell 1 offset 9000 >>>>>> cell 2 offset 9018 >>>>>> cell 3 offset 9036 >>>>>> >>>>>> which isn't exactly what I would expect. Shouldn't the offsets reset >>>>>> at 0 for the next rank? >>>>>> >>>>> >>>>> The local and global sections hold different information. This is the >>>>> source of the confusion. The local section does describe a local >>>>> vector, and thus includes overlap or "ghost" dofs. The global section >>>>> describes a global vector. However, it is intended to deliver >>>>> global indices, and thus the offsets give back global indices. When >>>>> you use VecGetArray*() you are getting out the local array, and >>>>> thus you have to subtract the first index on this process. You can get >>>>> that from >>>>> >>>>> VecGetOwnershipRange(v, &rstart, &rEnd); >>>>> >>>>> This is the same whether you are using DMDA or DMPlex or any other DM. >>>>> >>>>> >>>>>> 3) Does calling DMPlexDistribute also distribute the section data >>>>>> associated with the DOF, based on the description in DMPlexDistribute it >>>>>> looks like it should? >>>>>> >>>>> >>>>> No. By default, DMPlexDistribute() only distributes coordinate data. I >>>>> you want to distribute your field, it would look something like this: >>>>> >>>>> DMPlexDistribute(dm, 0, &sfDist, &dmDist); >>>>> VecCreate(comm, &stateDist); >>>>> VecSetDM(sateDist, dmDist); >>>>> PetscSectionCreate(comm §ionDist); >>>>> DMSetLocalSection(dmDist, sectionDist); >>>>> DMPlexDistributeField(dmDist, sfDist, section, state, sectionDist, >>>>> stateDist); >>>>> >>>>> We do this in src/dm/impls/plex/tests/ex36.c >>>>> >>>>> THanks, >>>>> >>>>> Matt >>>>> >>>>> I'd appreciate any insight into the specifics of this usage. I expect >>>>>> I have a misconception on the local vs global section. Thank you. >>>>>> >>>>>> Sincerely >>>>>> Nicholas >>>>>> >>>>>> -- >>>>>> Nicholas Arnold-Medabalimi >>>>>> >>>>>> Ph.D. Candidate >>>>>> Computational Aeroscience Lab >>>>>> University of Michigan >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>>> >>>> -- >>>> Nicholas Arnold-Medabalimi >>>> >>>> Ph.D. Candidate >>>> Computational Aeroscience Lab >>>> University of Michigan >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> >> -- >> Nicholas Arnold-Medabalimi >> >> Ph.D. Candidate >> Computational Aeroscience Lab >> University of Michigan >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Dec 7 20:28:36 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 7 Dec 2022 21:28:36 -0500 Subject: [petsc-users] Petsc Section in DMPlex In-Reply-To: References: Message-ID: On Wed, Dec 7, 2022 at 9:21 PM Nicholas Arnold-Medabalimi < narnoldm at umich.edu> wrote: > Thank you for the help. > > I think the last piece of the puzzle is how do I create the "expanded IS" > from the subpoint IS using the section? > Loop over the points in the IS. For each point, get the dof and offset from the Section. Make a new IS that has all the dogs, namely each run [offset, offset+dof). Thanks, Matt > Sincerely > Nicholas > > > On Wed, Dec 7, 2022 at 7:06 AM Matthew Knepley wrote: > >> On Wed, Dec 7, 2022 at 6:51 AM Nicholas Arnold-Medabalimi < >> narnoldm at umich.edu> wrote: >> >>> Hi >>> >>> Thank you so much for your patience. One thing to note: I don't have any >>> need to go back from the filtered distributed mapping back to the full but >>> it is good to know. >>> >>> One aside question. >>> 1) Is natural and global ordering the same in this context? >>> >> >> No. >> >> >>> As far as implementing what you have described. >>> >>> When I call ISView on the generated SubpointIS, I get an unusual error >>> which I'm not sure how to interpret. (this case is running on 2 ranks and >>> the filter label has points located on both ranks of the original DM. >>> However, if I manually get the indices (the commented lines), it seems to >>> not have any issues. >>> call DMPlexFilter(dmplex_full, iBlankLabel, 1, dmplex_filtered,ierr) >>> call DMPlexGetSubpointIS(dmplex_filtered, subpointsIS,ierr) >>> !call ISGetIndicesF90(subpointsIS, subPointKey,ierr) >>> !write(*,*) subPointKey >>> !call ISRestoreIndicesF90(subpointsIS, subPointKey,ierr) >>> call ISView(subpointsIS,PETSC_VIEWER_STDOUT_WORLD,ierr) >>> >>> [1]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> [1]PETSC ERROR: Arguments must have same communicators >>> [1]PETSC ERROR: Different communicators in the two objects: Argument # 1 >>> and 2 flag 3 >>> [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >>> [1]PETSC ERROR: Petsc Development GIT revision: v3.18.1-320-g7810d690132 >>> GIT Date: 2022-11-20 20:25:41 -0600 >>> [1]PETSC ERROR: Configure options with-fc=mpiifort with-mpi-f90=mpiifort >>> --download-triangle --download-parmetis --download-metis --with-debugging=1 >>> --download-hdf5 --prefix=/home/narnoldm/packages/petsc_install >>> [1]PETSC ERROR: #1 ISView() at >>> /home/narnoldm/packages/petsc/src/vec/is/is/interface/index.c:1629 >>> >> >> The problem here is the subpointsIS is a _serial_ object, and you are >> using a parallel viewer. You can use PETSC_VIEWER_STDOUT_SELF, >> or you can pull out the singleton viewer from STDOUT_WORLD if you want >> them all to print in order. >> >> >>> As far as the overall process you have described my question on first >>> glance is do I have to allocate/create the vector that is output by >>> VecISCopy before calling it, or does it create the vector automatically? >>> >> >> You create both vectors. I would do it using DMCreateGlobalVector() from >> both DMs. >> >> >>> I think I would need to create it first using a section and Setting the >>> Vec in the filtered DM? >>> >> >> Setting the Section in the filtered DM. >> >> >>> And I presume in this case I would be using the scatter reverse option >>> to go from the full set to the reduced set? >>> >> >> Yes >> >> Thanks >> >> Matt >> >> >>> Sincerely >>> Nicholas >>> >>> >>> >>> >>> >>> >>> Sincerely >>> Nick >>> >>> On Wed, Dec 7, 2022 at 6:00 AM Matthew Knepley >>> wrote: >>> >>>> On Wed, Dec 7, 2022 at 3:35 AM Nicholas Arnold-Medabalimi < >>>> narnoldm at umich.edu> wrote: >>>> >>>>> Hi Matthew >>>>> >>>>> Thank you for the help. This clarified a great deal. >>>>> >>>>> I have a follow-up question related to DMPlexFilter. It may be better >>>>> to describe what I'm trying to achieve. >>>>> >>>>> I have a general mesh I am solving which has a section with cell >>>>> center finite volume states, as described in my initial email. After >>>>> calculating some metrics, I tag a bunch of cells with an identifying Label >>>>> and use DMFilter to generate a new DM which is only that subset of cells. >>>>> Generally, this leads to a pretty unbalanced DM so I then plan to use >>>>> DMPlexDIstribute to balance that DM across the processors. The coordinates >>>>> pass along fine, but the state(or I should say Section) does not at least >>>>> as far as I can tell. >>>>> >>>>> Assuming I can get a filtered DM I then distribute the DM and state >>>>> using the method you described above and it seems to be working ok. >>>>> >>>>> The last connection I have to make is the transfer of information from >>>>> the full mesh to the "sampled" filtered mesh. From what I can gather I >>>>> would need to get the mapping of points using DMPlexGetSubpointIS and then >>>>> manually copy the values from the full DM section to the filtered DM? I >>>>> have the process from full->filtered->distributed all working for the >>>>> coordinates so its just a matter of transferring the section correctly. >>>>> >>>>> I appreciate all the help you have provided. >>>>> >>>> >>>> Let's do this in two steps, which makes it easier to debug. First, do >>>> not redistribute the submesh. Just use DMPlexGetSubpointIS() >>>> to get the mapping of filtered points to points in the original mesh. >>>> Then create an expanded IS using the Section which makes >>>> dofs in the filtered mesh to dofs in the original mesh. From this use >>>> >>>> https://petsc.org/main/docs/manualpages/Vec/VecISCopy/ >>>> >>>> to move values between the original vector and the filtered vector. >>>> >>>> Once that works, you can try redistributing the filtered mesh. Before >>>> calling DMPlexDistribute() on the filtered mesh, you need to call >>>> >>>> https://petsc.org/main/docs/manualpages/DM/DMSetUseNatural/ >>>> >>>> When you redistribute, it will compute a mapping back to the original >>>> layout. Now when you want to transfer values, you >>>> >>>> 1) Create a natural vector with DMCreateNaturalVec() >>>> >>>> 2) Use DMGlobalToNaturalBegin/End() to move values from the filtered >>>> vector to the natural vector >>>> >>>> 3) Use VecISCopy() to move values from the natural vector to the >>>> original vector >>>> >>>> Let me know if you have any problems. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Sincerely >>>>> Nicholas >>>>> >>>>> >>>>> >>>>> On Mon, Nov 28, 2022 at 6:19 AM Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Sun, Nov 27, 2022 at 10:22 PM Nicholas Arnold-Medabalimi < >>>>>> narnoldm at umich.edu> wrote: >>>>>> >>>>>>> Hi Petsc Users >>>>>>> >>>>>>> I have a question about properly using PetscSection to assign state >>>>>>> variables to a DM. I have an existing DMPlex mesh distributed on 2 >>>>>>> processors. My goal is to have state variables set to the cell centers. I >>>>>>> then want to call DMPlexDistribute, which I hope will balance the mesh >>>>>>> elements and hopefully transport the state variables to the hosting >>>>>>> processors as the cells are distributed to a different processor count or >>>>>>> simply just redistributing after doing mesh adaption. >>>>>>> >>>>>>> Looking at the DMPlex User guide, I should be able to achieve this >>>>>>> with a single field section using SetDof and assigning the DOF to the >>>>>>> points corresponding to cells. >>>>>>> >>>>>> >>>>>> Note that if you want several different fields, you can clone the DM >>>>>> first for this field >>>>>> >>>>>> call DMClone(dm,dmState,ierr) >>>>>> >>>>>> and use dmState in your calls below. >>>>>> >>>>>> >>>>>>> >>>>>>> call DMPlexGetHeightStratum(dm,0,c0,c1,ierr) >>>>>>> call DMPlexGetChart(dm,p0,p1,ierr) >>>>>>> call PetscSectionCreate(PETSC_COMM_WORLD,section,ierr) >>>>>>> call PetscSectionSetNumFields(section,1,ierr) call >>>>>>> PetscSectionSetChart(section,p0,p1,ierr) >>>>>>> do i = c0, (c1-1) >>>>>>> call PetscSectionSetDof(section,i,nvar,ierr) >>>>>>> end do >>>>>>> call PetscSectionSetup(section,ierr) >>>>>>> call DMSetLocalSection(dm,section,ierr) >>>>>>> >>>>>> >>>>>> In the loop, I would add a call to >>>>>> >>>>>> call PetscSectionSetFieldDof(section,i,0,nvar,ierr) >>>>>> >>>>>> This also puts in the field breakdown. It is not essential, but nicer. >>>>>> >>>>>> >>>>>>> From here, it looks like I can access and set the state vars using >>>>>>> >>>>>>> call DMGetGlobalVector(dmplex,state,ierr) >>>>>>> call DMGetGlobalSection(dmplex,section,ierr) >>>>>>> call VecGetArrayF90(state,stateVec,ierr) >>>>>>> do i = c0, (c1-1) >>>>>>> call PetscSectionGetOffset(section,i,offset,ierr) >>>>>>> stateVec(offset:(offset+nvar))=state_i(:) !simplified assignment >>>>>>> end do >>>>>>> call VecRestoreArrayF90(state,stateVec,ierr) >>>>>>> call DMRestoreGlobalVector(dmplex,state,ierr) >>>>>>> >>>>>>> To my understanding, I should be using Global vector since this is a >>>>>>> pure assignment operation and I don't need the ghost cells. >>>>>>> >>>>>> >>>>>> Yes. >>>>>> >>>>>> But the behavior I am seeing isn't exactly what I'd expect. >>>>>>> >>>>>>> To be honest, I'm somewhat unclear on a few things >>>>>>> >>>>>>> 1) Should be using nvar fields with 1 DOF each or 1 field with nvar >>>>>>> DOFs or what the distinction between the two methods are? >>>>>>> >>>>>> >>>>>> We have two divisions in a Section. A field can have a number of >>>>>> components. This is intended to model a vector or tensor field. >>>>>> Then a Section can have a number of fields, such as velocity and >>>>>> pressure for a Stokes problem. The division is mainly to help the >>>>>> user, so I would use the most natural one. >>>>>> >>>>>> >>>>>>> 2) Adding a print statement after the offset assignment I get (on >>>>>>> rank 0 of 2) >>>>>>> cell 1 offset 0 >>>>>>> cell 2 offset 18 >>>>>>> cell 3 offset 36 >>>>>>> which is expected and works but on rank 1 I get >>>>>>> cell 1 offset 9000 >>>>>>> cell 2 offset 9018 >>>>>>> cell 3 offset 9036 >>>>>>> >>>>>>> which isn't exactly what I would expect. Shouldn't the offsets reset >>>>>>> at 0 for the next rank? >>>>>>> >>>>>> >>>>>> The local and global sections hold different information. This is the >>>>>> source of the confusion. The local section does describe a local >>>>>> vector, and thus includes overlap or "ghost" dofs. The global section >>>>>> describes a global vector. However, it is intended to deliver >>>>>> global indices, and thus the offsets give back global indices. When >>>>>> you use VecGetArray*() you are getting out the local array, and >>>>>> thus you have to subtract the first index on this process. You can >>>>>> get that from >>>>>> >>>>>> VecGetOwnershipRange(v, &rstart, &rEnd); >>>>>> >>>>>> This is the same whether you are using DMDA or DMPlex or any other DM. >>>>>> >>>>>> >>>>>>> 3) Does calling DMPlexDistribute also distribute the section data >>>>>>> associated with the DOF, based on the description in DMPlexDistribute it >>>>>>> looks like it should? >>>>>>> >>>>>> >>>>>> No. By default, DMPlexDistribute() only distributes coordinate data. >>>>>> I you want to distribute your field, it would look something like this: >>>>>> >>>>>> DMPlexDistribute(dm, 0, &sfDist, &dmDist); >>>>>> VecCreate(comm, &stateDist); >>>>>> VecSetDM(sateDist, dmDist); >>>>>> PetscSectionCreate(comm §ionDist); >>>>>> DMSetLocalSection(dmDist, sectionDist); >>>>>> DMPlexDistributeField(dmDist, sfDist, section, state, sectionDist, >>>>>> stateDist); >>>>>> >>>>>> We do this in src/dm/impls/plex/tests/ex36.c >>>>>> >>>>>> THanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> I'd appreciate any insight into the specifics of this usage. I expect >>>>>>> I have a misconception on the local vs global section. Thank you. >>>>>>> >>>>>>> Sincerely >>>>>>> Nicholas >>>>>>> >>>>>>> -- >>>>>>> Nicholas Arnold-Medabalimi >>>>>>> >>>>>>> Ph.D. Candidate >>>>>>> Computational Aeroscience Lab >>>>>>> University of Michigan >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Nicholas Arnold-Medabalimi >>>>> >>>>> Ph.D. Candidate >>>>> Computational Aeroscience Lab >>>>> University of Michigan >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >>> -- >>> Nicholas Arnold-Medabalimi >>> >>> Ph.D. Candidate >>> Computational Aeroscience Lab >>> University of Michigan >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Dec 7 22:56:07 2022 From: jed at jedbrown.org (Jed Brown) Date: Wed, 07 Dec 2022 21:56:07 -0700 Subject: [petsc-users] prevent linking to multithreaded BLAS? In-Reply-To: <35A089AA-4AB6-4450-A348-95DBFAA4F68E@petsc.dev> References: <05539ceb-d9ec-7b03-4344-1f0cbfe57bd3@mcs.anl.gov> <35A089AA-4AB6-4450-A348-95DBFAA4F68E@petsc.dev> Message-ID: <87edtaze6w.fsf@jedbrown.org> It isn't always wrong to link threaded BLAS. For example, a user might need to call threaded BLAS on the side (but the application can only link one) or a sparse direct solver might want threading for the supernode. We could test at runtime whether child threads exist/are created when calling BLAS and deliver a warning. Barry Smith writes: > There would need to be, for example, some symbol in all the threaded BLAS libraries that is not in the unthreaded libraries. Of at least in some of the threaded libraries but never in the unthreaded. > > BlasLapack.py could check for the special symbol(s) to determine. > > Barry > > >> On Dec 7, 2022, at 4:47 PM, Mark Lohry wrote: >> >> Thanks, yes, I figured out the OMP_NUM_THREADS=1 way while triaging it, and the --download-fblaslapack way occurred to me. >> >> I was hoping for something that "just worked" (refuse to build in this case) but I don't know if it's programmatically possible for petsc to tell whether or not it's linking to a threaded BLAS? >> >> On Wed, Dec 7, 2022 at 4:35 PM Satish Balay > wrote: >>> If you don't specify a blas to use - petsc configure will guess and use what it can find. >>> >>> So only way to force it use a particular blas is to specify one [one way is --download-fblaslapack] >>> >>> Wrt multi-thread openblas - you can force it run single threaded [by one of these 2 env variables] >>> >>> # Use single thread openblas >>> export OPENBLAS_NUM_THREADS=1 >>> export OMP_NUM_THREADS=1 >>> >>> Satish >>> >>> >>> On Wed, 7 Dec 2022, Mark Lohry wrote: >>> >>> > I ran into an unexpected issue -- on an NP-core machine, each MPI rank of >>> > my application was launching NP threads, such that when running with >>> > multiple ranks the machine was quickly oversubscribed and performance >>> > tanked. >>> > >>> > The root cause of this was petsc linking against the system-provided >>> > library (libopenblas0-pthread in this case) set by the update-alternatives >>> > in ubuntu. At some point this machine got updated to using the threaded >>> > blas implementation instead of serial; not sure how, and I wouldn't have >>> > noticed if I weren't running interactively. >>> > >>> > Is there any mechanism in petsc or its build system to prevent linking >>> > against an inappropriate BLAS, or do I need to be diligent about manually >>> > setting the BLAS library in the configuration stage? >>> > >>> > Thanks, >>> > Mark >>> > >>> From narnoldm at umich.edu Thu Dec 8 02:04:16 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Thu, 8 Dec 2022 03:04:16 -0500 Subject: [petsc-users] Petsc Section in DMPlex In-Reply-To: References: Message-ID: Hi Matt I think I've gotten it just about there. I'm just having an issue with the VecISCopy. I have an IS built that matches size correctly to map from the full state to the filtered state. The core issue I think, is should the expanded IS the ownership range of the vector subtracted out. Looking at the implementation, it looks like VecISCopy takes care of that for me. (Line 573 in src/vec/vec/utils/projection.c) But I could be mistaken. The error I am getting is: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Only owned values supported Here is what I am currently doing. call DMPlexFilter(dmplex_full, iBlankLabel, 1, dmplex_filtered,ierr) call DMPlexGetSubpointIS(dmplex_filtered, subpointsIS,ierr) ! adds section to dmplex_filtered and allocates vec_filtered using DMCreateGlobalVector call addSectionToDMPlex(dmplex_filtered,vec_filtered) ! Get Sections for dmplex_filtered and dmplex_full call DMGetGlobalSection(dmplex_filtered,filteredfieldSection,ierr) call DMGetGlobalSection(dmplex_full,fullfieldSection,ierr) call ISGetIndicesF90(subpointsIS, subPointKey,ierr) ExpandedIndexSize = 0 do i = 1, size(subPointKey) call PetscSectionGetDof(fullfieldSection, subPointKey(i), dof,ierr) ExpandedIndexSize = ExpandedIndexSize + dof enddo !Create expandedIS from offset sections of full and filtered sections allocate(ExpandedIndex(ExpandedIndexSize)) call VecGetOwnershipRange(vec_full,oStart,oEnd,ierr) do i = 1, size(subPointKey) call PetscSectionGetOffset(fullfieldSection, subPointKey(i), offset,ierr) call PetscSectionGetDof(fullfieldSection, subPointKey(i), dof,ierr) !offset=offset-oStart !looking at VecIScopy it takes care of this subtraction (not sure) do j = 1, (dof) ExpandedIndex((i-1)*dof+j) = offset+j end do enddo call ISCreateGeneral(PETSC_COMM_WORLD, ExpandedIndexSize, ExpandedIndex, PETSC_COPY_VALUES, expandedIS,ierr) call ISRestoreIndicesF90(subpointsIS, subPointKey,ierr) deallocate(ExpandedIndex) call VecGetLocalSize(vec_full,sizeVec,ierr) write(*,*) sizeVec call VecGetLocalSize(vec_filtered,sizeVec,ierr) write(*,*) sizeVec call ISGetLocalSize(expandedIS,sizeVec,ierr) write(*,*) sizeVec call PetscSynchronizedFlush(PETSC_COMM_WORLD,ierr) call VecISCopy(vec_full,expandedIS,SCATTER_REVERSE,vec_filtered,ierr) Thanks again for the great help. Sincerely Nicholas On Wed, Dec 7, 2022 at 9:29 PM Matthew Knepley wrote: > On Wed, Dec 7, 2022 at 9:21 PM Nicholas Arnold-Medabalimi < > narnoldm at umich.edu> wrote: > >> Thank you for the help. >> >> I think the last piece of the puzzle is how do I create the "expanded IS" >> from the subpoint IS using the section? >> > > Loop over the points in the IS. For each point, get the dof and offset > from the Section. Make a new IS that has all the > dogs, namely each run [offset, offset+dof). > > Thanks, > > Matt > > >> Sincerely >> Nicholas >> >> >> On Wed, Dec 7, 2022 at 7:06 AM Matthew Knepley wrote: >> >>> On Wed, Dec 7, 2022 at 6:51 AM Nicholas Arnold-Medabalimi < >>> narnoldm at umich.edu> wrote: >>> >>>> Hi >>>> >>>> Thank you so much for your patience. One thing to note: I don't have >>>> any need to go back from the filtered distributed mapping back to the full >>>> but it is good to know. >>>> >>>> One aside question. >>>> 1) Is natural and global ordering the same in this context? >>>> >>> >>> No. >>> >>> >>>> As far as implementing what you have described. >>>> >>>> When I call ISView on the generated SubpointIS, I get an unusual error >>>> which I'm not sure how to interpret. (this case is running on 2 ranks and >>>> the filter label has points located on both ranks of the original DM. >>>> However, if I manually get the indices (the commented lines), it seems to >>>> not have any issues. >>>> call DMPlexFilter(dmplex_full, iBlankLabel, 1, dmplex_filtered,ierr) >>>> call DMPlexGetSubpointIS(dmplex_filtered, subpointsIS,ierr) >>>> !call ISGetIndicesF90(subpointsIS, subPointKey,ierr) >>>> !write(*,*) subPointKey >>>> !call ISRestoreIndicesF90(subpointsIS, subPointKey,ierr) >>>> call ISView(subpointsIS,PETSC_VIEWER_STDOUT_WORLD,ierr) >>>> >>>> [1]PETSC ERROR: --------------------- Error Message >>>> -------------------------------------------------------------- >>>> [1]PETSC ERROR: Arguments must have same communicators >>>> [1]PETSC ERROR: Different communicators in the two objects: Argument # >>>> 1 and 2 flag 3 >>>> [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble >>>> shooting. >>>> [1]PETSC ERROR: Petsc Development GIT revision: >>>> v3.18.1-320-g7810d690132 GIT Date: 2022-11-20 20:25:41 -0600 >>>> [1]PETSC ERROR: Configure options with-fc=mpiifort >>>> with-mpi-f90=mpiifort --download-triangle --download-parmetis >>>> --download-metis --with-debugging=1 --download-hdf5 >>>> --prefix=/home/narnoldm/packages/petsc_install >>>> [1]PETSC ERROR: #1 ISView() at >>>> /home/narnoldm/packages/petsc/src/vec/is/is/interface/index.c:1629 >>>> >>> >>> The problem here is the subpointsIS is a _serial_ object, and you are >>> using a parallel viewer. You can use PETSC_VIEWER_STDOUT_SELF, >>> or you can pull out the singleton viewer from STDOUT_WORLD if you want >>> them all to print in order. >>> >>> >>>> As far as the overall process you have described my question on first >>>> glance is do I have to allocate/create the vector that is output by >>>> VecISCopy before calling it, or does it create the vector automatically? >>>> >>> >>> You create both vectors. I would do it using DMCreateGlobalVector() from >>> both DMs. >>> >>> >>>> I think I would need to create it first using a section and Setting the >>>> Vec in the filtered DM? >>>> >>> >>> Setting the Section in the filtered DM. >>> >>> >>>> And I presume in this case I would be using the scatter reverse option >>>> to go from the full set to the reduced set? >>>> >>> >>> Yes >>> >>> Thanks >>> >>> Matt >>> >>> >>>> Sincerely >>>> Nicholas >>>> >>>> >>>> >>>> >>>> >>>> >>>> Sincerely >>>> Nick >>>> >>>> On Wed, Dec 7, 2022 at 6:00 AM Matthew Knepley >>>> wrote: >>>> >>>>> On Wed, Dec 7, 2022 at 3:35 AM Nicholas Arnold-Medabalimi < >>>>> narnoldm at umich.edu> wrote: >>>>> >>>>>> Hi Matthew >>>>>> >>>>>> Thank you for the help. This clarified a great deal. >>>>>> >>>>>> I have a follow-up question related to DMPlexFilter. It may be better >>>>>> to describe what I'm trying to achieve. >>>>>> >>>>>> I have a general mesh I am solving which has a section with cell >>>>>> center finite volume states, as described in my initial email. After >>>>>> calculating some metrics, I tag a bunch of cells with an identifying Label >>>>>> and use DMFilter to generate a new DM which is only that subset of cells. >>>>>> Generally, this leads to a pretty unbalanced DM so I then plan to use >>>>>> DMPlexDIstribute to balance that DM across the processors. The coordinates >>>>>> pass along fine, but the state(or I should say Section) does not at least >>>>>> as far as I can tell. >>>>>> >>>>>> Assuming I can get a filtered DM I then distribute the DM and state >>>>>> using the method you described above and it seems to be working ok. >>>>>> >>>>>> The last connection I have to make is the transfer of information >>>>>> from the full mesh to the "sampled" filtered mesh. From what I can gather I >>>>>> would need to get the mapping of points using DMPlexGetSubpointIS and then >>>>>> manually copy the values from the full DM section to the filtered DM? I >>>>>> have the process from full->filtered->distributed all working for the >>>>>> coordinates so its just a matter of transferring the section correctly. >>>>>> >>>>>> I appreciate all the help you have provided. >>>>>> >>>>> >>>>> Let's do this in two steps, which makes it easier to debug. First, do >>>>> not redistribute the submesh. Just use DMPlexGetSubpointIS() >>>>> to get the mapping of filtered points to points in the original mesh. >>>>> Then create an expanded IS using the Section which makes >>>>> dofs in the filtered mesh to dofs in the original mesh. From this use >>>>> >>>>> https://petsc.org/main/docs/manualpages/Vec/VecISCopy/ >>>>> >>>>> to move values between the original vector and the filtered vector. >>>>> >>>>> Once that works, you can try redistributing the filtered mesh. Before >>>>> calling DMPlexDistribute() on the filtered mesh, you need to call >>>>> >>>>> https://petsc.org/main/docs/manualpages/DM/DMSetUseNatural/ >>>>> >>>>> When you redistribute, it will compute a mapping back to the original >>>>> layout. Now when you want to transfer values, you >>>>> >>>>> 1) Create a natural vector with DMCreateNaturalVec() >>>>> >>>>> 2) Use DMGlobalToNaturalBegin/End() to move values from the filtered >>>>> vector to the natural vector >>>>> >>>>> 3) Use VecISCopy() to move values from the natural vector to the >>>>> original vector >>>>> >>>>> Let me know if you have any problems. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Sincerely >>>>>> Nicholas >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Nov 28, 2022 at 6:19 AM Matthew Knepley >>>>>> wrote: >>>>>> >>>>>>> On Sun, Nov 27, 2022 at 10:22 PM Nicholas Arnold-Medabalimi < >>>>>>> narnoldm at umich.edu> wrote: >>>>>>> >>>>>>>> Hi Petsc Users >>>>>>>> >>>>>>>> I have a question about properly using PetscSection to assign state >>>>>>>> variables to a DM. I have an existing DMPlex mesh distributed on 2 >>>>>>>> processors. My goal is to have state variables set to the cell centers. I >>>>>>>> then want to call DMPlexDistribute, which I hope will balance the mesh >>>>>>>> elements and hopefully transport the state variables to the hosting >>>>>>>> processors as the cells are distributed to a different processor count or >>>>>>>> simply just redistributing after doing mesh adaption. >>>>>>>> >>>>>>>> Looking at the DMPlex User guide, I should be able to achieve this >>>>>>>> with a single field section using SetDof and assigning the DOF to the >>>>>>>> points corresponding to cells. >>>>>>>> >>>>>>> >>>>>>> Note that if you want several different fields, you can clone the DM >>>>>>> first for this field >>>>>>> >>>>>>> call DMClone(dm,dmState,ierr) >>>>>>> >>>>>>> and use dmState in your calls below. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> call DMPlexGetHeightStratum(dm,0,c0,c1,ierr) >>>>>>>> call DMPlexGetChart(dm,p0,p1,ierr) >>>>>>>> call PetscSectionCreate(PETSC_COMM_WORLD,section,ierr) >>>>>>>> call PetscSectionSetNumFields(section,1,ierr) call >>>>>>>> PetscSectionSetChart(section,p0,p1,ierr) >>>>>>>> do i = c0, (c1-1) >>>>>>>> call PetscSectionSetDof(section,i,nvar,ierr) >>>>>>>> end do >>>>>>>> call PetscSectionSetup(section,ierr) >>>>>>>> call DMSetLocalSection(dm,section,ierr) >>>>>>>> >>>>>>> >>>>>>> In the loop, I would add a call to >>>>>>> >>>>>>> call PetscSectionSetFieldDof(section,i,0,nvar,ierr) >>>>>>> >>>>>>> This also puts in the field breakdown. It is not essential, but >>>>>>> nicer. >>>>>>> >>>>>>> >>>>>>>> From here, it looks like I can access and set the state vars using >>>>>>>> >>>>>>>> call DMGetGlobalVector(dmplex,state,ierr) >>>>>>>> call DMGetGlobalSection(dmplex,section,ierr) >>>>>>>> call VecGetArrayF90(state,stateVec,ierr) >>>>>>>> do i = c0, (c1-1) >>>>>>>> call PetscSectionGetOffset(section,i,offset,ierr) >>>>>>>> stateVec(offset:(offset+nvar))=state_i(:) !simplified assignment >>>>>>>> end do >>>>>>>> call VecRestoreArrayF90(state,stateVec,ierr) >>>>>>>> call DMRestoreGlobalVector(dmplex,state,ierr) >>>>>>>> >>>>>>>> To my understanding, I should be using Global vector since this is >>>>>>>> a pure assignment operation and I don't need the ghost cells. >>>>>>>> >>>>>>> >>>>>>> Yes. >>>>>>> >>>>>>> But the behavior I am seeing isn't exactly what I'd expect. >>>>>>>> >>>>>>>> To be honest, I'm somewhat unclear on a few things >>>>>>>> >>>>>>>> 1) Should be using nvar fields with 1 DOF each or 1 field with nvar >>>>>>>> DOFs or what the distinction between the two methods are? >>>>>>>> >>>>>>> >>>>>>> We have two divisions in a Section. A field can have a number of >>>>>>> components. This is intended to model a vector or tensor field. >>>>>>> Then a Section can have a number of fields, such as velocity and >>>>>>> pressure for a Stokes problem. The division is mainly to help the >>>>>>> user, so I would use the most natural one. >>>>>>> >>>>>>> >>>>>>>> 2) Adding a print statement after the offset assignment I get (on >>>>>>>> rank 0 of 2) >>>>>>>> cell 1 offset 0 >>>>>>>> cell 2 offset 18 >>>>>>>> cell 3 offset 36 >>>>>>>> which is expected and works but on rank 1 I get >>>>>>>> cell 1 offset 9000 >>>>>>>> cell 2 offset 9018 >>>>>>>> cell 3 offset 9036 >>>>>>>> >>>>>>>> which isn't exactly what I would expect. Shouldn't the offsets >>>>>>>> reset at 0 for the next rank? >>>>>>>> >>>>>>> >>>>>>> The local and global sections hold different information. This is >>>>>>> the source of the confusion. The local section does describe a local >>>>>>> vector, and thus includes overlap or "ghost" dofs. The global >>>>>>> section describes a global vector. However, it is intended to deliver >>>>>>> global indices, and thus the offsets give back global indices. When >>>>>>> you use VecGetArray*() you are getting out the local array, and >>>>>>> thus you have to subtract the first index on this process. You can >>>>>>> get that from >>>>>>> >>>>>>> VecGetOwnershipRange(v, &rstart, &rEnd); >>>>>>> >>>>>>> This is the same whether you are using DMDA or DMPlex or any other >>>>>>> DM. >>>>>>> >>>>>>> >>>>>>>> 3) Does calling DMPlexDistribute also distribute the section data >>>>>>>> associated with the DOF, based on the description in DMPlexDistribute it >>>>>>>> looks like it should? >>>>>>>> >>>>>>> >>>>>>> No. By default, DMPlexDistribute() only distributes coordinate data. >>>>>>> I you want to distribute your field, it would look something like this: >>>>>>> >>>>>>> DMPlexDistribute(dm, 0, &sfDist, &dmDist); >>>>>>> VecCreate(comm, &stateDist); >>>>>>> VecSetDM(sateDist, dmDist); >>>>>>> PetscSectionCreate(comm §ionDist); >>>>>>> DMSetLocalSection(dmDist, sectionDist); >>>>>>> DMPlexDistributeField(dmDist, sfDist, section, state, sectionDist, >>>>>>> stateDist); >>>>>>> >>>>>>> We do this in src/dm/impls/plex/tests/ex36.c >>>>>>> >>>>>>> THanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> I'd appreciate any insight into the specifics of this usage. I >>>>>>>> expect I have a misconception on the local vs global section. Thank you. >>>>>>>> >>>>>>>> Sincerely >>>>>>>> Nicholas >>>>>>>> >>>>>>>> -- >>>>>>>> Nicholas Arnold-Medabalimi >>>>>>>> >>>>>>>> Ph.D. Candidate >>>>>>>> Computational Aeroscience Lab >>>>>>>> University of Michigan >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Nicholas Arnold-Medabalimi >>>>>> >>>>>> Ph.D. Candidate >>>>>> Computational Aeroscience Lab >>>>>> University of Michigan >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>>> >>>> -- >>>> Nicholas Arnold-Medabalimi >>>> >>>> Ph.D. Candidate >>>> Computational Aeroscience Lab >>>> University of Michigan >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> >> -- >> Nicholas Arnold-Medabalimi >> >> Ph.D. Candidate >> Computational Aeroscience Lab >> University of Michigan >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Dec 8 06:27:02 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 8 Dec 2022 07:27:02 -0500 Subject: [petsc-users] Petsc Section in DMPlex In-Reply-To: References: Message-ID: On Thu, Dec 8, 2022 at 3:04 AM Nicholas Arnold-Medabalimi < narnoldm at umich.edu> wrote: > Hi Matt > > I think I've gotten it just about there. I'm just having an issue with the > VecISCopy. I have an IS built that matches size correctly to map from the > full state to the filtered state. The core issue I think, is should the > expanded IS the ownership range of the vector subtracted out. Looking at > the implementation, it looks like VecISCopy takes care of that for me. > (Line 573 in src/vec/vec/utils/projection.c) But I could be mistaken. > It is a good question. We have tried to give guidance on the manpage: The index set identifies entries in the global vector. Negative indices are skipped; indices outside the ownership range of vfull will raise an error. which means that it expects _global_ indices, and you have retrieved the global section, so that matches. The calculation of the index size looks right to me, and so does the index calculation. I would put a check in the loop, making sure that the calculated indices lie within [oStart, oEnd). The global section is designed to ensure that. It is not clear why one would lie outside. When I am debugging, I run a very small problem, and print out all the sections. Thanks, Matt > The error I am getting is: > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Only owned values supported > > > Here is what I am currently doing. > > call DMPlexFilter(dmplex_full, iBlankLabel, 1, dmplex_filtered,ierr) > call DMPlexGetSubpointIS(dmplex_filtered, subpointsIS,ierr) > > ! adds section to dmplex_filtered and allocates vec_filtered using > DMCreateGlobalVector > call addSectionToDMPlex(dmplex_filtered,vec_filtered) > > ! Get Sections for dmplex_filtered and dmplex_full > call DMGetGlobalSection(dmplex_filtered,filteredfieldSection,ierr) > call DMGetGlobalSection(dmplex_full,fullfieldSection,ierr) > > > > call ISGetIndicesF90(subpointsIS, subPointKey,ierr) > ExpandedIndexSize = 0 > do i = 1, size(subPointKey) > call PetscSectionGetDof(fullfieldSection, subPointKey(i), dof,ierr) > ExpandedIndexSize = ExpandedIndexSize + dof > enddo > > > !Create expandedIS from offset sections of full and filtered sections > allocate(ExpandedIndex(ExpandedIndexSize)) > call VecGetOwnershipRange(vec_full,oStart,oEnd,ierr) > do i = 1, size(subPointKey) > call PetscSectionGetOffset(fullfieldSection, subPointKey(i), > offset,ierr) > call PetscSectionGetDof(fullfieldSection, subPointKey(i), dof,ierr) > !offset=offset-oStart !looking at VecIScopy it takes care of this > subtraction (not sure) > do j = 1, (dof) > ExpandedIndex((i-1)*dof+j) = offset+j > end do > enddo > > call ISCreateGeneral(PETSC_COMM_WORLD, ExpandedIndexSize, ExpandedIndex, > PETSC_COPY_VALUES, expandedIS,ierr) > call ISRestoreIndicesF90(subpointsIS, subPointKey,ierr) > deallocate(ExpandedIndex) > > > call VecGetLocalSize(vec_full,sizeVec,ierr) > write(*,*) sizeVec > call VecGetLocalSize(vec_filtered,sizeVec,ierr) > write(*,*) sizeVec > call ISGetLocalSize(expandedIS,sizeVec,ierr) > write(*,*) sizeVec > call PetscSynchronizedFlush(PETSC_COMM_WORLD,ierr) > > > call VecISCopy(vec_full,expandedIS,SCATTER_REVERSE,vec_filtered,ierr) > > > Thanks again for the great help. > > Sincerely > Nicholas > > > On Wed, Dec 7, 2022 at 9:29 PM Matthew Knepley wrote: > >> On Wed, Dec 7, 2022 at 9:21 PM Nicholas Arnold-Medabalimi < >> narnoldm at umich.edu> wrote: >> >>> Thank you for the help. >>> >>> I think the last piece of the puzzle is how do I create the "expanded >>> IS" from the subpoint IS using the section? >>> >> >> Loop over the points in the IS. For each point, get the dof and offset >> from the Section. Make a new IS that has all the >> dogs, namely each run [offset, offset+dof). >> >> Thanks, >> >> Matt >> >> >>> Sincerely >>> Nicholas >>> >>> >>> On Wed, Dec 7, 2022 at 7:06 AM Matthew Knepley >>> wrote: >>> >>>> On Wed, Dec 7, 2022 at 6:51 AM Nicholas Arnold-Medabalimi < >>>> narnoldm at umich.edu> wrote: >>>> >>>>> Hi >>>>> >>>>> Thank you so much for your patience. One thing to note: I don't have >>>>> any need to go back from the filtered distributed mapping back to the full >>>>> but it is good to know. >>>>> >>>>> One aside question. >>>>> 1) Is natural and global ordering the same in this context? >>>>> >>>> >>>> No. >>>> >>>> >>>>> As far as implementing what you have described. >>>>> >>>>> When I call ISView on the generated SubpointIS, I get an unusual error >>>>> which I'm not sure how to interpret. (this case is running on 2 ranks and >>>>> the filter label has points located on both ranks of the original DM. >>>>> However, if I manually get the indices (the commented lines), it seems to >>>>> not have any issues. >>>>> call DMPlexFilter(dmplex_full, iBlankLabel, 1, dmplex_filtered,ierr) >>>>> call DMPlexGetSubpointIS(dmplex_filtered, subpointsIS,ierr) >>>>> !call ISGetIndicesF90(subpointsIS, subPointKey,ierr) >>>>> !write(*,*) subPointKey >>>>> !call ISRestoreIndicesF90(subpointsIS, subPointKey,ierr) >>>>> call ISView(subpointsIS,PETSC_VIEWER_STDOUT_WORLD,ierr) >>>>> >>>>> [1]PETSC ERROR: --------------------- Error Message >>>>> -------------------------------------------------------------- >>>>> [1]PETSC ERROR: Arguments must have same communicators >>>>> [1]PETSC ERROR: Different communicators in the two objects: Argument # >>>>> 1 and 2 flag 3 >>>>> [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble >>>>> shooting. >>>>> [1]PETSC ERROR: Petsc Development GIT revision: >>>>> v3.18.1-320-g7810d690132 GIT Date: 2022-11-20 20:25:41 -0600 >>>>> [1]PETSC ERROR: Configure options with-fc=mpiifort >>>>> with-mpi-f90=mpiifort --download-triangle --download-parmetis >>>>> --download-metis --with-debugging=1 --download-hdf5 >>>>> --prefix=/home/narnoldm/packages/petsc_install >>>>> [1]PETSC ERROR: #1 ISView() at >>>>> /home/narnoldm/packages/petsc/src/vec/is/is/interface/index.c:1629 >>>>> >>>> >>>> The problem here is the subpointsIS is a _serial_ object, and you are >>>> using a parallel viewer. You can use PETSC_VIEWER_STDOUT_SELF, >>>> or you can pull out the singleton viewer from STDOUT_WORLD if you want >>>> them all to print in order. >>>> >>>> >>>>> As far as the overall process you have described my question on first >>>>> glance is do I have to allocate/create the vector that is output by >>>>> VecISCopy before calling it, or does it create the vector automatically? >>>>> >>>> >>>> You create both vectors. I would do it using DMCreateGlobalVector() >>>> from both DMs. >>>> >>>> >>>>> I think I would need to create it first using a section and Setting >>>>> the Vec in the filtered DM? >>>>> >>>> >>>> Setting the Section in the filtered DM. >>>> >>>> >>>>> And I presume in this case I would be using the scatter reverse option >>>>> to go from the full set to the reduced set? >>>>> >>>> >>>> Yes >>>> >>>> Thanks >>>> >>>> Matt >>>> >>>> >>>>> Sincerely >>>>> Nicholas >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Sincerely >>>>> Nick >>>>> >>>>> On Wed, Dec 7, 2022 at 6:00 AM Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Wed, Dec 7, 2022 at 3:35 AM Nicholas Arnold-Medabalimi < >>>>>> narnoldm at umich.edu> wrote: >>>>>> >>>>>>> Hi Matthew >>>>>>> >>>>>>> Thank you for the help. This clarified a great deal. >>>>>>> >>>>>>> I have a follow-up question related to DMPlexFilter. It may be >>>>>>> better to describe what I'm trying to achieve. >>>>>>> >>>>>>> I have a general mesh I am solving which has a section with cell >>>>>>> center finite volume states, as described in my initial email. After >>>>>>> calculating some metrics, I tag a bunch of cells with an identifying Label >>>>>>> and use DMFilter to generate a new DM which is only that subset of cells. >>>>>>> Generally, this leads to a pretty unbalanced DM so I then plan to use >>>>>>> DMPlexDIstribute to balance that DM across the processors. The coordinates >>>>>>> pass along fine, but the state(or I should say Section) does not at least >>>>>>> as far as I can tell. >>>>>>> >>>>>>> Assuming I can get a filtered DM I then distribute the DM and state >>>>>>> using the method you described above and it seems to be working ok. >>>>>>> >>>>>>> The last connection I have to make is the transfer of information >>>>>>> from the full mesh to the "sampled" filtered mesh. From what I can gather I >>>>>>> would need to get the mapping of points using DMPlexGetSubpointIS and then >>>>>>> manually copy the values from the full DM section to the filtered DM? I >>>>>>> have the process from full->filtered->distributed all working for the >>>>>>> coordinates so its just a matter of transferring the section correctly. >>>>>>> >>>>>>> I appreciate all the help you have provided. >>>>>>> >>>>>> >>>>>> Let's do this in two steps, which makes it easier to debug. First, do >>>>>> not redistribute the submesh. Just use DMPlexGetSubpointIS() >>>>>> to get the mapping of filtered points to points in the original mesh. >>>>>> Then create an expanded IS using the Section which makes >>>>>> dofs in the filtered mesh to dofs in the original mesh. From this use >>>>>> >>>>>> https://petsc.org/main/docs/manualpages/Vec/VecISCopy/ >>>>>> >>>>>> to move values between the original vector and the filtered vector. >>>>>> >>>>>> Once that works, you can try redistributing the filtered mesh. Before >>>>>> calling DMPlexDistribute() on the filtered mesh, you need to call >>>>>> >>>>>> https://petsc.org/main/docs/manualpages/DM/DMSetUseNatural/ >>>>>> >>>>>> When you redistribute, it will compute a mapping back to the original >>>>>> layout. Now when you want to transfer values, you >>>>>> >>>>>> 1) Create a natural vector with DMCreateNaturalVec() >>>>>> >>>>>> 2) Use DMGlobalToNaturalBegin/End() to move values from the >>>>>> filtered vector to the natural vector >>>>>> >>>>>> 3) Use VecISCopy() to move values from the natural vector to the >>>>>> original vector >>>>>> >>>>>> Let me know if you have any problems. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Sincerely >>>>>>> Nicholas >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Nov 28, 2022 at 6:19 AM Matthew Knepley >>>>>>> wrote: >>>>>>> >>>>>>>> On Sun, Nov 27, 2022 at 10:22 PM Nicholas Arnold-Medabalimi < >>>>>>>> narnoldm at umich.edu> wrote: >>>>>>>> >>>>>>>>> Hi Petsc Users >>>>>>>>> >>>>>>>>> I have a question about properly using PetscSection to >>>>>>>>> assign state variables to a DM. I have an existing DMPlex mesh distributed >>>>>>>>> on 2 processors. My goal is to have state variables set to the cell >>>>>>>>> centers. I then want to call DMPlexDistribute, which I hope will balance >>>>>>>>> the mesh elements and hopefully transport the state variables to the >>>>>>>>> hosting processors as the cells are distributed to a different processor >>>>>>>>> count or simply just redistributing after doing mesh adaption. >>>>>>>>> >>>>>>>>> Looking at the DMPlex User guide, I should be able to achieve this >>>>>>>>> with a single field section using SetDof and assigning the DOF to the >>>>>>>>> points corresponding to cells. >>>>>>>>> >>>>>>>> >>>>>>>> Note that if you want several different fields, you can clone the >>>>>>>> DM first for this field >>>>>>>> >>>>>>>> call DMClone(dm,dmState,ierr) >>>>>>>> >>>>>>>> and use dmState in your calls below. >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> call DMPlexGetHeightStratum(dm,0,c0,c1,ierr) >>>>>>>>> call DMPlexGetChart(dm,p0,p1,ierr) >>>>>>>>> call PetscSectionCreate(PETSC_COMM_WORLD,section,ierr) >>>>>>>>> call PetscSectionSetNumFields(section,1,ierr) call >>>>>>>>> PetscSectionSetChart(section,p0,p1,ierr) >>>>>>>>> do i = c0, (c1-1) >>>>>>>>> call PetscSectionSetDof(section,i,nvar,ierr) >>>>>>>>> end do >>>>>>>>> call PetscSectionSetup(section,ierr) >>>>>>>>> call DMSetLocalSection(dm,section,ierr) >>>>>>>>> >>>>>>>> >>>>>>>> In the loop, I would add a call to >>>>>>>> >>>>>>>> call PetscSectionSetFieldDof(section,i,0,nvar,ierr) >>>>>>>> >>>>>>>> This also puts in the field breakdown. It is not essential, but >>>>>>>> nicer. >>>>>>>> >>>>>>>> >>>>>>>>> From here, it looks like I can access and set the state vars using >>>>>>>>> >>>>>>>>> call DMGetGlobalVector(dmplex,state,ierr) >>>>>>>>> call DMGetGlobalSection(dmplex,section,ierr) >>>>>>>>> call VecGetArrayF90(state,stateVec,ierr) >>>>>>>>> do i = c0, (c1-1) >>>>>>>>> call PetscSectionGetOffset(section,i,offset,ierr) >>>>>>>>> stateVec(offset:(offset+nvar))=state_i(:) !simplified assignment >>>>>>>>> end do >>>>>>>>> call VecRestoreArrayF90(state,stateVec,ierr) >>>>>>>>> call DMRestoreGlobalVector(dmplex,state,ierr) >>>>>>>>> >>>>>>>>> To my understanding, I should be using Global vector since this is >>>>>>>>> a pure assignment operation and I don't need the ghost cells. >>>>>>>>> >>>>>>>> >>>>>>>> Yes. >>>>>>>> >>>>>>>> But the behavior I am seeing isn't exactly what I'd expect. >>>>>>>>> >>>>>>>>> To be honest, I'm somewhat unclear on a few things >>>>>>>>> >>>>>>>>> 1) Should be using nvar fields with 1 DOF each or 1 field with >>>>>>>>> nvar DOFs or what the distinction between the two methods are? >>>>>>>>> >>>>>>>> >>>>>>>> We have two divisions in a Section. A field can have a number of >>>>>>>> components. This is intended to model a vector or tensor field. >>>>>>>> Then a Section can have a number of fields, such as velocity and >>>>>>>> pressure for a Stokes problem. The division is mainly to help the >>>>>>>> user, so I would use the most natural one. >>>>>>>> >>>>>>>> >>>>>>>>> 2) Adding a print statement after the offset assignment I get (on >>>>>>>>> rank 0 of 2) >>>>>>>>> cell 1 offset 0 >>>>>>>>> cell 2 offset 18 >>>>>>>>> cell 3 offset 36 >>>>>>>>> which is expected and works but on rank 1 I get >>>>>>>>> cell 1 offset 9000 >>>>>>>>> cell 2 offset 9018 >>>>>>>>> cell 3 offset 9036 >>>>>>>>> >>>>>>>>> which isn't exactly what I would expect. Shouldn't the offsets >>>>>>>>> reset at 0 for the next rank? >>>>>>>>> >>>>>>>> >>>>>>>> The local and global sections hold different information. This is >>>>>>>> the source of the confusion. The local section does describe a local >>>>>>>> vector, and thus includes overlap or "ghost" dofs. The global >>>>>>>> section describes a global vector. However, it is intended to deliver >>>>>>>> global indices, and thus the offsets give back global indices. When >>>>>>>> you use VecGetArray*() you are getting out the local array, and >>>>>>>> thus you have to subtract the first index on this process. You can >>>>>>>> get that from >>>>>>>> >>>>>>>> VecGetOwnershipRange(v, &rstart, &rEnd); >>>>>>>> >>>>>>>> This is the same whether you are using DMDA or DMPlex or any other >>>>>>>> DM. >>>>>>>> >>>>>>>> >>>>>>>>> 3) Does calling DMPlexDistribute also distribute the section data >>>>>>>>> associated with the DOF, based on the description in DMPlexDistribute it >>>>>>>>> looks like it should? >>>>>>>>> >>>>>>>> >>>>>>>> No. By default, DMPlexDistribute() only distributes coordinate >>>>>>>> data. I you want to distribute your field, it would look something like >>>>>>>> this: >>>>>>>> >>>>>>>> DMPlexDistribute(dm, 0, &sfDist, &dmDist); >>>>>>>> VecCreate(comm, &stateDist); >>>>>>>> VecSetDM(sateDist, dmDist); >>>>>>>> PetscSectionCreate(comm §ionDist); >>>>>>>> DMSetLocalSection(dmDist, sectionDist); >>>>>>>> DMPlexDistributeField(dmDist, sfDist, section, state, sectionDist, >>>>>>>> stateDist); >>>>>>>> >>>>>>>> We do this in src/dm/impls/plex/tests/ex36.c >>>>>>>> >>>>>>>> THanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> I'd appreciate any insight into the specifics of this usage. I >>>>>>>>> expect I have a misconception on the local vs global section. Thank you. >>>>>>>>> >>>>>>>>> Sincerely >>>>>>>>> Nicholas >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Nicholas Arnold-Medabalimi >>>>>>>>> >>>>>>>>> Ph.D. Candidate >>>>>>>>> Computational Aeroscience Lab >>>>>>>>> University of Michigan >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Nicholas Arnold-Medabalimi >>>>>>> >>>>>>> Ph.D. Candidate >>>>>>> Computational Aeroscience Lab >>>>>>> University of Michigan >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Nicholas Arnold-Medabalimi >>>>> >>>>> Ph.D. Candidate >>>>> Computational Aeroscience Lab >>>>> University of Michigan >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >>> -- >>> Nicholas Arnold-Medabalimi >>> >>> Ph.D. Candidate >>> Computational Aeroscience Lab >>> University of Michigan >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From facklerpw at ornl.gov Thu Dec 8 08:07:24 2022 From: facklerpw at ornl.gov (Fackler, Philip) Date: Thu, 8 Dec 2022 14:07:24 +0000 Subject: [petsc-users] [EXTERNAL] Re: Kokkos backend for Mat and Vec diverging when running on CUDA device. In-Reply-To: References: Message-ID: Great! Thank you! Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang Sent: Wednesday, December 7, 2022 18:47 To: Fackler, Philip Cc: xolotl-psi-development at lists.sourceforge.net ; petsc-users at mcs.anl.gov ; Blondel, Sophie ; Roth, Philip Subject: Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec diverging when running on CUDA device. Hi, Philip, I could reproduce the error. I need to find a way to debug it. Thanks. /home/jczhang/xolotl/test/system/SystemTestCase.cpp(317): fatal error: in "System/PSI_1": absolute value of diffNorm{0.19704848134353209} exceeds 1e-10 *** 1 failure is detected in the test module "Regression" --Junchao Zhang On Tue, Dec 6, 2022 at 10:10 AM Fackler, Philip > wrote: I think it would be simpler to use the develop branch for this issue. But you can still just build the SystemTester. Then (if you changed the PSI_1 case) run: ./test/system/SystemTester -t System/PSI_1 -- -v? (No need for multiple MPI ranks) Thanks, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang > Sent: Monday, December 5, 2022 15:40 To: Fackler, Philip > Cc: xolotl-psi-development at lists.sourceforge.net >; petsc-users at mcs.anl.gov >; Blondel, Sophie >; Roth, Philip > Subject: Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec diverging when running on CUDA device. I configured with xolotl branch feature-petsc-kokkos, and typed `make` under ~/xolotl-build/. Though there were errors, a lot of *Tester were built. [ 62%] Built target xolotlViz [ 63%] Linking CXX executable TemperatureProfileHandlerTester [ 64%] Linking CXX executable TemperatureGradientHandlerTester [ 64%] Built target TemperatureProfileHandlerTester [ 64%] Built target TemperatureConstantHandlerTester [ 64%] Built target TemperatureGradientHandlerTester [ 65%] Linking CXX executable HeatEquationHandlerTester [ 65%] Built target HeatEquationHandlerTester [ 66%] Linking CXX executable FeFitFluxHandlerTester [ 66%] Linking CXX executable W111FitFluxHandlerTester [ 67%] Linking CXX executable FuelFitFluxHandlerTester [ 67%] Linking CXX executable W211FitFluxHandlerTester Which Tester should I use to run with the parameter file benchmarks/params_system_PSI_2.txt? And how many ranks should I use? Could you give an example command line? Thanks. --Junchao Zhang On Mon, Dec 5, 2022 at 2:22 PM Junchao Zhang > wrote: Hello, Philip, Do I still need to use the feature-petsc-kokkos branch? --Junchao Zhang On Mon, Dec 5, 2022 at 11:08 AM Fackler, Philip > wrote: Junchao, Thank you for working on this. If you open the parameter file for, say, the PSI_2 system test case (benchmarks/params_system_PSI_2.txt), simply add -dm_mat_type aijkokkos -dm_vec_type kokkos?` to the "petscArgs=" field (or the corresponding cusparse/cuda option). Thanks, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang > Sent: Thursday, December 1, 2022 17:05 To: Fackler, Philip > Cc: xolotl-psi-development at lists.sourceforge.net >; petsc-users at mcs.anl.gov >; Blondel, Sophie >; Roth, Philip > Subject: Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec diverging when running on CUDA device. Hi, Philip, Sorry for the long delay. I could not get something useful from the -log_view output. Since I have already built xolotl, could you give me instructions on how to do a xolotl test to reproduce the divergence with petsc GPU backends (but fine on CPU)? Thank you. --Junchao Zhang On Wed, Nov 16, 2022 at 1:38 PM Fackler, Philip > wrote: ------------------------------------------------------------------ PETSc Performance Summary: ------------------------------------------------------------------ Unknown Name on a named PC0115427 with 1 processor, by 4pf Wed Nov 16 14:36:46 2022 Using Petsc Development GIT revision: v3.18.1-115-gdca010e0e9a GIT Date: 2022-10-28 14:39:41 +0000 Max Max/Min Avg Total Time (sec): 6.023e+00 1.000 6.023e+00 Objects: 1.020e+02 1.000 1.020e+02 Flops: 1.080e+09 1.000 1.080e+09 1.080e+09 Flops/sec: 1.793e+08 1.000 1.793e+08 1.793e+08 MPI Msg Count: 0.000e+00 0.000 0.000e+00 0.000e+00 MPI Msg Len (bytes): 0.000e+00 0.000 0.000e+00 0.000e+00 MPI Reductions: 0.000e+00 0.000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 6.0226e+00 100.0% 1.0799e+09 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) CpuToGpu Count: total number of CPU to GPU copies per processor CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) GpuToCpu Count: total number of GPU to CPU copies per processor GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) GPU %F: percent flops on GPU in this event ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F ------------------------------------------------------------------------------------------------------------------------ --------------------------------------- --- Event Stage 0: Main Stage BuildTwoSided 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 DMCreateMat 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFSetGraph 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFSetUp 3 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFPack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SFUnpack 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecDot 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecMDot 775 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecNorm 1728 1.0 nan nan 1.92e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecScale 1983 1.0 nan nan 6.24e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecCopy 780 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecSet 4955 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecAXPY 190 1.0 nan nan 2.11e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecAYPX 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecAXPBYCZ 643 1.0 nan nan 1.79e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecWAXPY 502 1.0 nan nan 5.58e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecMAXPY 1159 1.0 nan nan 3.68e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecScatterBegin 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan -nan 2 5.14e-03 0 0.00e+00 0 VecScatterEnd 4647 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecReduceArith 380 1.0 nan nan 4.23e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 VecReduceComm 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 VecNormalize 965 1.0 nan nan 1.61e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 TSStep 20 1.0 5.8699e+00 1.0 1.08e+09 1.0 0.0e+00 0.0e+00 0.0e+00 97100 0 0 0 97100 0 0 0 184 -nan 2 5.14e-03 0 0.00e+00 54 TSFunctionEval 597 1.0 nan nan 6.64e+06 1.0 0.0e+00 0.0e+00 0.0e+00 63 1 0 0 0 63 1 0 0 0 -nan -nan 1 3.36e-04 0 0.00e+00 100 TSJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 97 MatMult 1930 1.0 nan nan 4.46e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 41 0 0 0 1 41 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 MatMultTranspose 1 1.0 nan nan 3.44e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 MatSolve 965 1.0 nan nan 5.04e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 5 0 0 0 1 5 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatSOR 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatLUFactorSym 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatLUFactorNum 190 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 11 0 0 0 1 11 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatScale 190 1.0 nan nan 3.26e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 MatAssemblyBegin 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatAssemblyEnd 761 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatGetRowIJ 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatCreateSubMats 380 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatGetOrdering 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatZeroEntries 379 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatSetPreallCOO 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 MatSetValuesCOO 190 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSetUp 760 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSolve 190 1.0 5.8052e-01 1.0 9.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 10 86 0 0 0 10 86 0 0 0 1602 -nan 1 4.80e-03 0 0.00e+00 46 KSPGMRESOrthog 775 1.0 nan nan 2.27e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 SNESSolve 71 1.0 5.7117e+00 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 0.0e+00 95 99 0 0 0 95 99 0 0 0 188 -nan 1 4.80e-03 0 0.00e+00 53 SNESSetUp 1 1.0 nan nan 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 SNESFunctionEval 573 1.0 nan nan 2.23e+07 1.0 0.0e+00 0.0e+00 0.0e+00 60 2 0 0 0 60 2 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 SNESJacobianEval 190 1.0 nan nan 3.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 24 3 0 0 0 24 3 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 97 SNESLineSearch 190 1.0 nan nan 1.05e+08 1.0 0.0e+00 0.0e+00 0.0e+00 53 10 0 0 0 53 10 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 100 PCSetUp 570 1.0 nan nan 1.16e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 11 0 0 0 2 11 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 PCApply 965 1.0 nan nan 6.14e+08 1.0 0.0e+00 0.0e+00 0.0e+00 8 57 0 0 0 8 57 0 0 0 -nan -nan 1 4.80e-03 0 0.00e+00 19 KSPSolve_FS_0 965 1.0 nan nan 3.33e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 31 0 0 0 4 31 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 KSPSolve_FS_1 965 1.0 nan nan 1.66e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 15 0 0 0 2 15 0 0 0 -nan -nan 0 0.00e+00 0 0.00e+00 0 --- Event Stage 1: Unknown ------------------------------------------------------------------------------------------------------------------------ --------------------------------------- Object Type Creations Destructions. Reports information only for process 0. --- Event Stage 0: Main Stage Container 5 5 Distributed Mesh 2 2 Index Set 11 11 IS L to G Mapping 1 1 Star Forest Graph 7 7 Discrete System 2 2 Weak Form 2 2 Vector 49 49 TSAdapt 1 1 TS 1 1 DMTS 1 1 SNES 1 1 DMSNES 3 3 SNESLineSearch 1 1 Krylov Solver 4 4 DMKSP interface 1 1 Matrix 4 4 Preconditioner 4 4 Viewer 2 1 --- Event Stage 1: Unknown ======================================================================================================================== Average time to get PetscTime(): 3.14e-08 #PETSc Option Table entries: -log_view -log_view_gpu_times #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with 64 bit PetscInt Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 Configure options: PETSC_DIR=/home/4pf/repos/petsc PETSC_ARCH=arch-kokkos-cuda-no-tpls --with-cc=mpicc --with-cxx=mpicxx --with-fc=0 --with-cuda --with-debugging=0 --with-shared-libraries --prefix=/home/4pf/build/petsc/cuda-no-tpls/install --with-64-bit-indices --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --CUDAOPTFLAGS=-O3 --with-kokkos-dir=/home/4pf/build/kokkos/cuda/install --with-kokkos-kernels-dir=/home/4pf/build/kokkos-kernels/cuda-no-tpls/install ----------------------------------------- Libraries compiled on 2022-11-01 21:01:08 on PC0115427 Machine characteristics: Linux-5.15.0-52-generic-x86_64-with-glibc2.35 Using PETSc directory: /home/4pf/build/petsc/cuda-no-tpls/install Using PETSc arch: ----------------------------------------- Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -O3 ----------------------------------------- Using include paths: -I/home/4pf/build/petsc/cuda-no-tpls/install/include -I/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/include -I/home/4pf/build/kokkos/cuda/install/include -I/usr/local/cuda-11.8/include ----------------------------------------- Using C linker: mpicc Using libraries: -Wl,-rpath,/home/4pf/build/petsc/cuda-no-tpls/install/lib -L/home/4pf/build/petsc/cuda-no-tpls/install/lib -lpetsc -Wl,-rpath,/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib -L/home/4pf/build/kokkos-kernels/cuda-no-tpls/install/lib -Wl,-rpath,/home/4pf/build/kokkos/cuda/install/lib -L/home/4pf/build/kokkos/cuda/install/lib -Wl,-rpath,/usr/local/cuda-11.8/lib64 -L/usr/local/cuda-11.8/lib64 -L/usr/local/cuda-11.8/lib64/stubs -lkokkoskernels -lkokkoscontainers -lkokkoscore -llapack -lblas -lm -lcudart -lnvToolsExt -lcufft -lcublas -lcusparse -lcusolver -lcurand -lcuda -lquadmath -lstdc++ -ldl ----------------------------------------- Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang > Sent: Tuesday, November 15, 2022 13:03 To: Fackler, Philip > Cc: xolotl-psi-development at lists.sourceforge.net >; petsc-users at mcs.anl.gov >; Blondel, Sophie >; Roth, Philip > Subject: Re: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec diverging when running on CUDA device. Can you paste -log_view result so I can see what functions are used? --Junchao Zhang On Tue, Nov 15, 2022 at 10:24 AM Fackler, Philip > wrote: Yes, most (but not all) of our system test cases fail with the kokkos/cuda or cuda backends. All of them pass with the CPU-only kokkos backend. Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang > Sent: Monday, November 14, 2022 19:34 To: Fackler, Philip > Cc: xolotl-psi-development at lists.sourceforge.net >; petsc-users at mcs.anl.gov >; Blondel, Sophie >; Zhang, Junchao >; Roth, Philip > Subject: [EXTERNAL] Re: [petsc-users] Kokkos backend for Mat and Vec diverging when running on CUDA device. Hi, Philip, Sorry to hear that. It seems you could run the same code on CPUs but not no GPUs (with either petsc/Kokkos backend or petsc/cuda backend, is it right? --Junchao Zhang On Mon, Nov 14, 2022 at 12:13 PM Fackler, Philip via petsc-users > wrote: This is an issue I've brought up before (and discussed in-person with Richard). I wanted to bring it up again because I'm hitting the limits of what I know to do, and I need help figuring this out. The problem can be reproduced using Xolotl's "develop" branch built against a petsc build with kokkos and kokkos-kernels enabled. Then, either add the relevant kokkos options to the "petscArgs=" line in the system test parameter file(s), or just replace the system test parameter files with the ones from the "feature-petsc-kokkos" branch. See here the files that begin with "params_system_". Note that those files use the "kokkos" options, but the problem is similar using the corresponding cuda/cusparse options. I've already tried building kokkos-kernels with no TPLs and got slightly different results, but the same problem. Any help would be appreciated. Thanks, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From narnoldm at umich.edu Thu Dec 8 09:48:27 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Thu, 8 Dec 2022 10:48:27 -0500 Subject: [petsc-users] Petsc Section in DMPlex In-Reply-To: References: Message-ID: Hi Matt Thanks. Found the issue just messed up the Fortran to C indexing. Another question. I have been using the Petsc VTK output to view things. In some previous efforts, I used the PetscFVM object to set up my section data. When I output vectors using that method in ParaView, I could view the rank information and the state vector to visualize. As far as I can tell when I do the same with Vectors that were created with my manually created section the VecView using PETSCVIEWERVTK only mesh is output with the Rank distribution is viewable (basically as if I had done a DM output instead of a Vec output). My guess is this is because I'm not setting something in my section Field setup properly for the VTK viewer to output it? For my section set up, I am calling PetscSectionCreate PetscSectionSetChart PetscSectionSetFieldName and then the appropriate PetscSectionSetDof PetscSectionSetup DMSetLocalSection Thanks again for all the help. Sincerely Nicholas On Thu, Dec 8, 2022 at 7:27 AM Matthew Knepley wrote: > On Thu, Dec 8, 2022 at 3:04 AM Nicholas Arnold-Medabalimi < > narnoldm at umich.edu> wrote: > >> Hi Matt >> >> I think I've gotten it just about there. I'm just having an issue with >> the VecISCopy. I have an IS built that matches size correctly to map from >> the full state to the filtered state. The core issue I think, is should the >> expanded IS the ownership range of the vector subtracted out. Looking at >> the implementation, it looks like VecISCopy takes care of that for me. >> (Line 573 in src/vec/vec/utils/projection.c) But I could be mistaken. >> > > It is a good question. We have tried to give guidance on the manpage: > > The index set identifies entries in the global vector. Negative indices > are skipped; indices outside the ownership range of vfull will raise an > error. > > which means that it expects _global_ indices, and you have retrieved the > global section, so that matches. > The calculation of the index size looks right to me, and so does the index > calculation. > > I would put a check in the loop, making sure that the calculated indices > lie within [oStart, oEnd). The global > section is designed to ensure that. It is not clear why one would lie > outside. > > When I am debugging, I run a very small problem, and print out all the > sections. > > Thanks, > > Matt > > >> The error I am getting is: >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: No support for this operation for this object type >> [0]PETSC ERROR: Only owned values supported >> >> >> Here is what I am currently doing. >> >> call DMPlexFilter(dmplex_full, iBlankLabel, 1, dmplex_filtered,ierr) >> call DMPlexGetSubpointIS(dmplex_filtered, subpointsIS,ierr) >> >> ! adds section to dmplex_filtered and allocates vec_filtered using >> DMCreateGlobalVector >> call addSectionToDMPlex(dmplex_filtered,vec_filtered) >> >> ! Get Sections for dmplex_filtered and dmplex_full >> call DMGetGlobalSection(dmplex_filtered,filteredfieldSection,ierr) >> call DMGetGlobalSection(dmplex_full,fullfieldSection,ierr) >> >> >> >> call ISGetIndicesF90(subpointsIS, subPointKey,ierr) >> ExpandedIndexSize = 0 >> do i = 1, size(subPointKey) >> call PetscSectionGetDof(fullfieldSection, subPointKey(i), dof,ierr) >> ExpandedIndexSize = ExpandedIndexSize + dof >> enddo >> >> >> !Create expandedIS from offset sections of full and filtered sections >> allocate(ExpandedIndex(ExpandedIndexSize)) >> call VecGetOwnershipRange(vec_full,oStart,oEnd,ierr) >> do i = 1, size(subPointKey) >> call PetscSectionGetOffset(fullfieldSection, subPointKey(i), >> offset,ierr) >> call PetscSectionGetDof(fullfieldSection, subPointKey(i), dof,ierr) >> !offset=offset-oStart !looking at VecIScopy it takes care of this >> subtraction (not sure) >> do j = 1, (dof) >> ExpandedIndex((i-1)*dof+j) = offset+j >> end do >> enddo >> >> call ISCreateGeneral(PETSC_COMM_WORLD, ExpandedIndexSize, ExpandedIndex, >> PETSC_COPY_VALUES, expandedIS,ierr) >> call ISRestoreIndicesF90(subpointsIS, subPointKey,ierr) >> deallocate(ExpandedIndex) >> >> >> call VecGetLocalSize(vec_full,sizeVec,ierr) >> write(*,*) sizeVec >> call VecGetLocalSize(vec_filtered,sizeVec,ierr) >> write(*,*) sizeVec >> call ISGetLocalSize(expandedIS,sizeVec,ierr) >> write(*,*) sizeVec >> call PetscSynchronizedFlush(PETSC_COMM_WORLD,ierr) >> >> >> call VecISCopy(vec_full,expandedIS,SCATTER_REVERSE,vec_filtered,ierr) >> >> >> Thanks again for the great help. >> >> Sincerely >> Nicholas >> >> >> On Wed, Dec 7, 2022 at 9:29 PM Matthew Knepley wrote: >> >>> On Wed, Dec 7, 2022 at 9:21 PM Nicholas Arnold-Medabalimi < >>> narnoldm at umich.edu> wrote: >>> >>>> Thank you for the help. >>>> >>>> I think the last piece of the puzzle is how do I create the "expanded >>>> IS" from the subpoint IS using the section? >>>> >>> >>> Loop over the points in the IS. For each point, get the dof and offset >>> from the Section. Make a new IS that has all the >>> dogs, namely each run [offset, offset+dof). >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Sincerely >>>> Nicholas >>>> >>>> >>>> On Wed, Dec 7, 2022 at 7:06 AM Matthew Knepley >>>> wrote: >>>> >>>>> On Wed, Dec 7, 2022 at 6:51 AM Nicholas Arnold-Medabalimi < >>>>> narnoldm at umich.edu> wrote: >>>>> >>>>>> Hi >>>>>> >>>>>> Thank you so much for your patience. One thing to note: I don't have >>>>>> any need to go back from the filtered distributed mapping back to the full >>>>>> but it is good to know. >>>>>> >>>>>> One aside question. >>>>>> 1) Is natural and global ordering the same in this context? >>>>>> >>>>> >>>>> No. >>>>> >>>>> >>>>>> As far as implementing what you have described. >>>>>> >>>>>> When I call ISView on the generated SubpointIS, I get an unusual >>>>>> error which I'm not sure how to interpret. (this case is running on 2 ranks >>>>>> and the filter label has points located on both ranks of the original DM. >>>>>> However, if I manually get the indices (the commented lines), it seems to >>>>>> not have any issues. >>>>>> call DMPlexFilter(dmplex_full, iBlankLabel, 1, dmplex_filtered,ierr) >>>>>> call DMPlexGetSubpointIS(dmplex_filtered, subpointsIS,ierr) >>>>>> !call ISGetIndicesF90(subpointsIS, subPointKey,ierr) >>>>>> !write(*,*) subPointKey >>>>>> !call ISRestoreIndicesF90(subpointsIS, subPointKey,ierr) >>>>>> call ISView(subpointsIS,PETSC_VIEWER_STDOUT_WORLD,ierr) >>>>>> >>>>>> [1]PETSC ERROR: --------------------- Error Message >>>>>> -------------------------------------------------------------- >>>>>> [1]PETSC ERROR: Arguments must have same communicators >>>>>> [1]PETSC ERROR: Different communicators in the two objects: Argument >>>>>> # 1 and 2 flag 3 >>>>>> [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble >>>>>> shooting. >>>>>> [1]PETSC ERROR: Petsc Development GIT revision: >>>>>> v3.18.1-320-g7810d690132 GIT Date: 2022-11-20 20:25:41 -0600 >>>>>> [1]PETSC ERROR: Configure options with-fc=mpiifort >>>>>> with-mpi-f90=mpiifort --download-triangle --download-parmetis >>>>>> --download-metis --with-debugging=1 --download-hdf5 >>>>>> --prefix=/home/narnoldm/packages/petsc_install >>>>>> [1]PETSC ERROR: #1 ISView() at >>>>>> /home/narnoldm/packages/petsc/src/vec/is/is/interface/index.c:1629 >>>>>> >>>>> >>>>> The problem here is the subpointsIS is a _serial_ object, and you are >>>>> using a parallel viewer. You can use PETSC_VIEWER_STDOUT_SELF, >>>>> or you can pull out the singleton viewer from STDOUT_WORLD if you want >>>>> them all to print in order. >>>>> >>>>> >>>>>> As far as the overall process you have described my question on first >>>>>> glance is do I have to allocate/create the vector that is output by >>>>>> VecISCopy before calling it, or does it create the vector automatically? >>>>>> >>>>> >>>>> You create both vectors. I would do it using DMCreateGlobalVector() >>>>> from both DMs. >>>>> >>>>> >>>>>> I think I would need to create it first using a section and Setting >>>>>> the Vec in the filtered DM? >>>>>> >>>>> >>>>> Setting the Section in the filtered DM. >>>>> >>>>> >>>>>> And I presume in this case I would be using the scatter reverse >>>>>> option to go from the full set to the reduced set? >>>>>> >>>>> >>>>> Yes >>>>> >>>>> Thanks >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Sincerely >>>>>> Nicholas >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Sincerely >>>>>> Nick >>>>>> >>>>>> On Wed, Dec 7, 2022 at 6:00 AM Matthew Knepley >>>>>> wrote: >>>>>> >>>>>>> On Wed, Dec 7, 2022 at 3:35 AM Nicholas Arnold-Medabalimi < >>>>>>> narnoldm at umich.edu> wrote: >>>>>>> >>>>>>>> Hi Matthew >>>>>>>> >>>>>>>> Thank you for the help. This clarified a great deal. >>>>>>>> >>>>>>>> I have a follow-up question related to DMPlexFilter. It may be >>>>>>>> better to describe what I'm trying to achieve. >>>>>>>> >>>>>>>> I have a general mesh I am solving which has a section with cell >>>>>>>> center finite volume states, as described in my initial email. After >>>>>>>> calculating some metrics, I tag a bunch of cells with an identifying Label >>>>>>>> and use DMFilter to generate a new DM which is only that subset of cells. >>>>>>>> Generally, this leads to a pretty unbalanced DM so I then plan to use >>>>>>>> DMPlexDIstribute to balance that DM across the processors. The coordinates >>>>>>>> pass along fine, but the state(or I should say Section) does not at least >>>>>>>> as far as I can tell. >>>>>>>> >>>>>>>> Assuming I can get a filtered DM I then distribute the DM and state >>>>>>>> using the method you described above and it seems to be working ok. >>>>>>>> >>>>>>>> The last connection I have to make is the transfer of information >>>>>>>> from the full mesh to the "sampled" filtered mesh. From what I can gather I >>>>>>>> would need to get the mapping of points using DMPlexGetSubpointIS and then >>>>>>>> manually copy the values from the full DM section to the filtered DM? I >>>>>>>> have the process from full->filtered->distributed all working for the >>>>>>>> coordinates so its just a matter of transferring the section correctly. >>>>>>>> >>>>>>>> I appreciate all the help you have provided. >>>>>>>> >>>>>>> >>>>>>> Let's do this in two steps, which makes it easier to debug. First, >>>>>>> do not redistribute the submesh. Just use DMPlexGetSubpointIS() >>>>>>> to get the mapping of filtered points to points in the original >>>>>>> mesh. Then create an expanded IS using the Section which makes >>>>>>> dofs in the filtered mesh to dofs in the original mesh. From this use >>>>>>> >>>>>>> https://petsc.org/main/docs/manualpages/Vec/VecISCopy/ >>>>>>> >>>>>>> to move values between the original vector and the filtered vector. >>>>>>> >>>>>>> Once that works, you can try redistributing the filtered mesh. >>>>>>> Before calling DMPlexDistribute() on the filtered mesh, you need to call >>>>>>> >>>>>>> https://petsc.org/main/docs/manualpages/DM/DMSetUseNatural/ >>>>>>> >>>>>>> When you redistribute, it will compute a mapping back to the >>>>>>> original layout. Now when you want to transfer values, you >>>>>>> >>>>>>> 1) Create a natural vector with DMCreateNaturalVec() >>>>>>> >>>>>>> 2) Use DMGlobalToNaturalBegin/End() to move values from the >>>>>>> filtered vector to the natural vector >>>>>>> >>>>>>> 3) Use VecISCopy() to move values from the natural vector to the >>>>>>> original vector >>>>>>> >>>>>>> Let me know if you have any problems. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Sincerely >>>>>>>> Nicholas >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Nov 28, 2022 at 6:19 AM Matthew Knepley >>>>>>>> wrote: >>>>>>>> >>>>>>>>> On Sun, Nov 27, 2022 at 10:22 PM Nicholas Arnold-Medabalimi < >>>>>>>>> narnoldm at umich.edu> wrote: >>>>>>>>> >>>>>>>>>> Hi Petsc Users >>>>>>>>>> >>>>>>>>>> I have a question about properly using PetscSection to >>>>>>>>>> assign state variables to a DM. I have an existing DMPlex mesh distributed >>>>>>>>>> on 2 processors. My goal is to have state variables set to the cell >>>>>>>>>> centers. I then want to call DMPlexDistribute, which I hope will balance >>>>>>>>>> the mesh elements and hopefully transport the state variables to the >>>>>>>>>> hosting processors as the cells are distributed to a different processor >>>>>>>>>> count or simply just redistributing after doing mesh adaption. >>>>>>>>>> >>>>>>>>>> Looking at the DMPlex User guide, I should be able to achieve >>>>>>>>>> this with a single field section using SetDof and assigning the DOF to the >>>>>>>>>> points corresponding to cells. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Note that if you want several different fields, you can clone the >>>>>>>>> DM first for this field >>>>>>>>> >>>>>>>>> call DMClone(dm,dmState,ierr) >>>>>>>>> >>>>>>>>> and use dmState in your calls below. >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> call DMPlexGetHeightStratum(dm,0,c0,c1,ierr) >>>>>>>>>> call DMPlexGetChart(dm,p0,p1,ierr) >>>>>>>>>> call PetscSectionCreate(PETSC_COMM_WORLD,section,ierr) >>>>>>>>>> call PetscSectionSetNumFields(section,1,ierr) call >>>>>>>>>> PetscSectionSetChart(section,p0,p1,ierr) >>>>>>>>>> do i = c0, (c1-1) >>>>>>>>>> call PetscSectionSetDof(section,i,nvar,ierr) >>>>>>>>>> end do >>>>>>>>>> call PetscSectionSetup(section,ierr) >>>>>>>>>> call DMSetLocalSection(dm,section,ierr) >>>>>>>>>> >>>>>>>>> >>>>>>>>> In the loop, I would add a call to >>>>>>>>> >>>>>>>>> call PetscSectionSetFieldDof(section,i,0,nvar,ierr) >>>>>>>>> >>>>>>>>> This also puts in the field breakdown. It is not essential, but >>>>>>>>> nicer. >>>>>>>>> >>>>>>>>> >>>>>>>>>> From here, it looks like I can access and set the state vars >>>>>>>>>> using >>>>>>>>>> >>>>>>>>>> call DMGetGlobalVector(dmplex,state,ierr) >>>>>>>>>> call DMGetGlobalSection(dmplex,section,ierr) >>>>>>>>>> call VecGetArrayF90(state,stateVec,ierr) >>>>>>>>>> do i = c0, (c1-1) >>>>>>>>>> call PetscSectionGetOffset(section,i,offset,ierr) >>>>>>>>>> stateVec(offset:(offset+nvar))=state_i(:) !simplified assignment >>>>>>>>>> end do >>>>>>>>>> call VecRestoreArrayF90(state,stateVec,ierr) >>>>>>>>>> call DMRestoreGlobalVector(dmplex,state,ierr) >>>>>>>>>> >>>>>>>>>> To my understanding, I should be using Global vector since this >>>>>>>>>> is a pure assignment operation and I don't need the ghost cells. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Yes. >>>>>>>>> >>>>>>>>> But the behavior I am seeing isn't exactly what I'd expect. >>>>>>>>>> >>>>>>>>>> To be honest, I'm somewhat unclear on a few things >>>>>>>>>> >>>>>>>>>> 1) Should be using nvar fields with 1 DOF each or 1 field with >>>>>>>>>> nvar DOFs or what the distinction between the two methods are? >>>>>>>>>> >>>>>>>>> >>>>>>>>> We have two divisions in a Section. A field can have a number of >>>>>>>>> components. This is intended to model a vector or tensor field. >>>>>>>>> Then a Section can have a number of fields, such as velocity and >>>>>>>>> pressure for a Stokes problem. The division is mainly to help the >>>>>>>>> user, so I would use the most natural one. >>>>>>>>> >>>>>>>>> >>>>>>>>>> 2) Adding a print statement after the offset assignment I get (on >>>>>>>>>> rank 0 of 2) >>>>>>>>>> cell 1 offset 0 >>>>>>>>>> cell 2 offset 18 >>>>>>>>>> cell 3 offset 36 >>>>>>>>>> which is expected and works but on rank 1 I get >>>>>>>>>> cell 1 offset 9000 >>>>>>>>>> cell 2 offset 9018 >>>>>>>>>> cell 3 offset 9036 >>>>>>>>>> >>>>>>>>>> which isn't exactly what I would expect. Shouldn't the offsets >>>>>>>>>> reset at 0 for the next rank? >>>>>>>>>> >>>>>>>>> >>>>>>>>> The local and global sections hold different information. This is >>>>>>>>> the source of the confusion. The local section does describe a local >>>>>>>>> vector, and thus includes overlap or "ghost" dofs. The global >>>>>>>>> section describes a global vector. However, it is intended to deliver >>>>>>>>> global indices, and thus the offsets give back global indices. >>>>>>>>> When you use VecGetArray*() you are getting out the local array, and >>>>>>>>> thus you have to subtract the first index on this process. You can >>>>>>>>> get that from >>>>>>>>> >>>>>>>>> VecGetOwnershipRange(v, &rstart, &rEnd); >>>>>>>>> >>>>>>>>> This is the same whether you are using DMDA or DMPlex or any other >>>>>>>>> DM. >>>>>>>>> >>>>>>>>> >>>>>>>>>> 3) Does calling DMPlexDistribute also distribute the section data >>>>>>>>>> associated with the DOF, based on the description in DMPlexDistribute it >>>>>>>>>> looks like it should? >>>>>>>>>> >>>>>>>>> >>>>>>>>> No. By default, DMPlexDistribute() only distributes coordinate >>>>>>>>> data. I you want to distribute your field, it would look something like >>>>>>>>> this: >>>>>>>>> >>>>>>>>> DMPlexDistribute(dm, 0, &sfDist, &dmDist); >>>>>>>>> VecCreate(comm, &stateDist); >>>>>>>>> VecSetDM(sateDist, dmDist); >>>>>>>>> PetscSectionCreate(comm §ionDist); >>>>>>>>> DMSetLocalSection(dmDist, sectionDist); >>>>>>>>> DMPlexDistributeField(dmDist, sfDist, section, state, sectionDist, >>>>>>>>> stateDist); >>>>>>>>> >>>>>>>>> We do this in src/dm/impls/plex/tests/ex36.c >>>>>>>>> >>>>>>>>> THanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> I'd appreciate any insight into the specifics of this usage. I >>>>>>>>>> expect I have a misconception on the local vs global section. Thank you. >>>>>>>>>> >>>>>>>>>> Sincerely >>>>>>>>>> Nicholas >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Nicholas Arnold-Medabalimi >>>>>>>>>> >>>>>>>>>> Ph.D. Candidate >>>>>>>>>> Computational Aeroscience Lab >>>>>>>>>> University of Michigan >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Nicholas Arnold-Medabalimi >>>>>>>> >>>>>>>> Ph.D. Candidate >>>>>>>> Computational Aeroscience Lab >>>>>>>> University of Michigan >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Nicholas Arnold-Medabalimi >>>>>> >>>>>> Ph.D. Candidate >>>>>> Computational Aeroscience Lab >>>>>> University of Michigan >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>>> >>>> -- >>>> Nicholas Arnold-Medabalimi >>>> >>>> Ph.D. Candidate >>>> Computational Aeroscience Lab >>>> University of Michigan >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> >> -- >> Nicholas Arnold-Medabalimi >> >> Ph.D. Candidate >> Computational Aeroscience Lab >> University of Michigan >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Dec 8 10:49:52 2022 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 8 Dec 2022 11:49:52 -0500 Subject: [petsc-users] prevent linking to multithreaded BLAS? In-Reply-To: <87edtaze6w.fsf@jedbrown.org> References: <05539ceb-d9ec-7b03-4344-1f0cbfe57bd3@mcs.anl.gov> <35A089AA-4AB6-4450-A348-95DBFAA4F68E@petsc.dev> <87edtaze6w.fsf@jedbrown.org> Message-ID: <93CA8288-F666-4902-9289-6B20BD9CE249@petsc.dev> > On Dec 7, 2022, at 11:56 PM, Jed Brown wrote: > > It isn't always wrong to link threaded BLAS. For example, a user might need to call threaded BLAS on the side (but the application can only link one) or a sparse direct solver might want threading for the supernode. Indeed, the user asked specifically for their work flow if configure could, based on additional configure argument, ensure that they did not get a threaded BLAS; they did not ask that configure never give a threaded BLAS or even that it give a non-threaded BLAS by default. > We could test at runtime whether child threads exist/are created when calling BLAS and deliver a warning. How does one test for this? Some standard Unix API for checking this? > > Barry Smith writes: > >> There would need to be, for example, some symbol in all the threaded BLAS libraries that is not in the unthreaded libraries. Of at least in some of the threaded libraries but never in the unthreaded. >> >> BlasLapack.py could check for the special symbol(s) to determine. >> >> Barry >> >> >>> On Dec 7, 2022, at 4:47 PM, Mark Lohry wrote: >>> >>> Thanks, yes, I figured out the OMP_NUM_THREADS=1 way while triaging it, and the --download-fblaslapack way occurred to me. >>> >>> I was hoping for something that "just worked" (refuse to build in this case) but I don't know if it's programmatically possible for petsc to tell whether or not it's linking to a threaded BLAS? >>> >>> On Wed, Dec 7, 2022 at 4:35 PM Satish Balay > wrote: >>>> If you don't specify a blas to use - petsc configure will guess and use what it can find. >>>> >>>> So only way to force it use a particular blas is to specify one [one way is --download-fblaslapack] >>>> >>>> Wrt multi-thread openblas - you can force it run single threaded [by one of these 2 env variables] >>>> >>>> # Use single thread openblas >>>> export OPENBLAS_NUM_THREADS=1 >>>> export OMP_NUM_THREADS=1 >>>> >>>> Satish >>>> >>>> >>>> On Wed, 7 Dec 2022, Mark Lohry wrote: >>>> >>>>> I ran into an unexpected issue -- on an NP-core machine, each MPI rank of >>>>> my application was launching NP threads, such that when running with >>>>> multiple ranks the machine was quickly oversubscribed and performance >>>>> tanked. >>>>> >>>>> The root cause of this was petsc linking against the system-provided >>>>> library (libopenblas0-pthread in this case) set by the update-alternatives >>>>> in ubuntu. At some point this machine got updated to using the threaded >>>>> blas implementation instead of serial; not sure how, and I wouldn't have >>>>> noticed if I weren't running interactively. >>>>> >>>>> Is there any mechanism in petsc or its build system to prevent linking >>>>> against an inappropriate BLAS, or do I need to be diligent about manually >>>>> setting the BLAS library in the configuration stage? >>>>> >>>>> Thanks, >>>>> Mark >>>>> >>>> From narnoldm at umich.edu Thu Dec 8 11:50:36 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Thu, 8 Dec 2022 12:50:36 -0500 Subject: [petsc-users] Petsc Section in DMPlex In-Reply-To: References: Message-ID: I think I've figured out the issue. In previous efforts, I used DMAddField, which I think was key for the output to work properly. On Thu, Dec 8, 2022 at 10:48 AM Nicholas Arnold-Medabalimi < narnoldm at umich.edu> wrote: > Hi Matt > > Thanks. Found the issue just messed up the Fortran to C indexing. > > Another question. I have been using the Petsc VTK output to view things. > In some previous efforts, I used the PetscFVM object to set up my section > data. When I output vectors using that method in ParaView, I could view the > rank information and the state vector to visualize. As far as I can tell > when I do the same with Vectors that were created with my manually created > section the VecView using PETSCVIEWERVTK only mesh is output with the Rank > distribution is viewable (basically as if I had done a DM output instead of > a Vec output). My guess is this is because I'm not setting something in my > section Field setup properly for the VTK viewer to output it? > > For my section set up, I am calling > PetscSectionCreate > PetscSectionSetChart > PetscSectionSetFieldName > and then the appropriate PetscSectionSetDof > PetscSectionSetup > DMSetLocalSection > > Thanks again for all the help. > > Sincerely > Nicholas > > > On Thu, Dec 8, 2022 at 7:27 AM Matthew Knepley wrote: > >> On Thu, Dec 8, 2022 at 3:04 AM Nicholas Arnold-Medabalimi < >> narnoldm at umich.edu> wrote: >> >>> Hi Matt >>> >>> I think I've gotten it just about there. I'm just having an issue with >>> the VecISCopy. I have an IS built that matches size correctly to map from >>> the full state to the filtered state. The core issue I think, is should the >>> expanded IS the ownership range of the vector subtracted out. Looking at >>> the implementation, it looks like VecISCopy takes care of that for me. >>> (Line 573 in src/vec/vec/utils/projection.c) But I could be mistaken. >>> >> >> It is a good question. We have tried to give guidance on the manpage: >> >> The index set identifies entries in the global vector. Negative indices >> are skipped; indices outside the ownership range of vfull will raise an >> error. >> >> which means that it expects _global_ indices, and you have retrieved the >> global section, so that matches. >> The calculation of the index size looks right to me, and so does the >> index calculation. >> >> I would put a check in the loop, making sure that the calculated indices >> lie within [oStart, oEnd). The global >> section is designed to ensure that. It is not clear why one would lie >> outside. >> >> When I am debugging, I run a very small problem, and print out all the >> sections. >> >> Thanks, >> >> Matt >> >> >>> The error I am getting is: >>> [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> [0]PETSC ERROR: No support for this operation for this object type >>> [0]PETSC ERROR: Only owned values supported >>> >>> >>> Here is what I am currently doing. >>> >>> call DMPlexFilter(dmplex_full, iBlankLabel, 1, dmplex_filtered,ierr) >>> call DMPlexGetSubpointIS(dmplex_filtered, subpointsIS,ierr) >>> >>> ! adds section to dmplex_filtered and allocates vec_filtered using >>> DMCreateGlobalVector >>> call addSectionToDMPlex(dmplex_filtered,vec_filtered) >>> >>> ! Get Sections for dmplex_filtered and dmplex_full >>> call DMGetGlobalSection(dmplex_filtered,filteredfieldSection,ierr) >>> call DMGetGlobalSection(dmplex_full,fullfieldSection,ierr) >>> >>> >>> >>> call ISGetIndicesF90(subpointsIS, subPointKey,ierr) >>> ExpandedIndexSize = 0 >>> do i = 1, size(subPointKey) >>> call PetscSectionGetDof(fullfieldSection, subPointKey(i), dof,ierr) >>> ExpandedIndexSize = ExpandedIndexSize + dof >>> enddo >>> >>> >>> !Create expandedIS from offset sections of full and filtered sections >>> allocate(ExpandedIndex(ExpandedIndexSize)) >>> call VecGetOwnershipRange(vec_full,oStart,oEnd,ierr) >>> do i = 1, size(subPointKey) >>> call PetscSectionGetOffset(fullfieldSection, subPointKey(i), >>> offset,ierr) >>> call PetscSectionGetDof(fullfieldSection, subPointKey(i), dof,ierr) >>> !offset=offset-oStart !looking at VecIScopy it takes care of this >>> subtraction (not sure) >>> do j = 1, (dof) >>> ExpandedIndex((i-1)*dof+j) = offset+j >>> end do >>> enddo >>> >>> call ISCreateGeneral(PETSC_COMM_WORLD, ExpandedIndexSize, >>> ExpandedIndex, PETSC_COPY_VALUES, expandedIS,ierr) >>> call ISRestoreIndicesF90(subpointsIS, subPointKey,ierr) >>> deallocate(ExpandedIndex) >>> >>> >>> call VecGetLocalSize(vec_full,sizeVec,ierr) >>> write(*,*) sizeVec >>> call VecGetLocalSize(vec_filtered,sizeVec,ierr) >>> write(*,*) sizeVec >>> call ISGetLocalSize(expandedIS,sizeVec,ierr) >>> write(*,*) sizeVec >>> call PetscSynchronizedFlush(PETSC_COMM_WORLD,ierr) >>> >>> >>> call VecISCopy(vec_full,expandedIS,SCATTER_REVERSE,vec_filtered,ierr) >>> >>> >>> Thanks again for the great help. >>> >>> Sincerely >>> Nicholas >>> >>> >>> On Wed, Dec 7, 2022 at 9:29 PM Matthew Knepley >>> wrote: >>> >>>> On Wed, Dec 7, 2022 at 9:21 PM Nicholas Arnold-Medabalimi < >>>> narnoldm at umich.edu> wrote: >>>> >>>>> Thank you for the help. >>>>> >>>>> I think the last piece of the puzzle is how do I create the "expanded >>>>> IS" from the subpoint IS using the section? >>>>> >>>> >>>> Loop over the points in the IS. For each point, get the dof and offset >>>> from the Section. Make a new IS that has all the >>>> dogs, namely each run [offset, offset+dof). >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Sincerely >>>>> Nicholas >>>>> >>>>> >>>>> On Wed, Dec 7, 2022 at 7:06 AM Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Wed, Dec 7, 2022 at 6:51 AM Nicholas Arnold-Medabalimi < >>>>>> narnoldm at umich.edu> wrote: >>>>>> >>>>>>> Hi >>>>>>> >>>>>>> Thank you so much for your patience. One thing to note: I don't have >>>>>>> any need to go back from the filtered distributed mapping back to the full >>>>>>> but it is good to know. >>>>>>> >>>>>>> One aside question. >>>>>>> 1) Is natural and global ordering the same in this context? >>>>>>> >>>>>> >>>>>> No. >>>>>> >>>>>> >>>>>>> As far as implementing what you have described. >>>>>>> >>>>>>> When I call ISView on the generated SubpointIS, I get an unusual >>>>>>> error which I'm not sure how to interpret. (this case is running on 2 ranks >>>>>>> and the filter label has points located on both ranks of the original DM. >>>>>>> However, if I manually get the indices (the commented lines), it seems to >>>>>>> not have any issues. >>>>>>> call DMPlexFilter(dmplex_full, iBlankLabel, 1, dmplex_filtered,ierr) >>>>>>> call DMPlexGetSubpointIS(dmplex_filtered, subpointsIS,ierr) >>>>>>> !call ISGetIndicesF90(subpointsIS, subPointKey,ierr) >>>>>>> !write(*,*) subPointKey >>>>>>> !call ISRestoreIndicesF90(subpointsIS, subPointKey,ierr) >>>>>>> call ISView(subpointsIS,PETSC_VIEWER_STDOUT_WORLD,ierr) >>>>>>> >>>>>>> [1]PETSC ERROR: --------------------- Error Message >>>>>>> -------------------------------------------------------------- >>>>>>> [1]PETSC ERROR: Arguments must have same communicators >>>>>>> [1]PETSC ERROR: Different communicators in the two objects: Argument >>>>>>> # 1 and 2 flag 3 >>>>>>> [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble >>>>>>> shooting. >>>>>>> [1]PETSC ERROR: Petsc Development GIT revision: >>>>>>> v3.18.1-320-g7810d690132 GIT Date: 2022-11-20 20:25:41 -0600 >>>>>>> [1]PETSC ERROR: Configure options with-fc=mpiifort >>>>>>> with-mpi-f90=mpiifort --download-triangle --download-parmetis >>>>>>> --download-metis --with-debugging=1 --download-hdf5 >>>>>>> --prefix=/home/narnoldm/packages/petsc_install >>>>>>> [1]PETSC ERROR: #1 ISView() at >>>>>>> /home/narnoldm/packages/petsc/src/vec/is/is/interface/index.c:1629 >>>>>>> >>>>>> >>>>>> The problem here is the subpointsIS is a _serial_ object, and you are >>>>>> using a parallel viewer. You can use PETSC_VIEWER_STDOUT_SELF, >>>>>> or you can pull out the singleton viewer from STDOUT_WORLD if you >>>>>> want them all to print in order. >>>>>> >>>>>> >>>>>>> As far as the overall process you have described my question on >>>>>>> first glance is do I have to allocate/create the vector that is output by >>>>>>> VecISCopy before calling it, or does it create the vector automatically? >>>>>>> >>>>>> >>>>>> You create both vectors. I would do it using DMCreateGlobalVector() >>>>>> from both DMs. >>>>>> >>>>>> >>>>>>> I think I would need to create it first using a section and Setting >>>>>>> the Vec in the filtered DM? >>>>>>> >>>>>> >>>>>> Setting the Section in the filtered DM. >>>>>> >>>>>> >>>>>>> And I presume in this case I would be using the scatter reverse >>>>>>> option to go from the full set to the reduced set? >>>>>>> >>>>>> >>>>>> Yes >>>>>> >>>>>> Thanks >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Sincerely >>>>>>> Nicholas >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Sincerely >>>>>>> Nick >>>>>>> >>>>>>> On Wed, Dec 7, 2022 at 6:00 AM Matthew Knepley >>>>>>> wrote: >>>>>>> >>>>>>>> On Wed, Dec 7, 2022 at 3:35 AM Nicholas Arnold-Medabalimi < >>>>>>>> narnoldm at umich.edu> wrote: >>>>>>>> >>>>>>>>> Hi Matthew >>>>>>>>> >>>>>>>>> Thank you for the help. This clarified a great deal. >>>>>>>>> >>>>>>>>> I have a follow-up question related to DMPlexFilter. It may be >>>>>>>>> better to describe what I'm trying to achieve. >>>>>>>>> >>>>>>>>> I have a general mesh I am solving which has a section with cell >>>>>>>>> center finite volume states, as described in my initial email. After >>>>>>>>> calculating some metrics, I tag a bunch of cells with an identifying Label >>>>>>>>> and use DMFilter to generate a new DM which is only that subset of cells. >>>>>>>>> Generally, this leads to a pretty unbalanced DM so I then plan to use >>>>>>>>> DMPlexDIstribute to balance that DM across the processors. The coordinates >>>>>>>>> pass along fine, but the state(or I should say Section) does not at least >>>>>>>>> as far as I can tell. >>>>>>>>> >>>>>>>>> Assuming I can get a filtered DM I then distribute the DM and >>>>>>>>> state using the method you described above and it seems to be working ok. >>>>>>>>> >>>>>>>>> The last connection I have to make is the transfer of information >>>>>>>>> from the full mesh to the "sampled" filtered mesh. From what I can gather I >>>>>>>>> would need to get the mapping of points using DMPlexGetSubpointIS and then >>>>>>>>> manually copy the values from the full DM section to the filtered DM? I >>>>>>>>> have the process from full->filtered->distributed all working for the >>>>>>>>> coordinates so its just a matter of transferring the section correctly. >>>>>>>>> >>>>>>>>> I appreciate all the help you have provided. >>>>>>>>> >>>>>>>> >>>>>>>> Let's do this in two steps, which makes it easier to debug. First, >>>>>>>> do not redistribute the submesh. Just use DMPlexGetSubpointIS() >>>>>>>> to get the mapping of filtered points to points in the original >>>>>>>> mesh. Then create an expanded IS using the Section which makes >>>>>>>> dofs in the filtered mesh to dofs in the original mesh. From this >>>>>>>> use >>>>>>>> >>>>>>>> https://petsc.org/main/docs/manualpages/Vec/VecISCopy/ >>>>>>>> >>>>>>>> to move values between the original vector and the filtered vector. >>>>>>>> >>>>>>>> Once that works, you can try redistributing the filtered mesh. >>>>>>>> Before calling DMPlexDistribute() on the filtered mesh, you need to call >>>>>>>> >>>>>>>> https://petsc.org/main/docs/manualpages/DM/DMSetUseNatural/ >>>>>>>> >>>>>>>> When you redistribute, it will compute a mapping back to the >>>>>>>> original layout. Now when you want to transfer values, you >>>>>>>> >>>>>>>> 1) Create a natural vector with DMCreateNaturalVec() >>>>>>>> >>>>>>>> 2) Use DMGlobalToNaturalBegin/End() to move values from the >>>>>>>> filtered vector to the natural vector >>>>>>>> >>>>>>>> 3) Use VecISCopy() to move values from the natural vector to the >>>>>>>> original vector >>>>>>>> >>>>>>>> Let me know if you have any problems. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Sincerely >>>>>>>>> Nicholas >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Nov 28, 2022 at 6:19 AM Matthew Knepley >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> On Sun, Nov 27, 2022 at 10:22 PM Nicholas Arnold-Medabalimi < >>>>>>>>>> narnoldm at umich.edu> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Petsc Users >>>>>>>>>>> >>>>>>>>>>> I have a question about properly using PetscSection to >>>>>>>>>>> assign state variables to a DM. I have an existing DMPlex mesh distributed >>>>>>>>>>> on 2 processors. My goal is to have state variables set to the cell >>>>>>>>>>> centers. I then want to call DMPlexDistribute, which I hope will balance >>>>>>>>>>> the mesh elements and hopefully transport the state variables to the >>>>>>>>>>> hosting processors as the cells are distributed to a different processor >>>>>>>>>>> count or simply just redistributing after doing mesh adaption. >>>>>>>>>>> >>>>>>>>>>> Looking at the DMPlex User guide, I should be able to achieve >>>>>>>>>>> this with a single field section using SetDof and assigning the DOF to the >>>>>>>>>>> points corresponding to cells. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Note that if you want several different fields, you can clone the >>>>>>>>>> DM first for this field >>>>>>>>>> >>>>>>>>>> call DMClone(dm,dmState,ierr) >>>>>>>>>> >>>>>>>>>> and use dmState in your calls below. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> call DMPlexGetHeightStratum(dm,0,c0,c1,ierr) >>>>>>>>>>> call DMPlexGetChart(dm,p0,p1,ierr) >>>>>>>>>>> call PetscSectionCreate(PETSC_COMM_WORLD,section,ierr) >>>>>>>>>>> call PetscSectionSetNumFields(section,1,ierr) call >>>>>>>>>>> PetscSectionSetChart(section,p0,p1,ierr) >>>>>>>>>>> do i = c0, (c1-1) >>>>>>>>>>> call PetscSectionSetDof(section,i,nvar,ierr) >>>>>>>>>>> end do >>>>>>>>>>> call PetscSectionSetup(section,ierr) >>>>>>>>>>> call DMSetLocalSection(dm,section,ierr) >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> In the loop, I would add a call to >>>>>>>>>> >>>>>>>>>> call PetscSectionSetFieldDof(section,i,0,nvar,ierr) >>>>>>>>>> >>>>>>>>>> This also puts in the field breakdown. It is not essential, but >>>>>>>>>> nicer. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> From here, it looks like I can access and set the state vars >>>>>>>>>>> using >>>>>>>>>>> >>>>>>>>>>> call DMGetGlobalVector(dmplex,state,ierr) >>>>>>>>>>> call DMGetGlobalSection(dmplex,section,ierr) >>>>>>>>>>> call VecGetArrayF90(state,stateVec,ierr) >>>>>>>>>>> do i = c0, (c1-1) >>>>>>>>>>> call PetscSectionGetOffset(section,i,offset,ierr) >>>>>>>>>>> stateVec(offset:(offset+nvar))=state_i(:) !simplified assignment >>>>>>>>>>> end do >>>>>>>>>>> call VecRestoreArrayF90(state,stateVec,ierr) >>>>>>>>>>> call DMRestoreGlobalVector(dmplex,state,ierr) >>>>>>>>>>> >>>>>>>>>>> To my understanding, I should be using Global vector since this >>>>>>>>>>> is a pure assignment operation and I don't need the ghost cells. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yes. >>>>>>>>>> >>>>>>>>>> But the behavior I am seeing isn't exactly what I'd expect. >>>>>>>>>>> >>>>>>>>>>> To be honest, I'm somewhat unclear on a few things >>>>>>>>>>> >>>>>>>>>>> 1) Should be using nvar fields with 1 DOF each or 1 field with >>>>>>>>>>> nvar DOFs or what the distinction between the two methods are? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> We have two divisions in a Section. A field can have a number of >>>>>>>>>> components. This is intended to model a vector or tensor field. >>>>>>>>>> Then a Section can have a number of fields, such as velocity and >>>>>>>>>> pressure for a Stokes problem. The division is mainly to help the >>>>>>>>>> user, so I would use the most natural one. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> 2) Adding a print statement after the offset assignment I get >>>>>>>>>>> (on rank 0 of 2) >>>>>>>>>>> cell 1 offset 0 >>>>>>>>>>> cell 2 offset 18 >>>>>>>>>>> cell 3 offset 36 >>>>>>>>>>> which is expected and works but on rank 1 I get >>>>>>>>>>> cell 1 offset 9000 >>>>>>>>>>> cell 2 offset 9018 >>>>>>>>>>> cell 3 offset 9036 >>>>>>>>>>> >>>>>>>>>>> which isn't exactly what I would expect. Shouldn't the offsets >>>>>>>>>>> reset at 0 for the next rank? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The local and global sections hold different information. This is >>>>>>>>>> the source of the confusion. The local section does describe a local >>>>>>>>>> vector, and thus includes overlap or "ghost" dofs. The global >>>>>>>>>> section describes a global vector. However, it is intended to deliver >>>>>>>>>> global indices, and thus the offsets give back global indices. >>>>>>>>>> When you use VecGetArray*() you are getting out the local array, and >>>>>>>>>> thus you have to subtract the first index on this process. You >>>>>>>>>> can get that from >>>>>>>>>> >>>>>>>>>> VecGetOwnershipRange(v, &rstart, &rEnd); >>>>>>>>>> >>>>>>>>>> This is the same whether you are using DMDA or DMPlex or any >>>>>>>>>> other DM. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> 3) Does calling DMPlexDistribute also distribute the section >>>>>>>>>>> data associated with the DOF, based on the description in DMPlexDistribute >>>>>>>>>>> it looks like it should? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> No. By default, DMPlexDistribute() only distributes coordinate >>>>>>>>>> data. I you want to distribute your field, it would look something like >>>>>>>>>> this: >>>>>>>>>> >>>>>>>>>> DMPlexDistribute(dm, 0, &sfDist, &dmDist); >>>>>>>>>> VecCreate(comm, &stateDist); >>>>>>>>>> VecSetDM(sateDist, dmDist); >>>>>>>>>> PetscSectionCreate(comm §ionDist); >>>>>>>>>> DMSetLocalSection(dmDist, sectionDist); >>>>>>>>>> DMPlexDistributeField(dmDist, sfDist, section, state, >>>>>>>>>> sectionDist, stateDist); >>>>>>>>>> >>>>>>>>>> We do this in src/dm/impls/plex/tests/ex36.c >>>>>>>>>> >>>>>>>>>> THanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> I'd appreciate any insight into the specifics of this usage. I >>>>>>>>>>> expect I have a misconception on the local vs global section. Thank you. >>>>>>>>>>> >>>>>>>>>>> Sincerely >>>>>>>>>>> Nicholas >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Nicholas Arnold-Medabalimi >>>>>>>>>>> >>>>>>>>>>> Ph.D. Candidate >>>>>>>>>>> Computational Aeroscience Lab >>>>>>>>>>> University of Michigan >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Nicholas Arnold-Medabalimi >>>>>>>>> >>>>>>>>> Ph.D. Candidate >>>>>>>>> Computational Aeroscience Lab >>>>>>>>> University of Michigan >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Nicholas Arnold-Medabalimi >>>>>>> >>>>>>> Ph.D. Candidate >>>>>>> Computational Aeroscience Lab >>>>>>> University of Michigan >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Nicholas Arnold-Medabalimi >>>>> >>>>> Ph.D. Candidate >>>>> Computational Aeroscience Lab >>>>> University of Michigan >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >>> -- >>> Nicholas Arnold-Medabalimi >>> >>> Ph.D. Candidate >>> Computational Aeroscience Lab >>> University of Michigan >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan > -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From narnoldm at umich.edu Thu Dec 8 12:05:57 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Thu, 8 Dec 2022 13:05:57 -0500 Subject: [petsc-users] Fortran Interface NULL object / Casting Message-ID: Hi Petsc Users I am trying to use DMAddField in a Fortran code. I had some questions on casting/passing NULL. I follow how to pass NULL for standard types (INT, CHAR, etc). Is there a method/best practice for passing NULL for Petsc type arguments? (In this case DMAddLabel I'd want to pass NULL to a DMLabel Parameter not an intrinsic type) For the 2nd question, In some cases in C we would cast a more specific object to Petsc Object DMAddField(dm, NULL, (PetscObject)fvm); (where fvm is a PetscFVM type) Is there a method or practice to do the same in Fortran. Thanks Nicholas -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Dec 8 12:12:24 2022 From: jed at jedbrown.org (Jed Brown) Date: Thu, 08 Dec 2022 11:12:24 -0700 Subject: [petsc-users] prevent linking to multithreaded BLAS? In-Reply-To: <93CA8288-F666-4902-9289-6B20BD9CE249@petsc.dev> References: <05539ceb-d9ec-7b03-4344-1f0cbfe57bd3@mcs.anl.gov> <35A089AA-4AB6-4450-A348-95DBFAA4F68E@petsc.dev> <87edtaze6w.fsf@jedbrown.org> <93CA8288-F666-4902-9289-6B20BD9CE249@petsc.dev> Message-ID: <875yelzrw7.fsf@jedbrown.org> Barry Smith writes: >> We could test at runtime whether child threads exist/are created when calling BLAS and deliver a warning. > > How does one test for this? Some standard Unix API for checking this? I'm not sure, the ids of child threads are in /proc/$pid/task/ and (when opened by a process) /proc/self/task/. See man procfs for details. From bsmith at petsc.dev Thu Dec 8 13:56:16 2022 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 8 Dec 2022 14:56:16 -0500 Subject: [petsc-users] Fortran Interface NULL object / Casting In-Reply-To: References: Message-ID: <350B3325-38F9-41E5-8176-C4014DE44C81@petsc.dev> You would use PETSC_NULL_DMLABEL but Matt needs to customize the PETSc Fortran stub for DMAddField() for you to handle accepting the NULL from PETSc. Barry > On Dec 8, 2022, at 1:05 PM, Nicholas Arnold-Medabalimi wrote: > > Hi Petsc Users > > I am trying to use DMAddField in a Fortran code. I had some questions on casting/passing NULL. I follow how to pass NULL for standard types (INT, CHAR, etc). > > Is there a method/best practice for passing NULL for Petsc type arguments? (In this case DMAddLabel I'd want to pass NULL to a DMLabel Parameter not an intrinsic type) > > For the 2nd question, In some cases in C we would cast a more specific object to Petsc Object > DMAddField(dm, NULL, (PetscObject)fvm); (where fvm is a PetscFVM type) > Is there a method or practice to do the same in Fortran. > > Thanks > Nicholas > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From yuhongrui at utexas.edu Thu Dec 8 17:06:43 2022 From: yuhongrui at utexas.edu (Hongrui Yu) Date: Thu, 8 Dec 2022 17:06:43 -0600 Subject: [petsc-users] Modifying Entries Tied to Specific Points in Finite Element Stiffness Matrix using DMPlex Message-ID: <004801d90b59$bd4aa180$37dfe480$@utexas.edu> Hello! I'm trying to adapt a serial Finite Element code using PETSc. In this code it reads in special stiffness terms between the boundary DoFs from an input file, and add them to corresponding locations in the global Jacobian matrix. I currently use a DM Plex object to store the mesh information. My understanding is that once the DM is distributed its points are renumbered across different ranks. I wonder if there is a good way to find the corresponding entries that needs to be modified in the global Jacobian matrix? For Vectors I'm currently creating a Natural Vector and simply do DMPlexNaturalToGlobal. Is there a way to create a "Natural Mat" just like "Natural Vector" and then do some sort of NaturalToGlobal for this Mat? Any help would be highly appreciated! Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From narnoldm at umich.edu Thu Dec 8 20:32:48 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Thu, 8 Dec 2022 21:32:48 -0500 Subject: [petsc-users] Petsc Section in DMPlex In-Reply-To: References: Message-ID: Hi Thank you for your help with this barrage of questions. The AddField, while nice for visualization, isn't the critical path for my development. I've gotten the Filtered DM with its corresponding state vector sorted out. Now I'm moving on to the mesh distribution. From your 2nd email in this thread I've added call DMPlexDistribute(dmplex_filtered, 1, distrib_sf, dmplex_distrib,ierr) ! adds section to dmplex_filtered and allocates vec_distrib using DMCreateGlobalVector call addSectionToDMPlex(dmplex_distrib,vec_distrib) call DMGetGlobalSection(dmplex_distrib,distributedSection,ierr) call DMPlexDistributeField(dmplex_disb, distrib_sf, filteredfieldSection, vec_filtered, distributedSection, vec_distrib,ierr) I'm not entirely clear if the DM I should feed into Distribute field should be the starting or end DM but based on doc I think it should be the destination. But I'm getting this error on run. [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: No support for this operation for this object type [1]PETSC ERROR: PetscSectionSetUp is currently unsupported for includesConstraints = PETSC_TRUE [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. I suspect this is occurring because I am incorrectly handling the 3 sets(full, filtered, distributed) of DMPLEX, Section, and Vec's. Thanks Nicholas On Thu, Dec 8, 2022 at 12:50 PM Nicholas Arnold-Medabalimi < narnoldm at umich.edu> wrote: > I think I've figured out the issue. In previous efforts, I used > DMAddField, which I think was key for the output to work properly. > > > On Thu, Dec 8, 2022 at 10:48 AM Nicholas Arnold-Medabalimi < > narnoldm at umich.edu> wrote: > >> Hi Matt >> >> Thanks. Found the issue just messed up the Fortran to C indexing. >> >> Another question. I have been using the Petsc VTK output to view things. >> In some previous efforts, I used the PetscFVM object to set up my section >> data. When I output vectors using that method in ParaView, I could view the >> rank information and the state vector to visualize. As far as I can tell >> when I do the same with Vectors that were created with my manually created >> section the VecView using PETSCVIEWERVTK only mesh is output with the Rank >> distribution is viewable (basically as if I had done a DM output instead of >> a Vec output). My guess is this is because I'm not setting something in my >> section Field setup properly for the VTK viewer to output it? >> >> For my section set up, I am calling >> PetscSectionCreate >> PetscSectionSetChart >> PetscSectionSetFieldName >> and then the appropriate PetscSectionSetDof >> PetscSectionSetup >> DMSetLocalSection >> >> Thanks again for all the help. >> >> Sincerely >> Nicholas >> >> >> On Thu, Dec 8, 2022 at 7:27 AM Matthew Knepley wrote: >> >>> On Thu, Dec 8, 2022 at 3:04 AM Nicholas Arnold-Medabalimi < >>> narnoldm at umich.edu> wrote: >>> >>>> Hi Matt >>>> >>>> I think I've gotten it just about there. I'm just having an issue with >>>> the VecISCopy. I have an IS built that matches size correctly to map from >>>> the full state to the filtered state. The core issue I think, is should the >>>> expanded IS the ownership range of the vector subtracted out. Looking at >>>> the implementation, it looks like VecISCopy takes care of that for me. >>>> (Line 573 in src/vec/vec/utils/projection.c) But I could be mistaken. >>>> >>> >>> It is a good question. We have tried to give guidance on the manpage: >>> >>> The index set identifies entries in the global vector. Negative >>> indices are skipped; indices outside the ownership range of vfull will >>> raise an error. >>> >>> which means that it expects _global_ indices, and you have retrieved the >>> global section, so that matches. >>> The calculation of the index size looks right to me, and so does the >>> index calculation. >>> >>> I would put a check in the loop, making sure that the calculated indices >>> lie within [oStart, oEnd). The global >>> section is designed to ensure that. It is not clear why one would lie >>> outside. >>> >>> When I am debugging, I run a very small problem, and print out all the >>> sections. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> The error I am getting is: >>>> [0]PETSC ERROR: --------------------- Error Message >>>> -------------------------------------------------------------- >>>> [0]PETSC ERROR: No support for this operation for this object type >>>> [0]PETSC ERROR: Only owned values supported >>>> >>>> >>>> Here is what I am currently doing. >>>> >>>> call DMPlexFilter(dmplex_full, iBlankLabel, 1, dmplex_filtered,ierr) >>>> call DMPlexGetSubpointIS(dmplex_filtered, subpointsIS,ierr) >>>> >>>> ! adds section to dmplex_filtered and allocates vec_filtered using >>>> DMCreateGlobalVector >>>> call addSectionToDMPlex(dmplex_filtered,vec_filtered) >>>> >>>> ! Get Sections for dmplex_filtered and dmplex_full >>>> call DMGetGlobalSection(dmplex_filtered,filteredfieldSection,ierr) >>>> call DMGetGlobalSection(dmplex_full,fullfieldSection,ierr) >>>> >>>> >>>> >>>> call ISGetIndicesF90(subpointsIS, subPointKey,ierr) >>>> ExpandedIndexSize = 0 >>>> do i = 1, size(subPointKey) >>>> call PetscSectionGetDof(fullfieldSection, subPointKey(i), dof,ierr) >>>> ExpandedIndexSize = ExpandedIndexSize + dof >>>> enddo >>>> >>>> >>>> !Create expandedIS from offset sections of full and filtered sections >>>> allocate(ExpandedIndex(ExpandedIndexSize)) >>>> call VecGetOwnershipRange(vec_full,oStart,oEnd,ierr) >>>> do i = 1, size(subPointKey) >>>> call PetscSectionGetOffset(fullfieldSection, subPointKey(i), >>>> offset,ierr) >>>> call PetscSectionGetDof(fullfieldSection, subPointKey(i), dof,ierr) >>>> !offset=offset-oStart !looking at VecIScopy it takes care of >>>> this subtraction (not sure) >>>> do j = 1, (dof) >>>> ExpandedIndex((i-1)*dof+j) = offset+j >>>> end do >>>> enddo >>>> >>>> call ISCreateGeneral(PETSC_COMM_WORLD, ExpandedIndexSize, >>>> ExpandedIndex, PETSC_COPY_VALUES, expandedIS,ierr) >>>> call ISRestoreIndicesF90(subpointsIS, subPointKey,ierr) >>>> deallocate(ExpandedIndex) >>>> >>>> >>>> call VecGetLocalSize(vec_full,sizeVec,ierr) >>>> write(*,*) sizeVec >>>> call VecGetLocalSize(vec_filtered,sizeVec,ierr) >>>> write(*,*) sizeVec >>>> call ISGetLocalSize(expandedIS,sizeVec,ierr) >>>> write(*,*) sizeVec >>>> call PetscSynchronizedFlush(PETSC_COMM_WORLD,ierr) >>>> >>>> >>>> call VecISCopy(vec_full,expandedIS,SCATTER_REVERSE,vec_filtered,ierr) >>>> >>>> >>>> Thanks again for the great help. >>>> >>>> Sincerely >>>> Nicholas >>>> >>>> >>>> On Wed, Dec 7, 2022 at 9:29 PM Matthew Knepley >>>> wrote: >>>> >>>>> On Wed, Dec 7, 2022 at 9:21 PM Nicholas Arnold-Medabalimi < >>>>> narnoldm at umich.edu> wrote: >>>>> >>>>>> Thank you for the help. >>>>>> >>>>>> I think the last piece of the puzzle is how do I create the "expanded >>>>>> IS" from the subpoint IS using the section? >>>>>> >>>>> >>>>> Loop over the points in the IS. For each point, get the dof and offset >>>>> from the Section. Make a new IS that has all the >>>>> dogs, namely each run [offset, offset+dof). >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Sincerely >>>>>> Nicholas >>>>>> >>>>>> >>>>>> On Wed, Dec 7, 2022 at 7:06 AM Matthew Knepley >>>>>> wrote: >>>>>> >>>>>>> On Wed, Dec 7, 2022 at 6:51 AM Nicholas Arnold-Medabalimi < >>>>>>> narnoldm at umich.edu> wrote: >>>>>>> >>>>>>>> Hi >>>>>>>> >>>>>>>> Thank you so much for your patience. One thing to note: I don't >>>>>>>> have any need to go back from the filtered distributed mapping back to the >>>>>>>> full but it is good to know. >>>>>>>> >>>>>>>> One aside question. >>>>>>>> 1) Is natural and global ordering the same in this context? >>>>>>>> >>>>>>> >>>>>>> No. >>>>>>> >>>>>>> >>>>>>>> As far as implementing what you have described. >>>>>>>> >>>>>>>> When I call ISView on the generated SubpointIS, I get an unusual >>>>>>>> error which I'm not sure how to interpret. (this case is running on 2 ranks >>>>>>>> and the filter label has points located on both ranks of the original DM. >>>>>>>> However, if I manually get the indices (the commented lines), it seems to >>>>>>>> not have any issues. >>>>>>>> call DMPlexFilter(dmplex_full, iBlankLabel, 1, >>>>>>>> dmplex_filtered,ierr) >>>>>>>> call DMPlexGetSubpointIS(dmplex_filtered, subpointsIS,ierr) >>>>>>>> !call ISGetIndicesF90(subpointsIS, subPointKey,ierr) >>>>>>>> !write(*,*) subPointKey >>>>>>>> !call ISRestoreIndicesF90(subpointsIS, subPointKey,ierr) >>>>>>>> call ISView(subpointsIS,PETSC_VIEWER_STDOUT_WORLD,ierr) >>>>>>>> >>>>>>>> [1]PETSC ERROR: --------------------- Error Message >>>>>>>> -------------------------------------------------------------- >>>>>>>> [1]PETSC ERROR: Arguments must have same communicators >>>>>>>> [1]PETSC ERROR: Different communicators in the two objects: >>>>>>>> Argument # 1 and 2 flag 3 >>>>>>>> [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble >>>>>>>> shooting. >>>>>>>> [1]PETSC ERROR: Petsc Development GIT revision: >>>>>>>> v3.18.1-320-g7810d690132 GIT Date: 2022-11-20 20:25:41 -0600 >>>>>>>> [1]PETSC ERROR: Configure options with-fc=mpiifort >>>>>>>> with-mpi-f90=mpiifort --download-triangle --download-parmetis >>>>>>>> --download-metis --with-debugging=1 --download-hdf5 >>>>>>>> --prefix=/home/narnoldm/packages/petsc_install >>>>>>>> [1]PETSC ERROR: #1 ISView() at >>>>>>>> /home/narnoldm/packages/petsc/src/vec/is/is/interface/index.c:1629 >>>>>>>> >>>>>>> >>>>>>> The problem here is the subpointsIS is a _serial_ object, and you >>>>>>> are using a parallel viewer. You can use PETSC_VIEWER_STDOUT_SELF, >>>>>>> or you can pull out the singleton viewer from STDOUT_WORLD if you >>>>>>> want them all to print in order. >>>>>>> >>>>>>> >>>>>>>> As far as the overall process you have described my question on >>>>>>>> first glance is do I have to allocate/create the vector that is output by >>>>>>>> VecISCopy before calling it, or does it create the vector automatically? >>>>>>>> >>>>>>> >>>>>>> You create both vectors. I would do it using DMCreateGlobalVector() >>>>>>> from both DMs. >>>>>>> >>>>>>> >>>>>>>> I think I would need to create it first using a section and Setting >>>>>>>> the Vec in the filtered DM? >>>>>>>> >>>>>>> >>>>>>> Setting the Section in the filtered DM. >>>>>>> >>>>>>> >>>>>>>> And I presume in this case I would be using the scatter reverse >>>>>>>> option to go from the full set to the reduced set? >>>>>>>> >>>>>>> >>>>>>> Yes >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Sincerely >>>>>>>> Nicholas >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Sincerely >>>>>>>> Nick >>>>>>>> >>>>>>>> On Wed, Dec 7, 2022 at 6:00 AM Matthew Knepley >>>>>>>> wrote: >>>>>>>> >>>>>>>>> On Wed, Dec 7, 2022 at 3:35 AM Nicholas Arnold-Medabalimi < >>>>>>>>> narnoldm at umich.edu> wrote: >>>>>>>>> >>>>>>>>>> Hi Matthew >>>>>>>>>> >>>>>>>>>> Thank you for the help. This clarified a great deal. >>>>>>>>>> >>>>>>>>>> I have a follow-up question related to DMPlexFilter. It may be >>>>>>>>>> better to describe what I'm trying to achieve. >>>>>>>>>> >>>>>>>>>> I have a general mesh I am solving which has a section with cell >>>>>>>>>> center finite volume states, as described in my initial email. After >>>>>>>>>> calculating some metrics, I tag a bunch of cells with an identifying Label >>>>>>>>>> and use DMFilter to generate a new DM which is only that subset of cells. >>>>>>>>>> Generally, this leads to a pretty unbalanced DM so I then plan to use >>>>>>>>>> DMPlexDIstribute to balance that DM across the processors. The coordinates >>>>>>>>>> pass along fine, but the state(or I should say Section) does not at least >>>>>>>>>> as far as I can tell. >>>>>>>>>> >>>>>>>>>> Assuming I can get a filtered DM I then distribute the DM and >>>>>>>>>> state using the method you described above and it seems to be working ok. >>>>>>>>>> >>>>>>>>>> The last connection I have to make is the transfer of information >>>>>>>>>> from the full mesh to the "sampled" filtered mesh. From what I can gather I >>>>>>>>>> would need to get the mapping of points using DMPlexGetSubpointIS and then >>>>>>>>>> manually copy the values from the full DM section to the filtered DM? I >>>>>>>>>> have the process from full->filtered->distributed all working for the >>>>>>>>>> coordinates so its just a matter of transferring the section correctly. >>>>>>>>>> >>>>>>>>>> I appreciate all the help you have provided. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Let's do this in two steps, which makes it easier to debug. First, >>>>>>>>> do not redistribute the submesh. Just use DMPlexGetSubpointIS() >>>>>>>>> to get the mapping of filtered points to points in the original >>>>>>>>> mesh. Then create an expanded IS using the Section which makes >>>>>>>>> dofs in the filtered mesh to dofs in the original mesh. From this >>>>>>>>> use >>>>>>>>> >>>>>>>>> https://petsc.org/main/docs/manualpages/Vec/VecISCopy/ >>>>>>>>> >>>>>>>>> to move values between the original vector and the filtered vector. >>>>>>>>> >>>>>>>>> Once that works, you can try redistributing the filtered mesh. >>>>>>>>> Before calling DMPlexDistribute() on the filtered mesh, you need to call >>>>>>>>> >>>>>>>>> https://petsc.org/main/docs/manualpages/DM/DMSetUseNatural/ >>>>>>>>> >>>>>>>>> When you redistribute, it will compute a mapping back to the >>>>>>>>> original layout. Now when you want to transfer values, you >>>>>>>>> >>>>>>>>> 1) Create a natural vector with DMCreateNaturalVec() >>>>>>>>> >>>>>>>>> 2) Use DMGlobalToNaturalBegin/End() to move values from the >>>>>>>>> filtered vector to the natural vector >>>>>>>>> >>>>>>>>> 3) Use VecISCopy() to move values from the natural vector to the >>>>>>>>> original vector >>>>>>>>> >>>>>>>>> Let me know if you have any problems. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> Sincerely >>>>>>>>>> Nicholas >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Mon, Nov 28, 2022 at 6:19 AM Matthew Knepley < >>>>>>>>>> knepley at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> On Sun, Nov 27, 2022 at 10:22 PM Nicholas Arnold-Medabalimi < >>>>>>>>>>> narnoldm at umich.edu> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Petsc Users >>>>>>>>>>>> >>>>>>>>>>>> I have a question about properly using PetscSection to >>>>>>>>>>>> assign state variables to a DM. I have an existing DMPlex mesh distributed >>>>>>>>>>>> on 2 processors. My goal is to have state variables set to the cell >>>>>>>>>>>> centers. I then want to call DMPlexDistribute, which I hope will balance >>>>>>>>>>>> the mesh elements and hopefully transport the state variables to the >>>>>>>>>>>> hosting processors as the cells are distributed to a different processor >>>>>>>>>>>> count or simply just redistributing after doing mesh adaption. >>>>>>>>>>>> >>>>>>>>>>>> Looking at the DMPlex User guide, I should be able to achieve >>>>>>>>>>>> this with a single field section using SetDof and assigning the DOF to the >>>>>>>>>>>> points corresponding to cells. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Note that if you want several different fields, you can >>>>>>>>>>> clone the DM first for this field >>>>>>>>>>> >>>>>>>>>>> call DMClone(dm,dmState,ierr) >>>>>>>>>>> >>>>>>>>>>> and use dmState in your calls below. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> call DMPlexGetHeightStratum(dm,0,c0,c1,ierr) >>>>>>>>>>>> call DMPlexGetChart(dm,p0,p1,ierr) >>>>>>>>>>>> call PetscSectionCreate(PETSC_COMM_WORLD,section,ierr) >>>>>>>>>>>> call PetscSectionSetNumFields(section,1,ierr) call >>>>>>>>>>>> PetscSectionSetChart(section,p0,p1,ierr) >>>>>>>>>>>> do i = c0, (c1-1) >>>>>>>>>>>> call PetscSectionSetDof(section,i,nvar,ierr) >>>>>>>>>>>> end do >>>>>>>>>>>> call PetscSectionSetup(section,ierr) >>>>>>>>>>>> call DMSetLocalSection(dm,section,ierr) >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> In the loop, I would add a call to >>>>>>>>>>> >>>>>>>>>>> call PetscSectionSetFieldDof(section,i,0,nvar,ierr) >>>>>>>>>>> >>>>>>>>>>> This also puts in the field breakdown. It is not essential, but >>>>>>>>>>> nicer. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> From here, it looks like I can access and set the state vars >>>>>>>>>>>> using >>>>>>>>>>>> >>>>>>>>>>>> call DMGetGlobalVector(dmplex,state,ierr) >>>>>>>>>>>> call DMGetGlobalSection(dmplex,section,ierr) >>>>>>>>>>>> call VecGetArrayF90(state,stateVec,ierr) >>>>>>>>>>>> do i = c0, (c1-1) >>>>>>>>>>>> call PetscSectionGetOffset(section,i,offset,ierr) >>>>>>>>>>>> stateVec(offset:(offset+nvar))=state_i(:) !simplified >>>>>>>>>>>> assignment >>>>>>>>>>>> end do >>>>>>>>>>>> call VecRestoreArrayF90(state,stateVec,ierr) >>>>>>>>>>>> call DMRestoreGlobalVector(dmplex,state,ierr) >>>>>>>>>>>> >>>>>>>>>>>> To my understanding, I should be using Global vector since this >>>>>>>>>>>> is a pure assignment operation and I don't need the ghost cells. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yes. >>>>>>>>>>> >>>>>>>>>>> But the behavior I am seeing isn't exactly what I'd expect. >>>>>>>>>>>> >>>>>>>>>>>> To be honest, I'm somewhat unclear on a few things >>>>>>>>>>>> >>>>>>>>>>>> 1) Should be using nvar fields with 1 DOF each or 1 field with >>>>>>>>>>>> nvar DOFs or what the distinction between the two methods are? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> We have two divisions in a Section. A field can have a number of >>>>>>>>>>> components. This is intended to model a vector or tensor field. >>>>>>>>>>> Then a Section can have a number of fields, such as velocity and >>>>>>>>>>> pressure for a Stokes problem. The division is mainly to help the >>>>>>>>>>> user, so I would use the most natural one. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> 2) Adding a print statement after the offset assignment I get >>>>>>>>>>>> (on rank 0 of 2) >>>>>>>>>>>> cell 1 offset 0 >>>>>>>>>>>> cell 2 offset 18 >>>>>>>>>>>> cell 3 offset 36 >>>>>>>>>>>> which is expected and works but on rank 1 I get >>>>>>>>>>>> cell 1 offset 9000 >>>>>>>>>>>> cell 2 offset 9018 >>>>>>>>>>>> cell 3 offset 9036 >>>>>>>>>>>> >>>>>>>>>>>> which isn't exactly what I would expect. Shouldn't the offsets >>>>>>>>>>>> reset at 0 for the next rank? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> The local and global sections hold different information. This >>>>>>>>>>> is the source of the confusion. The local section does describe a local >>>>>>>>>>> vector, and thus includes overlap or "ghost" dofs. The global >>>>>>>>>>> section describes a global vector. However, it is intended to deliver >>>>>>>>>>> global indices, and thus the offsets give back global indices. >>>>>>>>>>> When you use VecGetArray*() you are getting out the local array, and >>>>>>>>>>> thus you have to subtract the first index on this process. You >>>>>>>>>>> can get that from >>>>>>>>>>> >>>>>>>>>>> VecGetOwnershipRange(v, &rstart, &rEnd); >>>>>>>>>>> >>>>>>>>>>> This is the same whether you are using DMDA or DMPlex or any >>>>>>>>>>> other DM. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> 3) Does calling DMPlexDistribute also distribute the section >>>>>>>>>>>> data associated with the DOF, based on the description in DMPlexDistribute >>>>>>>>>>>> it looks like it should? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> No. By default, DMPlexDistribute() only distributes coordinate >>>>>>>>>>> data. I you want to distribute your field, it would look something like >>>>>>>>>>> this: >>>>>>>>>>> >>>>>>>>>>> DMPlexDistribute(dm, 0, &sfDist, &dmDist); >>>>>>>>>>> VecCreate(comm, &stateDist); >>>>>>>>>>> VecSetDM(sateDist, dmDist); >>>>>>>>>>> PetscSectionCreate(comm §ionDist); >>>>>>>>>>> DMSetLocalSection(dmDist, sectionDist); >>>>>>>>>>> DMPlexDistributeField(dmDist, sfDist, section, state, >>>>>>>>>>> sectionDist, stateDist); >>>>>>>>>>> >>>>>>>>>>> We do this in src/dm/impls/plex/tests/ex36.c >>>>>>>>>>> >>>>>>>>>>> THanks, >>>>>>>>>>> >>>>>>>>>>> Matt >>>>>>>>>>> >>>>>>>>>>> I'd appreciate any insight into the specifics of this usage. I >>>>>>>>>>>> expect I have a misconception on the local vs global section. Thank you. >>>>>>>>>>>> >>>>>>>>>>>> Sincerely >>>>>>>>>>>> Nicholas >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Nicholas Arnold-Medabalimi >>>>>>>>>>>> >>>>>>>>>>>> Ph.D. Candidate >>>>>>>>>>>> Computational Aeroscience Lab >>>>>>>>>>>> University of Michigan >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>>> experiments lead. >>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>> >>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Nicholas Arnold-Medabalimi >>>>>>>>>> >>>>>>>>>> Ph.D. Candidate >>>>>>>>>> Computational Aeroscience Lab >>>>>>>>>> University of Michigan >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Nicholas Arnold-Medabalimi >>>>>>>> >>>>>>>> Ph.D. Candidate >>>>>>>> Computational Aeroscience Lab >>>>>>>> University of Michigan >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Nicholas Arnold-Medabalimi >>>>>> >>>>>> Ph.D. Candidate >>>>>> Computational Aeroscience Lab >>>>>> University of Michigan >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>>> >>>> -- >>>> Nicholas Arnold-Medabalimi >>>> >>>> Ph.D. Candidate >>>> Computational Aeroscience Lab >>>> University of Michigan >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> >> -- >> Nicholas Arnold-Medabalimi >> >> Ph.D. Candidate >> Computational Aeroscience Lab >> University of Michigan >> > > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan > -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Dec 9 09:10:03 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 9 Dec 2022 10:10:03 -0500 Subject: [petsc-users] Modifying Entries Tied to Specific Points in Finite Element Stiffness Matrix using DMPlex In-Reply-To: <004801d90b59$bd4aa180$37dfe480$@utexas.edu> References: <004801d90b59$bd4aa180$37dfe480$@utexas.edu> Message-ID: On Thu, Dec 8, 2022 at 6:06 PM Hongrui Yu wrote: > Hello! I?m trying to adapt a serial Finite Element code using PETSc. In > this code it reads in special stiffness terms between the boundary DoFs > from an input file, and add them to corresponding locations in the global > Jacobian matrix. > Hmm, so in completely general locations, or on the diagonal? > I currently use a DM Plex object to store the mesh information. My > understanding is that once the DM is distributed its points are renumbered > across different ranks. > That is true. > I wonder if there is a good way to find the corresponding entries that > needs to be modified in the global Jacobian matrix? > > > > For Vectors I?m currently creating a Natural Vector and simply do > DMPlexNaturalToGlobal. Is there a way to create a ?Natural Mat? just like > ?Natural Vector? and then do some sort of NaturalToGlobal for this Mat? > > > > Any help would be highly appreciated! > If it is completely general, this will take some coding. Thanks, Matt > Kevin > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yuhongrui at utexas.edu Fri Dec 9 09:51:55 2022 From: yuhongrui at utexas.edu (Hongrui Yu) Date: Fri, 9 Dec 2022 09:51:55 -0600 Subject: [petsc-users] Modifying Entries Tied to Specific Points in Finite Element Stiffness Matrix using DMPlex In-Reply-To: References: Message-ID: Thank you for your reply! Unfortunately yes.. I?ll need to modify stiffness between nodes on the boundary so most of them are going to be in completely general location. I can create an IS after distribution using DMPlexCreatePointNumbering() but they are Global numbering. Is there a way to get a map from Natural numbering to Global numbering? I assume this is used somewhere in DMPlexNaturalToGlobal()? This way I can find the correct entry to modify. Thanks, Kevin > On Dec 9, 2022, at 09:10, Matthew Knepley wrote: > > ? >> On Thu, Dec 8, 2022 at 6:06 PM Hongrui Yu wrote: > >> Hello! I?m trying to adapt a serial Finite Element code using PETSc. In this code it reads in special stiffness terms between the boundary DoFs from an input file, and add them to corresponding locations in the global Jacobian matrix. >> > > Hmm, so in completely general locations, or on the diagonal? > >> I currently use a DM Plex object to store the mesh information. My understanding is that once the DM is distributed its points are renumbered across different ranks. >> > > That is true. > >> I wonder if there is a good way to find the corresponding entries that needs to be modified in the global Jacobian matrix? >> >> >> >> For Vectors I?m currently creating a Natural Vector and simply do DMPlexNaturalToGlobal. Is there a way to create a ?Natural Mat? just like ?Natural Vector? and then do some sort of NaturalToGlobal for this Mat? >> >> >> >> Any help would be highly appreciated! >> > > If it is completely general, this will take some coding. > > Thanks, > > Matt > >> Kevin >> > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Dec 9 10:10:15 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 9 Dec 2022 11:10:15 -0500 Subject: [petsc-users] Modifying Entries Tied to Specific Points in Finite Element Stiffness Matrix using DMPlex In-Reply-To: References: Message-ID: On Fri, Dec 9, 2022 at 10:51 AM Hongrui Yu wrote: > Thank you for your reply! Unfortunately yes.. I?ll need to modify > stiffness between nodes on the boundary so most of them are going to be in > completely general location. > Hmm, there is usually a better way to do this. This is a mesh and discretization dependent way to impose boundary conditions. However, it is possible. > I can create an IS after distribution using DMPlexCreatePointNumbering() > but they are Global numbering. Is there a way to get a map from Natural > numbering to Global numbering? I assume this is used somewhere in > DMPlexNaturalToGlobal()? This way I can find the correct entry to modify. > You can get the SF that maps global vectors to natural vectors: https://petsc.org/main/docs/manualpages/DM/DMGetNaturalSF/ and pull out the information https://petsc.org/main/docs/manualpages/PetscSF/PetscSFGetGraph/ The roots should be global dofs and the leaves should be natural dofs. So you look through the ilocal leaves for your natural dof and the corresponding remote will be your global dof (but the local number, so you would have to add the rStart to make it a true global dof). Thanks, MAtt > Thanks, > Kevin > > On Dec 9, 2022, at 09:10, Matthew Knepley wrote: > > ? > On Thu, Dec 8, 2022 at 6:06 PM Hongrui Yu wrote: > >> Hello! I?m trying to adapt a serial Finite Element code using PETSc. In >> this code it reads in special stiffness terms between the boundary DoFs >> from an input file, and add them to corresponding locations in the global >> Jacobian matrix. >> > > Hmm, so in completely general locations, or on the diagonal? > > >> I currently use a DM Plex object to store the mesh information. My >> understanding is that once the DM is distributed its points are renumbered >> across different ranks. >> > > That is true. > > >> I wonder if there is a good way to find the corresponding entries that >> needs to be modified in the global Jacobian matrix? >> >> >> >> For Vectors I?m currently creating a Natural Vector and simply do >> DMPlexNaturalToGlobal. Is there a way to create a ?Natural Mat? just like >> ?Natural Vector? and then do some sort of NaturalToGlobal for this Mat? >> >> >> >> Any help would be highly appreciated! >> > > If it is completely general, this will take some coding. > > Thanks, > > Matt > > >> Kevin >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From karthikeyan.chockalingam at stfc.ac.uk Fri Dec 9 13:28:09 2022 From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI) Date: Fri, 9 Dec 2022 19:28:09 +0000 Subject: [petsc-users] Union of sequential vecs Message-ID: Hi, I want to take the union of a set of sequential vectors, each living in a different processor. Say, Vec_Seq1 = {2,5,7} Vec_Seq2 = {5,8,10,11} Vec_Seq3 = {5,2,12}. Finally, get the union of all them Vec = {2,5,7,8,10,11,12}. I initially wanted to create a parallel vector and insert the (sequential vector) values but I do not know, to which index to insert the values to. But I do know the total size of Vec (which in this case is 7). Any help is much appreciated. Kind regards, Karthik. This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Dec 9 14:03:45 2022 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 9 Dec 2022 15:03:45 -0500 Subject: [petsc-users] Union of sequential vecs In-Reply-To: References: Message-ID: <4C906088-C6AB-47DE-BD77-23F4F77E9B8E@petsc.dev> How are you combining them to get Vec = {2,5,7,8,10,11,12}? Do you want the values to remain on the same MPI rank as before, just in an MPI vector? > On Dec 9, 2022, at 2:28 PM, Karthikeyan Chockalingam - STFC UKRI via petsc-users wrote: > > Hi, > > I want to take the union of a set of sequential vectors, each living in a different processor. > > Say, > Vec_Seq1 = {2,5,7} > Vec_Seq2 = {5,8,10,11} > Vec_Seq3 = {5,2,12}. > > Finally, get the union of all them Vec = {2,5,7,8,10,11,12}. > > I initially wanted to create a parallel vector and insert the (sequential vector) values but I do not know, to which index to insert the values to. But I do know the total size of Vec (which in this case is 7). > > Any help is much appreciated. > > Kind regards, > Karthik. > > > > This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From karthikeyan.chockalingam at stfc.ac.uk Fri Dec 9 14:24:51 2022 From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI) Date: Fri, 9 Dec 2022 20:24:51 +0000 Subject: [petsc-users] Union of sequential vecs In-Reply-To: <4C906088-C6AB-47DE-BD77-23F4F77E9B8E@petsc.dev> References: <4C906088-C6AB-47DE-BD77-23F4F77E9B8E@petsc.dev> Message-ID: That is where I am stuck, I don?t know who to combine them to get Vec = {2,5,7,8,10,11,12}. I just want them in an MPI vector. I finally plan to call VecScatterCreateToAll so that all processor gets a copy. Thank you. Kind regards, Karthik. From: Barry Smith Date: Friday, 9 December 2022 at 20:04 To: Chockalingam, Karthikeyan (STFC,DL,HC) Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Union of sequential vecs How are you combining them to get Vec = {2,5,7,8,10,11,12}? Do you want the values to remain on the same MPI rank as before, just in an MPI vector? On Dec 9, 2022, at 2:28 PM, Karthikeyan Chockalingam - STFC UKRI via petsc-users wrote: Hi, I want to take the union of a set of sequential vectors, each living in a different processor. Say, Vec_Seq1 = {2,5,7} Vec_Seq2 = {5,8,10,11} Vec_Seq3 = {5,2,12}. Finally, get the union of all them Vec = {2,5,7,8,10,11,12}. I initially wanted to create a parallel vector and insert the (sequential vector) values but I do not know, to which index to insert the values to. But I do know the total size of Vec (which in this case is 7). Any help is much appreciated. Kind regards, Karthik. This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Dec 9 15:08:06 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 9 Dec 2022 16:08:06 -0500 Subject: [petsc-users] Union of sequential vecs In-Reply-To: References: <4C906088-C6AB-47DE-BD77-23F4F77E9B8E@petsc.dev> Message-ID: If your space is pretty compact, eg, (0,12), you could create an MPI vector Q of size 13, say, and each processor can add 1.0 to Q[Vec[i]], for all "i" in my local "Vec". Then each processor can count the number of local nonzeros in Q, call it n, create a new vector, R, with local size n, then set R[i] = global index of the nonzero for each nonzero in Q, i=0:n. Do some sort of vec-scatter-to-all with R to get what you want. Does that work? Mark On Fri, Dec 9, 2022 at 3:25 PM Karthikeyan Chockalingam - STFC UKRI via petsc-users wrote: > That is where I am stuck, *I don?t know* who to combine them to get Vec = > {2,5,7,8,10,11,12}. > > I just want them in an MPI vector. > > > > I finally plan to call VecScatterCreateToAll so that all processor gets a > copy. > > > > Thank you. > > > > Kind regards, > > Karthik. > > > > *From: *Barry Smith > *Date: *Friday, 9 December 2022 at 20:04 > *To: *Chockalingam, Karthikeyan (STFC,DL,HC) < > karthikeyan.chockalingam at stfc.ac.uk> > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] Union of sequential vecs > > > > How are you combining them to get Vec = {2,5,7,8,10,11,12}? > > > > Do you want the values to remain on the same MPI rank as before, just in > an MPI vector? > > > > > > > > On Dec 9, 2022, at 2:28 PM, Karthikeyan Chockalingam - STFC UKRI via > petsc-users wrote: > > > > Hi, > > > > I want to take the union of a set of sequential vectors, each living in a > different processor. > > > > Say, > > Vec_Seq1 = {2,5,7} > > Vec_Seq2 = {5,8,10,11} > > Vec_Seq3 = {5,2,12}. > > > > Finally, get the union of all them Vec = {2,5,7,8,10,11,12}. > > > > I initially wanted to create a parallel vector and insert the (sequential > vector) values but I do not know, to which index to insert the values to. > But I do know the total size of Vec (which in this case is 7). > > > > Any help is much appreciated. > > > > Kind regards, > > Karthik. > > > > > > > > This email and any attachments are intended solely for the use of the > named recipients. If you are not the intended recipient you must not use, > disclose, copy or distribute this email or any of its attachments and > should notify the sender immediately and delete this email from your > system. UK Research and Innovation (UKRI) has taken every reasonable > precaution to minimise risk of this email or any attachments containing > viruses or malware but the recipient should carry out its own virus and > malware checks before opening the attachments. UKRI does not accept any > liability for any losses or damages which the recipient may sustain due to > presence of any viruses. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Dec 9 15:14:26 2022 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 9 Dec 2022 16:14:26 -0500 Subject: [petsc-users] Union of sequential vecs In-Reply-To: References: <4C906088-C6AB-47DE-BD77-23F4F77E9B8E@petsc.dev> Message-ID: Ok, so you want the unique list of integers sorted from all the seq vectors on ever MPI rank? VecScatterCreateToAll() to get all values on all ranks (make the sequential vectors MPI vectors instead). create an integer array long enough to hold all of them Use VecGetArray() and a for loop to copy all the values to the integer array, Use PetscSortRemoveDupsInt on the integer array Now each rank has all the desired values. > On Dec 9, 2022, at 3:24 PM, Karthikeyan Chockalingam - STFC UKRI wrote: > > That is where I am stuck, I don?t know who to combine them to get Vec = {2,5,7,8,10,11,12}. > I just want them in an MPI vector. > > I finally plan to call VecScatterCreateToAll so that all processor gets a copy. > > Thank you. > > Kind regards, > Karthik. > > From: Barry Smith > > Date: Friday, 9 December 2022 at 20:04 > To: Chockalingam, Karthikeyan (STFC,DL,HC) > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Union of sequential vecs > > > How are you combining them to get Vec = {2,5,7,8,10,11,12}? > > Do you want the values to remain on the same MPI rank as before, just in an MPI vector? > > > > > On Dec 9, 2022, at 2:28 PM, Karthikeyan Chockalingam - STFC UKRI via petsc-users > wrote: > > Hi, > > I want to take the union of a set of sequential vectors, each living in a different processor. > > Say, > Vec_Seq1 = {2,5,7} > Vec_Seq2 = {5,8,10,11} > Vec_Seq3 = {5,2,12}. > > Finally, get the union of all them Vec = {2,5,7,8,10,11,12}. > > I initially wanted to create a parallel vector and insert the (sequential vector) values but I do not know, to which index to insert the values to. But I do know the total size of Vec (which in this case is 7). > > Any help is much appreciated. > > Kind regards, > Karthik. > > > > This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourdin at mcmaster.ca Fri Dec 9 15:26:26 2022 From: bourdin at mcmaster.ca (Blaise Bourdin) Date: Fri, 9 Dec 2022 21:26:26 +0000 Subject: [petsc-users] Union of sequential vecs In-Reply-To: References: <4C906088-C6AB-47DE-BD77-23F4F77E9B8E@petsc.dev> Message-ID: <5629B006-DC76-4002-965D-401D7907B026@mcmaster.ca> An HTML attachment was scrubbed... URL: From karthikeyan.chockalingam at stfc.ac.uk Fri Dec 9 17:50:26 2022 From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI) Date: Fri, 9 Dec 2022 23:50:26 +0000 Subject: [petsc-users] Union of sequential vecs In-Reply-To: References: <4C906088-C6AB-47DE-BD77-23F4F77E9B8E@petsc.dev> Message-ID: Thank you Mark and Barry. @Mark Adams I follow you for the most part. Shouldn?t R be an MPI Vector? Here it goes: Q[Vec[i]] for all "i" in my local (sequential) "Vec". Compute number of local nonzeros in Q, call it n. //Create a MPI Vector R, with local size n VecCreate(PETSC_COMM_WORLD, &R); VecSetType(R, VECMPI); VecSetSizes(R, n, PETSC_DECIDE); //Populate MPI Vector R with local (sequential) "Vec". VecGetOwnershipRange(R, &istart, &iend); local_size = iend - istart; //The local_size should be ?n? right? VecGetArray(R, &values); for (i = 0; i < local_size; i++) { values[i] = Vec[i]; } VecRestoreArray(R, &values); //Scatter R to all processors Vec V_SEQ; VecScatter ctx; VecScatterCreateToAll(R,&ctx,&V_SEQ); //Remove duplicates in V_SEQ How can I use PetscSortRemoveDupsInt to remove duplicates in V_SEQ? Physics behind: I am reading a parallel mesh, and want to mark all the boundary nodes. I use a local (sequential) Vec to store the boundary nodes for each parallel partition. Hence, local Vecs can end up with duplicate node index among them, which I would like to get rid of when I combine all of them together. Best, Karthik. From: Mark Adams Date: Friday, 9 December 2022 at 21:08 To: Chockalingam, Karthikeyan (STFC,DL,HC) Cc: Barry Smith , petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Union of sequential vecs If your space is pretty compact, eg, (0,12), you could create an MPI vector Q of size 13, say, and each processor can add 1.0 to Q[Vec[i]], for all "i" in my local "Vec". Then each processor can count the number of local nonzeros in Q, call it n, create a new vector, R, with local size n, then set R[i] = global index of the nonzero for each nonzero in Q, i=0:n. Do some sort of vec-scatter-to-all with R to get what you want. Does that work? Mark On Fri, Dec 9, 2022 at 3:25 PM Karthikeyan Chockalingam - STFC UKRI via petsc-users > wrote: That is where I am stuck, I don?t know who to combine them to get Vec = {2,5,7,8,10,11,12}. I just want them in an MPI vector. I finally plan to call VecScatterCreateToAll so that all processor gets a copy. Thank you. Kind regards, Karthik. From: Barry Smith > Date: Friday, 9 December 2022 at 20:04 To: Chockalingam, Karthikeyan (STFC,DL,HC) > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Union of sequential vecs How are you combining them to get Vec = {2,5,7,8,10,11,12}? Do you want the values to remain on the same MPI rank as before, just in an MPI vector? On Dec 9, 2022, at 2:28 PM, Karthikeyan Chockalingam - STFC UKRI via petsc-users > wrote: Hi, I want to take the union of a set of sequential vectors, each living in a different processor. Say, Vec_Seq1 = {2,5,7} Vec_Seq2 = {5,8,10,11} Vec_Seq3 = {5,2,12}. Finally, get the union of all them Vec = {2,5,7,8,10,11,12}. I initially wanted to create a parallel vector and insert the (sequential vector) values but I do not know, to which index to insert the values to. But I do know the total size of Vec (which in this case is 7). Any help is much appreciated. Kind regards, Karthik. This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Dec 9 18:38:00 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 9 Dec 2022 19:38:00 -0500 Subject: [petsc-users] Union of sequential vecs In-Reply-To: References: <4C906088-C6AB-47DE-BD77-23F4F77E9B8E@petsc.dev> Message-ID: On Fri, Dec 9, 2022 at 6:50 PM Karthikeyan Chockalingam - STFC UKRI < karthikeyan.chockalingam at stfc.ac.uk> wrote: > Thank you Mark and Barry. > > > > @Mark Adams I follow you for the most part. Shouldn?t R > be an MPI Vector? > yes, "R, with local size n" implies an MPI vector. > > > Here it goes: > > > > Q[Vec[i]] for all "i" in my local (sequential) "Vec". > OK, you are calling VecSetValue(Q,... ADD_VALUES ) VecAsseemblyBegin/End needs to be called and that will do communication. > Compute number of local nonzeros in Q, call it n. > You would use VecGetArray like below to do this count. You could cache the indices in Q that are non-zero, or redo the loop below. > > //Create a MPI Vector R, with local size n > > > > VecCreate(PETSC_COMM_WORLD, &R); > > VecSetType(R, VECMPI); > > VecSetSizes(R, *n*, PETSC_DECIDE); > > > > //Populate MPI Vector R with local (sequential) "Vec". > > > > VecGetOwnershipRange(R, &istart, &iend); > > local_size = iend - istart; //The local_size should be ?*n*? right? > Well yes, but you are going to need to get the "istart" from Q, not R, to get the global index in a loop. > VecGetArray(R, &values); > No, you want to get the values from Q and use VecSetValue(R,... VecGetArray(Q, &values_Q); VecGetOwnershipRange(R, &istart_R, NULL); idx = istart_R > *for* (i = 0; i < local_size_Q; i++) { > No, redo the loop above that you used to count. You skipped this so I can put it here. (simpler to just run this loop twice, first count and then set) If (values_Q[i] != 0) [ VecSetValue(R, idx++, (PetscScalar) istart_Q+i, INSERT_VALUES ) } > values[i] = Vec[i]; > > } > > VecRestoreArray(R, &values); > VecRestoreArray(Q, &values_Q); You don't really need a VecAsseemblyBegin/End (R) here but you can add it to be clear. I'm not sure this is correct so you need to debug this and the scatter can be figured out later. > > //Scatter R to all processors > > > > Vec V_SEQ; > > VecScatter ctx; > > > > VecScatterCreateToAll(R,&ctx,&V_SEQ); > > > I'm not sure how to best do this. Look in the docs or ask in a separate thread. This thread is busy figuring the first part out. > //Remove duplicates in V_SEQ > > How can I use PetscSortRemoveDupsInt to remove duplicates in V_SEQ? > > > > > > Physics behind: > > I am reading a parallel mesh, and want to mark all the boundary nodes. I > use a local (sequential) Vec to store the boundary nodes for each parallel > partition. Hence, local Vecs can end up with duplicate node index among > them, which I would like to get rid of when I combine all of them together. > Humm, OK, I'm not sure I get this exactly but yes this is intended to get each process a global list of (boundary) vertices. Not super scalable, but if it gets you started then that's great. Good luck, Mark > > > Best, > > Karthik. > > > > > > > > > > *From: *Mark Adams > *Date: *Friday, 9 December 2022 at 21:08 > *To: *Chockalingam, Karthikeyan (STFC,DL,HC) < > karthikeyan.chockalingam at stfc.ac.uk> > *Cc: *Barry Smith , petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject: *Re: [petsc-users] Union of sequential vecs > > If your space is pretty compact, eg, (0,12), you could create an MPI > vector Q of size 13, say, and each processor can add 1.0 to Q[Vec[i]], for > all "i" in my local "Vec". > > Then each processor can count the number of local nonzeros in Q, call it > n, create a new vector, R, with local size n, then set R[i] = global index > of the nonzero for each nonzero in Q, i=0:n. > > Do some sort of vec-scatter-to-all with R to get what you want. > > > > Does that work? > > > > Mark > > > > > > On Fri, Dec 9, 2022 at 3:25 PM Karthikeyan Chockalingam - STFC UKRI via > petsc-users wrote: > > That is where I am stuck, *I don?t know* who to combine them to get Vec = > {2,5,7,8,10,11,12}. > > I just want them in an MPI vector. > > > > I finally plan to call VecScatterCreateToAll so that all processor gets a > copy. > > > > Thank you. > > > > Kind regards, > > Karthik. > > > > *From: *Barry Smith > *Date: *Friday, 9 December 2022 at 20:04 > *To: *Chockalingam, Karthikeyan (STFC,DL,HC) < > karthikeyan.chockalingam at stfc.ac.uk> > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] Union of sequential vecs > > > > How are you combining them to get Vec = {2,5,7,8,10,11,12}? > > > > Do you want the values to remain on the same MPI rank as before, just in > an MPI vector? > > > > > > > > On Dec 9, 2022, at 2:28 PM, Karthikeyan Chockalingam - STFC UKRI via > petsc-users wrote: > > > > Hi, > > > > I want to take the union of a set of sequential vectors, each living in a > different processor. > > > > Say, > > Vec_Seq1 = {2,5,7} > > Vec_Seq2 = {5,8,10,11} > > Vec_Seq3 = {5,2,12}. > > > > Finally, get the union of all them Vec = {2,5,7,8,10,11,12}. > > > > I initially wanted to create a parallel vector and insert the (sequential > vector) values but I do not know, to which index to insert the values to. > But I do know the total size of Vec (which in this case is 7). > > > > Any help is much appreciated. > > > > Kind regards, > > Karthik. > > > > > > > > This email and any attachments are intended solely for the use of the > named recipients. If you are not the intended recipient you must not use, > disclose, copy or distribute this email or any of its attachments and > should notify the sender immediately and delete this email from your > system. UK Research and Innovation (UKRI) has taken every reasonable > precaution to minimise risk of this email or any attachments containing > viruses or malware but the recipient should carry out its own virus and > malware checks before opening the attachments. UKRI does not accept any > liability for any losses or damages which the recipient may sustain due to > presence of any viruses. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Dec 9 20:20:00 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 9 Dec 2022 21:20:00 -0500 Subject: [petsc-users] Union of sequential vecs In-Reply-To: References: <4C906088-C6AB-47DE-BD77-23F4F77E9B8E@petsc.dev> Message-ID: On Fri, Dec 9, 2022 at 6:50 PM Karthikeyan Chockalingam - STFC UKRI via petsc-users wrote: > Thank you Mark and Barry. > > > > @Mark Adams I follow you for the most part. Shouldn?t R > be an MPI Vector? > > > > Here it goes: > > > > Q[Vec[i]] for all "i" in my local (sequential) "Vec". > > Compute number of local nonzeros in Q, call it n. > > > > //Create a MPI Vector R, with local size n > > > > VecCreate(PETSC_COMM_WORLD, &R); > > VecSetType(R, VECMPI); > > VecSetSizes(R, *n*, PETSC_DECIDE); > > > > //Populate MPI Vector R with local (sequential) "Vec". > > > > VecGetOwnershipRange(R, &istart, &iend); > > local_size = iend - istart; //The local_size should be ?*n*? right? > > VecGetArray(R, &values); > > *for* (i = 0; i < local_size; i++) { > > values[i] = Vec[i]; > > } > > VecRestoreArray(R, &values); > > > > //Scatter R to all processors > > > > Vec V_SEQ; > > VecScatter ctx; > > > > VecScatterCreateToAll(R,&ctx,&V_SEQ); > > > > //Remove duplicates in V_SEQ > > How can I use PetscSortRemoveDupsInt to remove duplicates in V_SEQ? > > > > > > Physics behind: > > I am reading a parallel mesh, and want to mark all the boundary nodes. I > use a local (sequential) Vec to store the boundary nodes for each parallel > partition. Hence, local Vecs can end up with duplicate node index among > them, which I would like to get rid of when I combine all of them together. > 1) Blaise is right you should use an IS, not a Vec, to hold node indices. His solution is only a few lines, so I would use it. 2) I would not recommend doing things this way in the first place. PETSc can manage parallel meshes scalably, marking boundaries using DMLabel objects. Thanks, Matt > Best, > > Karthik. > > > > > > > > > > *From: *Mark Adams > *Date: *Friday, 9 December 2022 at 21:08 > *To: *Chockalingam, Karthikeyan (STFC,DL,HC) < > karthikeyan.chockalingam at stfc.ac.uk> > *Cc: *Barry Smith , petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject: *Re: [petsc-users] Union of sequential vecs > > If your space is pretty compact, eg, (0,12), you could create an MPI > vector Q of size 13, say, and each processor can add 1.0 to Q[Vec[i]], for > all "i" in my local "Vec". > > Then each processor can count the number of local nonzeros in Q, call it > n, create a new vector, R, with local size n, then set R[i] = global index > of the nonzero for each nonzero in Q, i=0:n. > > Do some sort of vec-scatter-to-all with R to get what you want. > > > > Does that work? > > > > Mark > > > > > > On Fri, Dec 9, 2022 at 3:25 PM Karthikeyan Chockalingam - STFC UKRI via > petsc-users wrote: > > That is where I am stuck, *I don?t know* who to combine them to get Vec = > {2,5,7,8,10,11,12}. > > I just want them in an MPI vector. > > > > I finally plan to call VecScatterCreateToAll so that all processor gets a > copy. > > > > Thank you. > > > > Kind regards, > > Karthik. > > > > *From: *Barry Smith > *Date: *Friday, 9 December 2022 at 20:04 > *To: *Chockalingam, Karthikeyan (STFC,DL,HC) < > karthikeyan.chockalingam at stfc.ac.uk> > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] Union of sequential vecs > > > > How are you combining them to get Vec = {2,5,7,8,10,11,12}? > > > > Do you want the values to remain on the same MPI rank as before, just in > an MPI vector? > > > > > > > > On Dec 9, 2022, at 2:28 PM, Karthikeyan Chockalingam - STFC UKRI via > petsc-users wrote: > > > > Hi, > > > > I want to take the union of a set of sequential vectors, each living in a > different processor. > > > > Say, > > Vec_Seq1 = {2,5,7} > > Vec_Seq2 = {5,8,10,11} > > Vec_Seq3 = {5,2,12}. > > > > Finally, get the union of all them Vec = {2,5,7,8,10,11,12}. > > > > I initially wanted to create a parallel vector and insert the (sequential > vector) values but I do not know, to which index to insert the values to. > But I do know the total size of Vec (which in this case is 7). > > > > Any help is much appreciated. > > > > Kind regards, > > Karthik. > > > > > > > > This email and any attachments are intended solely for the use of the > named recipients. If you are not the intended recipient you must not use, > disclose, copy or distribute this email or any of its attachments and > should notify the sender immediately and delete this email from your > system. UK Research and Innovation (UKRI) has taken every reasonable > precaution to minimise risk of this email or any attachments containing > viruses or malware but the recipient should carry out its own virus and > malware checks before opening the attachments. UKRI does not accept any > liability for any losses or damages which the recipient may sustain due to > presence of any viruses. > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Mon Dec 12 05:16:06 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Mon, 12 Dec 2022 20:16:06 +0900 Subject: [petsc-users] parallelize matrix assembly process Message-ID: Hello, I need some keyword or some examples for parallelizing matrix assemble process. My current state is as below. - Finite element analysis code for Structural mechanics. - problem size : 3D solid hexa element (number of elements : 125,000), number of degree of freedom : 397,953 - Matrix type : seqaij, matrix set preallocation by using MatSeqAIJSetPreallocation - Matrix assemble time by using 1 core : 120 sec for (int i=0; i<125000; i++) { ~~ element matrix calculation} matassemblybegin matassemblyend - Matrix assemble time by using 8 core : 70,234sec int start, end; VecGetOwnershipRange( element_vec, &start, &end); for (int i=start; i From mfadams at lbl.gov Mon Dec 12 07:24:18 2022 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 12 Dec 2022 08:24:18 -0500 Subject: [petsc-users] parallelize matrix assembly process In-Reply-To: References: Message-ID: Hi Hyung, First, verify that you are preallocating correctly. Run with '-info' and grep on "alloc" in the large output that you get. You will see lines like "number of mallocs in assembly: 0". You want 0. Do this with one processor and the 8. I don't understand your loop. You are iterating over vertices. You want to iterate over elements. Mark On Mon, Dec 12, 2022 at 6:16 AM ??? wrote: > Hello, > > > I need some keyword or some examples for parallelizing matrix assemble > process. > > My current state is as below. > - Finite element analysis code for Structural mechanics. > - problem size : 3D solid hexa element (number of elements : 125,000), > number of degree of freedom : 397,953 > - Matrix type : seqaij, matrix set preallocation by using > MatSeqAIJSetPreallocation > - Matrix assemble time by using 1 core : 120 sec > for (int i=0; i<125000; i++) { > ~~ element matrix calculation} > matassemblybegin > matassemblyend > - Matrix assemble time by using 8 core : 70,234sec > int start, end; > VecGetOwnershipRange( element_vec, &start, &end); > for (int i=start; i ~~ element matrix calculation > matassemblybegin > matassemblyend > > > As you see the state, the parallel case spent a lot of time than > sequential case.. > How can I speed up in this case? > Can I get some keyword or examples for parallelizing assembly of matrix in > finite element analysis ? > > Thanks, > Hyung Kim > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Mon Dec 12 08:44:20 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Mon, 12 Dec 2022 23:44:20 +0900 Subject: [petsc-users] parallelize matrix assembly process In-Reply-To: References: Message-ID: Hello Mark, Following your comments, I did run with '-info' and the outputs are as below [image: image.png] Global matrix seem to have preallocated well enough And, ass I said earlier in the former email, If I run this code with mpi , It will be 70,000secs.. In this case, What is the problem? And my loops already iterates over elements. element_vec is just 1~125,000 array. for getting proper element index in each process. That example code is just simple schematic of my code. Thanks, Hyung Kim 2022? 12? 12? (?) ?? 10:24, Mark Adams ?? ??: > Hi Hyung, > > First, verify that you are preallocating correctly. > Run with '-info' and grep on "alloc" in the large output that you get. > You will see lines like "number of mallocs in assembly: 0". You want 0. > Do this with one processor and the 8. > > I don't understand your loop. You are iterating over vertices. You want to > iterate over elements. > > Mark > > > > On Mon, Dec 12, 2022 at 6:16 AM ??? wrote: > >> Hello, >> >> >> I need some keyword or some examples for parallelizing matrix assemble >> process. >> >> My current state is as below. >> - Finite element analysis code for Structural mechanics. >> - problem size : 3D solid hexa element (number of elements : 125,000), >> number of degree of freedom : 397,953 >> - Matrix type : seqaij, matrix set preallocation by using >> MatSeqAIJSetPreallocation >> - Matrix assemble time by using 1 core : 120 sec >> for (int i=0; i<125000; i++) { >> ~~ element matrix calculation} >> matassemblybegin >> matassemblyend >> - Matrix assemble time by using 8 core : 70,234sec >> int start, end; >> VecGetOwnershipRange( element_vec, &start, &end); >> for (int i=start; i> ~~ element matrix calculation >> matassemblybegin >> matassemblyend >> >> >> As you see the state, the parallel case spent a lot of time than >> sequential case.. >> How can I speed up in this case? >> Can I get some keyword or examples for parallelizing assembly of matrix >> in finite element analysis ? >> >> Thanks, >> Hyung Kim >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 44388 bytes Desc: not available URL: From karthikeyan.chockalingam at stfc.ac.uk Mon Dec 12 09:00:43 2022 From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI) Date: Mon, 12 Dec 2022 15:00:43 +0000 Subject: [petsc-users] Union of sequential vecs In-Reply-To: References: <4C906088-C6AB-47DE-BD77-23F4F77E9B8E@petsc.dev> Message-ID: Thank you Matt and Blaise. I will try out IS (though I have not it before) (i) Can IS be of different size, on different processors, and still call ISALLGather? (ii) Can IS be passed as row indices to MatZeroRowsColumns? I will look into DMlabels and start a different thread if needed. Best, Karthik. From: Matthew Knepley Date: Saturday, 10 December 2022 at 02:20 To: Chockalingam, Karthikeyan (STFC,DL,HC) Cc: Mark Adams , petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Union of sequential vecs On Fri, Dec 9, 2022 at 6:50 PM Karthikeyan Chockalingam - STFC UKRI via petsc-users > wrote: Thank you Mark and Barry. @Mark Adams I follow you for the most part. Shouldn?t R be an MPI Vector? Here it goes: Q[Vec[i]] for all "i" in my local (sequential) "Vec". Compute number of local nonzeros in Q, call it n. //Create a MPI Vector R, with local size n VecCreate(PETSC_COMM_WORLD, &R); VecSetType(R, VECMPI); VecSetSizes(R, n, PETSC_DECIDE); //Populate MPI Vector R with local (sequential) "Vec". VecGetOwnershipRange(R, &istart, &iend); local_size = iend - istart; //The local_size should be ?n? right? VecGetArray(R, &values); for (i = 0; i < local_size; i++) { values[i] = Vec[i]; } VecRestoreArray(R, &values); //Scatter R to all processors Vec V_SEQ; VecScatter ctx; VecScatterCreateToAll(R,&ctx,&V_SEQ); //Remove duplicates in V_SEQ How can I use PetscSortRemoveDupsInt to remove duplicates in V_SEQ? Physics behind: I am reading a parallel mesh, and want to mark all the boundary nodes. I use a local (sequential) Vec to store the boundary nodes for each parallel partition. Hence, local Vecs can end up with duplicate node index among them, which I would like to get rid of when I combine all of them together. 1) Blaise is right you should use an IS, not a Vec, to hold node indices. His solution is only a few lines, so I would use it. 2) I would not recommend doing things this way in the first place. PETSc can manage parallel meshes scalably, marking boundaries using DMLabel objects. Thanks, Matt Best, Karthik. From: Mark Adams > Date: Friday, 9 December 2022 at 21:08 To: Chockalingam, Karthikeyan (STFC,DL,HC) > Cc: Barry Smith >, petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Union of sequential vecs If your space is pretty compact, eg, (0,12), you could create an MPI vector Q of size 13, say, and each processor can add 1.0 to Q[Vec[i]], for all "i" in my local "Vec". Then each processor can count the number of local nonzeros in Q, call it n, create a new vector, R, with local size n, then set R[i] = global index of the nonzero for each nonzero in Q, i=0:n. Do some sort of vec-scatter-to-all with R to get what you want. Does that work? Mark On Fri, Dec 9, 2022 at 3:25 PM Karthikeyan Chockalingam - STFC UKRI via petsc-users > wrote: That is where I am stuck, I don?t know who to combine them to get Vec = {2,5,7,8,10,11,12}. I just want them in an MPI vector. I finally plan to call VecScatterCreateToAll so that all processor gets a copy. Thank you. Kind regards, Karthik. From: Barry Smith > Date: Friday, 9 December 2022 at 20:04 To: Chockalingam, Karthikeyan (STFC,DL,HC) > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Union of sequential vecs How are you combining them to get Vec = {2,5,7,8,10,11,12}? Do you want the values to remain on the same MPI rank as before, just in an MPI vector? On Dec 9, 2022, at 2:28 PM, Karthikeyan Chockalingam - STFC UKRI via petsc-users > wrote: Hi, I want to take the union of a set of sequential vectors, each living in a different processor. Say, Vec_Seq1 = {2,5,7} Vec_Seq2 = {5,8,10,11} Vec_Seq3 = {5,2,12}. Finally, get the union of all them Vec = {2,5,7,8,10,11,12}. I initially wanted to create a parallel vector and insert the (sequential vector) values but I do not know, to which index to insert the values to. But I do know the total size of Vec (which in this case is 7). Any help is much appreciated. Kind regards, Karthik. This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Dec 12 09:50:38 2022 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 12 Dec 2022 10:50:38 -0500 Subject: [petsc-users] parallelize matrix assembly process In-Reply-To: References: Message-ID: <5485EB43-B786-4764-949E-F2F21687DB32@petsc.dev> The problem is possibly due to most elements being computed on "wrong" MPI rank and thus requiring almost all the matrix entries to be "stashed" when computed and then sent off to the owning MPI rank. Please send ALL the output of a parallel run with -info so we can see how much communication is done in the matrix assembly. Barry > On Dec 12, 2022, at 6:16 AM, ??? wrote: > > Hello, > > > I need some keyword or some examples for parallelizing matrix assemble process. > > My current state is as below. > - Finite element analysis code for Structural mechanics. > - problem size : 3D solid hexa element (number of elements : 125,000), number of degree of freedom : 397,953 > - Matrix type : seqaij, matrix set preallocation by using MatSeqAIJSetPreallocation > - Matrix assemble time by using 1 core : 120 sec > for (int i=0; i<125000; i++) { > ~~ element matrix calculation} > matassemblybegin > matassemblyend > - Matrix assemble time by using 8 core : 70,234sec > int start, end; > VecGetOwnershipRange( element_vec, &start, &end); > for (int i=start; i ~~ element matrix calculation > matassemblybegin > matassemblyend > > > As you see the state, the parallel case spent a lot of time than sequential case.. > How can I speed up in this case? > Can I get some keyword or examples for parallelizing assembly of matrix in finite element analysis ? > > Thanks, > Hyung Kim > From junchao.zhang at gmail.com Mon Dec 12 10:42:58 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Mon, 12 Dec 2022 10:42:58 -0600 Subject: [petsc-users] parallelize matrix assembly process In-Reply-To: References: Message-ID: Since you run with multiple ranks, you should use matrix type mpiaij and MatMPIAIJSetPreallocation. If preallocation is difficult to estimate, you can use MatPreallocator, see an example at https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c --Junchao Zhang On Mon, Dec 12, 2022 at 5:16 AM ??? wrote: > Hello, > > > I need some keyword or some examples for parallelizing matrix assemble > process. > > My current state is as below. > - Finite element analysis code for Structural mechanics. > - problem size : 3D solid hexa element (number of elements : 125,000), > number of degree of freedom : 397,953 > - Matrix type : seqaij, matrix set preallocation by using > MatSeqAIJSetPreallocation > - Matrix assemble time by using 1 core : 120 sec > for (int i=0; i<125000; i++) { > ~~ element matrix calculation} > matassemblybegin > matassemblyend > - Matrix assemble time by using 8 core : 70,234sec > int start, end; > VecGetOwnershipRange( element_vec, &start, &end); > for (int i=start; i ~~ element matrix calculation > matassemblybegin > matassemblyend > > > As you see the state, the parallel case spent a lot of time than > sequential case.. > How can I speed up in this case? > Can I get some keyword or examples for parallelizing assembly of matrix in > finite element analysis ? > > Thanks, > Hyung Kim > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pjool at dtu.dk Mon Dec 12 12:31:38 2022 From: pjool at dtu.dk (=?iso-8859-1?Q?Peder_J=F8rgensgaard_Olesen?=) Date: Mon, 12 Dec 2022 18:31:38 +0000 Subject: [petsc-users] Insert one sparse matrix as a block in another Message-ID: <58c80ce3b8414d57b421ac9bb8e1697b@dtu.dk> Hello I have a set of sparse matrices (A1, A2, ...) , and need to generate a larger matrix B with these as submatrices. I do not know the precise sparse layouts of the A's (only that each row has one or two non-zero values), and extracting all values to copy into B seems incredibly wasteful. How can I make use of the sparsity to solve this efficiently? Thanks, Peder [http://www.dtu.dk/-/media/DTU_Generelt/Andet/mail-signature-logo.png] Peder J?rgensgaard Olesen PhD student Department of Civil and Mechanical Engineering pjool at mek.dtu.dk Koppels All? Building 403, room 105 2800 Kgs. Lyngby www.dtu.dk/english -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Dec 12 16:12:09 2022 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 12 Dec 2022 17:12:09 -0500 Subject: [petsc-users] Insert one sparse matrix as a block in another In-Reply-To: <58c80ce3b8414d57b421ac9bb8e1697b@dtu.dk> References: <58c80ce3b8414d57b421ac9bb8e1697b@dtu.dk> Message-ID: Do you know what kind of solver works well for this problem? You probably want to figure that out first and not worry about efficiency. MATCOMPOSITE does what you want but not all solvers will work with it. Where does this problem come from? We have a lot of experience and might know something. Mark On Mon, Dec 12, 2022 at 1:33 PM Peder J?rgensgaard Olesen via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello > > > I have a set of sparse matrices (A1, A2, ...) , and need to generate a > larger matrix B with these as submatrices. I do not know the precise sparse > layouts of the A's (only that each row has one or two non-zero values), > and extracting *all* values to copy into B seems incredibly wasteful. How > can I make use of the sparsity to solve this efficiently? > > > Thanks, > > Peder > > > > Peder J?rgensgaard Olesen > PhD student > Department of Civil and Mechanical Engineering > > pjool at mek.dtu.dk > Koppels All? > Building 403, room 105 > 2800 Kgs. Lyngby > www.dtu.dk/english > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon Dec 12 16:23:57 2022 From: jed at jedbrown.org (Jed Brown) Date: Mon, 12 Dec 2022 15:23:57 -0700 Subject: [petsc-users] Insert one sparse matrix as a block in another In-Reply-To: References: <58c80ce3b8414d57b421ac9bb8e1697b@dtu.dk> Message-ID: <87pmcontvm.fsf@jedbrown.org> The description matches MATNEST (MATCOMPOSITE is for a sum or product of matrices) or parallel decompositions. Also consider the assembly style of src/snes/tutorials/ex28.c, which can create either a monolithic or block (MATNEST) matrix without extra storage or conversion costs. Mark Adams writes: > Do you know what kind of solver works well for this problem? > > You probably want to figure that out first and not worry about efficiency. > > MATCOMPOSITE does what you want but not all solvers will work with it. > > Where does this problem come from? We have a lot of experience and might > know something. > > Mark > > On Mon, Dec 12, 2022 at 1:33 PM Peder J?rgensgaard Olesen via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello >> >> >> I have a set of sparse matrices (A1, A2, ...) , and need to generate a >> larger matrix B with these as submatrices. I do not know the precise sparse >> layouts of the A's (only that each row has one or two non-zero values), >> and extracting *all* values to copy into B seems incredibly wasteful. How >> can I make use of the sparsity to solve this efficiently? >> >> >> Thanks, >> >> Peder >> >> >> >> Peder J?rgensgaard Olesen >> PhD student >> Department of Civil and Mechanical Engineering >> >> pjool at mek.dtu.dk >> Koppels All? >> Building 403, room 105 >> 2800 Kgs. Lyngby >> www.dtu.dk/english >> >> From knepley at gmail.com Mon Dec 12 16:58:02 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 12 Dec 2022 17:58:02 -0500 Subject: [petsc-users] Insert one sparse matrix as a block in another In-Reply-To: <87pmcontvm.fsf@jedbrown.org> References: <58c80ce3b8414d57b421ac9bb8e1697b@dtu.dk> <87pmcontvm.fsf@jedbrown.org> Message-ID: On Mon, Dec 12, 2022 at 5:24 PM Jed Brown wrote: > The description matches MATNEST (MATCOMPOSITE is for a sum or product of > matrices) or parallel decompositions. Also consider the assembly style of > src/snes/tutorials/ex28.c, which can create either a monolithic or block > (MATNEST) matrix without extra storage or conversion costs. > I will just say a few words about ex28. The idea is that if you are already calling MatSetValues() to assemble your submatrices, then you can use MatSetValuesLocal() to remap those locations into locations in the large matrix, using a LocalToGlobalMap. This allows you to choose either a standard AIJ matrix (which supports factorizations for example), or a MatNest object that supports fast extraction of the blocks. Thanks, Matt > Mark Adams writes: > > > Do you know what kind of solver works well for this problem? > > > > You probably want to figure that out first and not worry about > efficiency. > > > > MATCOMPOSITE does what you want but not all solvers will work with it. > > > > Where does this problem come from? We have a lot of experience and might > > know something. > > > > Mark > > > > On Mon, Dec 12, 2022 at 1:33 PM Peder J?rgensgaard Olesen via > petsc-users < > > petsc-users at mcs.anl.gov> wrote: > > > >> Hello > >> > >> > >> I have a set of sparse matrices (A1, A2, ...) , and need to generate a > >> larger matrix B with these as submatrices. I do not know the precise > sparse > >> layouts of the A's (only that each row has one or two non-zero values), > >> and extracting *all* values to copy into B seems incredibly wasteful. > How > >> can I make use of the sparsity to solve this efficiently? > >> > >> > >> Thanks, > >> > >> Peder > >> > >> > >> > >> Peder J?rgensgaard Olesen > >> PhD student > >> Department of Civil and Mechanical Engineering > >> > >> pjool at mek.dtu.dk > >> Koppels All? > >> Building 403, room 105 > >> 2800 Kgs. Lyngby > >> www.dtu.dk/english > >> > >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 12 17:00:51 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 12 Dec 2022 18:00:51 -0500 Subject: [petsc-users] Union of sequential vecs In-Reply-To: References: <4C906088-C6AB-47DE-BD77-23F4F77E9B8E@petsc.dev> Message-ID: On Mon, Dec 12, 2022 at 10:00 AM Karthikeyan Chockalingam - STFC UKRI < karthikeyan.chockalingam at stfc.ac.uk> wrote: > Thank you Matt and Blaise. > > > > I will try out IS (though I have not it before) > > (i) Can IS be of different size, on different > processors, and still call ISALLGather? > Yes. > (ii) Can IS be passed as row indices to MatZeroRowsColumns? > https://petsc.org/main/docs/manualpages/Mat/MatZeroRowsColumnsIS/ Thanks, Matt > I will look into DMlabels and start a different thread if needed. > > > > Best, > > Karthik. > > > > > > > > *From: *Matthew Knepley > *Date: *Saturday, 10 December 2022 at 02:20 > *To: *Chockalingam, Karthikeyan (STFC,DL,HC) < > karthikeyan.chockalingam at stfc.ac.uk> > *Cc: *Mark Adams , petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject: *Re: [petsc-users] Union of sequential vecs > > On Fri, Dec 9, 2022 at 6:50 PM Karthikeyan Chockalingam - STFC UKRI via > petsc-users wrote: > > Thank you Mark and Barry. > > > > @Mark Adams I follow you for the most part. Shouldn?t R > be an MPI Vector? > > > > Here it goes: > > > > Q[Vec[i]] for all "i" in my local (sequential) "Vec". > > Compute number of local nonzeros in Q, call it n. > > > > //Create a MPI Vector R, with local size n > > > > VecCreate(PETSC_COMM_WORLD, &R); > > VecSetType(R, VECMPI); > > VecSetSizes(R, *n*, PETSC_DECIDE); > > > > //Populate MPI Vector R with local (sequential) "Vec". > > > > VecGetOwnershipRange(R, &istart, &iend); > > local_size = iend - istart; //The local_size should be ?*n*? right? > > VecGetArray(R, &values); > > *for* (i = 0; i < local_size; i++) { > > values[i] = Vec[i]; > > } > > VecRestoreArray(R, &values); > > > > //Scatter R to all processors > > > > Vec V_SEQ; > > VecScatter ctx; > > > > VecScatterCreateToAll(R,&ctx,&V_SEQ); > > > > //Remove duplicates in V_SEQ > > How can I use PetscSortRemoveDupsInt to remove duplicates in V_SEQ? > > > > > > Physics behind: > > I am reading a parallel mesh, and want to mark all the boundary nodes. I > use a local (sequential) Vec to store the boundary nodes for each parallel > partition. Hence, local Vecs can end up with duplicate node index among > them, which I would like to get rid of when I combine all of them together. > > > > 1) Blaise is right you should use an IS, not a Vec, to hold node indices. > His solution is only a few lines, so I would use it. > > > > 2) I would not recommend doing things this way in the first place. PETSc > can manage parallel meshes scalably, marking boundaries using DMLabel > objects. > > > > Thanks, > > > > Matt > > > > Best, > > Karthik. > > > > > > > > > > *From: *Mark Adams > *Date: *Friday, 9 December 2022 at 21:08 > *To: *Chockalingam, Karthikeyan (STFC,DL,HC) < > karthikeyan.chockalingam at stfc.ac.uk> > *Cc: *Barry Smith , petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject: *Re: [petsc-users] Union of sequential vecs > > If your space is pretty compact, eg, (0,12), you could create an MPI > vector Q of size 13, say, and each processor can add 1.0 to Q[Vec[i]], for > all "i" in my local "Vec". > > Then each processor can count the number of local nonzeros in Q, call it > n, create a new vector, R, with local size n, then set R[i] = global index > of the nonzero for each nonzero in Q, i=0:n. > > Do some sort of vec-scatter-to-all with R to get what you want. > > > > Does that work? > > > > Mark > > > > > > On Fri, Dec 9, 2022 at 3:25 PM Karthikeyan Chockalingam - STFC UKRI via > petsc-users wrote: > > That is where I am stuck, *I don?t know* who to combine them to get Vec = > {2,5,7,8,10,11,12}. > > I just want them in an MPI vector. > > > > I finally plan to call VecScatterCreateToAll so that all processor gets a > copy. > > > > Thank you. > > > > Kind regards, > > Karthik. > > > > *From: *Barry Smith > *Date: *Friday, 9 December 2022 at 20:04 > *To: *Chockalingam, Karthikeyan (STFC,DL,HC) < > karthikeyan.chockalingam at stfc.ac.uk> > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] Union of sequential vecs > > > > How are you combining them to get Vec = {2,5,7,8,10,11,12}? > > > > Do you want the values to remain on the same MPI rank as before, just in > an MPI vector? > > > > > > > > On Dec 9, 2022, at 2:28 PM, Karthikeyan Chockalingam - STFC UKRI via > petsc-users wrote: > > > > Hi, > > > > I want to take the union of a set of sequential vectors, each living in a > different processor. > > > > Say, > > Vec_Seq1 = {2,5,7} > > Vec_Seq2 = {5,8,10,11} > > Vec_Seq3 = {5,2,12}. > > > > Finally, get the union of all them Vec = {2,5,7,8,10,11,12}. > > > > I initially wanted to create a parallel vector and insert the (sequential > vector) values but I do not know, to which index to insert the values to. > But I do know the total size of Vec (which in this case is 7). > > > > Any help is much appreciated. > > > > Kind regards, > > Karthik. > > > > > > > > This email and any attachments are intended solely for the use of the > named recipients. If you are not the intended recipient you must not use, > disclose, copy or distribute this email or any of its attachments and > should notify the sender immediately and delete this email from your > system. UK Research and Innovation (UKRI) has taken every reasonable > precaution to minimise risk of this email or any attachments containing > viruses or malware but the recipient should carry out its own virus and > malware checks before opening the attachments. UKRI does not accept any > liability for any losses or damages which the recipient may sustain due to > presence of any viruses. > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yann.jobic at univ-amu.fr Mon Dec 12 17:05:29 2022 From: yann.jobic at univ-amu.fr (Yann Jobic) Date: Tue, 13 Dec 2022 00:05:29 +0100 Subject: [petsc-users] realCoords for DOFs Message-ID: Hi all, I'm trying to get the coords of the dofs of a DMPlex for a PetscFE discretization, for orders greater than 1. I'm struggling to run dm/impls/plex/tutorials/ex8.c I've got the following error (with option -view_coord) : [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Object is in wrong state [0]PETSC ERROR: DMGetCoordinatesLocalSetUp() has not been called [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.18.2, unknown [0]PETSC ERROR: /home/jobic/projet/fe-utils/petsc/fetools/cmake-build-debug/ex_closure_petsc on a named luke by jobic Mon Dec 12 23:34:37 2022 [0]PETSC ERROR: Configure options --prefix=/local/lib/petsc/3.18/p3/gcc/nompi_hdf5 --with-mpi=0 --with-debugging=1 --with-blacs=1 --download-zlib,--download-p4est --download-hdf5=1 --download-triangle=1 --with-single-library=0 --with-large-file-io=1 --with-shared-libraries=0 -CFLAGS=" -g -O0" -CXXFLAGS=" -g -O0" -FFLAGS=" -g -O0" PETSC_ARCH=nompi_gcc_hdf5 [0]PETSC ERROR: #1 DMGetCoordinatesLocalNoncollective() at /home/devel/src_linux/petsc-3.18.0/src/dm/interface/dmcoordinates.c:621 [0]PETSC ERROR: #2 DMPlexGetCellCoordinates() at /home/devel/src_linux/petsc-3.18.0/src/dm/impls/plex/plexgeometry.c:1291 [0]PETSC ERROR: #3 main() at /home/jobic/projet/fe-utils/petsc/fetools/src/ex_closure_petsc.c:86 [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -dm_plex_box_faces 2,2 [0]PETSC ERROR: -dm_plex_dim 2 [0]PETSC ERROR: -dm_plex_simplex 0 [0]PETSC ERROR: -petscspace_degree 1 [0]PETSC ERROR: -view_coord [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Maybe i've done something wrong ? Moreover, i don't quite understand the function DMPlexGetLocalOffsets, and how to use it with DMGetCoordinatesLocal. It seems that DMGetCoordinatesLocal do not have the coords of the dofs, but only the nodes defining the geometry. I've made some small modifications of ex8.c, but i still have an error : [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Wrong type of object: Parameter # 1 [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could be the program crashed before they were used or a spelling mistake, etc! [0]PETSC ERROR: Option left: name:-sol value: vtk:sol.vtu [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.18.2, unknown [0]PETSC ERROR: /home/jobic/projet/fe-utils/petsc/fetools/cmake-build-debug/view_coords on a named luke by jobic Mon Dec 12 23:51:05 2022 [0]PETSC ERROR: Configure options --prefix=/local/lib/petsc/3.18/p3/gcc/nompi_hdf5 --with-mpi=0 --with-debugging=1 --with-blacs=1 --download-zlib,--download-p4est --download-hdf5=1 --download-triangle=1 --with-single-library=0 --with-large-file-io=1 --with-shared-libraries=0 -CFLAGS=" -g -O0" -CXXFLAGS=" -g -O0" -FFLAGS=" -g -O0" PETSC_ARCH=nompi_gcc_hdf5 [0]PETSC ERROR: #1 PetscFEGetHeightSubspace() at /home/devel/src_linux/petsc-3.18.0/src/dm/dt/fe/interface/fe.c:1692 [0]PETSC ERROR: #2 DMPlexGetLocalOffsets() at /home/devel/src_linux/petsc-3.18.0/src/dm/impls/plex/plexceed.c:98 [0]PETSC ERROR: #3 ViewOffsets() at /home/jobic/projet/fe-utils/petsc/fetools/src/view_coords.c:28 [0]PETSC ERROR: #4 main() at /home/jobic/projet/fe-utils/petsc/fetools/src/view_coords.c:99 [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -heat_petscspace_degree 2 [0]PETSC ERROR: -sol vtk:sol.vtu [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Is dm/impls/plex/tutorials/ex8.c a good example for viewing the coords of the dofs of a DMPlex ? Thanks, Yann -------------- next part -------------- static char help[] = "Test the function \n\n"; /* run with -heat_petscspace_degree 2 -sol vtk:sol.vtu */ /* En fait petsc, pour le heat_petscspace_degree 0 garde le maillage ori, traingulaire 3 sommets */ /* Pour les ordres superieurs, il raffine les elements, donc a 2 fois plus d'elements par directions */ /* et garde de l'ordre 1 en geometrie */ /* Il faut tester avec HDF5 voir si ce format ne permet pas des elements de plus haut degres */ #include "petscdm.h" #include #include #include #include "petscksp.h" /* simple gaussian centered at (0.5,0.5) */ static void gaussian(PetscInt dim, PetscInt Nf, PetscInt NfAux, const PetscInt uOff[], const PetscInt uOff_x[], const PetscScalar u[], const PetscScalar u_t[], const PetscScalar u_x[], const PetscInt aOff[], const PetscInt aOff_x[], const PetscScalar a[], const PetscScalar a_t[], const PetscScalar a_x[], PetscReal t, const PetscReal x[], PetscInt numConstants, const PetscScalar constants[], PetscScalar uexact[]) { uexact[0] = PetscExpReal(-(PetscPowReal(x[0]-0.5,2)+PetscPowReal(x[1]-0.5,2))/0.05); } static PetscErrorCode ViewOffsets(DM dm, Vec X) { PetscInt num_elem, elem_size, num_comp, num_dof; PetscInt *elem_restr_offsets; const PetscScalar *x = NULL; const char *name; PetscFunctionBegin; PetscCall(PetscObjectGetName((PetscObject)dm, &name)); PetscCall(DMPlexGetLocalOffsets(dm, NULL, 0, 0, 0, &num_elem, &elem_size, &num_comp, &num_dof, &elem_restr_offsets)); PetscCall(PetscPrintf(PETSC_COMM_SELF, "DM %s offsets: num_elem %" PetscInt_FMT ", size %" PetscInt_FMT ", comp %" PetscInt_FMT ", dof %" PetscInt_FMT "\n", name, num_elem, elem_size, num_comp, num_dof)); if (X) PetscCall(VecGetArrayRead(X, &x)); for (PetscInt c = 0; c < num_elem; c++) { PetscCall(PetscIntView(elem_size, &elem_restr_offsets[c * elem_size], PETSC_VIEWER_STDOUT_SELF)); if (x) { for (PetscInt i = 0; i < elem_size; i++) PetscCall(PetscScalarView(num_comp, &x[elem_restr_offsets[c * elem_size + i]], PETSC_VIEWER_STDERR_SELF)); } } if (X) PetscCall(VecRestoreArrayRead(X, &x)); PetscCall(PetscFree(elem_restr_offsets)); PetscFunctionReturn(0); } int main(int argc, char **argv) { DM dm; Vec u; PetscFE fe; PetscQuadrature quad; PetscInt dim=2,NumComp=1,size; PetscInt cells[3]={2,2,4}; PetscBool isSimplice=PETSC_FALSE; void (**exactFields)(PetscInt, PetscInt, PetscInt, const PetscInt[], const PetscInt[], const PetscScalar[], const PetscScalar[], const PetscScalar[], const PetscInt[], const PetscInt[], const PetscScalar[], const PetscScalar[], const PetscScalar[], PetscReal, const PetscReal[], PetscInt, const PetscScalar[], PetscScalar[]); PetscCall(PetscInitialize(&argc,&argv,(char *)0,help)); PetscCall(DMPlexCreateBoxMesh(PETSC_COMM_WORLD, dim, isSimplice, cells, NULL, NULL, NULL, PETSC_TRUE, &dm)); PetscCall(PetscFECreateDefault(PetscObjectComm((PetscObject) dm), dim,NumComp,isSimplice,"heat_",PETSC_DECIDE,&fe)); PetscCall(PetscFEViewFromOptions(fe,NULL,"-fe_field_view")); PetscCall(DMSetField(dm, 0, NULL, (PetscObject) fe)); PetscCall(PetscFEGetQuadrature(fe,&quad)); PetscCall(PetscQuadratureView(quad,PETSC_VIEWER_STDOUT_WORLD)); PetscCall(PetscFEDestroy(&fe)); PetscCall(DMCreateDS(dm)); PetscMalloc(1,&exactFields); exactFields[0]=gaussian; PetscCall(DMCreateGlobalVector(dm,&u)); PetscCall(PetscObjectSetName((PetscObject) u, "temp")); PetscCall(VecGetSize(u,&size)); PetscPrintf(PETSC_COMM_WORLD, "Size of the global vector : %d\n",size); { DM cdm; PetscSection section; PetscInt c,cStart,cEnd; Vec X; PetscInt cdim; PetscCall(DMGetLocalSection(dm, §ion)); PetscCall(DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd)); PetscCall(DMGetCoordinateDim(dm, &cdim)); PetscCall(DMGetCoordinateDM(dm, &cdm)); PetscCall(PetscObjectSetName((PetscObject)cdm, "coords")); for (c = cStart; c < cEnd; ++c) { const PetscScalar *array; PetscScalar *x = NULL; PetscInt ndof; PetscBool isDG; PetscCall(DMPlexGetCellCoordinates(dm, c, &isDG, &ndof, &array, &x)); PetscCheck(ndof % cdim == 0, PETSC_COMM_SELF, PETSC_ERR_ARG_INCOMP, "ndof not divisible by cdim"); PetscCall(PetscPrintf(PETSC_COMM_SELF, "Element #%" PetscInt_FMT " coordinates\n", c - cStart)); for (PetscInt i = 0; i < ndof; i += cdim) PetscCall(PetscScalarView(cdim, &x[i], PETSC_VIEWER_STDOUT_SELF)); PetscCall(DMPlexRestoreCellCoordinates(dm, c, &isDG, &ndof, &array, &x)); } PetscCall(ViewOffsets(dm, NULL)); PetscCall(DMGetCoordinatesLocal(dm, &X)); VecGetSize(X,&size); PetscCall(PetscPrintf(PETSC_COMM_WORLD,"Taille du vecteur des coordonnes : %d\n",size)); PetscCall(ViewOffsets(dm, X)); } PetscCall(DMProjectField(dm,0.0,u,exactFields,INSERT_ALL_VALUES, u)); VecViewFromOptions(u,NULL,"-sol"); PetscCall(DMDestroy(&dm)); PetscCall(VecDestroy(&u)); PetscFree(exactFields); PetscCall(PetscFinalize()); return 0; } From mfadams at lbl.gov Mon Dec 12 19:04:36 2022 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 12 Dec 2022 20:04:36 -0500 Subject: [petsc-users] realCoords for DOFs In-Reply-To: References: Message-ID: PETSc does not store the coordinates for high order elements (ie, the "midside nodes"). I get them by projecting a f(x) = x, function. Not sure if that is what you want but I can give you a code snippet if there are no better ideas. Mark On Mon, Dec 12, 2022 at 6:06 PM Yann Jobic wrote: > Hi all, > > I'm trying to get the coords of the dofs of a DMPlex for a PetscFE > discretization, for orders greater than 1. > > I'm struggling to run dm/impls/plex/tutorials/ex8.c > I've got the following error (with option -view_coord) : > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: DMGetCoordinatesLocalSetUp() has not been called > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.18.2, unknown > [0]PETSC ERROR: > /home/jobic/projet/fe-utils/petsc/fetools/cmake-build-debug/ex_closure_petsc > > on a named luke by jobic Mon Dec 12 23:34:37 2022 > [0]PETSC ERROR: Configure options > --prefix=/local/lib/petsc/3.18/p3/gcc/nompi_hdf5 --with-mpi=0 > --with-debugging=1 --with-blacs=1 --download-zlib,--download-p4est > --download-hdf5=1 --download-triangle=1 --with-single-library=0 > --with-large-file-io=1 --with-shared-libraries=0 -CFLAGS=" -g -O0" > -CXXFLAGS=" -g -O0" -FFLAGS=" -g -O0" PETSC_ARCH=nompi_gcc_hdf5 > [0]PETSC ERROR: #1 DMGetCoordinatesLocalNoncollective() at > /home/devel/src_linux/petsc-3.18.0/src/dm/interface/dmcoordinates.c:621 > [0]PETSC ERROR: #2 DMPlexGetCellCoordinates() at > /home/devel/src_linux/petsc-3.18.0/src/dm/impls/plex/plexgeometry.c:1291 > [0]PETSC ERROR: #3 main() at > /home/jobic/projet/fe-utils/petsc/fetools/src/ex_closure_petsc.c:86 > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -dm_plex_box_faces 2,2 > [0]PETSC ERROR: -dm_plex_dim 2 > [0]PETSC ERROR: -dm_plex_simplex 0 > [0]PETSC ERROR: -petscspace_degree 1 > [0]PETSC ERROR: -view_coord > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > > Maybe i've done something wrong ? > > > Moreover, i don't quite understand the function DMPlexGetLocalOffsets, > and how to use it with DMGetCoordinatesLocal. It seems that > DMGetCoordinatesLocal do not have the coords of the dofs, but only the > nodes defining the geometry. > > I've made some small modifications of ex8.c, but i still have an error : > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > [0]PETSC ERROR: WARNING! There are option(s) set that were not used! > Could be the program crashed before they were used or a spelling > mistake, etc! > [0]PETSC ERROR: Option left: name:-sol value: vtk:sol.vtu > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.18.2, unknown > [0]PETSC ERROR: > /home/jobic/projet/fe-utils/petsc/fetools/cmake-build-debug/view_coords > on a named luke by jobic Mon Dec 12 23:51:05 2022 > [0]PETSC ERROR: Configure options > --prefix=/local/lib/petsc/3.18/p3/gcc/nompi_hdf5 --with-mpi=0 > --with-debugging=1 --with-blacs=1 --download-zlib,--download-p4est > --download-hdf5=1 --download-triangle=1 --with-single-library=0 > --with-large-file-io=1 --with-shared-libraries=0 -CFLAGS=" -g -O0" > -CXXFLAGS=" -g -O0" -FFLAGS=" -g -O0" PETSC_ARCH=nompi_gcc_hdf5 > [0]PETSC ERROR: #1 PetscFEGetHeightSubspace() at > /home/devel/src_linux/petsc-3.18.0/src/dm/dt/fe/interface/fe.c:1692 > [0]PETSC ERROR: #2 DMPlexGetLocalOffsets() at > /home/devel/src_linux/petsc-3.18.0/src/dm/impls/plex/plexceed.c:98 > [0]PETSC ERROR: #3 ViewOffsets() at > /home/jobic/projet/fe-utils/petsc/fetools/src/view_coords.c:28 > [0]PETSC ERROR: #4 main() at > /home/jobic/projet/fe-utils/petsc/fetools/src/view_coords.c:99 > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -heat_petscspace_degree 2 > [0]PETSC ERROR: -sol vtk:sol.vtu > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > > > Is dm/impls/plex/tutorials/ex8.c a good example for viewing the coords > of the dofs of a DMPlex ? > > > Thanks, > > Yann > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yyang85 at alumni.stanford.edu Mon Dec 12 20:56:18 2022 From: yyang85 at alumni.stanford.edu (Yuyun Yang) Date: Tue, 13 Dec 2022 02:56:18 +0000 Subject: [petsc-users] Seg fault in gdb but program runs Message-ID: Hello team, I?m debugging my code using gdb. The program runs just fine if I don?t debug it, but when I use gdb, it seg faults at a place where it never experienced any seg fault when I debugged it 1-2 years ago. I wonder if this might be caused by the PETSc version change? Or something wrong with gdb itself? I?ve included the code block that is problematic for you to take a look at what might be wrong ? seg fault happens when this function is called. For context, Spmat is a class of sparse matrices in the code: // calculate the exact nonzero structure which results from the kronecker outer product of left and right // d_nnz = diagonal nonzero structure, o_nnz = off-diagonal nonzero structure void kronConvert_symbolic(const Spmat &left, const Spmat &right, Mat &mat, PetscInt* d_nnz, PetscInt* o_nnz) { size_t rightRowSize = right.size(1); size_t rightColSize = right.size(2); PetscInt Istart,Iend; // rows owned by current processor PetscInt Jstart,Jend; // cols owned by current processor // allocate space for mat MatGetOwnershipRange(mat,&Istart,&Iend); MatGetOwnershipRangeColumn(mat,&Jstart,&Jend); PetscInt m = Iend - Istart; for (int ii=0; iisecond).begin(); JjL!=(IiL->second).end(); JjL++) { rowL = IiL->first; colL = JjL->first; valL = JjL->second; if (valL==0) { continue; } // loop over all values in right for (IiR=right._mat.begin(); IiR!=right._mat.end(); IiR++) { for (JjR=(IiR->second).begin(); JjR!=(IiR->second).end(); JjR++) { rowR = IiR->first; colR = JjR->first; valR = JjR->second; // the new values and coordinates for the product matrix val = valL*valR; row = rowL*rightRowSize + rowR; col = colL*rightColSize + colR; PetscInt ii = row - Istart; // array index for d_nnz and o_nnz if (val!=0 && row >= Istart && row < Iend && col >= Jstart && col < Jend) { d_nnz[ii]++; \ } if ( (val!=0 && row >= Istart && row < Iend) && (col < Jstart || col >= Jend) ) { o_nnz[i\ i]++; } } } } } } Thank you, Yuyun -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Mon Dec 12 21:20:57 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Tue, 13 Dec 2022 12:20:57 +0900 Subject: [petsc-users] parallelize matrix assembly process In-Reply-To: <5485EB43-B786-4764-949E-F2F21687DB32@petsc.dev> References: <5485EB43-B786-4764-949E-F2F21687DB32@petsc.dev> Message-ID: Following your comments, I checked by using '-info'. As you suspected, most elements being computed on wrong MPI rank. Also, there are a lot of stashed entries. Should I divide the domain from the problem define stage? Or is a proper preallocation sufficient? [0] PetscCommDuplicate(): Duplicating a communicator 139687279637472 94370404729840 max tags = 2147483647 [1] PetscCommDuplicate(): Duplicating a communicator 139620736898016 94891084133376 max tags = 2147483647 [0] MatSetUp(): Warning not preallocating matrix storage [1] PetscCommDuplicate(): Duplicating a communicator 139620736897504 94891083133744 max tags = 2147483647 [0] PetscCommDuplicate(): Duplicating a communicator 139687279636960 94370403730224 max tags = 2147483647 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736898016 94891084133376 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279637472 94370404729840 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736898016 94891084133376 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279637472 94370404729840 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736898016 94891084133376 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279637472 94370404729840 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279637472 94370404729840 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736898016 94891084133376 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736898016 94891084133376 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279637472 94370404729840 TIME0 : 0.000000 TIME0 : 0.000000 [0] VecAssemblyBegin_MPI_BTS(): Stash has 661 entries, uses 8 mallocs. [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI_BTS(): Stash has 661 entries, uses 5 mallocs. [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. [0] MatAssemblyBegin_MPIAIJ(): Stash has 460416 entries, uses 5 mallocs. [1] MatAssemblyBegin_MPIAIJ(): Stash has 461184 entries, uses 5 mallocs. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 13892 X 13892; storage space: 180684 unneeded,987406 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 73242 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 13892) < 0.6. Do not use CompressedRow routines. [0] MatSeqAIJCheckInode(): Found 4631 nodes of 13892. Limit used: 5. Using Inode routines [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 13891 X 13891; storage space: 180715 unneeded,987325 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 73239 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81 [1] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 13891) < 0.6. Do not use CompressedRow routines. [1] MatSeqAIJCheckInode(): Found 4631 nodes of 13891. Limit used: 5. Using Inode routines [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 13892 X 1390; storage space: 72491 unneeded,34049 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 2472 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 40 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 12501)/(num_localrows 13892) > 0.6. Use CompressedRow routines. Assemble Time : 174.079366sec [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 13891 X 1391; storage space: 72441 unneeded,34049 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 2469 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 41 [1] MatCheckCompressedRow(): Found the ratio (num_zerorows 12501)/(num_localrows 13891) > 0.6. Use CompressedRow routines. Assemble Time : 174.141234sec [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [0] VecAssemblyBegin_MPI_BTS(): Stash has 13891 entries, uses 8 mallocs. [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 13891 X 13891; storage space: 0 unneeded,987325 used [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81 [1] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 13891) < 0.6. Do not use CompressedRow routines. [0] PCSetUp(): Setting up PC for first time [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged Solving Time : 5.085394sec [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 1.258030470407e-17 is less than relative tolerance 1.000000000000e-05 times initial right hand side norm 2.579617304779e-03 at iteration 1 Solving Time : 5.089733sec [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [0] VecAssemblyBegin_MPI_BTS(): Stash has 661 entries, uses 5 mallocs. [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. [0] MatAssemblyBegin_MPIAIJ(): Stash has 460416 entries, uses 0 mallocs. [1] MatAssemblyBegin_MPIAIJ(): Stash has 461184 entries, uses 0 mallocs. Assemble Time : 5.242508sec [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 Assemble Time : 5.240863sec [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [0] VecAssemblyBegin_MPI_BTS(): Stash has 13891 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. TIME : 1.000000, TIME_STEP : 1.000000, ITER : 2, RESIDUAL : 2.761615e-03 TIME : 1.000000, TIME_STEP : 1.000000, ITER : 2, RESIDUAL : 2.761615e-03 [0] PCSetUp(): Setting up PC with same nonzero pattern [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 1.539725065974e-19 is less than relative tolerance 1.000000000000e-05 times initial right hand side norm 8.015104666105e-06 at iteration 1 Solving Time : 4.662785sec [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 Solving Time : 4.664515sec [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [0] VecAssemblyBegin_MPI_BTS(): Stash has 661 entries, uses 5 mallocs. [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. [1] MatAssemblyBegin_MPIAIJ(): Stash has 461184 entries, uses 0 mallocs. [0] MatAssemblyBegin_MPIAIJ(): Stash has 460416 entries, uses 0 mallocs. Assemble Time : 5.238257sec [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 Assemble Time : 5.236535sec [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 TIME : 1.000000, TIME_STEP : 1.000000, ITER : 3, RESIDUAL : 3.705062e-08 TIME0 : 1.000000 [0] VecAssemblyBegin_MPI_BTS(): Stash has 13891 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. TIME : 1.000000, TIME_STEP : 1.000000, ITER : 3, RESIDUAL : 3.705062e-08 TIME0 : 1.000000 [1] PetscFinalize(): PetscFinalize() called [0] VecAssemblyBegin_MPI_BTS(): Stash has 661 entries, uses 5 mallocs. [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. [0] PetscFinalize(): PetscFinalize() called 2022? 12? 13? (?) ?? 12:50, Barry Smith ?? ??: > > The problem is possibly due to most elements being computed on "wrong" > MPI rank and thus requiring almost all the matrix entries to be "stashed" > when computed and then sent off to the owning MPI rank. Please send ALL > the output of a parallel run with -info so we can see how much > communication is done in the matrix assembly. > > Barry > > > > On Dec 12, 2022, at 6:16 AM, ??? wrote: > > > > Hello, > > > > > > I need some keyword or some examples for parallelizing matrix assemble > process. > > > > My current state is as below. > > - Finite element analysis code for Structural mechanics. > > - problem size : 3D solid hexa element (number of elements : 125,000), > number of degree of freedom : 397,953 > > - Matrix type : seqaij, matrix set preallocation by using > MatSeqAIJSetPreallocation > > - Matrix assemble time by using 1 core : 120 sec > > for (int i=0; i<125000; i++) { > > ~~ element matrix calculation} > > matassemblybegin > > matassemblyend > > - Matrix assemble time by using 8 core : 70,234sec > > int start, end; > > VecGetOwnershipRange( element_vec, &start, &end); > > for (int i=start; i > ~~ element matrix calculation > > matassemblybegin > > matassemblyend > > > > > > As you see the state, the parallel case spent a lot of time than > sequential case.. > > How can I speed up in this case? > > Can I get some keyword or examples for parallelizing assembly of matrix > in finite element analysis ? > > > > Thanks, > > Hyung Kim > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 12 22:41:07 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 12 Dec 2022 23:41:07 -0500 Subject: [petsc-users] Seg fault in gdb but program runs In-Reply-To: References: Message-ID: On Mon, Dec 12, 2022 at 9:56 PM Yuyun Yang wrote: > Hello team, > > > > I?m debugging my code using gdb. The program runs just fine if I don?t > debug it, but when I use gdb, it seg faults at a place where it never > experienced any seg fault when I debugged it 1-2 years ago. I wonder if > this might be caused by the PETSc version change? > The only PETSc calls are the MatGetOwnershipRange() calls, which have not changed, so I think this is unlikely. > Or something wrong with gdb itself? I?ve included the code block that is > problematic for you to take a look at what might be wrong ? seg fault > happens when this function is called. For context, Spmat is a class of > sparse matrices in the code: > What is the debugger output? Thanks, Matt > // calculate the exact nonzero structure which results from the kronecker > outer product of left and right > > > // d_nnz = diagonal nonzero structure, o_nnz = off-diagonal nonzero > structure > > void kronConvert_symbolic(const Spmat &left, const Spmat &right, Mat &mat, > PetscInt* d_nnz, PetscInt* o_nnz) > > > { > > > size_t rightRowSize = right.size(1); > > > size_t rightColSize = right.size(2); > > > > > > PetscInt Istart,Iend; // rows owned by current processor > > > PetscInt Jstart,Jend; // cols owned by current processor > > > > > > // allocate space for mat > > > MatGetOwnershipRange(mat,&Istart,&Iend); > > > MatGetOwnershipRangeColumn(mat,&Jstart,&Jend); > > > PetscInt m = Iend - Istart; > > > > > > for (int ii=0; ii > > for (int ii=0; ii > > > > > // iterate over only nnz entries > > > Spmat::const_row_iter IiL,IiR; > > > Spmat::const_col_iter JjL,JjR; > > > double valL=0, valR=0, val=0; > > > PetscInt row,col; > > > size_t rowL,colL,rowR,colR; > > > > > // loop over all values in left > > > for (IiL=left._mat.begin(); IiL!=left._mat.end(); IiL++) { > > > for (JjL=(IiL->second).begin(); JjL!=(IiL->second).end(); JjL++) { > > > rowL = IiL->first; > > > colL = JjL->first; > > > valL = JjL->second; > > > if (valL==0) { continue; } > > > > > > // loop over all values in right > > > for (IiR=right._mat.begin(); IiR!=right._mat.end(); IiR++) { > > > for (JjR=(IiR->second).begin(); JjR!=(IiR->second).end(); JjR++) > { > > rowR = IiR->first; > > > colR = JjR->first; > > > valR = JjR->second; > > > > > > // the new values and coordinates for the product matrix > > > val = valL*valR; > > > row = rowL*rightRowSize + rowR; > > > col = colL*rightColSize + colR; > > > > > > PetscInt ii = row - Istart; // array index for d_nnz and o_nnz > > > if (val!=0 && row >= Istart && row < Iend && col >= Jstart && > col < Jend) { d_nnz[ii]++; \ > > } > > > if ( (val!=0 && row >= Istart && row < Iend) && (col < Jstart > || col >= Jend) ) { o_nnz[i\ > > i]++; } > > > } > > > } > > > } > > > } > > > } > > > > > > > > Thank you, > > Yuyun > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 12 22:43:04 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 12 Dec 2022 23:43:04 -0500 Subject: [petsc-users] realCoords for DOFs In-Reply-To: References: Message-ID: On Mon, Dec 12, 2022 at 6:06 PM Yann Jobic wrote: > Hi all, > > I'm trying to get the coords of the dofs of a DMPlex for a PetscFE > discretization, for orders greater than 1. > > I'm struggling to run dm/impls/plex/tutorials/ex8.c > I've got the following error (with option -view_coord) : > You just need to call DMGetCoordinatesLocalSetUp() before the loop. I try to indicate this in the error message(). I did not call it in the example because it is only necessary for output. We do this so that output is not synchronizing. Thanks, Matt > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: DMGetCoordinatesLocalSetUp() has not been called > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.18.2, unknown > [0]PETSC ERROR: > /home/jobic/projet/fe-utils/petsc/fetools/cmake-build-debug/ex_closure_petsc > > on a named luke by jobic Mon Dec 12 23:34:37 2022 > [0]PETSC ERROR: Configure options > --prefix=/local/lib/petsc/3.18/p3/gcc/nompi_hdf5 --with-mpi=0 > --with-debugging=1 --with-blacs=1 --download-zlib,--download-p4est > --download-hdf5=1 --download-triangle=1 --with-single-library=0 > --with-large-file-io=1 --with-shared-libraries=0 -CFLAGS=" -g -O0" > -CXXFLAGS=" -g -O0" -FFLAGS=" -g -O0" PETSC_ARCH=nompi_gcc_hdf5 > [0]PETSC ERROR: #1 DMGetCoordinatesLocalNoncollective() at > /home/devel/src_linux/petsc-3.18.0/src/dm/interface/dmcoordinates.c:621 > [0]PETSC ERROR: #2 DMPlexGetCellCoordinates() at > /home/devel/src_linux/petsc-3.18.0/src/dm/impls/plex/plexgeometry.c:1291 > [0]PETSC ERROR: #3 main() at > /home/jobic/projet/fe-utils/petsc/fetools/src/ex_closure_petsc.c:86 > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -dm_plex_box_faces 2,2 > [0]PETSC ERROR: -dm_plex_dim 2 > [0]PETSC ERROR: -dm_plex_simplex 0 > [0]PETSC ERROR: -petscspace_degree 1 > [0]PETSC ERROR: -view_coord > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > > Maybe i've done something wrong ? > > > Moreover, i don't quite understand the function DMPlexGetLocalOffsets, > and how to use it with DMGetCoordinatesLocal. It seems that > DMGetCoordinatesLocal do not have the coords of the dofs, but only the > nodes defining the geometry. > > I've made some small modifications of ex8.c, but i still have an error : > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > [0]PETSC ERROR: WARNING! There are option(s) set that were not used! > Could be the program crashed before they were used or a spelling > mistake, etc! > [0]PETSC ERROR: Option left: name:-sol value: vtk:sol.vtu > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.18.2, unknown > [0]PETSC ERROR: > /home/jobic/projet/fe-utils/petsc/fetools/cmake-build-debug/view_coords > on a named luke by jobic Mon Dec 12 23:51:05 2022 > [0]PETSC ERROR: Configure options > --prefix=/local/lib/petsc/3.18/p3/gcc/nompi_hdf5 --with-mpi=0 > --with-debugging=1 --with-blacs=1 --download-zlib,--download-p4est > --download-hdf5=1 --download-triangle=1 --with-single-library=0 > --with-large-file-io=1 --with-shared-libraries=0 -CFLAGS=" -g -O0" > -CXXFLAGS=" -g -O0" -FFLAGS=" -g -O0" PETSC_ARCH=nompi_gcc_hdf5 > [0]PETSC ERROR: #1 PetscFEGetHeightSubspace() at > /home/devel/src_linux/petsc-3.18.0/src/dm/dt/fe/interface/fe.c:1692 > [0]PETSC ERROR: #2 DMPlexGetLocalOffsets() at > /home/devel/src_linux/petsc-3.18.0/src/dm/impls/plex/plexceed.c:98 > [0]PETSC ERROR: #3 ViewOffsets() at > /home/jobic/projet/fe-utils/petsc/fetools/src/view_coords.c:28 > [0]PETSC ERROR: #4 main() at > /home/jobic/projet/fe-utils/petsc/fetools/src/view_coords.c:99 > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -heat_petscspace_degree 2 > [0]PETSC ERROR: -sol vtk:sol.vtu > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > > > Is dm/impls/plex/tutorials/ex8.c a good example for viewing the coords > of the dofs of a DMPlex ? > > > Thanks, > > Yann > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Mon Dec 12 22:53:52 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Tue, 13 Dec 2022 13:53:52 +0900 Subject: [petsc-users] parallelize matrix assembly process In-Reply-To: References: Message-ID: With the following example https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c There are some questions about MatPreallocator. 1. In parallel run, all the MPI ranks should do the same preallocator procedure? 2. In ex230.c, the difference between ex1 of ex230.c and ex2 of ex230.c is the block. Developers want to show using block is more efficient method than just using matsetvalues? Thanks, Hyung Kim 2022? 12? 13? (?) ?? 1:43, Junchao Zhang ?? ??: > Since you run with multiple ranks, you should use matrix type mpiaij and > MatMPIAIJSetPreallocation. If preallocation is difficult to estimate, you > can use MatPreallocator, see an example at > https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c > > --Junchao Zhang > > > On Mon, Dec 12, 2022 at 5:16 AM ??? wrote: > >> Hello, >> >> >> I need some keyword or some examples for parallelizing matrix assemble >> process. >> >> My current state is as below. >> - Finite element analysis code for Structural mechanics. >> - problem size : 3D solid hexa element (number of elements : 125,000), >> number of degree of freedom : 397,953 >> - Matrix type : seqaij, matrix set preallocation by using >> MatSeqAIJSetPreallocation >> - Matrix assemble time by using 1 core : 120 sec >> for (int i=0; i<125000; i++) { >> ~~ element matrix calculation} >> matassemblybegin >> matassemblyend >> - Matrix assemble time by using 8 core : 70,234sec >> int start, end; >> VecGetOwnershipRange( element_vec, &start, &end); >> for (int i=start; i> ~~ element matrix calculation >> matassemblybegin >> matassemblyend >> >> >> As you see the state, the parallel case spent a lot of time than >> sequential case.. >> How can I speed up in this case? >> Can I get some keyword or examples for parallelizing assembly of matrix >> in finite element analysis ? >> >> Thanks, >> Hyung Kim >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 12 23:43:14 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 13 Dec 2022 00:43:14 -0500 Subject: [petsc-users] parallelize matrix assembly process In-Reply-To: References: Message-ID: On Mon, Dec 12, 2022 at 11:54 PM ??? wrote: > > With the following example > https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c > > There are some questions about MatPreallocator. > > 1. In parallel run, all the MPI ranks should do the same preallocator > procedure? > In parallel, each process adds its own entries, just as you would in the matrix assembly. > 2. In ex230.c, the difference between ex1 of ex230.c and ex2 of ex230.c is > the block. > Developers want to show using block is more efficient method than just > using matsetvalues? > It can be. Thanks, Matt > Thanks, > Hyung Kim > > 2022? 12? 13? (?) ?? 1:43, Junchao Zhang ?? ??: > >> Since you run with multiple ranks, you should use matrix type mpiaij and >> MatMPIAIJSetPreallocation. If preallocation is difficult to estimate, you >> can use MatPreallocator, see an example at >> https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c >> >> --Junchao Zhang >> >> >> On Mon, Dec 12, 2022 at 5:16 AM ??? wrote: >> >>> Hello, >>> >>> >>> I need some keyword or some examples for parallelizing matrix assemble >>> process. >>> >>> My current state is as below. >>> - Finite element analysis code for Structural mechanics. >>> - problem size : 3D solid hexa element (number of elements : 125,000), >>> number of degree of freedom : 397,953 >>> - Matrix type : seqaij, matrix set preallocation by using >>> MatSeqAIJSetPreallocation >>> - Matrix assemble time by using 1 core : 120 sec >>> for (int i=0; i<125000; i++) { >>> ~~ element matrix calculation} >>> matassemblybegin >>> matassemblyend >>> - Matrix assemble time by using 8 core : 70,234sec >>> int start, end; >>> VecGetOwnershipRange( element_vec, &start, &end); >>> for (int i=start; i>> ~~ element matrix calculation >>> matassemblybegin >>> matassemblyend >>> >>> >>> As you see the state, the parallel case spent a lot of time than >>> sequential case.. >>> How can I speed up in this case? >>> Can I get some keyword or examples for parallelizing assembly of matrix >>> in finite element analysis ? >>> >>> Thanks, >>> Hyung Kim >>> >>> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yyang85 at alumni.stanford.edu Tue Dec 13 00:14:24 2022 From: yyang85 at alumni.stanford.edu (Yuyun Yang) Date: Tue, 13 Dec 2022 14:14:24 +0800 Subject: [petsc-users] Seg fault in gdb but program runs In-Reply-To: References: Message-ID: Here is the error message: Program received signal SIGSEGV, Segmentation fault. 0x00005555555e73b7 in kronConvert (left=..., right=..., mat=@0x555555927e10: 0x555557791bb0, diag=5, offDiag=0) at /home/yuyun/scycle-2/source/spmat.cpp:265 265 kronConvert_symbolic(left,right,mat,d_nnz,o_nnz); On Tue, Dec 13, 2022 at 12:41 PM Matthew Knepley wrote: > On Mon, Dec 12, 2022 at 9:56 PM Yuyun Yang > wrote: > >> Hello team, >> >> >> >> I?m debugging my code using gdb. The program runs just fine if I don?t >> debug it, but when I use gdb, it seg faults at a place where it never >> experienced any seg fault when I debugged it 1-2 years ago. I wonder if >> this might be caused by the PETSc version change? >> > > The only PETSc calls are the MatGetOwnershipRange() calls, which have not > changed, so I think this is unlikely. > > >> Or something wrong with gdb itself? I?ve included the code block that is >> problematic for you to take a look at what might be wrong ? seg fault >> happens when this function is called. For context, Spmat is a class of >> sparse matrices in the code: >> > > What is the debugger output? > > Thanks, > > Matt > > >> // calculate the exact nonzero structure which results from the kronecker >> outer product of left and right >> >> >> // d_nnz = diagonal nonzero structure, o_nnz = off-diagonal nonzero >> structure >> >> void kronConvert_symbolic(const Spmat &left, const Spmat &right, Mat &mat, >> PetscInt* d_nnz, PetscInt* o_nnz) >> >> >> { >> >> >> size_t rightRowSize = right.size(1); >> >> >> size_t rightColSize = right.size(2); >> >> >> >> >> >> PetscInt Istart,Iend; // rows owned by current processor >> >> >> PetscInt Jstart,Jend; // cols owned by current processor >> >> >> >> >> >> // allocate space for mat >> >> >> MatGetOwnershipRange(mat,&Istart,&Iend); >> >> >> MatGetOwnershipRangeColumn(mat,&Jstart,&Jend); >> >> >> PetscInt m = Iend - Istart; >> >> >> >> >> >> for (int ii=0; ii> >> >> for (int ii=0; ii> >> >> >> >> >> // iterate over only nnz entries >> >> >> Spmat::const_row_iter IiL,IiR; >> >> >> Spmat::const_col_iter JjL,JjR; >> >> >> double valL=0, valR=0, val=0; >> >> >> PetscInt row,col; >> >> >> size_t rowL,colL,rowR,colR; >> >> >> >> >> // loop over all values in left >> >> >> for (IiL=left._mat.begin(); IiL!=left._mat.end(); IiL++) { >> >> >> for (JjL=(IiL->second).begin(); JjL!=(IiL->second).end(); JjL++) { >> >> >> rowL = IiL->first; >> >> >> colL = JjL->first; >> >> >> valL = JjL->second; >> >> >> if (valL==0) { continue; } >> >> >> >> >> >> // loop over all values in right >> >> >> for (IiR=right._mat.begin(); IiR!=right._mat.end(); IiR++) { >> >> >> for (JjR=(IiR->second).begin(); JjR!=(IiR->second).end(); JjR++) >> { >> >> rowR = IiR->first; >> >> >> colR = JjR->first; >> >> >> valR = JjR->second; >> >> >> >> >> >> // the new values and coordinates for the product matrix >> >> >> val = valL*valR; >> >> >> row = rowL*rightRowSize + rowR; >> >> >> col = colL*rightColSize + colR; >> >> >> >> >> >> PetscInt ii = row - Istart; // array index for d_nnz and o_nnz >> >> >> if (val!=0 && row >= Istart && row < Iend && col >= Jstart && >> col < Jend) { d_nnz[ii]++; \ >> >> } >> >> >> if ( (val!=0 && row >= Istart && row < Iend) && (col < Jstart >> || col >= Jend) ) { o_nnz[i\ >> >> i]++; } >> >> >> } >> >> >> } >> >> >> } >> >> >> } >> >> >> } >> >> >> >> >> >> >> >> Thank you, >> >> Yuyun >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Dec 13 00:49:09 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 13 Dec 2022 01:49:09 -0500 Subject: [petsc-users] Seg fault in gdb but program runs In-Reply-To: References: Message-ID: On Tue, Dec 13, 2022 at 1:14 AM Yuyun Yang wrote: > Here is the error message: > > Program received signal SIGSEGV, Segmentation fault. > 0x00005555555e73b7 in kronConvert (left=..., right=..., > mat=@0x555555927e10: 0x555557791bb0, diag=5, offDiag=0) > at /home/yuyun/scycle-2/source/spmat.cpp:265 > 265 kronConvert_symbolic(left,right,mat,d_nnz,o_nnz); > d_nnz and o_nnz are pointers, and they are supposed to hold arrays of the number of nonzero in each row, You seem to be passing integers. Thanks, Matt > On Tue, Dec 13, 2022 at 12:41 PM Matthew Knepley > wrote: > >> On Mon, Dec 12, 2022 at 9:56 PM Yuyun Yang >> wrote: >> >>> Hello team, >>> >>> >>> >>> I?m debugging my code using gdb. The program runs just fine if I don?t >>> debug it, but when I use gdb, it seg faults at a place where it never >>> experienced any seg fault when I debugged it 1-2 years ago. I wonder if >>> this might be caused by the PETSc version change? >>> >> >> The only PETSc calls are the MatGetOwnershipRange() calls, which have not >> changed, so I think this is unlikely. >> >> >>> Or something wrong with gdb itself? I?ve included the code block that is >>> problematic for you to take a look at what might be wrong ? seg fault >>> happens when this function is called. For context, Spmat is a class of >>> sparse matrices in the code: >>> >> >> What is the debugger output? >> >> Thanks, >> >> Matt >> >> >>> // calculate the exact nonzero structure which results from the >>> kronecker outer product of left and right >>> >>> >>> // d_nnz = diagonal nonzero structure, o_nnz = off-diagonal nonzero >>> structure >>> >>> void kronConvert_symbolic(const Spmat &left, const Spmat &right, Mat & >>> mat, PetscInt* d_nnz, PetscInt* o_nnz) >>> >>> >>> { >>> >>> >>> size_t rightRowSize = right.size(1); >>> >>> >>> size_t rightColSize = right.size(2); >>> >>> >>> >>> >>> >>> PetscInt Istart,Iend; // rows owned by current processor >>> >>> >>> PetscInt Jstart,Jend; // cols owned by current processor >>> >>> >>> >>> >>> >>> // allocate space for mat >>> >>> >>> MatGetOwnershipRange(mat,&Istart,&Iend); >>> >>> >>> MatGetOwnershipRangeColumn(mat,&Jstart,&Jend); >>> >>> >>> PetscInt m = Iend - Istart; >>> >>> >>> >>> >>> >>> for (int ii=0; ii>> >>> >>> for (int ii=0; ii>> >>> >>> >>> >>> >>> // iterate over only nnz entries >>> >>> >>> Spmat::const_row_iter IiL,IiR; >>> >>> >>> Spmat::const_col_iter JjL,JjR; >>> >>> >>> double valL=0, valR=0, val=0; >>> >>> >>> PetscInt row,col; >>> >>> >>> size_t rowL,colL,rowR,colR; >>> >>> >>> >>> >>> // loop over all values in left >>> >>> >>> for (IiL=left._mat.begin(); IiL!=left._mat.end(); IiL++) { >>> >>> >>> for (JjL=(IiL->second).begin(); JjL!=(IiL->second).end(); JjL++) { >>> >>> >>> rowL = IiL->first; >>> >>> >>> colL = JjL->first; >>> >>> >>> valL = JjL->second; >>> >>> >>> if (valL==0) { continue; } >>> >>> >>> >>> >>> >>> // loop over all values in right >>> >>> >>> for (IiR=right._mat.begin(); IiR!=right._mat.end(); IiR++) { >>> >>> >>> for (JjR=(IiR->second).begin(); JjR!=(IiR->second).end(); >>> JjR++) { >>> >>> rowR = IiR->first; >>> >>> >>> colR = JjR->first; >>> >>> >>> valR = JjR->second; >>> >>> >>> >>> >>> >>> // the new values and coordinates for the product matrix >>> >>> >>> val = valL*valR; >>> >>> >>> row = rowL*rightRowSize + rowR; >>> >>> >>> col = colL*rightColSize + colR; >>> >>> >>> >>> >>> >>> PetscInt ii = row - Istart; // array index for d_nnz and >>> o_nnz >>> >>> if (val!=0 && row >= Istart && row < Iend && col >= Jstart && >>> col < Jend) { d_nnz[ii]++; \ >>> >>> } >>> >>> >>> if ( (val!=0 && row >= Istart && row < Iend) && (col < Jstart >>> || col >= Jend) ) { o_nnz[i\ >>> >>> i]++; } >>> >>> >>> } >>> >>> >>> } >>> >>> >>> } >>> >>> >>> } >>> >>> >>> } >>> >>> >>> >>> >>> >>> >>> >>> Thank you, >>> >>> Yuyun >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yyang85 at alumni.stanford.edu Tue Dec 13 00:52:46 2022 From: yyang85 at alumni.stanford.edu (Yuyun Yang) Date: Tue, 13 Dec 2022 06:52:46 +0000 Subject: [petsc-users] Seg fault in gdb but program runs In-Reply-To: References: Message-ID: Ok I?ll check that, thanks for taking a look! By the way, when I reduce the domain size this error doesn?t appear anymore, so I don?t know whether gdb just cannot handle the memory, and start to cut things off which is causing the seg fault. From: Matthew Knepley Date: Tuesday, December 13, 2022 at 2:49 PM To: Yuyun Yang Cc: petsc-users Subject: Re: [petsc-users] Seg fault in gdb but program runs On Tue, Dec 13, 2022 at 1:14 AM Yuyun Yang > wrote: Here is the error message: Program received signal SIGSEGV, Segmentation fault. 0x00005555555e73b7 in kronConvert (left=..., right=..., mat=@0x555555927e10: 0x555557791bb0, diag=5, offDiag=0) at /home/yuyun/scycle-2/source/spmat.cpp:265 265 kronConvert_symbolic(left,right,mat,d_nnz,o_nnz); d_nnz and o_nnz are pointers, and they are supposed to hold arrays of the number of nonzero in each row, You seem to be passing integers. Thanks, Matt On Tue, Dec 13, 2022 at 12:41 PM Matthew Knepley > wrote: On Mon, Dec 12, 2022 at 9:56 PM Yuyun Yang > wrote: Hello team, I?m debugging my code using gdb. The program runs just fine if I don?t debug it, but when I use gdb, it seg faults at a place where it never experienced any seg fault when I debugged it 1-2 years ago. I wonder if this might be caused by the PETSc version change? The only PETSc calls are the MatGetOwnershipRange() calls, which have not changed, so I think this is unlikely. Or something wrong with gdb itself? I?ve included the code block that is problematic for you to take a look at what might be wrong ? seg fault happens when this function is called. For context, Spmat is a class of sparse matrices in the code: What is the debugger output? Thanks, Matt // calculate the exact nonzero structure which results from the kronecker outer product of left and right // d_nnz = diagonal nonzero structure, o_nnz = off-diagonal nonzero structure void kronConvert_symbolic(const Spmat &left, const Spmat &right, Mat &mat, PetscInt* d_nnz, PetscInt* o_nnz) { size_t rightRowSize = right.size(1); size_t rightColSize = right.size(2); PetscInt Istart,Iend; // rows owned by current processor PetscInt Jstart,Jend; // cols owned by current processor // allocate space for mat MatGetOwnershipRange(mat,&Istart,&Iend); MatGetOwnershipRangeColumn(mat,&Jstart,&Jend); PetscInt m = Iend - Istart; for (int ii=0; iisecond).begin(); JjL!=(IiL->second).end(); JjL++) { rowL = IiL->first; colL = JjL->first; valL = JjL->second; if (valL==0) { continue; } // loop over all values in right for (IiR=right._mat.begin(); IiR!=right._mat.end(); IiR++) { for (JjR=(IiR->second).begin(); JjR!=(IiR->second).end(); JjR++) { rowR = IiR->first; colR = JjR->first; valR = JjR->second; // the new values and coordinates for the product matrix val = valL*valR; row = rowL*rightRowSize + rowR; col = colL*rightColSize + colR; PetscInt ii = row - Istart; // array index for d_nnz and o_nnz if (val!=0 && row >= Istart && row < Iend && col >= Jstart && col < Jend) { d_nnz[ii]++; \ } if ( (val!=0 && row >= Istart && row < Iend) && (col < Jstart || col >= Jend) ) { o_nnz[i\ i]++; } } } } } } Thank you, Yuyun -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yann.jobic at univ-amu.fr Tue Dec 13 02:16:13 2022 From: yann.jobic at univ-amu.fr (Yann Jobic) Date: Tue, 13 Dec 2022 09:16:13 +0100 Subject: [petsc-users] realCoords for DOFs In-Reply-To: References: Message-ID: <35fadbd2-80fa-3ffa-3b64-6250830af797@univ-amu.fr> Le 12/13/2022 ? 5:43 AM, Matthew Knepley a ?crit?: > On Mon, Dec 12, 2022 at 6:06 PM Yann Jobic > wrote: > > Hi all, > > I'm trying to get the coords of the dofs of a DMPlex for a PetscFE > discretization, for orders greater than 1. > > I'm struggling to run dm/impls/plex/tutorials/ex8.c > I've got the following error (with option -view_coord) : > > > You just need to call > > ??DMGetCoordinatesLocalSetUp() > > before the loop. I try to indicate this in the error message(). I did > not call it in the example > because it is only necessary for output. The error message is explicit. It feels strange to modify a Petsc tutorial in order to make it work, with an option proposed by it. Maybe the option -view_coord should be removed then. Thanks, Yann We do this so that output is > not synchronizing. > > ? Thanks, > > ? ? Matt > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: DMGetCoordinatesLocalSetUp() has not been called > [0]PETSC ERROR: See https://petsc.org/release/faq/ > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.18.2, unknown > [0]PETSC ERROR: > /home/jobic/projet/fe-utils/petsc/fetools/cmake-build-debug/ex_closure_petsc > on a? named luke by jobic Mon Dec 12 23:34:37 2022 > [0]PETSC ERROR: Configure options > --prefix=/local/lib/petsc/3.18/p3/gcc/nompi_hdf5 --with-mpi=0 > --with-debugging=1 --with-blacs=1 --download-zlib,--download-p4est > --download-hdf5=1 --download-triangle=1 --with-single-library=0 > --with-large-file-io=1 --with-shared-libraries=0 -CFLAGS=" -g -O0" > -CXXFLAGS=" -g -O0" -FFLAGS=" -g -O0" PETSC_ARCH=nompi_gcc_hdf5 > [0]PETSC ERROR: #1 DMGetCoordinatesLocalNoncollective() at > /home/devel/src_linux/petsc-3.18.0/src/dm/interface/dmcoordinates.c:621 > [0]PETSC ERROR: #2 DMPlexGetCellCoordinates() at > /home/devel/src_linux/petsc-3.18.0/src/dm/impls/plex/plexgeometry.c:1291 > [0]PETSC ERROR: #3 main() at > /home/jobic/projet/fe-utils/petsc/fetools/src/ex_closure_petsc.c:86 > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -dm_plex_box_faces 2,2 > [0]PETSC ERROR: -dm_plex_dim 2 > [0]PETSC ERROR: -dm_plex_simplex 0 > [0]PETSC ERROR: -petscspace_degree 1 > [0]PETSC ERROR: -view_coord > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > > Maybe i've done something wrong ? > > > Moreover, i don't quite understand the function DMPlexGetLocalOffsets, > and how to use it with DMGetCoordinatesLocal. It seems that > DMGetCoordinatesLocal do not have the coords of the dofs, but only the > nodes defining the geometry. > > I've made some small modifications of ex8.c, but i still have an error : > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > [0]PETSC ERROR: WARNING! There are option(s) set that were not used! > Could be the program crashed before they were used or a spelling > mistake, etc! > [0]PETSC ERROR: Option left: name:-sol value: vtk:sol.vtu > [0]PETSC ERROR: See https://petsc.org/release/faq/ > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.18.2, unknown > [0]PETSC ERROR: > /home/jobic/projet/fe-utils/petsc/fetools/cmake-build-debug/view_coords > on a? named luke by jobic Mon Dec 12 23:51:05 2022 > [0]PETSC ERROR: Configure options > --prefix=/local/lib/petsc/3.18/p3/gcc/nompi_hdf5 --with-mpi=0 > --with-debugging=1 --with-blacs=1 --download-zlib,--download-p4est > --download-hdf5=1 --download-triangle=1 --with-single-library=0 > --with-large-file-io=1 --with-shared-libraries=0 -CFLAGS=" -g -O0" > -CXXFLAGS=" -g -O0" -FFLAGS=" -g -O0" PETSC_ARCH=nompi_gcc_hdf5 > [0]PETSC ERROR: #1 PetscFEGetHeightSubspace() at > /home/devel/src_linux/petsc-3.18.0/src/dm/dt/fe/interface/fe.c:1692 > [0]PETSC ERROR: #2 DMPlexGetLocalOffsets() at > /home/devel/src_linux/petsc-3.18.0/src/dm/impls/plex/plexceed.c:98 > [0]PETSC ERROR: #3 ViewOffsets() at > /home/jobic/projet/fe-utils/petsc/fetools/src/view_coords.c:28 > [0]PETSC ERROR: #4 main() at > /home/jobic/projet/fe-utils/petsc/fetools/src/view_coords.c:99 > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -heat_petscspace_degree 2 > [0]PETSC ERROR: -sol vtk:sol.vtu > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > > > Is dm/impls/plex/tutorials/ex8.c a good example for viewing the coords > of the dofs of a DMPlex ? > > > Thanks, > > Yann > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From yann.jobic at univ-amu.fr Tue Dec 13 02:38:47 2022 From: yann.jobic at univ-amu.fr (Yann Jobic) Date: Tue, 13 Dec 2022 09:38:47 +0100 Subject: [petsc-users] realCoords for DOFs In-Reply-To: References: Message-ID: <9a891888-dbc6-43a3-36d3-b47c2864a0af@univ-amu.fr> Le 12/13/2022 ? 2:04 AM, Mark Adams a ?crit?: > PETSc does not store the coordinates for high order elements (ie, the > "midside nodes"). > I get them by projecting a f(x) = x, function. > Not sure if that is what you want but I can give you a code snippet if > there are no better ideas. It could really help me ! If i have the node coordinates in the reference element, then it's easy to project them to the real space. But i couldn't find a way to have them, so if you could give me some guidance, it could be really helpful. Thanks, Yann > > Mark > > > On Mon, Dec 12, 2022 at 6:06 PM Yann Jobic > wrote: > > Hi all, > > I'm trying to get the coords of the dofs of a DMPlex for a PetscFE > discretization, for orders greater than 1. > > I'm struggling to run dm/impls/plex/tutorials/ex8.c > I've got the following error (with option -view_coord) : > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: DMGetCoordinatesLocalSetUp() has not been called > [0]PETSC ERROR: See https://petsc.org/release/faq/ > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.18.2, unknown > [0]PETSC ERROR: > /home/jobic/projet/fe-utils/petsc/fetools/cmake-build-debug/ex_closure_petsc > on a? named luke by jobic Mon Dec 12 23:34:37 2022 > [0]PETSC ERROR: Configure options > --prefix=/local/lib/petsc/3.18/p3/gcc/nompi_hdf5 --with-mpi=0 > --with-debugging=1 --with-blacs=1 --download-zlib,--download-p4est > --download-hdf5=1 --download-triangle=1 --with-single-library=0 > --with-large-file-io=1 --with-shared-libraries=0 -CFLAGS=" -g -O0" > -CXXFLAGS=" -g -O0" -FFLAGS=" -g -O0" PETSC_ARCH=nompi_gcc_hdf5 > [0]PETSC ERROR: #1 DMGetCoordinatesLocalNoncollective() at > /home/devel/src_linux/petsc-3.18.0/src/dm/interface/dmcoordinates.c:621 > [0]PETSC ERROR: #2 DMPlexGetCellCoordinates() at > /home/devel/src_linux/petsc-3.18.0/src/dm/impls/plex/plexgeometry.c:1291 > [0]PETSC ERROR: #3 main() at > /home/jobic/projet/fe-utils/petsc/fetools/src/ex_closure_petsc.c:86 > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -dm_plex_box_faces 2,2 > [0]PETSC ERROR: -dm_plex_dim 2 > [0]PETSC ERROR: -dm_plex_simplex 0 > [0]PETSC ERROR: -petscspace_degree 1 > [0]PETSC ERROR: -view_coord > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > > Maybe i've done something wrong ? > > > Moreover, i don't quite understand the function DMPlexGetLocalOffsets, > and how to use it with DMGetCoordinatesLocal. It seems that > DMGetCoordinatesLocal do not have the coords of the dofs, but only the > nodes defining the geometry. > > I've made some small modifications of ex8.c, but i still have an error : > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > [0]PETSC ERROR: WARNING! There are option(s) set that were not used! > Could be the program crashed before they were used or a spelling > mistake, etc! > [0]PETSC ERROR: Option left: name:-sol value: vtk:sol.vtu > [0]PETSC ERROR: See https://petsc.org/release/faq/ > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.18.2, unknown > [0]PETSC ERROR: > /home/jobic/projet/fe-utils/petsc/fetools/cmake-build-debug/view_coords > on a? named luke by jobic Mon Dec 12 23:51:05 2022 > [0]PETSC ERROR: Configure options > --prefix=/local/lib/petsc/3.18/p3/gcc/nompi_hdf5 --with-mpi=0 > --with-debugging=1 --with-blacs=1 --download-zlib,--download-p4est > --download-hdf5=1 --download-triangle=1 --with-single-library=0 > --with-large-file-io=1 --with-shared-libraries=0 -CFLAGS=" -g -O0" > -CXXFLAGS=" -g -O0" -FFLAGS=" -g -O0" PETSC_ARCH=nompi_gcc_hdf5 > [0]PETSC ERROR: #1 PetscFEGetHeightSubspace() at > /home/devel/src_linux/petsc-3.18.0/src/dm/dt/fe/interface/fe.c:1692 > [0]PETSC ERROR: #2 DMPlexGetLocalOffsets() at > /home/devel/src_linux/petsc-3.18.0/src/dm/impls/plex/plexceed.c:98 > [0]PETSC ERROR: #3 ViewOffsets() at > /home/jobic/projet/fe-utils/petsc/fetools/src/view_coords.c:28 > [0]PETSC ERROR: #4 main() at > /home/jobic/projet/fe-utils/petsc/fetools/src/view_coords.c:99 > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -heat_petscspace_degree 2 > [0]PETSC ERROR: -sol vtk:sol.vtu > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > > > Is dm/impls/plex/tutorials/ex8.c a good example for viewing the coords > of the dofs of a DMPlex ? > > > Thanks, > > Yann > From praveen at gmx.net Tue Dec 13 05:11:06 2022 From: praveen at gmx.net (Praveen C) Date: Tue, 13 Dec 2022 16:41:06 +0530 Subject: [petsc-users] dmplex normal vector incorrect for periodic gmsh grids Message-ID: <32720DD5-FA50-4766-A36A-7566913795B3@gmx.net> Hello In the attached test, I read a small grid made in gmsh with periodic bc. This is a 2d mesh. The cell numbers are shown in the figure. All faces have length = 2.5 But using PetscFVFaceGeom I am getting length of 7.5 for some faces. E.g., face: 59, centroid = 3.750000, 2.500000, normal = 0.000000, -7.500000 ===> Face length incorrect = 7.500000, should be 2.5 support[0] = 11, cent = 8.750000, 3.750000, area = 6.250000 support[1] = 15, cent = 8.750000, 1.250000, area = 6.250000 There are also errors in the orientation of normal. If we disable periodicity in geo file, this error goes away. Thanks praveen ??? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: dmplex.c Type: application/octet-stream Size: 4821 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mesh.png Type: image/png Size: 10967 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ug_periodic.geo Type: application/octet-stream Size: 525 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From pjool at dtu.dk Tue Dec 13 05:40:03 2022 From: pjool at dtu.dk (=?utf-8?B?UGVkZXIgSsO4cmdlbnNnYWFyZCBPbGVzZW4=?=) Date: Tue, 13 Dec 2022 11:40:03 +0000 Subject: [petsc-users] Insert one sparse matrix as a block in another In-Reply-To: References: <58c80ce3b8414d57b421ac9bb8e1697b@dtu.dk> <87pmcontvm.fsf@jedbrown.org>, Message-ID: <02574a3cee884e579906ca322cb69d88@dtu.dk> Yes, MATNEST seems do to just what I wanted, and extremely fast too. Thanks! For context, I am doing a snapshot POD (like SVD, but less memory intensive) in which I'm building a dense matrix S_pq = u_p^T B u_q for the EVP Sa=?a, where {u_p} and {u_q} are vectors in a particular dataset. The kernel B is the sparse matrix I was asking about. Thanks again, Peder ________________________________ Fra: Matthew Knepley Sendt: 12. december 2022 23:58:02 Til: Jed Brown Cc: Mark Adams; Peder J?rgensgaard Olesen; petsc-users at mcs.anl.gov Emne: Re: [petsc-users] Insert one sparse matrix as a block in another On Mon, Dec 12, 2022 at 5:24 PM Jed Brown > wrote: The description matches MATNEST (MATCOMPOSITE is for a sum or product of matrices) or parallel decompositions. Also consider the assembly style of src/snes/tutorials/ex28.c, which can create either a monolithic or block (MATNEST) matrix without extra storage or conversion costs. I will just say a few words about ex28. The idea is that if you are already calling MatSetValues() to assemble your submatrices, then you can use MatSetValuesLocal() to remap those locations into locations in the large matrix, using a LocalToGlobalMap. This allows you to choose either a standard AIJ matrix (which supports factorizations for example), or a MatNest object that supports fast extraction of the blocks. Thanks, Matt Mark Adams > writes: > Do you know what kind of solver works well for this problem? > > You probably want to figure that out first and not worry about efficiency. > > MATCOMPOSITE does what you want but not all solvers will work with it. > > Where does this problem come from? We have a lot of experience and might > know something. > > Mark > > On Mon, Dec 12, 2022 at 1:33 PM Peder J?rgensgaard Olesen via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello >> >> >> I have a set of sparse matrices (A1, A2, ...) , and need to generate a >> larger matrix B with these as submatrices. I do not know the precise sparse >> layouts of the A's (only that each row has one or two non-zero values), >> and extracting *all* values to copy into B seems incredibly wasteful. How >> can I make use of the sparsity to solve this efficiently? >> >> >> Thanks, >> >> Peder >> >> >> >> Peder J?rgensgaard Olesen >> PhD student >> Department of Civil and Mechanical Engineering >> >> pjool at mek.dtu.dk >> Koppels All? >> Building 403, room 105 >> 2800 Kgs. Lyngby >> www.dtu.dk/english >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Dec 13 06:52:17 2022 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 13 Dec 2022 07:52:17 -0500 Subject: [petsc-users] realCoords for DOFs In-Reply-To: <9a891888-dbc6-43a3-36d3-b47c2864a0af@univ-amu.fr> References: <9a891888-dbc6-43a3-36d3-b47c2864a0af@univ-amu.fr> Message-ID: You should be able to use PetscFECreateDefault instead of PetscFECreateLagrange here and set the "order" on the command line, which is recommended, but start with this and low order to debug. Mark static PetscErrorCode crd_func(PetscInt dim, PetscReal time, const PetscReal x[], PetscInt Nf_dummy, PetscScalar *u, void *actx) { int i; PetscFunctionBeginUser; for (i = 0; i < dim; ++i) u[i] = x[i]; PetscFunctionReturn(0); } PetscErrorCode (*initu[1])(PetscInt, PetscReal, const PetscReal [], PetscInt, PetscScalar [], void *); PetscFE fe; /* create coordinate DM */ ierr = DMClone(dm, &crddm);CHKERRV(ierr); ierr = PetscFECreateLagrange(PETSC_COMM_SELF, dim, dim, PETSC_FALSE, order, PETSC_DECIDE, &fe);CHKERRV(ierr); ierr = PetscFESetFromOptions(fe);CHKERRV(ierr); ierr = DMSetField(crddm, 0, NULL, (PetscObject)fe);CHKERRV(ierr); ierr = DMCreateDS(crddm);CHKERRV(ierr); ierr = PetscFEDestroy(&fe);CHKERRV(ierr); /* project coordinates to vertices */ ierr = DMCreateGlobalVector(crddm, &crd_vec);CHKERRV(ierr); initu[0] = crd_func; ierr = DMProjectFunction(crddm, 0.0, initu, NULL, INSERT_ALL_VALUES, crd_vec);CHKERRV(ierr); ierr = VecViewFromOptions(crd_vec, NULL, "-coord_view");CHKERRV(ierr); On Tue, Dec 13, 2022 at 3:38 AM Yann Jobic wrote: > > > Le 12/13/2022 ? 2:04 AM, Mark Adams a ?crit : > > PETSc does not store the coordinates for high order elements (ie, the > > "midside nodes"). > > I get them by projecting a f(x) = x, function. > > Not sure if that is what you want but I can give you a code snippet if > > there are no better ideas. > > It could really help me ! > If i have the node coordinates in the reference element, then it's easy > to project them to the real space. But i couldn't find a way to have > them, so if you could give me some guidance, it could be really helpful. > Thanks, > Yann > > > > > Mark > > > > > > On Mon, Dec 12, 2022 at 6:06 PM Yann Jobic > > wrote: > > > > Hi all, > > > > I'm trying to get the coords of the dofs of a DMPlex for a PetscFE > > discretization, for orders greater than 1. > > > > I'm struggling to run dm/impls/plex/tutorials/ex8.c > > I've got the following error (with option -view_coord) : > > > > [0]PETSC ERROR: --------------------- Error Message > > -------------------------------------------------------------- > > [0]PETSC ERROR: Object is in wrong state > > [0]PETSC ERROR: DMGetCoordinatesLocalSetUp() has not been called > > [0]PETSC ERROR: See https://petsc.org/release/faq/ > > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.18.2, unknown > > [0]PETSC ERROR: > > > /home/jobic/projet/fe-utils/petsc/fetools/cmake-build-debug/ex_closure_petsc > > on a named luke by jobic Mon Dec 12 23:34:37 2022 > > [0]PETSC ERROR: Configure options > > --prefix=/local/lib/petsc/3.18/p3/gcc/nompi_hdf5 --with-mpi=0 > > --with-debugging=1 --with-blacs=1 --download-zlib,--download-p4est > > --download-hdf5=1 --download-triangle=1 --with-single-library=0 > > --with-large-file-io=1 --with-shared-libraries=0 -CFLAGS=" -g -O0" > > -CXXFLAGS=" -g -O0" -FFLAGS=" -g -O0" PETSC_ARCH=nompi_gcc_hdf5 > > [0]PETSC ERROR: #1 DMGetCoordinatesLocalNoncollective() at > > > /home/devel/src_linux/petsc-3.18.0/src/dm/interface/dmcoordinates.c:621 > > [0]PETSC ERROR: #2 DMPlexGetCellCoordinates() at > > > /home/devel/src_linux/petsc-3.18.0/src/dm/impls/plex/plexgeometry.c:1291 > > [0]PETSC ERROR: #3 main() at > > /home/jobic/projet/fe-utils/petsc/fetools/src/ex_closure_petsc.c:86 > > [0]PETSC ERROR: PETSc Option Table entries: > > [0]PETSC ERROR: -dm_plex_box_faces 2,2 > > [0]PETSC ERROR: -dm_plex_dim 2 > > [0]PETSC ERROR: -dm_plex_simplex 0 > > [0]PETSC ERROR: -petscspace_degree 1 > > [0]PETSC ERROR: -view_coord > > [0]PETSC ERROR: ----------------End of Error Message -------send > entire > > error message to petsc-maint at mcs.anl.gov---------- > > > > Maybe i've done something wrong ? > > > > > > Moreover, i don't quite understand the function > DMPlexGetLocalOffsets, > > and how to use it with DMGetCoordinatesLocal. It seems that > > DMGetCoordinatesLocal do not have the coords of the dofs, but only > the > > nodes defining the geometry. > > > > I've made some small modifications of ex8.c, but i still have an > error : > > [0]PETSC ERROR: --------------------- Error Message > > -------------------------------------------------------------- > > [0]PETSC ERROR: Invalid argument > > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > > [0]PETSC ERROR: WARNING! There are option(s) set that were not used! > > Could be the program crashed before they were used or a spelling > > mistake, etc! > > [0]PETSC ERROR: Option left: name:-sol value: vtk:sol.vtu > > [0]PETSC ERROR: See https://petsc.org/release/faq/ > > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.18.2, unknown > > [0]PETSC ERROR: > > > /home/jobic/projet/fe-utils/petsc/fetools/cmake-build-debug/view_coords > > on a named luke by jobic Mon Dec 12 23:51:05 2022 > > [0]PETSC ERROR: Configure options > > --prefix=/local/lib/petsc/3.18/p3/gcc/nompi_hdf5 --with-mpi=0 > > --with-debugging=1 --with-blacs=1 --download-zlib,--download-p4est > > --download-hdf5=1 --download-triangle=1 --with-single-library=0 > > --with-large-file-io=1 --with-shared-libraries=0 -CFLAGS=" -g -O0" > > -CXXFLAGS=" -g -O0" -FFLAGS=" -g -O0" PETSC_ARCH=nompi_gcc_hdf5 > > [0]PETSC ERROR: #1 PetscFEGetHeightSubspace() at > > /home/devel/src_linux/petsc-3.18.0/src/dm/dt/fe/interface/fe.c:1692 > > [0]PETSC ERROR: #2 DMPlexGetLocalOffsets() at > > /home/devel/src_linux/petsc-3.18.0/src/dm/impls/plex/plexceed.c:98 > > [0]PETSC ERROR: #3 ViewOffsets() at > > /home/jobic/projet/fe-utils/petsc/fetools/src/view_coords.c:28 > > [0]PETSC ERROR: #4 main() at > > /home/jobic/projet/fe-utils/petsc/fetools/src/view_coords.c:99 > > [0]PETSC ERROR: PETSc Option Table entries: > > [0]PETSC ERROR: -heat_petscspace_degree 2 > > [0]PETSC ERROR: -sol vtk:sol.vtu > > [0]PETSC ERROR: ----------------End of Error Message -------send > entire > > error message to petsc-maint at mcs.anl.gov---------- > > > > > > Is dm/impls/plex/tutorials/ex8.c a good example for viewing the > coords > > of the dofs of a DMPlex ? > > > > > > Thanks, > > > > Yann > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yann.jobic at univ-amu.fr Tue Dec 13 07:03:28 2022 From: yann.jobic at univ-amu.fr (Yann Jobic) Date: Tue, 13 Dec 2022 14:03:28 +0100 Subject: [petsc-users] realCoords for DOFs In-Reply-To: References: <9a891888-dbc6-43a3-36d3-b47c2864a0af@univ-amu.fr> Message-ID: <4799a0b9-b52b-9de1-2416-b3b4116a9dad@univ-amu.fr> That a really smart trick ! Thanks for sharing it. I previously looked closeley to the DMProjectFunction without success yet as it has the solution, but yours is just too good. Yann Le 12/13/2022 ? 1:52 PM, Mark Adams a ?crit?: > You should be able to use PetscFECreateDefault instead of > PetscFECreateLagrange here and set the "order" on?the command > line,?which is recommended,?but start with this and low order to?debug. > > Mark > > static PetscErrorCode crd_func(PetscInt dim, PetscReal time, const > PetscReal x[], PetscInt Nf_dummy, PetscScalar *u, void *actx) > { > ? int i; > ? PetscFunctionBeginUser; > ? for (i = 0; i < dim; ++i) u[i] = x[i]; > ? PetscFunctionReturn(0); > } > > ? PetscErrorCode (*initu[1])(PetscInt, PetscReal, const PetscReal [], > PetscInt, PetscScalar [], void *); > ? PetscFE ? ? ? fe; > > ? /* create coordinate DM */ > ? ierr = DMClone(dm, &crddm);CHKERRV(ierr); > ? ierr = PetscFECreateLagrange(PETSC_COMM_SELF, dim, dim, PETSC_FALSE, > order, PETSC_DECIDE, &fe);CHKERRV(ierr); > ? ierr = PetscFESetFromOptions(fe);CHKERRV(ierr); > ? ierr = DMSetField(crddm, 0, NULL, (PetscObject)fe);CHKERRV(ierr); > ? ierr = DMCreateDS(crddm);CHKERRV(ierr); > ? ierr = PetscFEDestroy(&fe);CHKERRV(ierr); > ? /* project coordinates to vertices */ > ? ierr = DMCreateGlobalVector(crddm, &crd_vec);CHKERRV(ierr); > ? initu[0] = crd_func; > ? ierr = DMProjectFunction(crddm, 0.0, initu, NULL, INSERT_ALL_VALUES, > crd_vec);CHKERRV(ierr); > ? ierr = VecViewFromOptions(crd_vec, NULL, "-coord_view");CHKERRV(ierr); > > On Tue, Dec 13, 2022 at 3:38 AM Yann Jobic > wrote: > > > > Le 12/13/2022 ? 2:04 AM, Mark Adams a ?crit?: > > PETSc does not store the coordinates for high order elements (ie, > the > > "midside nodes"). > > I get them by projecting a f(x) = x, function. > > Not sure if that is what you want but I can give you a code > snippet if > > there are no better ideas. > > It could really help me ! > If i have the node coordinates in the reference element, then it's easy > to project them to the real space. But i couldn't find a way to have > them, so if you could give me some guidance, it could be really helpful. > Thanks, > Yann > > > > > Mark > > > > > > On Mon, Dec 12, 2022 at 6:06 PM Yann Jobic > > > >> > wrote: > > > >? ? ?Hi all, > > > >? ? ?I'm trying to get the coords of the dofs of a DMPlex for a > PetscFE > >? ? ?discretization, for orders greater than 1. > > > >? ? ?I'm struggling to run dm/impls/plex/tutorials/ex8.c > >? ? ?I've got the following error (with option -view_coord) : > > > >? ? ?[0]PETSC ERROR: --------------------- Error Message > >? ? ?-------------------------------------------------------------- > >? ? ?[0]PETSC ERROR: Object is in wrong state > >? ? ?[0]PETSC ERROR: DMGetCoordinatesLocalSetUp() has not been called > >? ? ?[0]PETSC ERROR: See https://petsc.org/release/faq/ > > >? ? ? > for trouble shooting. > >? ? ?[0]PETSC ERROR: Petsc Release Version 3.18.2, unknown > >? ? ?[0]PETSC ERROR: > > > ?/home/jobic/projet/fe-utils/petsc/fetools/cmake-build-debug/ex_closure_petsc > >? ? ?on a? named luke by jobic Mon Dec 12 23:34:37 2022 > >? ? ?[0]PETSC ERROR: Configure options > >? ? ?--prefix=/local/lib/petsc/3.18/p3/gcc/nompi_hdf5 --with-mpi=0 > >? ? ?--with-debugging=1 --with-blacs=1 > --download-zlib,--download-p4est > >? ? ?--download-hdf5=1 --download-triangle=1 --with-single-library=0 > >? ? ?--with-large-file-io=1 --with-shared-libraries=0 -CFLAGS=" -g > -O0" > >? ? ?-CXXFLAGS=" -g -O0" -FFLAGS=" -g -O0" PETSC_ARCH=nompi_gcc_hdf5 > >? ? ?[0]PETSC ERROR: #1 DMGetCoordinatesLocalNoncollective() at > > > ?/home/devel/src_linux/petsc-3.18.0/src/dm/interface/dmcoordinates.c:621 > >? ? ?[0]PETSC ERROR: #2 DMPlexGetCellCoordinates() at > > > ?/home/devel/src_linux/petsc-3.18.0/src/dm/impls/plex/plexgeometry.c:1291 > >? ? ?[0]PETSC ERROR: #3 main() at > > > ?/home/jobic/projet/fe-utils/petsc/fetools/src/ex_closure_petsc.c:86 > >? ? ?[0]PETSC ERROR: PETSc Option Table entries: > >? ? ?[0]PETSC ERROR: -dm_plex_box_faces 2,2 > >? ? ?[0]PETSC ERROR: -dm_plex_dim 2 > >? ? ?[0]PETSC ERROR: -dm_plex_simplex 0 > >? ? ?[0]PETSC ERROR: -petscspace_degree 1 > >? ? ?[0]PETSC ERROR: -view_coord > >? ? ?[0]PETSC ERROR: ----------------End of Error Message > -------send entire > >? ? ?error message to petsc-maint at mcs.anl.gov---------- > > > >? ? ?Maybe i've done something wrong ? > > > > > >? ? ?Moreover, i don't quite understand the function > DMPlexGetLocalOffsets, > >? ? ?and how to use it with DMGetCoordinatesLocal. It seems that > >? ? ?DMGetCoordinatesLocal do not have the coords of the dofs, but > only the > >? ? ?nodes defining the geometry. > > > >? ? ?I've made some small modifications of ex8.c, but i still have > an error : > >? ? ?[0]PETSC ERROR: --------------------- Error Message > >? ? ?-------------------------------------------------------------- > >? ? ?[0]PETSC ERROR: Invalid argument > >? ? ?[0]PETSC ERROR: Wrong type of object: Parameter # 1 > >? ? ?[0]PETSC ERROR: WARNING! There are option(s) set that were > not used! > >? ? ?Could be the program crashed before they were used or a spelling > >? ? ?mistake, etc! > >? ? ?[0]PETSC ERROR: Option left: name:-sol value: vtk:sol.vtu > >? ? ?[0]PETSC ERROR: See https://petsc.org/release/faq/ > > >? ? ? > for trouble shooting. > >? ? ?[0]PETSC ERROR: Petsc Release Version 3.18.2, unknown > >? ? ?[0]PETSC ERROR: > > > ?/home/jobic/projet/fe-utils/petsc/fetools/cmake-build-debug/view_coords > >? ? ?on a? named luke by jobic Mon Dec 12 23:51:05 2022 > >? ? ?[0]PETSC ERROR: Configure options > >? ? ?--prefix=/local/lib/petsc/3.18/p3/gcc/nompi_hdf5 --with-mpi=0 > >? ? ?--with-debugging=1 --with-blacs=1 > --download-zlib,--download-p4est > >? ? ?--download-hdf5=1 --download-triangle=1 --with-single-library=0 > >? ? ?--with-large-file-io=1 --with-shared-libraries=0 -CFLAGS=" -g > -O0" > >? ? ?-CXXFLAGS=" -g -O0" -FFLAGS=" -g -O0" PETSC_ARCH=nompi_gcc_hdf5 > >? ? ?[0]PETSC ERROR: #1 PetscFEGetHeightSubspace() at > > > ?/home/devel/src_linux/petsc-3.18.0/src/dm/dt/fe/interface/fe.c:1692 > >? ? ?[0]PETSC ERROR: #2 DMPlexGetLocalOffsets() at > > > ?/home/devel/src_linux/petsc-3.18.0/src/dm/impls/plex/plexceed.c:98 > >? ? ?[0]PETSC ERROR: #3 ViewOffsets() at > >? ? ?/home/jobic/projet/fe-utils/petsc/fetools/src/view_coords.c:28 > >? ? ?[0]PETSC ERROR: #4 main() at > >? ? ?/home/jobic/projet/fe-utils/petsc/fetools/src/view_coords.c:99 > >? ? ?[0]PETSC ERROR: PETSc Option Table entries: > >? ? ?[0]PETSC ERROR: -heat_petscspace_degree 2 > >? ? ?[0]PETSC ERROR: -sol vtk:sol.vtu > >? ? ?[0]PETSC ERROR: ----------------End of Error Message > -------send entire > >? ? ?error message to petsc-maint at mcs.anl.gov---------- > > > > > >? ? ?Is dm/impls/plex/tutorials/ex8.c a good example for viewing > the coords > >? ? ?of the dofs of a DMPlex ? > > > > > >? ? ?Thanks, > > > >? ? ?Yann > > > From balay at mcs.anl.gov Tue Dec 13 07:38:13 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 13 Dec 2022 07:38:13 -0600 (CST) Subject: [petsc-users] Seg fault in gdb but program runs In-Reply-To: References: Message-ID: suggest running your code with valgrind Satish On Mon, 12 Dec 2022, Matthew Knepley wrote: > On Mon, Dec 12, 2022 at 9:56 PM Yuyun Yang > wrote: > > > Hello team, > > > > > > > > I?m debugging my code using gdb. The program runs just fine if I don?t > > debug it, but when I use gdb, it seg faults at a place where it never > > experienced any seg fault when I debugged it 1-2 years ago. I wonder if > > this might be caused by the PETSc version change? > > > > The only PETSc calls are the MatGetOwnershipRange() calls, which have not > changed, so I think this is unlikely. > > > > Or something wrong with gdb itself? I?ve included the code block that is > > problematic for you to take a look at what might be wrong ? seg fault > > happens when this function is called. For context, Spmat is a class of > > sparse matrices in the code: > > > > What is the debugger output? > > Thanks, > > Matt > > > > // calculate the exact nonzero structure which results from the kronecker > > outer product of left and right > > > > > > // d_nnz = diagonal nonzero structure, o_nnz = off-diagonal nonzero > > structure > > > > void kronConvert_symbolic(const Spmat &left, const Spmat &right, Mat &mat, > > PetscInt* d_nnz, PetscInt* o_nnz) > > > > > > { > > > > > > size_t rightRowSize = right.size(1); > > > > > > size_t rightColSize = right.size(2); > > > > > > > > > > > > PetscInt Istart,Iend; // rows owned by current processor > > > > > > PetscInt Jstart,Jend; // cols owned by current processor > > > > > > > > > > > > // allocate space for mat > > > > > > MatGetOwnershipRange(mat,&Istart,&Iend); > > > > > > MatGetOwnershipRangeColumn(mat,&Jstart,&Jend); > > > > > > PetscInt m = Iend - Istart; > > > > > > > > > > > > for (int ii=0; ii > > > > > for (int ii=0; ii > > > > > > > > > > > // iterate over only nnz entries > > > > > > Spmat::const_row_iter IiL,IiR; > > > > > > Spmat::const_col_iter JjL,JjR; > > > > > > double valL=0, valR=0, val=0; > > > > > > PetscInt row,col; > > > > > > size_t rowL,colL,rowR,colR; > > > > > > > > > > // loop over all values in left > > > > > > for (IiL=left._mat.begin(); IiL!=left._mat.end(); IiL++) { > > > > > > for (JjL=(IiL->second).begin(); JjL!=(IiL->second).end(); JjL++) { > > > > > > rowL = IiL->first; > > > > > > colL = JjL->first; > > > > > > valL = JjL->second; > > > > > > if (valL==0) { continue; } > > > > > > > > > > > > // loop over all values in right > > > > > > for (IiR=right._mat.begin(); IiR!=right._mat.end(); IiR++) { > > > > > > for (JjR=(IiR->second).begin(); JjR!=(IiR->second).end(); JjR++) > > { > > > > rowR = IiR->first; > > > > > > colR = JjR->first; > > > > > > valR = JjR->second; > > > > > > > > > > > > // the new values and coordinates for the product matrix > > > > > > val = valL*valR; > > > > > > row = rowL*rightRowSize + rowR; > > > > > > col = colL*rightColSize + colR; > > > > > > > > > > > > PetscInt ii = row - Istart; // array index for d_nnz and o_nnz > > > > > > if (val!=0 && row >= Istart && row < Iend && col >= Jstart && > > col < Jend) { d_nnz[ii]++; \ > > > > } > > > > > > if ( (val!=0 && row >= Istart && row < Iend) && (col < Jstart > > || col >= Jend) ) { o_nnz[i\ > > > > i]++; } > > > > > > } > > > > > > } > > > > > > } > > > > > > } > > > > > > } > > > > > > > > > > > > > > > > Thank you, > > > > Yuyun > > > > > From guglielmo2 at llnl.gov Tue Dec 13 00:14:34 2022 From: guglielmo2 at llnl.gov (Guglielmo, Tyler Hardy) Date: Tue, 13 Dec 2022 06:14:34 +0000 Subject: [petsc-users] Saving solution with monitor function Message-ID: Hi all, I am a new PETSc user (and new to MPI in general), and was wondering if someone could help me out with what I am sure is a basic question (if this is not the appropriate email list or there is a better place please let me know!). Basically, I am writing a code that requires a solution to an ODE that will be used later on during runtime. I have written the basic ODE solver using TSRK, however I haven?t thought of a good way to store the actual solution at all time steps throughout the time evolution. I would like to avoid writing each time step to a file through the monitor function, and instead just plug each time step into an array. How is this usually done? I suppose the user defined struct that gets passed into the monitor function could contain a pointer to an array in main? This is how I would do this if the program wasn?t of the MPI variety, but I am not sure how to properly declare a pointer to an array declared as Vec and built through the usual PETSc process. Any tips are greatly appreciated! Thanks for your time, Tyler +++++++++++++++++++++++++++++ Tyler Guglielmo Postdoctoral Researcher Lawrence Livermore National Lab Office: 925-423-6186 Cell: 210-480-8000 +++++++++++++++++++++++++++++ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Dec 13 08:35:35 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 13 Dec 2022 09:35:35 -0500 Subject: [petsc-users] Seg fault in gdb but program runs In-Reply-To: References: Message-ID: On Tue, Dec 13, 2022 at 1:52 AM Yuyun Yang wrote: > Ok I?ll check that, thanks for taking a look! By the way, when I reduce > the domain size this error doesn?t appear anymore, so I don?t know whether > gdb just cannot handle the memory, and start to cut things off which is > causing the seg fault. > It is not the size, it is your arguments. See here mat=@0x555555927e10: 0x555557791bb0, diag=5, offDiag=0) Thanks, Matt > > > *From: *Matthew Knepley > *Date: *Tuesday, December 13, 2022 at 2:49 PM > *To: *Yuyun Yang > *Cc: *petsc-users > *Subject: *Re: [petsc-users] Seg fault in gdb but program runs > > On Tue, Dec 13, 2022 at 1:14 AM Yuyun Yang > wrote: > > Here is the error message: > > > > Program received signal SIGSEGV, Segmentation fault. > 0x00005555555e73b7 in kronConvert (left=..., right=..., > mat=@0x555555927e10: 0x555557791bb0, diag=5, offDiag=0) > at /home/yuyun/scycle-2/source/spmat.cpp:265 > 265 kronConvert_symbolic(left,right,mat,d_nnz,o_nnz); > > > > d_nnz and o_nnz are pointers, and they are supposed to hold arrays of the > number of nonzero in each row, > > You seem to be passing integers. > > > > Thanks, > > > > Matt > > > > On Tue, Dec 13, 2022 at 12:41 PM Matthew Knepley > wrote: > > On Mon, Dec 12, 2022 at 9:56 PM Yuyun Yang > wrote: > > Hello team, > > > > I?m debugging my code using gdb. The program runs just fine if I don?t > debug it, but when I use gdb, it seg faults at a place where it never > experienced any seg fault when I debugged it 1-2 years ago. I wonder if > this might be caused by the PETSc version change? > > > > The only PETSc calls are the MatGetOwnershipRange() calls, which have not > changed, so I think this is unlikely. > > > > Or something wrong with gdb itself? I?ve included the code block that is > problematic for you to take a look at what might be wrong ? seg fault > happens when this function is called. For context, Spmat is a class of > sparse matrices in the code: > > > > What is the debugger output? > > > > Thanks, > > > > Matt > > > > // calculate the exact nonzero structure which results from the kronecker > outer product of left and right > > > // d_nnz = diagonal nonzero structure, o_nnz = off-diagonal nonzero > structure > > void kronConvert_symbolic(const Spmat &left, const Spmat &right, Mat &mat, > PetscInt* d_nnz, PetscInt* o_nnz) > > > { > > > size_t rightRowSize = right.size(1); > > > size_t rightColSize = right.size(2); > > > > > > PetscInt Istart,Iend; // rows owned by current processor > > > PetscInt Jstart,Jend; // cols owned by current processor > > > > > > // allocate space for mat > > > MatGetOwnershipRange(mat,&Istart,&Iend); > > > MatGetOwnershipRangeColumn(mat,&Jstart,&Jend); > > > PetscInt m = Iend - Istart; > > > > > > for (int ii=0; ii > > for (int ii=0; ii > > > > > // iterate over only nnz entries > > > Spmat::const_row_iter IiL,IiR; > > > Spmat::const_col_iter JjL,JjR; > > > double valL=0, valR=0, val=0; > > > PetscInt row,col; > > > size_t rowL,colL,rowR,colR; > > > > > // loop over all values in left > > > for (IiL=left._mat.begin(); IiL!=left._mat.end(); IiL++) { > > > for (JjL=(IiL->second).begin(); JjL!=(IiL->second).end(); JjL++) { > > > rowL = IiL->first; > > > colL = JjL->first; > > > valL = JjL->second; > > > if (valL==0) { continue; } > > > > > > // loop over all values in right > > > for (IiR=right._mat.begin(); IiR!=right._mat.end(); IiR++) { > > > for (JjR=(IiR->second).begin(); JjR!=(IiR->second).end(); JjR++) > { > > rowR = IiR->first; > > > colR = JjR->first; > > > valR = JjR->second; > > > > > > // the new values and coordinates for the product matrix > > > val = valL*valR; > > > row = rowL*rightRowSize + rowR; > > > col = colL*rightColSize + colR; > > > > > > PetscInt ii = row - Istart; // array index for d_nnz and o_nnz > > > if (val!=0 && row >= Istart && row < Iend && col >= Jstart && > col < Jend) { d_nnz[ii]++; \ > > } > > > if ( (val!=0 && row >= Istart && row < Iend) && (col < Jstart > || col >= Jend) ) { o_nnz[i\ > > i]++; } > > > } > > > } > > > } > > > } > > > } > > > > > > > > Thank you, > > Yuyun > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Dec 13 08:41:07 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 13 Dec 2022 09:41:07 -0500 Subject: [petsc-users] Saving solution with monitor function In-Reply-To: References: Message-ID: On Tue, Dec 13, 2022 at 8:40 AM Guglielmo, Tyler Hardy via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi all, > > > > I am a new PETSc user (and new to MPI in general), and was wondering if > someone could help me out with what I am sure is a basic question (if this > is not the appropriate email list or there is a better place please let me > know!). > > > > Basically, I am writing a code that requires a solution to an ODE that > will be used later on during runtime. I have written the basic ODE solver > using TSRK, however I haven?t thought of a good way to store the actual > solution at all time steps throughout the time evolution. I would like to > avoid writing each time step to a file through the monitor function, and > instead just plug each time step into an array. > > > > How is this usually done? I suppose the user defined struct that gets > passed into the monitor function could contain a pointer to an array in > main? This is how I would do this if the program wasn?t of the MPI > variety, but I am not sure how to properly declare a pointer to an array > declared as Vec and built through the usual PETSc process. Any tips are > greatly appreciated > I think this is what TSTrajectory is for. I believe you want https://petsc.org/main/docs/manualpages/TS/TSTRAJECTORYMEMORY/ Thanks, Matt > Thanks for your time, > > Tyler > > > > +++++++++++++++++++++++++++++ > > Tyler Guglielmo > > Postdoctoral Researcher > > Lawrence Livermore National Lab > > Office: 925-423-6186 > > Cell: 210-480-8000 > > +++++++++++++++++++++++++++++ > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Dec 13 08:55:46 2022 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 13 Dec 2022 09:55:46 -0500 Subject: [petsc-users] parallelize matrix assembly process In-Reply-To: References: <5485EB43-B786-4764-949E-F2F21687DB32@petsc.dev> Message-ID: <86FF810C-994C-4262-AA58-41F5A8688443@petsc.dev> "MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 73239" The preallocation is VERY wrong. This is why the computation is so slow; this number should be zero. > On Dec 12, 2022, at 10:20 PM, ??? wrote: > > Following your comments, > I checked by using '-info'. > > As you suspected, most elements being computed on wrong MPI rank. > Also, there are a lot of stashed entries. > > > > Should I divide the domain from the problem define stage? > Or is a proper preallocation sufficient? > > > > [0] PetscCommDuplicate(): Duplicating a communicator 139687279637472 94370404729840 max tags = 2147483647 > > [1] PetscCommDuplicate(): Duplicating a communicator 139620736898016 94891084133376 max tags = 2147483647 > > [0] MatSetUp(): Warning not preallocating matrix storage > > [1] PetscCommDuplicate(): Duplicating a communicator 139620736897504 94891083133744 max tags = 2147483647 > > [0] PetscCommDuplicate(): Duplicating a communicator 139687279636960 94370403730224 max tags = 2147483647 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736898016 94891084133376 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279637472 94370404729840 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736898016 94891084133376 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279637472 94370404729840 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736898016 94891084133376 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279637472 94370404729840 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279637472 94370404729840 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736898016 94891084133376 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736898016 94891084133376 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279637472 94370404729840 > > TIME0 : 0.000000 > > TIME0 : 0.000000 > > [0] VecAssemblyBegin_MPI_BTS(): Stash has 661 entries, uses 8 mallocs. > > [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. > > [0] VecAssemblyBegin_MPI_BTS(): Stash has 661 entries, uses 5 mallocs. > > [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. > > [0] MatAssemblyBegin_MPIAIJ(): Stash has 460416 entries, uses 5 mallocs. > > [1] MatAssemblyBegin_MPIAIJ(): Stash has 461184 entries, uses 5 mallocs. > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 13892 X 13892; storage space: 180684 unneeded,987406 used > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 73242 > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81 > > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 13892) < 0.6. Do not use CompressedRow routines. > > [0] MatSeqAIJCheckInode(): Found 4631 nodes of 13892. Limit used: 5. Using Inode routines > > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 13891 X 13891; storage space: 180715 unneeded,987325 used > > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 73239 > > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81 > > [1] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 13891) < 0.6. Do not use CompressedRow routines. > > [1] MatSeqAIJCheckInode(): Found 4631 nodes of 13891. Limit used: 5. Using Inode routines > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 13892 X 1390; storage space: 72491 unneeded,34049 used > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 2472 > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 40 > > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 12501)/(num_localrows 13892) > 0.6. Use CompressedRow routines. > > Assemble Time : 174.079366sec > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 13891 X 1391; storage space: 72441 unneeded,34049 used > > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 2469 > > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 41 > > [1] MatCheckCompressedRow(): Found the ratio (num_zerorows 12501)/(num_localrows 13891) > 0.6. Use CompressedRow routines. > > Assemble Time : 174.141234sec > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [0] VecAssemblyBegin_MPI_BTS(): Stash has 13891 entries, uses 8 mallocs. > > [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. > > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 13891 X 13891; storage space: 0 unneeded,987325 used > > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81 > > [1] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 13891) < 0.6. Do not use CompressedRow routines. > > [0] PCSetUp(): Setting up PC for first time > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged > > Solving Time : 5.085394sec > > [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 1.258030470407e-17 is less than relative tolerance 1.000000000000e-05 times initial right hand side norm 2.579617304779e-03 at iteration 1 > > Solving Time : 5.089733sec > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [0] VecAssemblyBegin_MPI_BTS(): Stash has 661 entries, uses 5 mallocs. > > [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. > > [0] MatAssemblyBegin_MPIAIJ(): Stash has 460416 entries, uses 0 mallocs. > > [1] MatAssemblyBegin_MPIAIJ(): Stash has 461184 entries, uses 0 mallocs. > > Assemble Time : 5.242508sec > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > Assemble Time : 5.240863sec > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [0] VecAssemblyBegin_MPI_BTS(): Stash has 13891 entries, uses 0 mallocs. > > [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. > > > TIME : 1.000000, TIME_STEP : 1.000000, ITER : 2, RESIDUAL : 2.761615e-03 > > > TIME : 1.000000, TIME_STEP : 1.000000, ITER : 2, RESIDUAL : 2.761615e-03 > > [0] PCSetUp(): Setting up PC with same nonzero pattern > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged > > [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged > > [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 1.539725065974e-19 is less than relative tolerance 1.000000000000e-05 times initial right hand side norm 8.015104666105e-06 at iteration 1 > > Solving Time : 4.662785sec > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > Solving Time : 4.664515sec > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [0] VecAssemblyBegin_MPI_BTS(): Stash has 661 entries, uses 5 mallocs. > > [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. > > [1] MatAssemblyBegin_MPIAIJ(): Stash has 461184 entries, uses 0 mallocs. > > [0] MatAssemblyBegin_MPIAIJ(): Stash has 460416 entries, uses 0 mallocs. > > Assemble Time : 5.238257sec > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > [1] PetscCommDuplicate(): Using internal PETSc communicator 139620736897504 94891083133744 > > Assemble Time : 5.236535sec > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > [0] PetscCommDuplicate(): Using internal PETSc communicator 139687279636960 94370403730224 > > > TIME : 1.000000, TIME_STEP : 1.000000, ITER : 3, RESIDUAL : 3.705062e-08 > > TIME0 : 1.000000 > > [0] VecAssemblyBegin_MPI_BTS(): Stash has 13891 entries, uses 0 mallocs. > > [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. > > > TIME : 1.000000, TIME_STEP : 1.000000, ITER : 3, RESIDUAL : 3.705062e-08 > > TIME0 : 1.000000 > > [1] PetscFinalize(): PetscFinalize() called > > [0] VecAssemblyBegin_MPI_BTS(): Stash has 661 entries, uses 5 mallocs. > > [0] VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 mallocs. > > [0] PetscFinalize(): PetscFinalize() called > > > 2022? 12? 13? (?) ?? 12:50, Barry Smith >?? ??: >> >> The problem is possibly due to most elements being computed on "wrong" MPI rank and thus requiring almost all the matrix entries to be "stashed" when computed and then sent off to the owning MPI rank. Please send ALL the output of a parallel run with -info so we can see how much communication is done in the matrix assembly. >> >> Barry >> >> >> > On Dec 12, 2022, at 6:16 AM, ??? > wrote: >> > >> > Hello, >> > >> > >> > I need some keyword or some examples for parallelizing matrix assemble process. >> > >> > My current state is as below. >> > - Finite element analysis code for Structural mechanics. >> > - problem size : 3D solid hexa element (number of elements : 125,000), number of degree of freedom : 397,953 >> > - Matrix type : seqaij, matrix set preallocation by using MatSeqAIJSetPreallocation >> > - Matrix assemble time by using 1 core : 120 sec >> > for (int i=0; i<125000; i++) { >> > ~~ element matrix calculation} >> > matassemblybegin >> > matassemblyend >> > - Matrix assemble time by using 8 core : 70,234sec >> > int start, end; >> > VecGetOwnershipRange( element_vec, &start, &end); >> > for (int i=start; i> > ~~ element matrix calculation >> > matassemblybegin >> > matassemblyend >> > >> > >> > As you see the state, the parallel case spent a lot of time than sequential case.. >> > How can I speed up in this case? >> > Can I get some keyword or examples for parallelizing assembly of matrix in finite element analysis ? >> > >> > Thanks, >> > Hyung Kim >> > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Dec 13 09:57:03 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 13 Dec 2022 10:57:03 -0500 Subject: [petsc-users] dmplex normal vector incorrect for periodic gmsh grids In-Reply-To: <32720DD5-FA50-4766-A36A-7566913795B3@gmx.net> References: <32720DD5-FA50-4766-A36A-7566913795B3@gmx.net> Message-ID: On Tue, Dec 13, 2022 at 6:11 AM Praveen C wrote: > Hello > > In the attached test, I read a small grid made in gmsh with periodic bc. > > This is a 2d mesh. > > The cell numbers are shown in the figure. > > All faces have length = 2.5 > > But using PetscFVFaceGeom I am getting length of 7.5 for some faces. E.g., > > face: 59, centroid = 3.750000, 2.500000, normal = 0.000000, -7.500000 > > ===> Face length incorrect = 7.500000, should be 2.5 > > support[0] = 11, cent = 8.750000, 3.750000, area = 6.250000 > > support[1] = 15, cent = 8.750000, 1.250000, area = 6.250000 > > > There are also errors in the orientation of normal. > > If we disable periodicity in geo file, this error goes away. > Yes, by default we only localize coordinates for cells. I can put in code to localize faces. Thanks, Matt > Thanks > praveen > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Tue Dec 13 11:51:43 2022 From: hongzhang at anl.gov (Zhang, Hong) Date: Tue, 13 Dec 2022 17:51:43 +0000 Subject: [petsc-users] Saving solution with monitor function In-Reply-To: References: Message-ID: Tyler, The quickest solution is to use TSTrajectory as Matt mentioned. You can add the following command line options to save the solution into a binary file under a folder at each time step. -ts_save_trajectory -ts_trajectory_type visualization The folder name and the file name can be customized with -ts_trajectory_dirname and -ts_trajectory_file_template. If you want to load these files into Matlab, you can use some scripts in share/petsc/matlab/ such as PetscReadBinaryTrajectory.m and PetscBinaryRead.m. The python versions of these scripts are available in lib/petsc/bin/. Hong(Mr.) On Dec 13, 2022, at 12:14 AM, Guglielmo, Tyler Hardy via petsc-users > wrote: Hi all, I am a new PETSc user (and new to MPI in general), and was wondering if someone could help me out with what I am sure is a basic question (if this is not the appropriate email list or there is a better place please let me know!). Basically, I am writing a code that requires a solution to an ODE that will be used later on during runtime. I have written the basic ODE solver using TSRK, however I haven?t thought of a good way to store the actual solution at all time steps throughout the time evolution. I would like to avoid writing each time step to a file through the monitor function, and instead just plug each time step into an array. How is this usually done? I suppose the user defined struct that gets passed into the monitor function could contain a pointer to an array in main? This is how I would do this if the program wasn?t of the MPI variety, but I am not sure how to properly declare a pointer to an array declared as Vec and built through the usual PETSc process. Any tips are greatly appreciated! Thanks for your time, Tyler +++++++++++++++++++++++++++++ Tyler Guglielmo Postdoctoral Researcher Lawrence Livermore National Lab Office: 925-423-6186 Cell: 210-480-8000 +++++++++++++++++++++++++++++ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Dec 13 11:55:53 2022 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 13 Dec 2022 12:55:53 -0500 Subject: [petsc-users] Saving solution with monitor function In-Reply-To: References: Message-ID: <741C810C-C49B-453C-82F6-B3AC41C658D7@petsc.dev> It is also possible to read the solutions back from the trajectory object from your running code. It is not just for saving to files. > On Dec 13, 2022, at 12:51 PM, Zhang, Hong via petsc-users wrote: > > Tyler, > > The quickest solution is to use TSTrajectory as Matt mentioned. You can add the following command line options to save the solution into a binary file under a folder at each time step. > > -ts_save_trajectory -ts_trajectory_type visualization > > The folder name and the file name can be customized with -ts_trajectory_dirname and -ts_trajectory_file_template. > > If you want to load these files into Matlab, you can use some scripts in share/petsc/matlab/ such as PetscReadBinaryTrajectory.m and PetscBinaryRead.m. > > The python versions of these scripts are available in lib/petsc/bin/. > > Hong(Mr.) > >> On Dec 13, 2022, at 12:14 AM, Guglielmo, Tyler Hardy via petsc-users > wrote: >> >> Hi all, >> >> I am a new PETSc user (and new to MPI in general), and was wondering if someone could help me out with what I am sure is a basic question (if this is not the appropriate email list or there is a better place please let me know!). >> >> Basically, I am writing a code that requires a solution to an ODE that will be used later on during runtime. I have written the basic ODE solver using TSRK, however I haven?t thought of a good way to store the actual solution at all time steps throughout the time evolution. I would like to avoid writing each time step to a file through the monitor function, and instead just plug each time step into an array. >> >> How is this usually done? I suppose the user defined struct that gets passed into the monitor function could contain a pointer to an array in main? This is how I would do this if the program wasn?t of the MPI variety, but I am not sure how to properly declare a pointer to an array declared as Vec and built through the usual PETSc process. Any tips are greatly appreciated! >> >> Thanks for your time, >> Tyler >> >> +++++++++++++++++++++++++++++ >> Tyler Guglielmo >> Postdoctoral Researcher >> Lawrence Livermore National Lab >> Office: 925-423-6186 >> Cell: 210-480-8000 >> +++++++++++++++++++++++++++++ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guglielmo2 at llnl.gov Tue Dec 13 12:06:05 2022 From: guglielmo2 at llnl.gov (Guglielmo, Tyler Hardy) Date: Tue, 13 Dec 2022 18:06:05 +0000 Subject: [petsc-users] Saving solution with monitor function In-Reply-To: <741C810C-C49B-453C-82F6-B3AC41C658D7@petsc.dev> References: <741C810C-C49B-453C-82F6-B3AC41C658D7@petsc.dev> Message-ID: Thanks guys, Yes, this looks like exactly what I need. Appreciate everyone?s help! Best, Tyler From: Barry Smith Date: Tuesday, December 13, 2022 at 9:56 AM To: hongzhang at ANL.GOV Cc: Guglielmo, Tyler Hardy , PETSc users list Subject: Re: [petsc-users] Saving solution with monitor function It is also possible to read the solutions back from the trajectory object from your running code. It is not just for saving to files. On Dec 13, 2022, at 12:51 PM, Zhang, Hong via petsc-users wrote: Tyler, The quickest solution is to use TSTrajectory as Matt mentioned. You can add the following command line options to save the solution into a binary file under a folder at each time step. -ts_save_trajectory -ts_trajectory_type visualization The folder name and the file name can be customized with -ts_trajectory_dirname and -ts_trajectory_file_template. If you want to load these files into Matlab, you can use some scripts in share/petsc/matlab/ such as PetscReadBinaryTrajectory.m and PetscBinaryRead.m. The python versions of these scripts are available in lib/petsc/bin/. Hong(Mr.) On Dec 13, 2022, at 12:14 AM, Guglielmo, Tyler Hardy via petsc-users > wrote: Hi all, I am a new PETSc user (and new to MPI in general), and was wondering if someone could help me out with what I am sure is a basic question (if this is not the appropriate email list or there is a better place please let me know!). Basically, I am writing a code that requires a solution to an ODE that will be used later on during runtime. I have written the basic ODE solver using TSRK, however I haven?t thought of a good way to store the actual solution at all time steps throughout the time evolution. I would like to avoid writing each time step to a file through the monitor function, and instead just plug each time step into an array. How is this usually done? I suppose the user defined struct that gets passed into the monitor function could contain a pointer to an array in main? This is how I would do this if the program wasn?t of the MPI variety, but I am not sure how to properly declare a pointer to an array declared as Vec and built through the usual PETSc process. Any tips are greatly appreciated! Thanks for your time, Tyler +++++++++++++++++++++++++++++ Tyler Guglielmo Postdoctoral Researcher Lawrence Livermore National Lab Office: 925-423-6186 Cell: 210-480-8000 +++++++++++++++++++++++++++++ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Dec 13 13:21:49 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 13 Dec 2022 14:21:49 -0500 Subject: [petsc-users] dmplex normal vector incorrect for periodic gmsh grids In-Reply-To: References: <32720DD5-FA50-4766-A36A-7566913795B3@gmx.net> Message-ID: On Tue, Dec 13, 2022 at 10:57 AM Matthew Knepley wrote: > On Tue, Dec 13, 2022 at 6:11 AM Praveen C wrote: > >> Hello >> >> In the attached test, I read a small grid made in gmsh with periodic bc. >> >> This is a 2d mesh. >> >> The cell numbers are shown in the figure. >> >> All faces have length = 2.5 >> >> But using PetscFVFaceGeom I am getting length of 7.5 for some faces. >> E.g., >> >> face: 59, centroid = 3.750000, 2.500000, normal = 0.000000, -7.500000 >> >> ===> Face length incorrect = 7.500000, should be 2.5 >> >> support[0] = 11, cent = 8.750000, 3.750000, area = 6.250000 >> >> support[1] = 15, cent = 8.750000, 1.250000, area = 6.250000 >> >> >> There are also errors in the orientation of normal. >> >> If we disable periodicity in geo file, this error goes away. >> > > Yes, by default we only localize coordinates for cells. I can put in code > to localize faces. > Okay, I now have a MR for this: https://gitlab.com/petsc/petsc/-/merge_requests/5917 I am attaching your code, slightly modified. You can run ./dmplex -malloc_debug 0 -dm_plex_box_upper 10,10 -dm_plex_box_faces 4,4 -dm_plex_simplex 0 -dm_view ::ascii_info_detail -draw_pause 3 -dm_plex_box_bd periodic,periodic -dm_localize_height 0 which shows incorrect edges and ./dmplex -malloc_debug 0 -dm_plex_box_upper 10,10 -dm_plex_box_faces 4,4 -dm_plex_simplex 0 -dm_view ::ascii_info_detail -draw_pause 3 -dm_plex_box_bd periodic,periodic -dm_localize_height 1 which is correct. If you want to control things yourself, instead of using the option you can call DMPlexSetMaxProjectionHeight() on the coordinate DM yourself. Thanks, Matt > Thanks, > > Matt > > >> Thanks >> praveen >> > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: dmplex.c Type: application/octet-stream Size: 5118 bytes Desc: not available URL: From bourdin at mcmaster.ca Tue Dec 13 16:15:24 2022 From: bourdin at mcmaster.ca (Blaise Bourdin) Date: Tue, 13 Dec 2022 22:15:24 +0000 Subject: [petsc-users] GAMG and linearized elasticity Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: gamg_agg.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ksp_view_3.3.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ksp_view_main.txt URL: From jed at jedbrown.org Tue Dec 13 21:37:32 2022 From: jed at jedbrown.org (Jed Brown) Date: Tue, 13 Dec 2022 20:37:32 -0700 Subject: [petsc-users] GAMG and linearized elasticity In-Reply-To: References: Message-ID: <87r0x2mz9f.fsf@jedbrown.org> Do you have slip/symmetry boundary conditions, where some components are constrained? In that case, there is no uniform block size and I think you'll need DMPlexCreateRigidBody() and MatSetNearNullSpace(). The PCSetCoordinates() code won't work for non-constant block size. -pc_type gamg should work okay out of the box for elasticity. For hypre, I've had good luck with this options suite, which also runs on GPU. -pc_type hypre -pc_hypre_boomeramg_coarsen_type pmis -pc_hypre_boomeramg_interp_type ext+i -pc_hypre_boomeramg_no_CF -pc_hypre_boomeramg_P_max 6 -pc_hypre_boomeramg_relax_type_down Chebyshev -pc_hypre_boomeramg_relax_type_up Chebyshev -pc_hypre_boomeramg_strong_threshold 0.5 Blaise Bourdin writes: > Hi, > > I am getting close to finish porting a code from petsc 3.3 / sieve to main / dmplex, but am > now encountering difficulties > I am reasonably sure that the Jacobian and residual are correct. The codes handle boundary > conditions differently (MatZeroRowCols vs dmplex constraints) so it is not trivial to compare > them. Running with snes_type ksponly pc_type Jacobi or hyper gives me the same results in > roughly the same number of iterations. > > In my old code, gamg would work out of the box. When using petsc-main, -pc_type gamg - > pc_gamg_type agg works for _some_ problems using P1-Lagrange elements, but never for > P2-Lagrange. The typical error message is in gamg_agg.txt > > When using -pc_type classical, a problem where the KSP would converge in 47 iteration in > 3.3 now takes 1400. ksp_view_3.3.txt and ksp_view_main.txt show the output of -ksp_view > for both versions. I don?t notice anything obvious. > > Strangely, removing the call to PCSetCoordinates does not have any impact on the > convergence. > > I am sure that I am missing something, or not passing the right options. What?s a good > starting point for 3D elasticity? > Regards, > Blaise > > ? > Canada Research Chair in Mathematical and Computational Aspects of Solid Mechanics > (Tier 1) > Professor, Department of Mathematics & Statistics > Hamilton Hall room 409A, McMaster University > 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada > https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243 > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Petsc has generated inconsistent data > [0]PETSC ERROR: Computed maximum singular value as zero > [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could be the program crashed before they were used or a spelling mistake, etc! > [0]PETSC ERROR: Option left: name:-displacement_ksp_converged_reason value: ascii source: file > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.18.2-341-g16200351da0 GIT Date: 2022-12-12 23:42:20 +0000 > [0]PETSC ERROR: /home/bourdinb/Development/mef90/mef90-dmplex/bbserv-gcc11.2.1-mvapich2-2.3.7-O/bin/ThermoElasticity on a bbserv-gcc11.2.1-mvapich2-2.3.7-O named bb01 by bourdinb Tue Dec 13 17:02:19 2022 > [0]PETSC ERROR: Configure options --CFLAGS=-Wunused --FFLAGS="-ffree-line-length-none -fallow-argument-mismatch -Wunused" --COPTFLAGS="-O2 -march=znver2" --CXXOPTFLAGS="-O2 -march=znver2" --FOPTFLAGS="-O2 -march=znver2" --download-chaco=1 --download-exodusii=1 --download-fblaslapack=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 --download-netcdf=1 --download-p4est=1 --download-parmetis=1 --download-pnetcdf=1 --download-scalapack=1 --download-sowing=1 --download-sowing-cc=/opt/rh/devtoolset-9/root/usr/bin/gcc --download-sowing-cxx=/opt/rh/devtoolset-9/root/usr/bin/g++ --download-sowing-cpp=/opt/rh/devtoolset-9/root/usr/bin/cpp --download-sowing-cxxcpp=/opt/rh/devtoolset-9/root/usr/bin/cpp --download-superlu=1 --download-triangle=1 --download-yaml=1 --download-zlib=1 --with-debugging=0 --with-mpi-dir=/opt/HPC/mvapich2/2.3.7-gcc11.2.1 --with-pic --with-shared-libraries=1 --with-mpiexec=srun --with-x11=0 > [0]PETSC ERROR: #1 PCGAMGOptProlongator_AGG() at /1/HPC/petsc/main/src/ksp/pc/impls/gamg/agg.c:779 > [0]PETSC ERROR: #2 PCSetUp_GAMG() at /1/HPC/petsc/main/src/ksp/pc/impls/gamg/gamg.c:639 > [0]PETSC ERROR: #3 PCSetUp() at /1/HPC/petsc/main/src/ksp/pc/interface/precon.c:994 > [0]PETSC ERROR: #4 KSPSetUp() at /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:405 > [0]PETSC ERROR: #5 KSPSolve_Private() at /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:824 > [0]PETSC ERROR: #6 KSPSolve() at /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:1070 > [0]PETSC ERROR: #7 SNESSolve_KSPONLY() at /1/HPC/petsc/main/src/snes/impls/ksponly/ksponly.c:48 > [0]PETSC ERROR: #8 SNESSolve() at /1/HPC/petsc/main/src/snes/interface/snes.c:4693 > [0]PETSC ERROR: #9 /home/bourdinb/Development/mef90/mef90-dmplex/ThermoElasticity/ThermoElasticity.F90:228 > Linear solve converged due to CONVERGED_RTOL iterations 46 > KSP Object:(Disp_) 32 MPI processes > type: cg > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-08, divergence=1e+10 > left preconditioning > using nonzero initial guess > using PRECONDITIONED norm type for convergence test > PC Object:(Disp_) 32 MPI processes > type: gamg > MG: type is MULTIPLICATIVE, levels=4 cycles=v > Cycles per PCApply=1 > Using Galerkin computed coarse grid matrices > Coarse grid solver -- level ------------------------------- > KSP Object: (Disp_mg_coarse_) 32 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_) 32 MPI processes > type: bjacobi > block Jacobi: number of blocks = 32 > Local solve info for each block is in the following KSP and PC objects: > [0] number of local blocks = 1, first local block number = 0 > [0] local block number 0 > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 1.06061 > Factored matrix follows: > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Matrix Object: 1 MPI processes > type: seqaij > rows=54, cols=54, bs=6 > package used to perform factorization: petsc > total: nonzeros=1260, allocated nonzeros=1260 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 16 nodes, limit used is 5 > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=54, cols=54, bs=6 > total: nonzeros=1188, allocated nonzeros=1188 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 17 nodes, limit used is 5 > - - - - - - - - - - - - - - - - - - > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > [1] number of local blocks = 1, first local block number = 1 > [1] local block number 0 > - - - - - - - - - - - - - - - - - - > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > [2] number of local blocks = 1, first local block number = 2 > [2] local block number 0 > - - - - - - - - - - - - - - - - - - > [3] number of local blocks = 1, first local block number = 3 > [3] local block number 0 > - - - - - - - - - - - - - - - - - - > [4] number of local blocks = 1, first local block number = 4 > [4] local block number 0 > - - - - - - - - - - - - - - - - - - > [5] number of local blocks = 1, first local block number = 5 > [5] local block number 0 > - - - - - - - - - - - - - - - - - - > [6] number of local blocks = 1, first local block number = 6 > [6] local block number 0 > - - - - - - - - - - - - - - - - - - > [7] number of local blocks = 1, first local block number = 7 > [7] local block number 0 > - - - - - - - - - - - - - - - - - - > [8] number of local blocks = 1, first local block number = 8 > [8] local block number 0 > - - - - - - - - - - - - - - - - - - > [9] number of local blocks = 1, first local block number = 9 > [9] local block number 0 > - - - - - - - - - - - - - - - - - - > [10] number of local blocks = 1, first local block number = 10 > [10] local block number 0 > - - - - - - - - - - - - - - - - - - > [11] number of local blocks = 1, first local block number = 11 > [11] local block number 0 > - - - - - - - - - - - - - - - - - - > [12] number of local blocks = 1, first local block number = 12 > [12] local block number 0 > - - - - - - - - - - - - - - - - - - > [13] number of local blocks = 1, first local block number = 13 > [13] local block number 0 > - - - - - - - - - - - - - - - - - - > [14] number of local blocks = 1, first local block number = 14 > [14] local block number 0 > - - - - - - - - - - - - - - - - - - > [15] number of local blocks = 1, first local block number = 15 > [15] local block number 0 > - - - - - - - - - - - - - - - - - - > [16] number of local blocks = 1, first local block number = 16 > [16] local block number 0 > - - - - - - - - - - - - - - - - - - > [17] number of local blocks = 1, first local block number = 17 > [17] local block number 0 > - - - - - - - - - - - - - - - - - - > [18] number of local blocks = 1, first local block number = 18 > [18] local block number 0 > - - - - - - - - - - - - - - - - - - > [19] number of local blocks = 1, first local block number = 19 > [19] local block number 0 > - - - - - - - - - - - - - - - - - - > [20] number of local blocks = 1, first local block number = 20 > [20] local block number 0 > - - - - - - - - - - - - - - - - - - > [21] number of local blocks = 1, first local block number = 21 > [21] local block number 0 > - - - - - - - - - - - - - - - - - - > [22] number of local blocks = 1, first local block number = 22 > [22] local block number 0 > - - - - - - - - - - - - - - - - - - > [23] number of local blocks = 1, first local block number = 23 > [23] local block number 0 > - - - - - - - - - - - - - - - - - - > [24] number of local blocks = 1, first local block number = 24 > [24] local block number 0 > - - - - - - - - - - - - - - - - - - > [25] number of local blocks = 1, first local block number = 25 > [25] local block number 0 > - - - - - - - - - - - - - - - - - - > [26] number of local blocks = 1, first local block number = 26 > [26] local block number 0 > - - - - - - - - - - - - - - - - - - > [27] number of local blocks = 1, first local block number = 27 > [27] local block number 0 > - - - - - - - - - - - - - - - - - - > [28] number of local blocks = 1, first local block number = 28 > [28] local block number 0 > - - - - - - - - - - - - - - - - - - > [29] number of local blocks = 1, first local block number = 29 > [29] local block number 0 > - - - - - - - - - - - - - - - - - - > [30] number of local blocks = 1, first local block number = 30 > [30] local block number 0 > - - - - - - - - - - - - - - - - - - > [31] number of local blocks = 1, first local block number = 31 > [31] local block number 0 > - - - - - - - - - - - - - - - - - - > linear system matrix = precond matrix: > Matrix Object: 32 MPI processes > type: mpiaij > rows=54, cols=54, bs=6 > total: nonzeros=1188, allocated nonzeros=1188 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 17 nodes, limit used is 5 > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object: (Disp_mg_levels_1_) 32 MPI processes > type: chebyshev > Chebyshev: eigenvalue estimates: min = 0.101023, max = 2.13327 > maximum iterations=2 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (Disp_mg_levels_1_) 32 MPI processes > type: jacobi > linear system matrix = precond matrix: > Matrix Object: 32 MPI processes > type: mpiaij > rows=1086, cols=1086, bs=6 > total: nonzeros=67356, allocated nonzeros=67356 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 362 nodes, limit used is 5 > Up solver (post-smoother) same as down solver (pre-smoother) > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object: (Disp_mg_levels_2_) 32 MPI processes > type: chebyshev > Chebyshev: eigenvalue estimates: min = 0.0996526, max = 2.29388 > maximum iterations=2 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (Disp_mg_levels_2_) 32 MPI processes > type: jacobi > linear system matrix = precond matrix: > Matrix Object: 32 MPI processes > type: mpiaij > rows=23808, cols=23808, bs=6 > total: nonzeros=1976256, allocated nonzeros=1976256 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 7936 nodes, limit used is 5 > Up solver (post-smoother) same as down solver (pre-smoother) > Down solver (pre-smoother) on level 3 ------------------------------- > KSP Object: (Disp_mg_levels_3_) 32 MPI processes > type: chebyshev > Chebyshev: eigenvalue estimates: min = 0.165968, max = 2.13065 > maximum iterations=2 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (Disp_mg_levels_3_) 32 MPI processes > type: jacobi > linear system matrix = precond matrix: > Matrix Object: (Disp_) 32 MPI processes > type: mpiaij > rows=291087, cols=291087 > total: nonzeros=12323691, allocated nonzeros=12336696 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 3419 nodes, limit used is 5 > Up solver (post-smoother) same as down solver (pre-smoother) > linear system matrix = precond matrix: > Matrix Object: (Disp_) 32 MPI processes > type: mpiaij > rows=291087, cols=291087 > total: nonzeros=12323691, allocated nonzeros=12336696 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 3419 nodes, limit used is 5 > SNESConvergedReason returned 5 > KSP Object: (Displacement_) 32 MPI processes > type: cg > maximum iterations=10000, nonzero initial guess > tolerances: relative=1e-05, absolute=1e-08, divergence=1e+10 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (Displacement_) 32 MPI processes > type: gamg > type is MULTIPLICATIVE, levels=4 cycles=v > Cycles per PCApply=1 > Using externally compute Galerkin coarse grid matrices > GAMG specific options > Threshold for dropping small values in graph on each level = -1. -1. -1. -1. > Threshold scaling factor for each level not specified = 1. > Complexity: grid = 1.02128 operator = 1.05534 > Coarse grid solver -- level 0 ------------------------------- > KSP Object: (Displacement_mg_coarse_) 32 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (Displacement_mg_coarse_) 32 MPI processes > type: bjacobi > number of blocks = 32 > Local solver information for first block is in the following KSP and PC objects on rank 0: > Use -Displacement_mg_coarse_ksp_view ::ascii_info_detail to display information for all blocks > KSP Object: (Displacement_mg_coarse_sub_) 1 MPI process > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (Displacement_mg_coarse_sub_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > matrix ordering: nd > factor fill ratio given 5., needed 1.08081 > Factored matrix follows: > Mat Object: (Displacement_mg_coarse_sub_) 1 MPI process > type: seqaij > rows=20, cols=20 > package used to perform factorization: petsc > total: nonzeros=214, allocated nonzeros=214 > using I-node routines: found 8 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: (Displacement_mg_coarse_sub_) 1 MPI process > type: seqaij > rows=20, cols=20 > total: nonzeros=198, allocated nonzeros=198 > total number of mallocs used during MatSetValues calls=0 > using I-node routines: found 13 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: 32 MPI processes > type: mpiaij > rows=20, cols=20 > total: nonzeros=198, allocated nonzeros=198 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 13 nodes, limit used is 5 > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object: (Displacement_mg_levels_1_) 32 MPI processes > type: chebyshev > eigenvalue targets used: min 0.81922, max 9.01143 > eigenvalues estimated via gmres: min 0.186278, max 8.1922 > eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1] > KSP Object: (Displacement_mg_levels_1_esteig_) 32 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > estimating eigenvalues using noisy right hand side > maximum iterations=2, nonzero initial guess > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (Displacement_mg_levels_1_) 32 MPI processes > type: jacobi > type DIAGONAL > linear system matrix = precond matrix: > Mat Object: 32 MPI processes > type: mpiaij > rows=799, cols=799 > total: nonzeros=83159, allocated nonzeros=83159 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 23 nodes, limit used is 5 > Up solver (post-smoother) same as down solver (pre-smoother) > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object: (Displacement_mg_levels_2_) 32 MPI processes > type: chebyshev > eigenvalue targets used: min 1.16291, max 12.792 > eigenvalues estimated via gmres: min 0.27961, max 11.6291 > eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1] > KSP Object: (Displacement_mg_levels_2_esteig_) 32 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > estimating eigenvalues using noisy right hand side > maximum iterations=2, nonzero initial guess > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (Displacement_mg_levels_2_) 32 MPI processes > type: jacobi > type DIAGONAL > linear system matrix = precond matrix: > Mat Object: 32 MPI processes > type: mpiaij > rows=45721, cols=45721 > total: nonzeros=9969661, allocated nonzeros=9969661 > total number of mallocs used during MatSetValues calls=0 > using nonscalable MatPtAP() implementation > not using I-node (on process 0) routines > Up solver (post-smoother) same as down solver (pre-smoother) > Down solver (pre-smoother) on level 3 ------------------------------- > KSP Object: (Displacement_mg_levels_3_) 32 MPI processes > type: chebyshev > eigenvalue targets used: min 0.281318, max 3.0945 > eigenvalues estimated via gmres: min 0.0522027, max 2.81318 > eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1] > KSP Object: (Displacement_mg_levels_3_esteig_) 32 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > estimating eigenvalues using noisy right hand side > maximum iterations=2, nonzero initial guess > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (Displacement_mg_levels_3_) 32 MPI processes > type: jacobi > type DIAGONAL > linear system matrix = precond matrix: > Mat Object: (Displacement_) 32 MPI processes > type: mpiaij > rows=2186610, cols=2186610, bs=3 > total: nonzeros=181659996, allocated nonzeros=181659996 > total number of mallocs used during MatSetValues calls=0 > has attached near null space > using I-node (on process 0) routines: found 21368 nodes, limit used is 5 > Up solver (post-smoother) same as down solver (pre-smoother) > linear system matrix = precond matrix: > Mat Object: (Displacement_) 32 MPI processes > type: mpiaij > rows=2186610, cols=2186610, bs=3 > total: nonzeros=181659996, allocated nonzeros=181659996 > total number of mallocs used during MatSetValues calls=0 > has attached near null space > using I-node (on process 0) routines: found 21368 nodes, limit used is 5 > cell set 1 elastic energy: 9.32425E-02 work: 1.86485E-01 total: -9.32425E-02 From praveen at gmx.net Wed Dec 14 01:37:53 2022 From: praveen at gmx.net (Praveen C) Date: Wed, 14 Dec 2022 13:07:53 +0530 Subject: [petsc-users] dmplex normal vector incorrect for periodic gmsh grids In-Reply-To: References: <32720DD5-FA50-4766-A36A-7566913795B3@gmx.net> Message-ID: Thank you, this MR works if I generate the mesh within the code. But using a mesh made in gmsh, I see the same issue. Thanks praveen > On 14-Dec-2022, at 12:51 AM, Matthew Knepley wrote: > > On Tue, Dec 13, 2022 at 10:57 AM Matthew Knepley > wrote: >> On Tue, Dec 13, 2022 at 6:11 AM Praveen C > wrote: >>> Hello >>> >>> In the attached test, I read a small grid made in gmsh with periodic bc. >>> >>> This is a 2d mesh. >>> >>> The cell numbers are shown in the figure. >>> >>> All faces have length = 2.5 >>> >>> But using PetscFVFaceGeom I am getting length of 7.5 for some faces. E.g., >>> >>> face: 59, centroid = 3.750000, 2.500000, normal = 0.000000, -7.500000 >>> ===> Face length incorrect = 7.500000, should be 2.5 >>> support[0] = 11, cent = 8.750000, 3.750000, area = 6.250000 >>> support[1] = 15, cent = 8.750000, 1.250000, area = 6.250000 >>> >>> There are also errors in the orientation of normal. >>> >>> If we disable periodicity in geo file, this error goes away. >> >> Yes, by default we only localize coordinates for cells. I can put in code to localize faces. > > Okay, I now have a MR for this: https://gitlab.com/petsc/petsc/-/merge_requests/5917 > > I am attaching your code, slightly modified. You can run > > ./dmplex -malloc_debug 0 -dm_plex_box_upper 10,10 -dm_plex_box_faces 4,4 -dm_plex_simplex 0 -dm_view ::ascii_info_detail -draw_pause 3 -dm_plex_box_bd periodic,periodic -dm_localize_height 0 > > which shows incorrect edges and > > ./dmplex -malloc_debug 0 -dm_plex_box_upper 10,10 -dm_plex_box_faces 4,4 -dm_plex_simplex 0 -dm_view ::ascii_info_detail -draw_pause 3 -dm_plex_box_bd periodic,periodic -dm_localize_height 1 > > which is correct. If you want to control things yourself, instead of using the option you can call DMPlexSetMaxProjectionHeight() on the coordinate DM yourself. > > Thanks, > > Matt > >> Thanks, >> >> Matt >> >>> Thanks >>> praveen >> -- >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From berend.vanwachem at ovgu.de Wed Dec 14 02:57:49 2022 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Wed, 14 Dec 2022 09:57:49 +0100 Subject: [petsc-users] reading and writing periodic DMPlex to file Message-ID: <83e2b092-5440-e009-ef84-dfde3ff6804d@ovgu.de> Dear PETSc team and users, I have asked a few times about this before, but we haven't really gotten this to work yet. In our code, we use the DMPlex framework and are also interested in periodic geometries. As our simulations typically require many time-steps, we would like to be able to save the DM to file and to read it again to resume the simulation (a restart). Although this works for a non-periodic DM, we haven't been able to get this to work for a periodic one. To illustrate this, I have made a working example, consisting of 2 files, createandwrite.c and readandcreate.c. I have attached these 2 working examples. We are using Petsc-3.18.2. In the first file (createandwrite.c) a DMPlex is created and written to a file. Periodicity is activated on lines 52-55 of the code. In the second file (readandcreate.c) a DMPlex is read from the file. When a periodic DM is read, this does not work. Also, trying to 'enforce' periodicity, lines 55 - 66, does not work if the number of processes is larger than 1 - the code "hangs" without producing an error. Could you indicate what I am missing? I have really tried many different options, without finding a solution. Many thanks and kind regards, Berend. -------------- next part -------------- A non-text attachment was scrubbed... Name: createandwrite.c Type: text/x-csrc Size: 7721 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: readandcreate.c Type: text/x-csrc Size: 5379 bytes Desc: not available URL: From berend.vanwachem at ovgu.de Wed Dec 14 03:27:45 2022 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Wed, 14 Dec 2022 10:27:45 +0100 Subject: [petsc-users] reading and writing periodic DMPlex to file Message-ID: <951f341e-9fa3-a661-2cc9-9745345b1a4c@ovgu.de> Dear PETSc team and users, I have asked a few times about this before, but we haven't really gotten this to work yet. In our code, we use the DMPlex framework and are also interested in periodic geometries. As our simulations typically require many time-steps, we would like to be able to save the DM to file and to read it again to resume the simulation (a restart). Although this works for a non-periodic DM, we haven't been able to get this to work for a periodic one. To illustrate this, I have made a working example, consisting of 2 files, createandwrite.c and readandcreate.c. I have attached these 2 working examples. We are using Petsc-3.18.2. In the first file (createandwrite.c) a DMPlex is created and written to a file. Periodicity is activated on lines 52-55 of the code. In the second file (readandcreate.c) a DMPlex is read from the file. When a periodic DM is read, this does not work. Also, trying to 'enforce' periodicity, lines 55 - 66, does not work if the number of processes is larger than 1 - the code "hangs" without producing an error. Could you indicate what I am missing? I have really tried many different options, without finding a solution. Many thanks and kind regards, Berend. -------------- next part -------------- A non-text attachment was scrubbed... Name: createandwrite.c Type: text/x-csrc Size: 7721 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: readandcreate.c Type: text/x-csrc Size: 5379 bytes Desc: not available URL: From knepley at gmail.com Wed Dec 14 07:10:03 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 14 Dec 2022 08:10:03 -0500 Subject: [petsc-users] dmplex normal vector incorrect for periodic gmsh grids In-Reply-To: References: <32720DD5-FA50-4766-A36A-7566913795B3@gmx.net> Message-ID: On Wed, Dec 14, 2022 at 2:38 AM Praveen C wrote: > Thank you, this MR works if I generate the mesh within the code. > > But using a mesh made in gmsh, I see the same issue. > Because the operations happen in a different order. If you read in your mesh with -dm_plex_filename mymesh.gmsh -dm_localize_height 1 then it should work. Thanks, Matt > Thanks > praveen > > On 14-Dec-2022, at 12:51 AM, Matthew Knepley wrote: > > On Tue, Dec 13, 2022 at 10:57 AM Matthew Knepley > wrote: > >> On Tue, Dec 13, 2022 at 6:11 AM Praveen C wrote: >> >>> Hello >>> >>> In the attached test, I read a small grid made in gmsh with periodic bc. >>> >>> This is a 2d mesh. >>> >>> The cell numbers are shown in the figure. >>> >>> All faces have length = 2.5 >>> >>> But using PetscFVFaceGeom I am getting length of 7.5 for some faces. >>> E.g., >>> >>> face: 59, centroid = 3.750000, 2.500000, normal = 0.000000, -7.500000 >>> ===> Face length incorrect = 7.500000, should be 2.5 >>> support[0] = 11, cent = 8.750000, 3.750000, area = 6.250000 >>> support[1] = 15, cent = 8.750000, 1.250000, area = 6.250000 >>> >>> There are also errors in the orientation of normal. >>> >>> If we disable periodicity in geo file, this error goes away. >>> >> >> Yes, by default we only localize coordinates for cells. I can put in code >> to localize faces. >> > > Okay, I now have a MR for this: > https://gitlab.com/petsc/petsc/-/merge_requests/5917 > > I am attaching your code, slightly modified. You can run > > ./dmplex -malloc_debug 0 -dm_plex_box_upper 10,10 -dm_plex_box_faces 4,4 > -dm_plex_simplex 0 -dm_view ::ascii_info_detail -draw_pause 3 > -dm_plex_box_bd periodic,periodic -dm_localize_height 0 > > which shows incorrect edges and > > ./dmplex -malloc_debug 0 -dm_plex_box_upper 10,10 -dm_plex_box_faces 4,4 > -dm_plex_simplex 0 -dm_view ::ascii_info_detail -draw_pause 3 > -dm_plex_box_bd periodic,periodic -dm_localize_height 1 > > which is correct. If you want to control things yourself, instead of using > the option you can call DMPlexSetMaxProjectionHeight() on the coordinate DM > yourself. > > Thanks, > > Matt > > >> Thanks, >> >> Matt >> >> >>> Thanks >>> praveen >>> >> -- >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourdin at mcmaster.ca Wed Dec 14 08:18:40 2022 From: bourdin at mcmaster.ca (Blaise Bourdin) Date: Wed, 14 Dec 2022 14:18:40 +0000 Subject: [petsc-users] GAMG and linearized elasticity In-Reply-To: <87r0x2mz9f.fsf@jedbrown.org> References: <87r0x2mz9f.fsf@jedbrown.org> Message-ID: <66A2912A-A271-4EDD-8ACD-E1D1902C8112@mcmaster.ca> An HTML attachment was scrubbed... URL: From praveen at gmx.net Wed Dec 14 08:20:09 2022 From: praveen at gmx.net (Praveen C) Date: Wed, 14 Dec 2022 19:50:09 +0530 Subject: [petsc-users] dmplex normal vector incorrect for periodic gmsh grids In-Reply-To: References: <32720DD5-FA50-4766-A36A-7566913795B3@gmx.net> Message-ID: Hello I tried this with the attached code but I still get wrong normals. ./dmplex -dm_plex_filename ug_periodic.msh -dm_localize_height 1 Code and grid file are attached Thank you praveen ?? > On 14-Dec-2022, at 6:40 PM, Matthew Knepley wrote: > > On Wed, Dec 14, 2022 at 2:38 AM Praveen C > wrote: >> Thank you, this MR works if I generate the mesh within the code. >> >> But using a mesh made in gmsh, I see the same issue. > > Because the operations happen in a different order. If you read in your mesh with > > -dm_plex_filename mymesh.gmsh -dm_localize_height 1 > > then it should work. > > Thanks, > > Matt > >> Thanks >> praveen >> >>> On 14-Dec-2022, at 12:51 AM, Matthew Knepley > wrote: >>> >>> On Tue, Dec 13, 2022 at 10:57 AM Matthew Knepley > wrote: >>>> On Tue, Dec 13, 2022 at 6:11 AM Praveen C > wrote: >>>>> Hello >>>>> >>>>> In the attached test, I read a small grid made in gmsh with periodic bc. >>>>> >>>>> This is a 2d mesh. >>>>> >>>>> The cell numbers are shown in the figure. >>>>> >>>>> All faces have length = 2.5 >>>>> >>>>> But using PetscFVFaceGeom I am getting length of 7.5 for some faces. E.g., >>>>> >>>>> face: 59, centroid = 3.750000, 2.500000, normal = 0.000000, -7.500000 >>>>> ===> Face length incorrect = 7.500000, should be 2.5 >>>>> support[0] = 11, cent = 8.750000, 3.750000, area = 6.250000 >>>>> support[1] = 15, cent = 8.750000, 1.250000, area = 6.250000 >>>>> >>>>> There are also errors in the orientation of normal. >>>>> >>>>> If we disable periodicity in geo file, this error goes away. >>>> >>>> Yes, by default we only localize coordinates for cells. I can put in code to localize faces. >>> >>> Okay, I now have a MR for this: https://gitlab.com/petsc/petsc/-/merge_requests/5917 >>> >>> I am attaching your code, slightly modified. You can run >>> >>> ./dmplex -malloc_debug 0 -dm_plex_box_upper 10,10 -dm_plex_box_faces 4,4 -dm_plex_simplex 0 -dm_view ::ascii_info_detail -draw_pause 3 -dm_plex_box_bd periodic,periodic -dm_localize_height 0 >>> >>> which shows incorrect edges and >>> >>> ./dmplex -malloc_debug 0 -dm_plex_box_upper 10,10 -dm_plex_box_faces 4,4 -dm_plex_simplex 0 -dm_view ::ascii_info_detail -draw_pause 3 -dm_plex_box_bd periodic,periodic -dm_localize_height 1 >>> >>> which is correct. If you want to control things yourself, instead of using the option you can call DMPlexSetMaxProjectionHeight() on the coordinate DM yourself. >>> >>> Thanks, >>> >>> Matt >>> >>>> Thanks, >>>> >>>> Matt >>>> >>>>> Thanks >>>>> praveen >>>> -- >>>> >> > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: dmplex.c Type: application/octet-stream Size: 5155 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ug_periodic.msh Type: application/octet-stream Size: 1205 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Dec 14 08:20:33 2022 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 14 Dec 2022 09:20:33 -0500 Subject: [petsc-users] GAMG and linearized elasticity In-Reply-To: <87r0x2mz9f.fsf@jedbrown.org> References: <87r0x2mz9f.fsf@jedbrown.org> Message-ID: The eigen estimator is failing in GAMG. * The coarsening method in GAMG changed in recent releases, a little bit, with "aggressive" or "square" coarsening (two MISs instead of MIS on A'A), but something else is going on here. * Your fine grid looks good, N%3 == 0 and NNZ%9 == 0, the coarse grids seem to have lost the block size. N is not a factor of 3, or 6 with the null space. rows=45721, cols=45721. This is bad. * the block size is in there on the fine grid: rows=2186610, cols=2186610, bs=3 * Try running with -info and grep on GAMG and send me that output. Something is very wrong here. Thanks, Mark On Tue, Dec 13, 2022 at 10:38 PM Jed Brown wrote: > Do you have slip/symmetry boundary conditions, where some components are > constrained? In that case, there is no uniform block size and I think > you'll need DMPlexCreateRigidBody() and MatSetNearNullSpace(). > > The PCSetCoordinates() code won't work for non-constant block size. > > -pc_type gamg should work okay out of the box for elasticity. For hypre, > I've had good luck with this options suite, which also runs on GPU. > > -pc_type hypre -pc_hypre_boomeramg_coarsen_type pmis > -pc_hypre_boomeramg_interp_type ext+i -pc_hypre_boomeramg_no_CF > -pc_hypre_boomeramg_P_max 6 -pc_hypre_boomeramg_relax_type_down Chebyshev > -pc_hypre_boomeramg_relax_type_up Chebyshev > -pc_hypre_boomeramg_strong_threshold 0.5 > > Blaise Bourdin writes: > > > Hi, > > > > I am getting close to finish porting a code from petsc 3.3 / sieve to > main / dmplex, but am > > now encountering difficulties > > I am reasonably sure that the Jacobian and residual are correct. The > codes handle boundary > > conditions differently (MatZeroRowCols vs dmplex constraints) so it is > not trivial to compare > > them. Running with snes_type ksponly pc_type Jacobi or hyper gives me > the same results in > > roughly the same number of iterations. > > > > In my old code, gamg would work out of the box. When using petsc-main, > -pc_type gamg - > > pc_gamg_type agg works for _some_ problems using P1-Lagrange elements, > but never for > > P2-Lagrange. The typical error message is in gamg_agg.txt > > > > When using -pc_type classical, a problem where the KSP would converge in > 47 iteration in > > 3.3 now takes 1400. ksp_view_3.3.txt and ksp_view_main.txt show the > output of -ksp_view > > for both versions. I don?t notice anything obvious. > > > > Strangely, removing the call to PCSetCoordinates does not have any > impact on the > > convergence. > > > > I am sure that I am missing something, or not passing the right options. > What?s a good > > starting point for 3D elasticity? > > Regards, > > Blaise > > > > ? > > Canada Research Chair in Mathematical and Computational Aspects of Solid > Mechanics > > (Tier 1) > > Professor, Department of Mathematics & Statistics > > Hamilton Hall room 409A, McMaster University > > 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada > > https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243 > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: Computed maximum singular value as zero > > [0]PETSC ERROR: WARNING! There are option(s) set that were not used! > Could be the program crashed before they were used or a spelling mistake, > etc! > > [0]PETSC ERROR: Option left: name:-displacement_ksp_converged_reason > value: ascii source: file > > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > > [0]PETSC ERROR: Petsc Development GIT revision: > v3.18.2-341-g16200351da0 GIT Date: 2022-12-12 23:42:20 +0000 > > [0]PETSC ERROR: > /home/bourdinb/Development/mef90/mef90-dmplex/bbserv-gcc11.2.1-mvapich2-2.3.7-O/bin/ThermoElasticity > on a bbserv-gcc11.2.1-mvapich2-2.3.7-O named bb01 by bourdinb Tue Dec 13 > 17:02:19 2022 > > [0]PETSC ERROR: Configure options --CFLAGS=-Wunused > --FFLAGS="-ffree-line-length-none -fallow-argument-mismatch -Wunused" > --COPTFLAGS="-O2 -march=znver2" --CXXOPTFLAGS="-O2 -march=znver2" > --FOPTFLAGS="-O2 -march=znver2" --download-chaco=1 --download-exodusii=1 > --download-fblaslapack=1 --download-hdf5=1 --download-hypre=1 > --download-metis=1 --download-ml=1 --download-mumps=1 --download-netcdf=1 > --download-p4est=1 --download-parmetis=1 --download-pnetcdf=1 > --download-scalapack=1 --download-sowing=1 > --download-sowing-cc=/opt/rh/devtoolset-9/root/usr/bin/gcc > --download-sowing-cxx=/opt/rh/devtoolset-9/root/usr/bin/g++ > --download-sowing-cpp=/opt/rh/devtoolset-9/root/usr/bin/cpp > --download-sowing-cxxcpp=/opt/rh/devtoolset-9/root/usr/bin/cpp > --download-superlu=1 --download-triangle=1 --download-yaml=1 > --download-zlib=1 --with-debugging=0 > --with-mpi-dir=/opt/HPC/mvapich2/2.3.7-gcc11.2.1 --with-pic > --with-shared-libraries=1 --with-mpiexec=srun --with-x11=0 > > [0]PETSC ERROR: #1 PCGAMGOptProlongator_AGG() at > /1/HPC/petsc/main/src/ksp/pc/impls/gamg/agg.c:779 > > [0]PETSC ERROR: #2 PCSetUp_GAMG() at > /1/HPC/petsc/main/src/ksp/pc/impls/gamg/gamg.c:639 > > [0]PETSC ERROR: #3 PCSetUp() at > /1/HPC/petsc/main/src/ksp/pc/interface/precon.c:994 > > [0]PETSC ERROR: #4 KSPSetUp() at > /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:405 > > [0]PETSC ERROR: #5 KSPSolve_Private() at > /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:824 > > [0]PETSC ERROR: #6 KSPSolve() at > /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:1070 > > [0]PETSC ERROR: #7 SNESSolve_KSPONLY() at > /1/HPC/petsc/main/src/snes/impls/ksponly/ksponly.c:48 > > [0]PETSC ERROR: #8 SNESSolve() at > /1/HPC/petsc/main/src/snes/interface/snes.c:4693 > > [0]PETSC ERROR: #9 > /home/bourdinb/Development/mef90/mef90-dmplex/ThermoElasticity/ThermoElasticity.F90:228 > > Linear solve converged due to CONVERGED_RTOL iterations 46 > > KSP Object:(Disp_) 32 MPI processes > > type: cg > > maximum iterations=10000 > > tolerances: relative=1e-05, absolute=1e-08, divergence=1e+10 > > left preconditioning > > using nonzero initial guess > > using PRECONDITIONED norm type for convergence test > > PC Object:(Disp_) 32 MPI processes > > type: gamg > > MG: type is MULTIPLICATIVE, levels=4 cycles=v > > Cycles per PCApply=1 > > Using Galerkin computed coarse grid matrices > > Coarse grid solver -- level ------------------------------- > > KSP Object: (Disp_mg_coarse_) 32 MPI processes > > type: gmres > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=1, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_) 32 MPI processes > > type: bjacobi > > block Jacobi: number of blocks = 32 > > Local solve info for each block is in the following KSP and PC > objects: > > [0] number of local blocks = 1, first local block number = 0 > > [0] local block number 0 > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 1.06061 > > Factored matrix follows: > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=54, cols=54, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1260, allocated nonzeros=1260 > > total number of mallocs used during MatSetValues calls > =0 > > using I-node routines: found 16 nodes, limit used is > 5 > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=54, cols=54, bs=6 > > total: nonzeros=1188, allocated nonzeros=1188 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 17 nodes, limit used is 5 > > - - - - - - - - - - - - - - - - - - > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > [1] number of local blocks = 1, first local block number = 1 > > [1] local block number 0 > > - - - - - - - - - - - - - - - - - - > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5, needed 0 > > Factored matrix follows: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > package used to perform factorization: petsc > > total: nonzeros=1, allocated nonzeros=1 > > total number of mallocs used during MatSetValues calls > =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Matrix Object: 1 MPI processes > > type: seqaij > > rows=0, cols=0, bs=6 > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > [2] number of local blocks = 1, first local block number = 2 > > [2] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [3] number of local blocks = 1, first local block number = 3 > > [3] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [4] number of local blocks = 1, first local block number = 4 > > [4] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [5] number of local blocks = 1, first local block number = 5 > > [5] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [6] number of local blocks = 1, first local block number = 6 > > [6] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [7] number of local blocks = 1, first local block number = 7 > > [7] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [8] number of local blocks = 1, first local block number = 8 > > [8] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [9] number of local blocks = 1, first local block number = 9 > > [9] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [10] number of local blocks = 1, first local block number = 10 > > [10] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [11] number of local blocks = 1, first local block number = 11 > > [11] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [12] number of local blocks = 1, first local block number = 12 > > [12] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [13] number of local blocks = 1, first local block number = 13 > > [13] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [14] number of local blocks = 1, first local block number = 14 > > [14] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [15] number of local blocks = 1, first local block number = 15 > > [15] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [16] number of local blocks = 1, first local block number = 16 > > [16] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [17] number of local blocks = 1, first local block number = 17 > > [17] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [18] number of local blocks = 1, first local block number = 18 > > [18] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [19] number of local blocks = 1, first local block number = 19 > > [19] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [20] number of local blocks = 1, first local block number = 20 > > [20] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [21] number of local blocks = 1, first local block number = 21 > > [21] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [22] number of local blocks = 1, first local block number = 22 > > [22] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [23] number of local blocks = 1, first local block number = 23 > > [23] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [24] number of local blocks = 1, first local block number = 24 > > [24] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [25] number of local blocks = 1, first local block number = 25 > > [25] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [26] number of local blocks = 1, first local block number = 26 > > [26] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [27] number of local blocks = 1, first local block number = 27 > > [27] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [28] number of local blocks = 1, first local block number = 28 > > [28] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [29] number of local blocks = 1, first local block number = 29 > > [29] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [30] number of local blocks = 1, first local block number = 30 > > [30] local block number 0 > > - - - - - - - - - - - - - - - - - - > > [31] number of local blocks = 1, first local block number = 31 > > [31] local block number 0 > > - - - - - - - - - - - - - - - - - - > > linear system matrix = precond matrix: > > Matrix Object: 32 MPI processes > > type: mpiaij > > rows=54, cols=54, bs=6 > > total: nonzeros=1188, allocated nonzeros=1188 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 17 nodes, limit > used is 5 > > Down solver (pre-smoother) on level 1 ------------------------------- > > KSP Object: (Disp_mg_levels_1_) 32 MPI processes > > type: chebyshev > > Chebyshev: eigenvalue estimates: min = 0.101023, max = 2.13327 > > maximum iterations=2 > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using nonzero initial guess > > using NONE norm type for convergence test > > PC Object: (Disp_mg_levels_1_) 32 MPI processes > > type: jacobi > > linear system matrix = precond matrix: > > Matrix Object: 32 MPI processes > > type: mpiaij > > rows=1086, cols=1086, bs=6 > > total: nonzeros=67356, allocated nonzeros=67356 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 362 nodes, limit > used is 5 > > Up solver (post-smoother) same as down solver (pre-smoother) > > Down solver (pre-smoother) on level 2 ------------------------------- > > KSP Object: (Disp_mg_levels_2_) 32 MPI processes > > type: chebyshev > > Chebyshev: eigenvalue estimates: min = 0.0996526, max = 2.29388 > > maximum iterations=2 > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using nonzero initial guess > > using NONE norm type for convergence test > > PC Object: (Disp_mg_levels_2_) 32 MPI processes > > type: jacobi > > linear system matrix = precond matrix: > > Matrix Object: 32 MPI processes > > type: mpiaij > > rows=23808, cols=23808, bs=6 > > total: nonzeros=1976256, allocated nonzeros=1976256 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 7936 nodes, limit > used is 5 > > Up solver (post-smoother) same as down solver (pre-smoother) > > Down solver (pre-smoother) on level 3 ------------------------------- > > KSP Object: (Disp_mg_levels_3_) 32 MPI processes > > type: chebyshev > > Chebyshev: eigenvalue estimates: min = 0.165968, max = 2.13065 > > maximum iterations=2 > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using nonzero initial guess > > using NONE norm type for convergence test > > PC Object: (Disp_mg_levels_3_) 32 MPI processes > > type: jacobi > > linear system matrix = precond matrix: > > Matrix Object: (Disp_) 32 MPI processes > > type: mpiaij > > rows=291087, cols=291087 > > total: nonzeros=12323691, allocated nonzeros=12336696 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 3419 nodes, limit > used is 5 > > Up solver (post-smoother) same as down solver (pre-smoother) > > linear system matrix = precond matrix: > > Matrix Object: (Disp_) 32 MPI processes > > type: mpiaij > > rows=291087, cols=291087 > > total: nonzeros=12323691, allocated nonzeros=12336696 > > total number of mallocs used during MatSetValues calls =0 > > using I-node (on process 0) routines: found 3419 nodes, limit used > is 5 > > SNESConvergedReason returned 5 > > KSP Object: (Displacement_) 32 MPI processes > > type: cg > > maximum iterations=10000, nonzero initial guess > > tolerances: relative=1e-05, absolute=1e-08, divergence=1e+10 > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object: (Displacement_) 32 MPI processes > > type: gamg > > type is MULTIPLICATIVE, levels=4 cycles=v > > Cycles per PCApply=1 > > Using externally compute Galerkin coarse grid matrices > > GAMG specific options > > Threshold for dropping small values in graph on each level = > -1. -1. -1. -1. > > Threshold scaling factor for each level not specified = 1. > > Complexity: grid = 1.02128 operator = 1.05534 > > Coarse grid solver -- level 0 ------------------------------- > > KSP Object: (Displacement_mg_coarse_) 32 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Displacement_mg_coarse_) 32 MPI processes > > type: bjacobi > > number of blocks = 32 > > Local solver information for first block is in the following KSP > and PC objects on rank 0: > > Use -Displacement_mg_coarse_ksp_view ::ascii_info_detail to > display information for all blocks > > KSP Object: (Displacement_mg_coarse_sub_) 1 MPI process > > type: preonly > > maximum iterations=1, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Displacement_mg_coarse_sub_) 1 MPI process > > type: lu > > out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > > matrix ordering: nd > > factor fill ratio given 5., needed 1.08081 > > Factored matrix follows: > > Mat Object: (Displacement_mg_coarse_sub_) 1 MPI process > > type: seqaij > > rows=20, cols=20 > > package used to perform factorization: petsc > > total: nonzeros=214, allocated nonzeros=214 > > using I-node routines: found 8 nodes, limit used is 5 > > linear system matrix = precond matrix: > > Mat Object: (Displacement_mg_coarse_sub_) 1 MPI process > > type: seqaij > > rows=20, cols=20 > > total: nonzeros=198, allocated nonzeros=198 > > total number of mallocs used during MatSetValues calls=0 > > using I-node routines: found 13 nodes, limit used is 5 > > linear system matrix = precond matrix: > > Mat Object: 32 MPI processes > > type: mpiaij > > rows=20, cols=20 > > total: nonzeros=198, allocated nonzeros=198 > > total number of mallocs used during MatSetValues calls=0 > > using I-node (on process 0) routines: found 13 nodes, limit > used is 5 > > Down solver (pre-smoother) on level 1 ------------------------------- > > KSP Object: (Displacement_mg_levels_1_) 32 MPI processes > > type: chebyshev > > eigenvalue targets used: min 0.81922, max 9.01143 > > eigenvalues estimated via gmres: min 0.186278, max 8.1922 > > eigenvalues estimated using gmres with transform: [0. 0.1; 0. > 1.1] > > KSP Object: (Displacement_mg_levels_1_esteig_) 32 MPI processes > > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > > maximum iterations=10, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > estimating eigenvalues using noisy right hand side > > maximum iterations=2, nonzero initial guess > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Displacement_mg_levels_1_) 32 MPI processes > > type: jacobi > > type DIAGONAL > > linear system matrix = precond matrix: > > Mat Object: 32 MPI processes > > type: mpiaij > > rows=799, cols=799 > > total: nonzeros=83159, allocated nonzeros=83159 > > total number of mallocs used during MatSetValues calls=0 > > using I-node (on process 0) routines: found 23 nodes, limit > used is 5 > > Up solver (post-smoother) same as down solver (pre-smoother) > > Down solver (pre-smoother) on level 2 ------------------------------- > > KSP Object: (Displacement_mg_levels_2_) 32 MPI processes > > type: chebyshev > > eigenvalue targets used: min 1.16291, max 12.792 > > eigenvalues estimated via gmres: min 0.27961, max 11.6291 > > eigenvalues estimated using gmres with transform: [0. 0.1; 0. > 1.1] > > KSP Object: (Displacement_mg_levels_2_esteig_) 32 MPI processes > > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > > maximum iterations=10, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > estimating eigenvalues using noisy right hand side > > maximum iterations=2, nonzero initial guess > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Displacement_mg_levels_2_) 32 MPI processes > > type: jacobi > > type DIAGONAL > > linear system matrix = precond matrix: > > Mat Object: 32 MPI processes > > type: mpiaij > > rows=45721, cols=45721 > > total: nonzeros=9969661, allocated nonzeros=9969661 > > total number of mallocs used during MatSetValues calls=0 > > using nonscalable MatPtAP() implementation > > not using I-node (on process 0) routines > > Up solver (post-smoother) same as down solver (pre-smoother) > > Down solver (pre-smoother) on level 3 ------------------------------- > > KSP Object: (Displacement_mg_levels_3_) 32 MPI processes > > type: chebyshev > > eigenvalue targets used: min 0.281318, max 3.0945 > > eigenvalues estimated via gmres: min 0.0522027, max 2.81318 > > eigenvalues estimated using gmres with transform: [0. 0.1; 0. > 1.1] > > KSP Object: (Displacement_mg_levels_3_esteig_) 32 MPI processes > > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > > maximum iterations=10, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > estimating eigenvalues using noisy right hand side > > maximum iterations=2, nonzero initial guess > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (Displacement_mg_levels_3_) 32 MPI processes > > type: jacobi > > type DIAGONAL > > linear system matrix = precond matrix: > > Mat Object: (Displacement_) 32 MPI processes > > type: mpiaij > > rows=2186610, cols=2186610, bs=3 > > total: nonzeros=181659996, allocated nonzeros=181659996 > > total number of mallocs used during MatSetValues calls=0 > > has attached near null space > > using I-node (on process 0) routines: found 21368 nodes, limit > used is 5 > > Up solver (post-smoother) same as down solver (pre-smoother) > > linear system matrix = precond matrix: > > Mat Object: (Displacement_) 32 MPI processes > > type: mpiaij > > rows=2186610, cols=2186610, bs=3 > > total: nonzeros=181659996, allocated nonzeros=181659996 > > total number of mallocs used during MatSetValues calls=0 > > has attached near null space > > using I-node (on process 0) routines: found 21368 nodes, limit > used is 5 > > cell set 1 elastic energy: 9.32425E-02 work: 1.86485E-01 total: > -9.32425E-02 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Dec 14 10:07:23 2022 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 14 Dec 2022 11:07:23 -0500 Subject: [petsc-users] GAMG and linearized elasticity In-Reply-To: <66A2912A-A271-4EDD-8ACD-E1D1902C8112@mcmaster.ca> References: <87r0x2mz9f.fsf@jedbrown.org> <66A2912A-A271-4EDD-8ACD-E1D1902C8112@mcmaster.ca> Message-ID: On Wed, Dec 14, 2022 at 9:38 AM Blaise Bourdin wrote: > Hi Jed, > > Thanks for pointing us in the right direction. > We were using MatNullSpaceCreateRigidBody which does not know anything > about the discretization, hence our issues with quadratic elements. > DMPlexCreate RigidBody does not work out of the box for us since we do not > use PetscFE at the moment, but we can easily build the near null space by > hand. > Oh, MatNullSpaceCreateRigidBody should work because it takes the coordinates. You just need to get the coordinates for all the points/vertices. Or you can build it by hand. I don't know of DMPlexCreate RigidBody does the right thing. This would take a little code and I'm not sure if (Matt) did this (kinda doubt it). It should error out if not, but you don't use it anyway. Did you call MatNullSpaceCreateRigidBody with a vector of coordinates that only has the corner points? (In that case it should have through an error) > > FWIW, removing the wrong null space brought GAMG iteration number to > something more reasonable > Good. I'm not sure what happened, but MatNullSpaceCreateRigidBody should work unless you have a non-standard element and you can always test it by call MatMult on the RBMs and verify that its a null space, away from the BCs. > > Thanks a million, > Blaise > > > On Dec 13, 2022, at 10:37 PM, Jed Brown wrote: > > Do you have slip/symmetry boundary conditions, where some components are > constrained? In that case, there is no uniform block size and I think > you'll need DMPlexCreateRigidBody() and MatSetNearNullSpace(). > > The PCSetCoordinates() code won't work for non-constant block size. > > -pc_type gamg should work okay out of the box for elasticity. For hypre, > I've had good luck with this options suite, which also runs on GPU. > > -pc_type hypre -pc_hypre_boomeramg_coarsen_type pmis > -pc_hypre_boomeramg_interp_type ext+i -pc_hypre_boomeramg_no_CF > -pc_hypre_boomeramg_P_max 6 -pc_hypre_boomeramg_relax_type_down Chebyshev > -pc_hypre_boomeramg_relax_type_up Chebyshev > -pc_hypre_boomeramg_strong_threshold 0.5 > > Blaise Bourdin writes: > > Hi, > > I am getting close to finish porting a code from petsc 3.3 / sieve to main > / dmplex, but am > now encountering difficulties > I am reasonably sure that the Jacobian and residual are correct. The codes > handle boundary > conditions differently (MatZeroRowCols vs dmplex constraints) so it is not > trivial to compare > them. Running with snes_type ksponly pc_type Jacobi or hyper gives me the > same results in > roughly the same number of iterations. > > In my old code, gamg would work out of the box. When using petsc-main, > -pc_type gamg - > pc_gamg_type agg works for _some_ problems using P1-Lagrange elements, but > never for > P2-Lagrange. The typical error message is in gamg_agg.txt > > When using -pc_type classical, a problem where the KSP would converge in > 47 iteration in > 3.3 now takes 1400. ksp_view_3.3.txt and ksp_view_main.txt show the > output of -ksp_view > for both versions. I don?t notice anything obvious. > > Strangely, removing the call to PCSetCoordinates does not have any impact > on the > convergence. > > I am sure that I am missing something, or not passing the right options. > What?s a good > starting point for 3D elasticity? > Regards, > Blaise > > ? > Canada Research Chair in Mathematical and Computational Aspects of Solid > Mechanics > (Tier 1) > Professor, Department of Mathematics & Statistics > Hamilton Hall room 409A, McMaster University > 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada > https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243 > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Petsc has generated inconsistent data > [0]PETSC ERROR: Computed maximum singular value as zero > [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could > be the program crashed before they were used or a spelling mistake, etc! > [0]PETSC ERROR: Option left: name:-displacement_ksp_converged_reason > value: ascii source: file > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.18.2-341-g16200351da0 > GIT Date: 2022-12-12 23:42:20 +0000 > [0]PETSC ERROR: > /home/bourdinb/Development/mef90/mef90-dmplex/bbserv-gcc11.2.1-mvapich2-2.3.7-O/bin/ThermoElasticity > on a bbserv-gcc11.2.1-mvapich2-2.3.7-O named bb01 by bourdinb Tue Dec 13 > 17:02:19 2022 > [0]PETSC ERROR: Configure options --CFLAGS=-Wunused > --FFLAGS="-ffree-line-length-none -fallow-argument-mismatch -Wunused" > --COPTFLAGS="-O2 -march=znver2" --CXXOPTFLAGS="-O2 -march=znver2" > --FOPTFLAGS="-O2 -march=znver2" --download-chaco=1 --download-exodusii=1 > --download-fblaslapack=1 --download-hdf5=1 --download-hypre=1 > --download-metis=1 --download-ml=1 --download-mumps=1 --download-netcdf=1 > --download-p4est=1 --download-parmetis=1 --download-pnetcdf=1 > --download-scalapack=1 --download-sowing=1 > --download-sowing-cc=/opt/rh/devtoolset-9/root/usr/bin/gcc > --download-sowing-cxx=/opt/rh/devtoolset-9/root/usr/bin/g++ > --download-sowing-cpp=/opt/rh/devtoolset-9/root/usr/bin/cpp > --download-sowing-cxxcpp=/opt/rh/devtoolset-9/root/usr/bin/cpp > --download-superlu=1 --download-triangle=1 --download-yaml=1 > --download-zlib=1 --with-debugging=0 > --with-mpi-dir=/opt/HPC/mvapich2/2.3.7-gcc11.2.1 --with-pic > --with-shared-libraries=1 --with-mpiexec=srun --with-x11=0 > [0]PETSC ERROR: #1 PCGAMGOptProlongator_AGG() at > /1/HPC/petsc/main/src/ksp/pc/impls/gamg/agg.c:779 > [0]PETSC ERROR: #2 PCSetUp_GAMG() at > /1/HPC/petsc/main/src/ksp/pc/impls/gamg/gamg.c:639 > [0]PETSC ERROR: #3 PCSetUp() at > /1/HPC/petsc/main/src/ksp/pc/interface/precon.c:994 > [0]PETSC ERROR: #4 KSPSetUp() at > /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:405 > [0]PETSC ERROR: #5 KSPSolve_Private() at > /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:824 > [0]PETSC ERROR: #6 KSPSolve() at > /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:1070 > [0]PETSC ERROR: #7 SNESSolve_KSPONLY() at > /1/HPC/petsc/main/src/snes/impls/ksponly/ksponly.c:48 > [0]PETSC ERROR: #8 SNESSolve() at > /1/HPC/petsc/main/src/snes/interface/snes.c:4693 > [0]PETSC ERROR: #9 > /home/bourdinb/Development/mef90/mef90-dmplex/ThermoElasticity/ThermoElasticity.F90:228 > Linear solve converged due to CONVERGED_RTOL iterations 46 > KSP Object:(Disp_) 32 MPI processes > type: cg > maximum iterations=10000 > tolerances: relative=1e-05, absolute=1e-08, divergence=1e+10 > left preconditioning > using nonzero initial guess > using PRECONDITIONED norm type for convergence test > PC Object:(Disp_) 32 MPI processes > type: gamg > MG: type is MULTIPLICATIVE, levels=4 cycles=v > Cycles per PCApply=1 > Using Galerkin computed coarse grid matrices > Coarse grid solver -- level ------------------------------- > KSP Object: (Disp_mg_coarse_) 32 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_) 32 MPI processes > type: bjacobi > block Jacobi: number of blocks = 32 > Local solve info for each block is in the following KSP and PC > objects: > [0] number of local blocks = 1, first local block number = 0 > [0] local block number 0 > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 1.06061 > Factored matrix follows: > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Matrix Object: 1 MPI processes > type: seqaij > rows=54, cols=54, bs=6 > package used to perform factorization: petsc > total: nonzeros=1260, allocated nonzeros=1260 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 16 nodes, limit used is 5 > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=54, cols=54, bs=6 > total: nonzeros=1188, allocated nonzeros=1188 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 17 nodes, limit used is 5 > - - - - - - - - - - - - - - - - - - > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > [1] number of local blocks = 1, first local block number = 1 > [1] local block number 0 > - - - - - - - - - - - - - - - - - - > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (Disp_mg_coarse_sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 0 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > package used to perform factorization: petsc > total: nonzeros=1, allocated nonzeros=1 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=0, cols=0, bs=6 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > [2] number of local blocks = 1, first local block number = 2 > [2] local block number 0 > - - - - - - - - - - - - - - - - - - > [3] number of local blocks = 1, first local block number = 3 > [3] local block number 0 > - - - - - - - - - - - - - - - - - - > [4] number of local blocks = 1, first local block number = 4 > [4] local block number 0 > - - - - - - - - - - - - - - - - - - > [5] number of local blocks = 1, first local block number = 5 > [5] local block number 0 > - - - - - - - - - - - - - - - - - - > [6] number of local blocks = 1, first local block number = 6 > [6] local block number 0 > - - - - - - - - - - - - - - - - - - > [7] number of local blocks = 1, first local block number = 7 > [7] local block number 0 > - - - - - - - - - - - - - - - - - - > [8] number of local blocks = 1, first local block number = 8 > [8] local block number 0 > - - - - - - - - - - - - - - - - - - > [9] number of local blocks = 1, first local block number = 9 > [9] local block number 0 > - - - - - - - - - - - - - - - - - - > [10] number of local blocks = 1, first local block number = 10 > [10] local block number 0 > - - - - - - - - - - - - - - - - - - > [11] number of local blocks = 1, first local block number = 11 > [11] local block number 0 > - - - - - - - - - - - - - - - - - - > [12] number of local blocks = 1, first local block number = 12 > [12] local block number 0 > - - - - - - - - - - - - - - - - - - > [13] number of local blocks = 1, first local block number = 13 > [13] local block number 0 > - - - - - - - - - - - - - - - - - - > [14] number of local blocks = 1, first local block number = 14 > [14] local block number 0 > - - - - - - - - - - - - - - - - - - > [15] number of local blocks = 1, first local block number = 15 > [15] local block number 0 > - - - - - - - - - - - - - - - - - - > [16] number of local blocks = 1, first local block number = 16 > [16] local block number 0 > - - - - - - - - - - - - - - - - - - > [17] number of local blocks = 1, first local block number = 17 > [17] local block number 0 > - - - - - - - - - - - - - - - - - - > [18] number of local blocks = 1, first local block number = 18 > [18] local block number 0 > - - - - - - - - - - - - - - - - - - > [19] number of local blocks = 1, first local block number = 19 > [19] local block number 0 > - - - - - - - - - - - - - - - - - - > [20] number of local blocks = 1, first local block number = 20 > [20] local block number 0 > - - - - - - - - - - - - - - - - - - > [21] number of local blocks = 1, first local block number = 21 > [21] local block number 0 > - - - - - - - - - - - - - - - - - - > [22] number of local blocks = 1, first local block number = 22 > [22] local block number 0 > - - - - - - - - - - - - - - - - - - > [23] number of local blocks = 1, first local block number = 23 > [23] local block number 0 > - - - - - - - - - - - - - - - - - - > [24] number of local blocks = 1, first local block number = 24 > [24] local block number 0 > - - - - - - - - - - - - - - - - - - > [25] number of local blocks = 1, first local block number = 25 > [25] local block number 0 > - - - - - - - - - - - - - - - - - - > [26] number of local blocks = 1, first local block number = 26 > [26] local block number 0 > - - - - - - - - - - - - - - - - - - > [27] number of local blocks = 1, first local block number = 27 > [27] local block number 0 > - - - - - - - - - - - - - - - - - - > [28] number of local blocks = 1, first local block number = 28 > [28] local block number 0 > - - - - - - - - - - - - - - - - - - > [29] number of local blocks = 1, first local block number = 29 > [29] local block number 0 > - - - - - - - - - - - - - - - - - - > [30] number of local blocks = 1, first local block number = 30 > [30] local block number 0 > - - - - - - - - - - - - - - - - - - > [31] number of local blocks = 1, first local block number = 31 > [31] local block number 0 > - - - - - - - - - - - - - - - - - - > linear system matrix = precond matrix: > Matrix Object: 32 MPI processes > type: mpiaij > rows=54, cols=54, bs=6 > total: nonzeros=1188, allocated nonzeros=1188 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 17 nodes, limit used > is 5 > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object: (Disp_mg_levels_1_) 32 MPI processes > type: chebyshev > Chebyshev: eigenvalue estimates: min = 0.101023, max = 2.13327 > maximum iterations=2 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (Disp_mg_levels_1_) 32 MPI processes > type: jacobi > linear system matrix = precond matrix: > Matrix Object: 32 MPI processes > type: mpiaij > rows=1086, cols=1086, bs=6 > total: nonzeros=67356, allocated nonzeros=67356 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 362 nodes, limit used > is 5 > Up solver (post-smoother) same as down solver (pre-smoother) > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object: (Disp_mg_levels_2_) 32 MPI processes > type: chebyshev > Chebyshev: eigenvalue estimates: min = 0.0996526, max = 2.29388 > maximum iterations=2 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (Disp_mg_levels_2_) 32 MPI processes > type: jacobi > linear system matrix = precond matrix: > Matrix Object: 32 MPI processes > type: mpiaij > rows=23808, cols=23808, bs=6 > total: nonzeros=1976256, allocated nonzeros=1976256 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 7936 nodes, limit > used is 5 > Up solver (post-smoother) same as down solver (pre-smoother) > Down solver (pre-smoother) on level 3 ------------------------------- > KSP Object: (Disp_mg_levels_3_) 32 MPI processes > type: chebyshev > Chebyshev: eigenvalue estimates: min = 0.165968, max = 2.13065 > maximum iterations=2 > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (Disp_mg_levels_3_) 32 MPI processes > type: jacobi > linear system matrix = precond matrix: > Matrix Object: (Disp_) 32 MPI processes > type: mpiaij > rows=291087, cols=291087 > total: nonzeros=12323691, allocated nonzeros=12336696 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 3419 nodes, limit > used is 5 > Up solver (post-smoother) same as down solver (pre-smoother) > linear system matrix = precond matrix: > Matrix Object: (Disp_) 32 MPI processes > type: mpiaij > rows=291087, cols=291087 > total: nonzeros=12323691, allocated nonzeros=12336696 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 3419 nodes, limit used is > 5 > SNESConvergedReason returned 5 > KSP Object: (Displacement_) 32 MPI processes > type: cg > maximum iterations=10000, nonzero initial guess > tolerances: relative=1e-05, absolute=1e-08, divergence=1e+10 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (Displacement_) 32 MPI processes > type: gamg > type is MULTIPLICATIVE, levels=4 cycles=v > Cycles per PCApply=1 > Using externally compute Galerkin coarse grid matrices > GAMG specific options > Threshold for dropping small values in graph on each level = -1. > -1. -1. -1. > Threshold scaling factor for each level not specified = 1. > Complexity: grid = 1.02128 operator = 1.05534 > Coarse grid solver -- level 0 ------------------------------- > KSP Object: (Displacement_mg_coarse_) 32 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (Displacement_mg_coarse_) 32 MPI processes > type: bjacobi > number of blocks = 32 > Local solver information for first block is in the following KSP > and PC objects on rank 0: > Use -Displacement_mg_coarse_ksp_view ::ascii_info_detail to display > information for all blocks > KSP Object: (Displacement_mg_coarse_sub_) 1 MPI process > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (Displacement_mg_coarse_sub_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > matrix ordering: nd > factor fill ratio given 5., needed 1.08081 > Factored matrix follows: > Mat Object: (Displacement_mg_coarse_sub_) 1 MPI process > type: seqaij > rows=20, cols=20 > package used to perform factorization: petsc > total: nonzeros=214, allocated nonzeros=214 > using I-node routines: found 8 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: (Displacement_mg_coarse_sub_) 1 MPI process > type: seqaij > rows=20, cols=20 > total: nonzeros=198, allocated nonzeros=198 > total number of mallocs used during MatSetValues calls=0 > using I-node routines: found 13 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: 32 MPI processes > type: mpiaij > rows=20, cols=20 > total: nonzeros=198, allocated nonzeros=198 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 13 nodes, limit used > is 5 > Down solver (pre-smoother) on level 1 ------------------------------- > KSP Object: (Displacement_mg_levels_1_) 32 MPI processes > type: chebyshev > eigenvalue targets used: min 0.81922, max 9.01143 > eigenvalues estimated via gmres: min 0.186278, max 8.1922 > eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1] > KSP Object: (Displacement_mg_levels_1_esteig_) 32 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > estimating eigenvalues using noisy right hand side > maximum iterations=2, nonzero initial guess > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (Displacement_mg_levels_1_) 32 MPI processes > type: jacobi > type DIAGONAL > linear system matrix = precond matrix: > Mat Object: 32 MPI processes > type: mpiaij > rows=799, cols=799 > total: nonzeros=83159, allocated nonzeros=83159 > total number of mallocs used during MatSetValues calls=0 > using I-node (on process 0) routines: found 23 nodes, limit used > is 5 > Up solver (post-smoother) same as down solver (pre-smoother) > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object: (Displacement_mg_levels_2_) 32 MPI processes > type: chebyshev > eigenvalue targets used: min 1.16291, max 12.792 > eigenvalues estimated via gmres: min 0.27961, max 11.6291 > eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1] > KSP Object: (Displacement_mg_levels_2_esteig_) 32 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > estimating eigenvalues using noisy right hand side > maximum iterations=2, nonzero initial guess > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (Displacement_mg_levels_2_) 32 MPI processes > type: jacobi > type DIAGONAL > linear system matrix = precond matrix: > Mat Object: 32 MPI processes > type: mpiaij > rows=45721, cols=45721 > total: nonzeros=9969661, allocated nonzeros=9969661 > total number of mallocs used during MatSetValues calls=0 > using nonscalable MatPtAP() implementation > not using I-node (on process 0) routines > Up solver (post-smoother) same as down solver (pre-smoother) > Down solver (pre-smoother) on level 3 ------------------------------- > KSP Object: (Displacement_mg_levels_3_) 32 MPI processes > type: chebyshev > eigenvalue targets used: min 0.281318, max 3.0945 > eigenvalues estimated via gmres: min 0.0522027, max 2.81318 > eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1] > KSP Object: (Displacement_mg_levels_3_esteig_) 32 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10, initial guess is zero > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > estimating eigenvalues using noisy right hand side > maximum iterations=2, nonzero initial guess > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (Displacement_mg_levels_3_) 32 MPI processes > type: jacobi > type DIAGONAL > linear system matrix = precond matrix: > Mat Object: (Displacement_) 32 MPI processes > type: mpiaij > rows=2186610, cols=2186610, bs=3 > total: nonzeros=181659996, allocated nonzeros=181659996 > total number of mallocs used during MatSetValues calls=0 > has attached near null space > using I-node (on process 0) routines: found 21368 nodes, limit > used is 5 > Up solver (post-smoother) same as down solver (pre-smoother) > linear system matrix = precond matrix: > Mat Object: (Displacement_) 32 MPI processes > type: mpiaij > rows=2186610, cols=2186610, bs=3 > total: nonzeros=181659996, allocated nonzeros=181659996 > total number of mallocs used during MatSetValues calls=0 > has attached near null space > using I-node (on process 0) routines: found 21368 nodes, limit used > is 5 > cell set 1 elastic energy: 9.32425E-02 work: 1.86485E-01 total: > -9.32425E-02 > > > ? > Canada Research Chair in Mathematical and Computational Aspects of Solid > Mechanics (Tier 1) > Professor, Department of Mathematics & Statistics > Hamilton Hall room 409A, McMaster University > 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada > https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guglielmo2 at llnl.gov Wed Dec 14 12:07:51 2022 From: guglielmo2 at llnl.gov (Guglielmo, Tyler Hardy) Date: Wed, 14 Dec 2022 18:07:51 +0000 Subject: [petsc-users] Saving solution with monitor function In-Reply-To: References: Message-ID: Thanks Matt, I?m a bit confused on where the trajectory is being stored in the TSTrajectory object. Basically I have run TSSetSaveTrajectory(ts); ? TSSolve(ts, x); TSTrajectory tj; TSGetTrajectory(ts, &tj); TSTrajectorySetType(tj, ts, TSTRAJECTORYMEMORY); How is the object supposed to be accessed to find the entire trajectory? I couldn?t find a clear example of where this is laid out in the documentation. The TSTrajectory object looks like some complicated struct, but parsing which pointer is pointing to the solution has alluded me. Thanks for your time! Best, Tyler From: Matthew Knepley Date: Tuesday, December 13, 2022 at 6:41 AM To: Guglielmo, Tyler Hardy Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Saving solution with monitor function On Tue, Dec 13, 2022 at 8:40 AM Guglielmo, Tyler Hardy via petsc-users > wrote: Hi all, I am a new PETSc user (and new to MPI in general), and was wondering if someone could help me out with what I am sure is a basic question (if this is not the appropriate email list or there is a better place please let me know!). Basically, I am writing a code that requires a solution to an ODE that will be used later on during runtime. I have written the basic ODE solver using TSRK, however I haven?t thought of a good way to store the actual solution at all time steps throughout the time evolution. I would like to avoid writing each time step to a file through the monitor function, and instead just plug each time step into an array. How is this usually done? I suppose the user defined struct that gets passed into the monitor function could contain a pointer to an array in main? This is how I would do this if the program wasn?t of the MPI variety, but I am not sure how to properly declare a pointer to an array declared as Vec and built through the usual PETSc process. Any tips are greatly appreciated I think this is what TSTrajectory is for. I believe you want https://petsc.org/main/docs/manualpages/TS/TSTRAJECTORYMEMORY/ Thanks, Matt Thanks for your time, Tyler +++++++++++++++++++++++++++++ Tyler Guglielmo Postdoctoral Researcher Lawrence Livermore National Lab Office: 925-423-6186 Cell: 210-480-8000 +++++++++++++++++++++++++++++ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourdin at mcmaster.ca Wed Dec 14 12:11:00 2022 From: bourdin at mcmaster.ca (Blaise Bourdin) Date: Wed, 14 Dec 2022 18:11:00 +0000 Subject: [petsc-users] GAMG and linearized elasticity In-Reply-To: References: <87r0x2mz9f.fsf@jedbrown.org> <66A2912A-A271-4EDD-8ACD-E1D1902C8112@mcmaster.ca> Message-ID: <98DBAE1D-2E17-4FA1-ACA5-5654C1BD1274@mcmaster.ca> An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Dec 14 13:12:09 2022 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 14 Dec 2022 14:12:09 -0500 Subject: [petsc-users] GAMG and linearized elasticity In-Reply-To: <98DBAE1D-2E17-4FA1-ACA5-5654C1BD1274@mcmaster.ca> References: <87r0x2mz9f.fsf@jedbrown.org> <66A2912A-A271-4EDD-8ACD-E1D1902C8112@mcmaster.ca> <98DBAE1D-2E17-4FA1-ACA5-5654C1BD1274@mcmaster.ca> Message-ID: On Wed, Dec 14, 2022 at 1:11 PM Blaise Bourdin wrote: > Hi Mark, > > On Dec 14, 2022, at 11:07 AM, Mark Adams wrote: > > > > On Wed, Dec 14, 2022 at 9:38 AM Blaise Bourdin > wrote: > >> Hi Jed, >> >> Thanks for pointing us in the right direction. >> We were using MatNullSpaceCreateRigidBody which does not know anything >> about the discretization, hence our issues with quadratic elements. >> DMPlexCreate RigidBody does not work out of the box for us since we do not >> use PetscFE at the moment, but we can easily build the near null space by >> hand. >> > > Oh, MatNullSpaceCreateRigidBody should work because it takes the > coordinates. You just need to get the coordinates for all the > points/vertices. > Or you can build it by hand. > I don't know of DMPlexCreate RigidBody does the right thing. This would > take a little code and I'm not sure if (Matt) did this (kinda doubt it). It > should error out if not, but you don't use it anyway. > > Did you call MatNullSpaceCreateRigidBody with a vector of coordinates that > only has the corner points? (In that case it should have through an error) > > > Yes. I need to figure out why it did not throw an error. > I see that MatNullSpaceCreateRigidBody is not a Mat method! This is really just a utility method that creates the RBMs for each "node" and does not relate to a matrix to make sure they match. Now you must call MatSetNearNullSpace(A, matnull); I see the problem now. MatSetNearNullSpace simply attached matnull to A, and then GAMG grabs that and copies the data in. GAMG does not check the sizes. So the null space that GAMG used was garbage and it read past the end of the provided null space vectors. I'll add a check. Thanks, Mark > > > >> >> FWIW, removing the wrong null space brought GAMG iteration number to >> something more reasonable >> > > Good. I'm not sure what happened, but MatNullSpaceCreateRigidBody should > work unless you have a non-standard element and you can always test it by > call MatMult on the RBMs and verify that its a null space, away from the > BCs. > > Will do. > All in all, the easiest for me to to rebuild the null space. This way, I > am absolutely certain that it will works, regardless of my FE space. > > Blaise > > > > >> >> Thanks a million, >> Blaise >> >> >> On Dec 13, 2022, at 10:37 PM, Jed Brown wrote: >> >> Do you have slip/symmetry boundary conditions, where some components are >> constrained? In that case, there is no uniform block size and I think >> you'll need DMPlexCreateRigidBody() and MatSetNearNullSpace(). >> >> The PCSetCoordinates() code won't work for non-constant block size. >> >> -pc_type gamg should work okay out of the box for elasticity. For hypre, >> I've had good luck with this options suite, which also runs on GPU. >> >> -pc_type hypre -pc_hypre_boomeramg_coarsen_type pmis >> -pc_hypre_boomeramg_interp_type ext+i -pc_hypre_boomeramg_no_CF >> -pc_hypre_boomeramg_P_max 6 -pc_hypre_boomeramg_relax_type_down Chebyshev >> -pc_hypre_boomeramg_relax_type_up Chebyshev >> -pc_hypre_boomeramg_strong_threshold 0.5 >> >> Blaise Bourdin writes: >> >> Hi, >> >> I am getting close to finish porting a code from petsc 3.3 / sieve to >> main / dmplex, but am >> now encountering difficulties >> I am reasonably sure that the Jacobian and residual are correct. The >> codes handle boundary >> conditions differently (MatZeroRowCols vs dmplex constraints) so it is >> not trivial to compare >> them. Running with snes_type ksponly pc_type Jacobi or hyper gives me the >> same results in >> roughly the same number of iterations. >> >> In my old code, gamg would work out of the box. When using petsc-main, >> -pc_type gamg - >> pc_gamg_type agg works for _some_ problems using P1-Lagrange elements, >> but never for >> P2-Lagrange. The typical error message is in gamg_agg.txt >> >> When using -pc_type classical, a problem where the KSP would converge in >> 47 iteration in >> 3.3 now takes 1400. ksp_view_3.3.txt and ksp_view_main.txt show the >> output of -ksp_view >> for both versions. I don?t notice anything obvious. >> >> Strangely, removing the call to PCSetCoordinates does not have any impact >> on the >> convergence. >> >> I am sure that I am missing something, or not passing the right options. >> What?s a good >> starting point for 3D elasticity? >> Regards, >> Blaise >> >> ? >> Canada Research Chair in Mathematical and Computational Aspects of Solid >> Mechanics >> (Tier 1) >> Professor, Department of Mathematics & Statistics >> Hamilton Hall room 409A, McMaster University >> 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada >> https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243 >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: Petsc has generated inconsistent data >> [0]PETSC ERROR: Computed maximum singular value as zero >> [0]PETSC ERROR: WARNING! There are option(s) set that were not used! >> Could be the program crashed before they were used or a spelling mistake, >> etc! >> [0]PETSC ERROR: Option left: name:-displacement_ksp_converged_reason >> value: ascii source: file >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Development GIT revision: v3.18.2-341-g16200351da0 >> GIT Date: 2022-12-12 23:42:20 +0000 >> [0]PETSC ERROR: >> /home/bourdinb/Development/mef90/mef90-dmplex/bbserv-gcc11.2.1-mvapich2-2.3.7-O/bin/ThermoElasticity >> on a bbserv-gcc11.2.1-mvapich2-2.3.7-O named bb01 by bourdinb Tue Dec 13 >> 17:02:19 2022 >> [0]PETSC ERROR: Configure options --CFLAGS=-Wunused >> --FFLAGS="-ffree-line-length-none -fallow-argument-mismatch -Wunused" >> --COPTFLAGS="-O2 -march=znver2" --CXXOPTFLAGS="-O2 -march=znver2" >> --FOPTFLAGS="-O2 -march=znver2" --download-chaco=1 --download-exodusii=1 >> --download-fblaslapack=1 --download-hdf5=1 --download-hypre=1 >> --download-metis=1 --download-ml=1 --download-mumps=1 --download-netcdf=1 >> --download-p4est=1 --download-parmetis=1 --download-pnetcdf=1 >> --download-scalapack=1 --download-sowing=1 >> --download-sowing-cc=/opt/rh/devtoolset-9/root/usr/bin/gcc >> --download-sowing-cxx=/opt/rh/devtoolset-9/root/usr/bin/g++ >> --download-sowing-cpp=/opt/rh/devtoolset-9/root/usr/bin/cpp >> --download-sowing-cxxcpp=/opt/rh/devtoolset-9/root/usr/bin/cpp >> --download-superlu=1 --download-triangle=1 --download-yaml=1 >> --download-zlib=1 --with-debugging=0 >> --with-mpi-dir=/opt/HPC/mvapich2/2.3.7-gcc11.2.1 --with-pic >> --with-shared-libraries=1 --with-mpiexec=srun --with-x11=0 >> [0]PETSC ERROR: #1 PCGAMGOptProlongator_AGG() at >> /1/HPC/petsc/main/src/ksp/pc/impls/gamg/agg.c:779 >> [0]PETSC ERROR: #2 PCSetUp_GAMG() at >> /1/HPC/petsc/main/src/ksp/pc/impls/gamg/gamg.c:639 >> [0]PETSC ERROR: #3 PCSetUp() at >> /1/HPC/petsc/main/src/ksp/pc/interface/precon.c:994 >> [0]PETSC ERROR: #4 KSPSetUp() at >> /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:405 >> [0]PETSC ERROR: #5 KSPSolve_Private() at >> /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:824 >> [0]PETSC ERROR: #6 KSPSolve() at >> /1/HPC/petsc/main/src/ksp/ksp/interface/itfunc.c:1070 >> [0]PETSC ERROR: #7 SNESSolve_KSPONLY() at >> /1/HPC/petsc/main/src/snes/impls/ksponly/ksponly.c:48 >> [0]PETSC ERROR: #8 SNESSolve() at >> /1/HPC/petsc/main/src/snes/interface/snes.c:4693 >> [0]PETSC ERROR: #9 >> /home/bourdinb/Development/mef90/mef90-dmplex/ThermoElasticity/ThermoElasticity.F90:228 >> Linear solve converged due to CONVERGED_RTOL iterations 46 >> KSP Object:(Disp_) 32 MPI processes >> type: cg >> maximum iterations=10000 >> tolerances: relative=1e-05, absolute=1e-08, divergence=1e+10 >> left preconditioning >> using nonzero initial guess >> using PRECONDITIONED norm type for convergence test >> PC Object:(Disp_) 32 MPI processes >> type: gamg >> MG: type is MULTIPLICATIVE, levels=4 cycles=v >> Cycles per PCApply=1 >> Using Galerkin computed coarse grid matrices >> Coarse grid solver -- level ------------------------------- >> KSP Object: (Disp_mg_coarse_) 32 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_) 32 MPI processes >> type: bjacobi >> block Jacobi: number of blocks = 32 >> Local solve info for each block is in the following KSP and PC >> objects: >> [0] number of local blocks = 1, first local block number = 0 >> [0] local block number 0 >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 1.06061 >> Factored matrix follows: >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=54, cols=54, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1260, allocated nonzeros=1260 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 16 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=54, cols=54, bs=6 >> total: nonzeros=1188, allocated nonzeros=1188 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 17 nodes, limit used is 5 >> - - - - - - - - - - - - - - - - - - >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> [1] number of local blocks = 1, first local block number = 1 >> [1] local block number 0 >> - - - - - - - - - - - - - - - - - - >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> KSP Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Disp_mg_coarse_sub_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 0 >> Factored matrix follows: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> package used to perform factorization: petsc >> total: nonzeros=1, allocated nonzeros=1 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Matrix Object: 1 MPI processes >> type: seqaij >> rows=0, cols=0, bs=6 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> [2] number of local blocks = 1, first local block number = 2 >> [2] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [3] number of local blocks = 1, first local block number = 3 >> [3] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [4] number of local blocks = 1, first local block number = 4 >> [4] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [5] number of local blocks = 1, first local block number = 5 >> [5] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [6] number of local blocks = 1, first local block number = 6 >> [6] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [7] number of local blocks = 1, first local block number = 7 >> [7] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [8] number of local blocks = 1, first local block number = 8 >> [8] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [9] number of local blocks = 1, first local block number = 9 >> [9] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [10] number of local blocks = 1, first local block number = 10 >> [10] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [11] number of local blocks = 1, first local block number = 11 >> [11] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [12] number of local blocks = 1, first local block number = 12 >> [12] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [13] number of local blocks = 1, first local block number = 13 >> [13] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [14] number of local blocks = 1, first local block number = 14 >> [14] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [15] number of local blocks = 1, first local block number = 15 >> [15] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [16] number of local blocks = 1, first local block number = 16 >> [16] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [17] number of local blocks = 1, first local block number = 17 >> [17] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [18] number of local blocks = 1, first local block number = 18 >> [18] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [19] number of local blocks = 1, first local block number = 19 >> [19] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [20] number of local blocks = 1, first local block number = 20 >> [20] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [21] number of local blocks = 1, first local block number = 21 >> [21] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [22] number of local blocks = 1, first local block number = 22 >> [22] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [23] number of local blocks = 1, first local block number = 23 >> [23] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [24] number of local blocks = 1, first local block number = 24 >> [24] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [25] number of local blocks = 1, first local block number = 25 >> [25] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [26] number of local blocks = 1, first local block number = 26 >> [26] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [27] number of local blocks = 1, first local block number = 27 >> [27] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [28] number of local blocks = 1, first local block number = 28 >> [28] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [29] number of local blocks = 1, first local block number = 29 >> [29] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [30] number of local blocks = 1, first local block number = 30 >> [30] local block number 0 >> - - - - - - - - - - - - - - - - - - >> [31] number of local blocks = 1, first local block number = 31 >> [31] local block number 0 >> - - - - - - - - - - - - - - - - - - >> linear system matrix = precond matrix: >> Matrix Object: 32 MPI processes >> type: mpiaij >> rows=54, cols=54, bs=6 >> total: nonzeros=1188, allocated nonzeros=1188 >> total number of mallocs used during MatSetValues calls =0 >> using I-node (on process 0) routines: found 17 nodes, limit used >> is 5 >> Down solver (pre-smoother) on level 1 ------------------------------- >> KSP Object: (Disp_mg_levels_1_) 32 MPI processes >> type: chebyshev >> Chebyshev: eigenvalue estimates: min = 0.101023, max = 2.13327 >> maximum iterations=2 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using nonzero initial guess >> using NONE norm type for convergence test >> PC Object: (Disp_mg_levels_1_) 32 MPI processes >> type: jacobi >> linear system matrix = precond matrix: >> Matrix Object: 32 MPI processes >> type: mpiaij >> rows=1086, cols=1086, bs=6 >> total: nonzeros=67356, allocated nonzeros=67356 >> total number of mallocs used during MatSetValues calls =0 >> using I-node (on process 0) routines: found 362 nodes, limit >> used is 5 >> Up solver (post-smoother) same as down solver (pre-smoother) >> Down solver (pre-smoother) on level 2 ------------------------------- >> KSP Object: (Disp_mg_levels_2_) 32 MPI processes >> type: chebyshev >> Chebyshev: eigenvalue estimates: min = 0.0996526, max = 2.29388 >> maximum iterations=2 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using nonzero initial guess >> using NONE norm type for convergence test >> PC Object: (Disp_mg_levels_2_) 32 MPI processes >> type: jacobi >> linear system matrix = precond matrix: >> Matrix Object: 32 MPI processes >> type: mpiaij >> rows=23808, cols=23808, bs=6 >> total: nonzeros=1976256, allocated nonzeros=1976256 >> total number of mallocs used during MatSetValues calls =0 >> using I-node (on process 0) routines: found 7936 nodes, limit >> used is 5 >> Up solver (post-smoother) same as down solver (pre-smoother) >> Down solver (pre-smoother) on level 3 ------------------------------- >> KSP Object: (Disp_mg_levels_3_) 32 MPI processes >> type: chebyshev >> Chebyshev: eigenvalue estimates: min = 0.165968, max = 2.13065 >> maximum iterations=2 >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using nonzero initial guess >> using NONE norm type for convergence test >> PC Object: (Disp_mg_levels_3_) 32 MPI processes >> type: jacobi >> linear system matrix = precond matrix: >> Matrix Object: (Disp_) 32 MPI processes >> type: mpiaij >> rows=291087, cols=291087 >> total: nonzeros=12323691, allocated nonzeros=12336696 >> total number of mallocs used during MatSetValues calls =0 >> using I-node (on process 0) routines: found 3419 nodes, limit >> used is 5 >> Up solver (post-smoother) same as down solver (pre-smoother) >> linear system matrix = precond matrix: >> Matrix Object: (Disp_) 32 MPI processes >> type: mpiaij >> rows=291087, cols=291087 >> total: nonzeros=12323691, allocated nonzeros=12336696 >> total number of mallocs used during MatSetValues calls =0 >> using I-node (on process 0) routines: found 3419 nodes, limit used >> is 5 >> SNESConvergedReason returned 5 >> KSP Object: (Displacement_) 32 MPI processes >> type: cg >> maximum iterations=10000, nonzero initial guess >> tolerances: relative=1e-05, absolute=1e-08, divergence=1e+10 >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: (Displacement_) 32 MPI processes >> type: gamg >> type is MULTIPLICATIVE, levels=4 cycles=v >> Cycles per PCApply=1 >> Using externally compute Galerkin coarse grid matrices >> GAMG specific options >> Threshold for dropping small values in graph on each level = -1. >> -1. -1. -1. >> Threshold scaling factor for each level not specified = 1. >> Complexity: grid = 1.02128 operator = 1.05534 >> Coarse grid solver -- level 0 ------------------------------- >> KSP Object: (Displacement_mg_coarse_) 32 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Displacement_mg_coarse_) 32 MPI processes >> type: bjacobi >> number of blocks = 32 >> Local solver information for first block is in the following KSP >> and PC objects on rank 0: >> Use -Displacement_mg_coarse_ksp_view ::ascii_info_detail to >> display information for all blocks >> KSP Object: (Displacement_mg_coarse_sub_) 1 MPI process >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Displacement_mg_coarse_sub_) 1 MPI process >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> using diagonal shift on blocks to prevent zero pivot [INBLOCKS] >> matrix ordering: nd >> factor fill ratio given 5., needed 1.08081 >> Factored matrix follows: >> Mat Object: (Displacement_mg_coarse_sub_) 1 MPI process >> type: seqaij >> rows=20, cols=20 >> package used to perform factorization: petsc >> total: nonzeros=214, allocated nonzeros=214 >> using I-node routines: found 8 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: (Displacement_mg_coarse_sub_) 1 MPI process >> type: seqaij >> rows=20, cols=20 >> total: nonzeros=198, allocated nonzeros=198 >> total number of mallocs used during MatSetValues calls=0 >> using I-node routines: found 13 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: 32 MPI processes >> type: mpiaij >> rows=20, cols=20 >> total: nonzeros=198, allocated nonzeros=198 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 13 nodes, limit used >> is 5 >> Down solver (pre-smoother) on level 1 ------------------------------- >> KSP Object: (Displacement_mg_levels_1_) 32 MPI processes >> type: chebyshev >> eigenvalue targets used: min 0.81922, max 9.01143 >> eigenvalues estimated via gmres: min 0.186278, max 8.1922 >> eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1] >> KSP Object: (Displacement_mg_levels_1_esteig_) 32 MPI processes >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10, initial guess is zero >> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> estimating eigenvalues using noisy right hand side >> maximum iterations=2, nonzero initial guess >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Displacement_mg_levels_1_) 32 MPI processes >> type: jacobi >> type DIAGONAL >> linear system matrix = precond matrix: >> Mat Object: 32 MPI processes >> type: mpiaij >> rows=799, cols=799 >> total: nonzeros=83159, allocated nonzeros=83159 >> total number of mallocs used during MatSetValues calls=0 >> using I-node (on process 0) routines: found 23 nodes, limit used >> is 5 >> Up solver (post-smoother) same as down solver (pre-smoother) >> Down solver (pre-smoother) on level 2 ------------------------------- >> KSP Object: (Displacement_mg_levels_2_) 32 MPI processes >> type: chebyshev >> eigenvalue targets used: min 1.16291, max 12.792 >> eigenvalues estimated via gmres: min 0.27961, max 11.6291 >> eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1] >> KSP Object: (Displacement_mg_levels_2_esteig_) 32 MPI processes >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10, initial guess is zero >> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> estimating eigenvalues using noisy right hand side >> maximum iterations=2, nonzero initial guess >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Displacement_mg_levels_2_) 32 MPI processes >> type: jacobi >> type DIAGONAL >> linear system matrix = precond matrix: >> Mat Object: 32 MPI processes >> type: mpiaij >> rows=45721, cols=45721 >> total: nonzeros=9969661, allocated nonzeros=9969661 >> total number of mallocs used during MatSetValues calls=0 >> using nonscalable MatPtAP() implementation >> not using I-node (on process 0) routines >> Up solver (post-smoother) same as down solver (pre-smoother) >> Down solver (pre-smoother) on level 3 ------------------------------- >> KSP Object: (Displacement_mg_levels_3_) 32 MPI processes >> type: chebyshev >> eigenvalue targets used: min 0.281318, max 3.0945 >> eigenvalues estimated via gmres: min 0.0522027, max 2.81318 >> eigenvalues estimated using gmres with transform: [0. 0.1; 0. 1.1] >> KSP Object: (Displacement_mg_levels_3_esteig_) 32 MPI processes >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10, initial guess is zero >> tolerances: relative=1e-12, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> estimating eigenvalues using noisy right hand side >> maximum iterations=2, nonzero initial guess >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (Displacement_mg_levels_3_) 32 MPI processes >> type: jacobi >> type DIAGONAL >> linear system matrix = precond matrix: >> Mat Object: (Displacement_) 32 MPI processes >> type: mpiaij >> rows=2186610, cols=2186610, bs=3 >> total: nonzeros=181659996, allocated nonzeros=181659996 >> total number of mallocs used during MatSetValues calls=0 >> has attached near null space >> using I-node (on process 0) routines: found 21368 nodes, limit >> used is 5 >> Up solver (post-smoother) same as down solver (pre-smoother) >> linear system matrix = precond matrix: >> Mat Object: (Displacement_) 32 MPI processes >> type: mpiaij >> rows=2186610, cols=2186610, bs=3 >> total: nonzeros=181659996, allocated nonzeros=181659996 >> total number of mallocs used during MatSetValues calls=0 >> has attached near null space >> using I-node (on process 0) routines: found 21368 nodes, limit used >> is 5 >> cell set 1 elastic energy: 9.32425E-02 work: 1.86485E-01 total: >> -9.32425E-02 >> >> >> ? >> Canada Research Chair in Mathematical and Computational Aspects of Solid >> Mechanics (Tier 1) >> Professor, Department of Mathematics & Statistics >> Hamilton Hall room 409A, McMaster University >> 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada >> https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243 >> >> > ? > Canada Research Chair in Mathematical and Computational Aspects of Solid > Mechanics (Tier 1) > Professor, Department of Mathematics & Statistics > Hamilton Hall room 409A, McMaster University > 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada > https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Dec 14 14:50:50 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 14 Dec 2022 15:50:50 -0500 Subject: [petsc-users] Saving solution with monitor function In-Reply-To: References: Message-ID: See, for example https://petsc.org/release/docs/manualpages/TS/TSTrajectoryGetVecs/ and https://petsc.org/release/docs/manualpages/TS/TSTrajectoryGetUpdatedHistoryVecs/ One does not directly access the data inside the trajectory; one calls functions in the API to obtained desired information. If you need specific information that it does not currently provide we can attempt to provide additional functionality. Barry > On Dec 14, 2022, at 1:07 PM, Guglielmo, Tyler Hardy via petsc-users wrote: > > Thanks Matt, > > I?m a bit confused on where the trajectory is being stored in the TSTrajectory object. > > Basically I have run > > TSSetSaveTrajectory(ts); > ? > TSSolve(ts, x); > TSTrajectory tj; > TSGetTrajectory(ts, &tj); > TSTrajectorySetType(tj, ts, TSTRAJECTORYMEMORY); > > How is the object supposed to be accessed to find the entire trajectory? I couldn?t find a clear example of where this is laid out in the documentation. > > The TSTrajectory object looks like some complicated struct, but parsing which pointer is pointing to the solution has alluded me. > > Thanks for your time! > > Best, > Tyler > > > From: Matthew Knepley > > Date: Tuesday, December 13, 2022 at 6:41 AM > To: Guglielmo, Tyler Hardy > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Saving solution with monitor function > > On Tue, Dec 13, 2022 at 8:40 AM Guglielmo, Tyler Hardy via petsc-users > wrote: > Hi all, > > I am a new PETSc user (and new to MPI in general), and was wondering if someone could help me out with what I am sure is a basic question (if this is not the appropriate email list or there is a better place please let me know!). > > Basically, I am writing a code that requires a solution to an ODE that will be used later on during runtime. I have written the basic ODE solver using TSRK, however I haven?t thought of a good way to store the actual solution at all time steps throughout the time evolution. I would like to avoid writing each time step to a file through the monitor function, and instead just plug each time step into an array. > > How is this usually done? I suppose the user defined struct that gets passed into the monitor function could contain a pointer to an array in main? This is how I would do this if the program wasn?t of the MPI variety, but I am not sure how to properly declare a pointer to an array declared as Vec and built through the usual PETSc process. Any tips are greatly appreciated > > I think this is what TSTrajectory is for. I believe you want https://petsc.org/main/docs/manualpages/TS/TSTRAJECTORYMEMORY/ > > Thanks, > > Matt > > Thanks for your time, > Tyler > > +++++++++++++++++++++++++++++ > Tyler Guglielmo > Postdoctoral Researcher > Lawrence Livermore National Lab > Office: 925-423-6186 > Cell: 210-480-8000 > +++++++++++++++++++++++++++++ > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Wed Dec 14 15:09:33 2022 From: hongzhang at anl.gov (Zhang, Hong) Date: Wed, 14 Dec 2022 21:09:33 +0000 Subject: [petsc-users] Saving solution with monitor function In-Reply-To: References: Message-ID: <1E697913-B06B-4260-8526-6C24C1848C9F@anl.gov> It seems that you do not intend to do an offline analysis on the solutions. If the goal is just to get the solutions at each step on the fly, you can probably also try TSSetTimeSpan() and TSGetTimeSpanSolutions(). They can give you an array of solution vectors at the time points you specify beforehand. https://petsc.org/main/docs/manualpages/TS/TSSetTimeSpan/ Hong (Mr.) On Dec 14, 2022, at 12:07 PM, Guglielmo, Tyler Hardy via petsc-users > wrote: Thanks Matt, I?m a bit confused on where the trajectory is being stored in the TSTrajectory object. Basically I have run TSSetSaveTrajectory(ts); ? TSSolve(ts, x); TSTrajectory tj; TSGetTrajectory(ts, &tj); TSTrajectorySetType(tj, ts, TSTRAJECTORYMEMORY); How is the object supposed to be accessed to find the entire trajectory? I couldn?t find a clear example of where this is laid out in the documentation. The TSTrajectory object looks like some complicated struct, but parsing which pointer is pointing to the solution has alluded me. Thanks for your time! Best, Tyler From: Matthew Knepley > Date: Tuesday, December 13, 2022 at 6:41 AM To: Guglielmo, Tyler Hardy > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Saving solution with monitor function On Tue, Dec 13, 2022 at 8:40 AM Guglielmo, Tyler Hardy via petsc-users > wrote: Hi all, I am a new PETSc user (and new to MPI in general), and was wondering if someone could help me out with what I am sure is a basic question (if this is not the appropriate email list or there is a better place please let me know!). Basically, I am writing a code that requires a solution to an ODE that will be used later on during runtime. I have written the basic ODE solver using TSRK, however I haven?t thought of a good way to store the actual solution at all time steps throughout the time evolution. I would like to avoid writing each time step to a file through the monitor function, and instead just plug each time step into an array. How is this usually done? I suppose the user defined struct that gets passed into the monitor function could contain a pointer to an array in main? This is how I would do this if the program wasn?t of the MPI variety, but I am not sure how to properly declare a pointer to an array declared as Vec and built through the usual PETSc process. Any tips are greatly appreciated I think this is what TSTrajectory is for. I believe you want https://petsc.org/main/docs/manualpages/TS/TSTRAJECTORYMEMORY/ Thanks, Matt Thanks for your time, Tyler +++++++++++++++++++++++++++++ Tyler Guglielmo Postdoctoral Researcher Lawrence Livermore National Lab Office: 925-423-6186 Cell: 210-480-8000 +++++++++++++++++++++++++++++ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guglielmo2 at llnl.gov Wed Dec 14 16:42:56 2022 From: guglielmo2 at llnl.gov (Guglielmo, Tyler Hardy) Date: Wed, 14 Dec 2022 22:42:56 +0000 Subject: [petsc-users] Saving solution with monitor function In-Reply-To: References: Message-ID: Ah beautiful, thanks for showing me these. Apologies for bugging you all with these questions. Having to learn a ton of subroutines? Btw is there a matrix addition function for M = a*M_1 + b*M_2, or one similar to MatAXPY that does not overwrite the Y matrix, i.e. M = a*X + Y ? All the best, Tyler From: Barry Smith Date: Wednesday, December 14, 2022 at 12:51 PM To: Guglielmo, Tyler Hardy Cc: Matthew Knepley , petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Saving solution with monitor function See, for example https://petsc.org/release/docs/manualpages/TS/TSTrajectoryGetVecs/ and https://petsc.org/release/docs/manualpages/TS/TSTrajectoryGetUpdatedHistoryVecs/ One does not directly access the data inside the trajectory; one calls functions in the API to obtained desired information. If you need specific information that it does not currently provide we can attempt to provide additional functionality. Barry On Dec 14, 2022, at 1:07 PM, Guglielmo, Tyler Hardy via petsc-users wrote: Thanks Matt, I?m a bit confused on where the trajectory is being stored in the TSTrajectory object. Basically I have run TSSetSaveTrajectory(ts); ? TSSolve(ts, x); TSTrajectory tj; TSGetTrajectory(ts, &tj); TSTrajectorySetType(tj, ts, TSTRAJECTORYMEMORY); How is the object supposed to be accessed to find the entire trajectory? I couldn?t find a clear example of where this is laid out in the documentation. The TSTrajectory object looks like some complicated struct, but parsing which pointer is pointing to the solution has alluded me. Thanks for your time! Best, Tyler From: Matthew Knepley > Date: Tuesday, December 13, 2022 at 6:41 AM To: Guglielmo, Tyler Hardy > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Saving solution with monitor function On Tue, Dec 13, 2022 at 8:40 AM Guglielmo, Tyler Hardy via petsc-users > wrote: Hi all, I am a new PETSc user (and new to MPI in general), and was wondering if someone could help me out with what I am sure is a basic question (if this is not the appropriate email list or there is a better place please let me know!). Basically, I am writing a code that requires a solution to an ODE that will be used later on during runtime. I have written the basic ODE solver using TSRK, however I haven?t thought of a good way to store the actual solution at all time steps throughout the time evolution. I would like to avoid writing each time step to a file through the monitor function, and instead just plug each time step into an array. How is this usually done? I suppose the user defined struct that gets passed into the monitor function could contain a pointer to an array in main? This is how I would do this if the program wasn?t of the MPI variety, but I am not sure how to properly declare a pointer to an array declared as Vec and built through the usual PETSc process. Any tips are greatly appreciated I think this is what TSTrajectory is for. I believe you want https://petsc.org/main/docs/manualpages/TS/TSTRAJECTORYMEMORY/ Thanks, Matt Thanks for your time, Tyler +++++++++++++++++++++++++++++ Tyler Guglielmo Postdoctoral Researcher Lawrence Livermore National Lab Office: 925-423-6186 Cell: 210-480-8000 +++++++++++++++++++++++++++++ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Wed Dec 14 20:10:50 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Thu, 15 Dec 2022 11:10:50 +0900 Subject: [petsc-users] extract preconditioner matrix Message-ID: Hello, I tried to find the way to adapt my own preconditioner. In other words, I want to apply and solve a new preconditioner rather than using the existing one in Petsc. So, my questions are as below 1. Is this possible to adapt my own preconditioner?? 2. Also is it possible to extract preconditioner matrix created in Petsc? 3. Is this possible to separate preconditioning & solving procedure to check the result of each process in Petsc?? Thanks, Hyung Kim -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Dec 14 20:33:50 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 14 Dec 2022 21:33:50 -0500 Subject: [petsc-users] extract preconditioner matrix In-Reply-To: References: Message-ID: > On Dec 14, 2022, at 9:10 PM, ??? wrote: > > Hello, > > > > I tried to find the way to adapt my own preconditioner. > In other words, I want to apply and solve a new preconditioner rather than using the existing one in Petsc. > > So, my questions are as below > 1. Is this possible to adapt my own preconditioner?? There are a variety of ways to provide your own preconditioner; you can use https://petsc.org/release/docs/manualpages/PC/PCSHELL/ and take the preconditioner completely in your own hands. But often one builds a preconditioner by combining multiple simpler preconditioners: for example PCFIELDSPLIT discuss in https://petsc.org/release/docs/manual/ksp/, even block Jacobi https://petsc.org/release/docs/manualpages/PC/PCBJACOBI/#pcbjacobi is built up with smaller preconditioners. What particular type of preconditioner are you planning to build? Other users may have ideas on how to do it. > > 2. Also is it possible to extract preconditioner matrix created in Petsc? I'm not sure what you mean by this. Preconditioners are very rarely represented directly by a matrix (that would be too inefficient). Rather one provides functions that apply the action of the preconditioner. As noted above one provides such functions in PETSc using https://petsc.org/release/docs/manualpages/PC/PCSHELL/. > > 3. Is this possible to separate preconditioning & solving procedure to check the result of each process in Petsc?? The KSP and the PC work together to provide an over all good solver. One can focus on the preconditioner's quality by using it with several different Krylov methods. For example the KSPRICHARDSON (-ksp_typre richardson) does essentially nothing so the convergence (or lack of convergence) is determined by the preconditioner only. For non positive definite matrices https://petsc.org/release/docs/manualpages/KSP/KSPBCGS/#kspbcgs is generally weaker than https://petsc.org/release/docs/manualpages/KSP/KSPGMRES/#kspgmres so how your preconditioner works with these two different KSP methods can help you evaluate the preconditioner. Feel free to ask more detailed questions as you use the PETSc solvers, Barry > > > Thanks, > Hyung Kim -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Wed Dec 14 20:46:18 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Thu, 15 Dec 2022 11:46:18 +0900 Subject: [petsc-users] extract preconditioner matrix In-Reply-To: References: Message-ID: I'm work on FEM (especially for solving contact mechanics). In contact mechanics, original global jacobian matrix has nondiagonal dominance in some rows. I want to remove this effect(nondiagonal dominance) by using some techniques. Because I saw the many references that nondiagonal dominance is harmful for iterative solver. So my goal is, combine algebraic multigrid with some tuning jacobian matrix. To this goal, I am investigating what is possible and what is impossible. Thanks, Hyung Kim 2022? 12? 15? (?) ?? 11:34, Barry Smith ?? ??: > > > On Dec 14, 2022, at 9:10 PM, ??? wrote: > > Hello, > > > > I tried to find the way to adapt my own preconditioner. > In other words, I want to apply and solve a new preconditioner rather than > using the existing one in Petsc. > > So, my questions are as below > 1. Is this possible to adapt my own preconditioner?? > > > There are a variety of ways to provide your own preconditioner; you can > use https://petsc.org/release/docs/manualpages/PC/PCSHELL/ and take the > preconditioner completely in your own hands. But often one builds a > preconditioner by combining multiple simpler preconditioners: for example > PCFIELDSPLIT discuss in https://petsc.org/release/docs/manual/ksp/, even > block Jacobi > https://petsc.org/release/docs/manualpages/PC/PCBJACOBI/#pcbjacobi is > built up with smaller preconditioners. > > What particular type of preconditioner are you planning to build? Other > users may have ideas on how to do it. > > > 2. Also is it possible to extract preconditioner matrix created in Petsc? > > > I'm not sure what you mean by this. Preconditioners are very rarely > represented directly by a matrix (that would be too inefficient). Rather > one provides functions that apply the action of the preconditioner. As > noted above one provides such functions in PETSc using > https://petsc.org/release/docs/manualpages/PC/PCSHELL/. > > > > 3. Is this possible to separate preconditioning & solving procedure to > check the result of each process in Petsc?? > > > The KSP and the PC work together to provide an over all good solver. > One can focus on the preconditioner's quality by using it with several > different Krylov methods. For example the KSPRICHARDSON (-ksp_typre > richardson) does essentially nothing so the convergence (or lack of > convergence) is determined by the preconditioner only. For non positive > definite matrices > https://petsc.org/release/docs/manualpages/KSP/KSPBCGS/#kspbcgs is > generally weaker than > https://petsc.org/release/docs/manualpages/KSP/KSPGMRES/#kspgmres so how > your preconditioner works with these two different KSP methods can help you > evaluate the preconditioner. > > Feel free to ask more detailed questions as you use the PETSc solvers, > > Barry > > > > Thanks, > Hyung Kim > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Dec 14 21:20:30 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 14 Dec 2022 22:20:30 -0500 Subject: [petsc-users] Saving solution with monitor function In-Reply-To: References: Message-ID: On Wed, Dec 14, 2022 at 5:43 PM Guglielmo, Tyler Hardy wrote: > Ah beautiful, thanks for showing me these. Apologies for bugging you all > with these questions. Having to learn a ton of subroutines? > > > > Btw is there a matrix addition function for M = a*M_1 + b*M_2, or one > similar to MatAXPY that does not overwrite the Y matrix, i.e. M = a*X + Y ? > We do not have that. You might MatCopy(Y, M) and then MatAXPY() Thanks, Matt > > > All the best, > > Tyler > > > > *From: *Barry Smith > *Date: *Wednesday, December 14, 2022 at 12:51 PM > *To: *Guglielmo, Tyler Hardy > *Cc: *Matthew Knepley , petsc-users at mcs.anl.gov < > petsc-users at mcs.anl.gov> > *Subject: *Re: [petsc-users] Saving solution with monitor function > > > > See, for example > https://petsc.org/release/docs/manualpages/TS/TSTrajectoryGetVecs/ > > and > https://petsc.org/release/docs/manualpages/TS/TSTrajectoryGetUpdatedHistoryVecs/ > > > > > > One does not directly access the data inside the trajectory; one calls > functions in the API to obtained desired information. If you need specific > information that it does not currently provide we can attempt to provide > additional functionality. > > > > > > Barry > > > > > > On Dec 14, 2022, at 1:07 PM, Guglielmo, Tyler Hardy via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > Thanks Matt, > > > > I?m a bit confused on where the trajectory is being stored in the > TSTrajectory object. > > > > Basically I have run > > > > TSSetSaveTrajectory(ts); > > ? > > TSSolve(ts, x); > > TSTrajectory tj; > > TSGetTrajectory(ts, &tj); > > TSTrajectorySetType(tj, ts, TSTRAJECTORYMEMORY); > > > > How is the object supposed to be accessed to find the entire trajectory? > I couldn?t find a clear example of where this is laid out in the > documentation. > > > > The TSTrajectory object looks like some complicated struct, but parsing > which pointer is pointing to the solution has alluded me. > > > > Thanks for your time! > > > > Best, > > Tyler > > > > > > *From: *Matthew Knepley > *Date: *Tuesday, December 13, 2022 at 6:41 AM > *To: *Guglielmo, Tyler Hardy > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] Saving solution with monitor function > > On Tue, Dec 13, 2022 at 8:40 AM Guglielmo, Tyler Hardy via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi all, > > > > I am a new PETSc user (and new to MPI in general), and was wondering if > someone could help me out with what I am sure is a basic question (if this > is not the appropriate email list or there is a better place please let me > know!). > > > > Basically, I am writing a code that requires a solution to an ODE that > will be used later on during runtime. I have written the basic ODE solver > using TSRK, however I haven?t thought of a good way to store the actual > solution at all time steps throughout the time evolution. I would like to > avoid writing each time step to a file through the monitor function, and > instead just plug each time step into an array. > > > > How is this usually done? I suppose the user defined struct that gets > passed into the monitor function could contain a pointer to an array in > main? This is how I would do this if the program wasn?t of the MPI > variety, but I am not sure how to properly declare a pointer to an array > declared as Vec and built through the usual PETSc process. Any tips are > greatly appreciated > > > > I think this is what TSTrajectory is for. I believe you want > https://petsc.org/main/docs/manualpages/TS/TSTRAJECTORYMEMORY/ > > > > > Thanks, > > > > Matt > > > > Thanks for your time, > > Tyler > > > > +++++++++++++++++++++++++++++ > > Tyler Guglielmo > > Postdoctoral Researcher > > Lawrence Livermore National Lab > > Office: 925-423-6186 > > Cell: 210-480-8000 > > +++++++++++++++++++++++++++++ > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From syvshc at foxmail.com Thu Dec 15 04:13:40 2022 From: syvshc at foxmail.com (=?ISO-8859-1?B?U3l2c2hj?=) Date: Thu, 15 Dec 2022 18:13:40 +0800 Subject: [petsc-users] out of date warning when petscmpiexec. Message-ID: I'm a beginner with petsc, and I'm reading PETSc for Partial Differential Equations.  There is the newest release version (3.18.2) of PETSc's gitlab repo on my device, and openmpi in my system (/usr/sbin/).  Here is the repo and what I tried to compiling. p4pdes/e.c at master ? bueler/p4pdes (github.com) After "make e", I got an excutable file "e". "./e" or "mpiexec -n 4 ./e" can perfectly run.  However when I use "petscmpiexec -n 4 ./e", I got some warnings: Warning: ************** The PETSc libraries are out of date Warning: ************** The executable ./e is out of date What should I do to fix the warning? Also this is the first time that I send to a mail-list, if there are some mistakes I made, please tell me.  Kind regards, Syvshc -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Dec 15 08:19:21 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 15 Dec 2022 08:19:21 -0600 (CST) Subject: [petsc-users] out of date warning when petscmpiexec. In-Reply-To: References: Message-ID: What do you get - if you invoke the following command in petsc source dir? make -q -f gmakefile libs [this is the test petscmpiexec is using to check if "libraries are out of date"] Satish On Thu, 15 Dec 2022, Syvshc wrote: > I'm a beginner with petsc, and I'm reading PETSc for Partial Differential Equations.  > > There is the newest release version (3.18.2) of PETSc's gitlab repo on my device, and openmpi in my system (/usr/sbin/).  > > > Here is the repo and what I tried to compiling. p4pdes/e.c at master ? bueler/p4pdes (github.com) > > > After "make e", I got an excutable file "e". "./e" or "mpiexec -n 4 ./e" can perfectly run.  > > > However when I use "petscmpiexec -n 4 ./e", I got some warnings: > > > Warning: ************** The PETSc libraries are out of date > Warning: ************** The executable ./e is out of date > > > What should I do to fix the warning? > > > Also this is the first time that I send to a mail-list, if there are some mistakes I made, please tell me.  > > > Kind regards, > > > Syvshc From knepley at gmail.com Thu Dec 15 10:23:04 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 15 Dec 2022 11:23:04 -0500 Subject: [petsc-users] dmplex normal vector incorrect for periodic gmsh grids In-Reply-To: References: <32720DD5-FA50-4766-A36A-7566913795B3@gmx.net> Message-ID: On Wed, Dec 14, 2022 at 9:20 AM Praveen C wrote: > Hello > > I tried this with the attached code but I still get wrong normals. > > ./dmplex -dm_plex_filename ug_periodic.msh -dm_localize_height 1 > > > Code and grid file are attached > I have fixed the code. The above options should now work for you. Thanks, Matt > Thank you > praveen > > > > On 14-Dec-2022, at 6:40 PM, Matthew Knepley wrote: > > On Wed, Dec 14, 2022 at 2:38 AM Praveen C wrote: > >> Thank you, this MR works if I generate the mesh within the code. >> >> But using a mesh made in gmsh, I see the same issue. >> > > Because the operations happen in a different order. If you read in your > mesh with > > -dm_plex_filename mymesh.gmsh -dm_localize_height 1 > > then it should work. > > Thanks, > > Matt > > >> Thanks >> praveen >> >> On 14-Dec-2022, at 12:51 AM, Matthew Knepley wrote: >> >> On Tue, Dec 13, 2022 at 10:57 AM Matthew Knepley >> wrote: >> >>> On Tue, Dec 13, 2022 at 6:11 AM Praveen C wrote: >>> >>>> Hello >>>> >>>> In the attached test, I read a small grid made in gmsh with periodic bc. >>>> >>>> This is a 2d mesh. >>>> >>>> The cell numbers are shown in the figure. >>>> >>>> All faces have length = 2.5 >>>> >>>> But using PetscFVFaceGeom I am getting length of 7.5 for some faces. >>>> E.g., >>>> >>>> face: 59, centroid = 3.750000, 2.500000, normal = 0.000000, -7.500000 >>>> ===> Face length incorrect = 7.500000, should be 2.5 >>>> support[0] = 11, cent = 8.750000, 3.750000, area = 6.250000 >>>> support[1] = 15, cent = 8.750000, 1.250000, area = 6.250000 >>>> >>>> There are also errors in the orientation of normal. >>>> >>>> If we disable periodicity in geo file, this error goes away. >>>> >>> >>> Yes, by default we only localize coordinates for cells. I can put in >>> code to localize faces. >>> >> >> Okay, I now have a MR for this: >> https://gitlab.com/petsc/petsc/-/merge_requests/5917 >> >> I am attaching your code, slightly modified. You can run >> >> ./dmplex -malloc_debug 0 -dm_plex_box_upper 10,10 -dm_plex_box_faces >> 4,4 -dm_plex_simplex 0 -dm_view ::ascii_info_detail -draw_pause 3 >> -dm_plex_box_bd periodic,periodic -dm_localize_height 0 >> >> which shows incorrect edges and >> >> ./dmplex -malloc_debug 0 -dm_plex_box_upper 10,10 -dm_plex_box_faces 4,4 >> -dm_plex_simplex 0 -dm_view ::ascii_info_detail -draw_pause 3 >> -dm_plex_box_bd periodic,periodic -dm_localize_height 1 >> >> which is correct. If you want to control things yourself, instead of >> using the option you can call DMPlexSetMaxProjectionHeight() on the >> coordinate DM yourself. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks >>>> praveen >>>> >>> -- >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From syvshc at foxmail.com Thu Dec 15 11:29:47 2022 From: syvshc at foxmail.com (=?ISO-8859-1?B?U3l2c2hj?=) Date: Fri, 16 Dec 2022 01:29:47 +0800 Subject: [petsc-users] out of date warning when petscmpiexec. In-Reply-To: References: Message-ID: I run this command in the root dir of my petsc, and didn't get any response. ------------------ Original ------------------ From: "petsc-users" From balay at mcs.anl.gov Thu Dec 15 11:40:33 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 15 Dec 2022 11:40:33 -0600 (CST) Subject: [petsc-users] out of date warning when petscmpiexec. In-Reply-To: References: Message-ID: <02eec826-0cb0-3c86-5f17-b01187dc12ce@mcs.anl.gov> That's strange. Do you still get this warning from petscmpiexec? Can you run these commands - and copy/paste the *complete* session from your terminal [for these commands and their output on terminal] ? >>>>>> cd $PETSC_DIR make -q -f gmakefile libs echo $? cd bueler/p4pdes make -q e echo $? petscmpiexec -n 1 ./e make clean make e make -q e echo $? petscmpiexec -n 1 ./e <<<< Alternatively you can just use mpiexec [ i.e not use petscmpiexec - its just a convenience wrapper over using correct mpiexec/valgrind]. Satish On Fri, 16 Dec 2022, Syvshc wrote: > I run this command in the root dir of my petsc, and didn't get any response. > > > > > ------------------ Original ------------------ > From: "petsc-users" Date: Thu, Dec 15, 2022 10:19 PM > To: "Syvshc" Cc: "petsc-users" Subject: Re: [petsc-users] out of date warning when petscmpiexec. > > > > What do you get - if you invoke the following command in petsc source dir? > > make -q -f gmakefile libs > > [this is the test petscmpiexec is using to check if "libraries are out of date"] > > Satish > > On Thu, 15 Dec 2022, Syvshc wrote: > > > I'm a beginner with petsc, and I'm reading&nbsp;PETSc for Partial Differential Equations.&nbsp; > > > > There is the newest release version (3.18.2) of PETSc's gitlab repo on my device, and openmpi in my system (/usr/sbin/).&nbsp; > > > > > > Here is the repo and what I tried to compiling.&nbsp;p4pdes/e.c at master ? bueler/p4pdes (github.com) > > > > > > After "make e", I got an excutable file "e". "./e" or "mpiexec -n 4 ./e" can perfectly run.&nbsp; > > > > > > However when I use "petscmpiexec -n 4 ./e", I got some warnings: > > > > > > Warning: ************** The PETSc libraries are out of date > > Warning: ************** The executable ./e is out of date > > > > > > What should I do to fix the warning? > > > > > > Also this is the first time that I send to a mail-list, if there are some mistakes I made, please tell me.&nbsp; > > > > > > Kind regards, > > > > > > Syvshc From syvshc at foxmail.com Thu Dec 15 11:52:18 2022 From: syvshc at foxmail.com (=?gb18030?B?U3l2c2hj?=) Date: Fri, 16 Dec 2022 01:52:18 +0800 Subject: [petsc-users] out of date warning when petscmpiexec. In-Reply-To: <02eec826-0cb0-3c86-5f17-b01187dc12ce@mcs.anl.gov> References: <02eec826-0cb0-3c86-5f17-b01187dc12ce@mcs.anl.gov> Message-ID: I still get this warning, I found that "make -q" command won't get any respond. Here is the whole output: ? cd $PETSC_DIR make -q -f gmakefile libs echo $? cd ~/git/p4pdes/c/ch1 make -q e echo $? petscmpiexec -n 1 ./e make clean make e make -q e echo $? petscmpiexec -n 1 ./e 1 1 Warning: ************** The PETSc libraries are out of date Warning: ************** The executable ./e is out of date e is about 1.000000000000000 rank 0 did 0 flops mpicc -o e.o -c -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -g3 -O0 -pedantic -std=c99 -I/home/syvshclily/git/petsc/petsc-latest/include -I/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/include    `pwd`/e.c mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -g3 -O0 -pedantic -std=c99 -o e e.o  -Wl,-rpath,/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/lib -L/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/lib -Wl,-rpath,/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/lib -L/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.0 -L/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.0 -lpetsc -lsuperlu_dist -llapack -lblas -lm -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl /usr/sbin/rm -f e.o 1 Warning: ************** The PETSc libraries are out of date Warning: ************** The executable ./e is out of date e is about 1.000000000000000 rank 0 did 0 flops Thanks for your reply.  Syvshc ------------------ Original ------------------ From: "petsc-users" From knepley at gmail.com Thu Dec 15 11:56:47 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 15 Dec 2022 12:56:47 -0500 Subject: [petsc-users] reading and writing periodic DMPlex to file In-Reply-To: <83e2b092-5440-e009-ef84-dfde3ff6804d@ovgu.de> References: <83e2b092-5440-e009-ef84-dfde3ff6804d@ovgu.de> Message-ID: On Wed, Dec 14, 2022 at 3:58 AM Berend van Wachem wrote: > > Dear PETSc team and users, > > I have asked a few times about this before, but we haven't really gotten > this to work yet. > > In our code, we use the DMPlex framework and are also interested in > periodic geometries. > > As our simulations typically require many time-steps, we would like to > be able to save the DM to file and to read it again to resume the > simulation (a restart). > > Although this works for a non-periodic DM, we haven't been able to get > this to work for a periodic one. To illustrate this, I have made a > working example, consisting of 2 files, createandwrite.c and > readandcreate.c. I have attached these 2 working examples. We are using > Petsc-3.18.2. > > In the first file (createandwrite.c) a DMPlex is created and written to > a file. Periodicity is activated on lines 52-55 of the code. > > In the second file (readandcreate.c) a DMPlex is read from the file. > When a periodic DM is read, this does not work. Also, trying to > 'enforce' periodicity, lines 55 - 66, does not work if the number of > processes is larger than 1 - the code "hangs" without producing an error. > > Could you indicate what I am missing? I have really tried many different > options, without finding a solution. > Hi Berend, There are several problems. I will eventually fix all of them, but I think we can get this working quickly. 1) Periodicity information is not saved. I will fix this, but forcing it should work. 2) You were getting a hang because the blocksize on the local coordinates was not set correctly after loading since the vector had zero length. This does not happen in any test because HDF5 loads a global vector, but most other things create local coordinates. I have a fix for this, which I will get in an MR, Also, I moved DMLocalizeCoordinates() after distribution, since this is where it belongs. knepley/fix-plex-periodic-faces *$:/PETSc3/petsc/petsc-dev$ git diff diff --git a/src/dm/interface/dmcoordinates.c b/src/dm/interface/dmcoordinates.c index a922348f95b..6437e9f7259 100644 --- a/src/dm/interface/dmcoordinates.c +++ b/src/dm/interface/dmcoordinates.c @@ -551,10 +551,14 @@ PetscErrorCode DMGetCoordinatesLocalSetUp(DM dm) PetscFunctionBegin; PetscValidHeaderSpecific(dm, DM_CLASSID, 1); if (!dm->coordinates[0].xl && dm->coordinates[0].x) { - DM cdm = NULL; + DM cdm = NULL; + PetscInt bs; PetscCall(DMGetCoordinateDM(dm, &cdm)); PetscCall(DMCreateLocalVector(cdm, &dm->coordinates[0].xl)); + // If the size of the vector is 0, it will not get the right block size + PetscCall(VecGetBlockSize(dm->coordinates[0].x, &bs)); + PetscCall(VecSetBlockSize(dm->coordinates[0].xl, bs)); PetscCall(PetscObjectSetName((PetscObject)dm->coordinates[0].xl, "coordinates")); PetscCall(DMGlobalToLocalBegin(cdm, dm->coordinates[0].x, INSERT_VALUES, dm->coordinates[0].xl)); PetscCall(DMGlobalToLocalEnd(cdm, dm->coordinates[0].x, INSERT_VALUES, dm->coordinates[0].xl)); 3) If I comment out forcing the periodicity, your example does not run for me. I will try to figure it out [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Nonconforming object sizes [0]PETSC ERROR: SF roots 4400 < pEnd 6000 [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could be the program crashed before they were used or a spelling mistake, etc! [1]PETSC ERROR: Nonconforming object sizes [0]PETSC ERROR: Option left: name:-start_in_debugger_no (no value) source: command line [1]PETSC ERROR: SF roots 4400 < pEnd 6000 [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.18.1-494-g16200351da0 GIT Date: 2022-12-12 23:42:20 +0000 [1]PETSC ERROR: WARNING! There are option(s) set that were not used! Could be the program crashed before they were used or a spelling mistake, etc! [1]PETSC ERROR: Option left: name:-start_in_debugger_no (no value) source: command line [0]PETSC ERROR: ./readandcreate on a arch-master-debug named MacBook-Pro.cable.rcn.com by knepley Thu Dec 15 12:50:26 2022 [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Configure options --PETSC_ARCH=arch-master-debug --download-bamg --download-bison --download-chaco --download-ctetgen --download-egads --download-eigen --download-exodusii --download-fftw --download-hpddm --download-ks --download-libceed --download-libpng --download-metis --download-ml --download-mumps --download-muparser --download-netcdf --download-opencascade --download-p4est --download-parmetis --download-pnetcdf --download-pragmatic --download-ptscotch --download-scalapack --download-slepc --download-suitesparse --download-superlu_dist --download-tetgen --download-triangle --with-cmake-exec=/PETSc3/petsc/apple/bin/cmake --with-ctest-exec=/PETSc3/petsc/apple/bin/ctest --with-hdf5-dir=/PETSc3/petsc/apple --with-mpi-dir=/PETSc3/petsc/apple --with-petsc4py=1 --with-shared-libraries --with-slepc --with-zlib [1]PETSC ERROR: Petsc Development GIT revision: v3.18.1-494-g16200351da0 GIT Date: 2022-12-12 23:42:20 +0000 [0]PETSC ERROR: #1 PetscSectionCreateGlobalSection() at /PETSc3/petsc/petsc-dev/src/vec/is/section/interface/section.c:1322 [1]PETSC ERROR: ./readandcreate on a arch-master-debug named MacBook-Pro.cable.rcn.com by knepley Thu Dec 15 12:50:26 2022 [0]PETSC ERROR: #2 DMGetGlobalSection() at /PETSc3/petsc/petsc-dev/src/dm/interface/dm.c:4527 [1]PETSC ERROR: Configure options --PETSC_ARCH=arch-master-debug --download-bamg --download-bison --download-chaco --download-ctetgen --download-egads --download-eigen --download-exodusii --download-fftw --download-hpddm --download-ks --download-libceed --download-libpng --download-metis --download-ml --download-mumps --download-muparser --download-netcdf --download-opencascade --download-p4est --download-parmetis --download-pnetcdf --download-pragmatic --download-ptscotch --download-scalapack --download-slepc --download-suitesparse --download-superlu_dist --download-tetgen --download-triangle --with-cmake-exec=/PETSc3/petsc/apple/bin/cmake --with-ctest-exec=/PETSc3/petsc/apple/bin/ctest --with-hdf5-dir=/PETSc3/petsc/apple --with-mpi-dir=/PETSc3/petsc/apple --with-petsc4py=1 --with-shared-libraries --with-slepc --with-zlib [0]PETSC ERROR: #3 DMPlexSectionLoad_HDF5_Internal() at /PETSc3/petsc/petsc-dev/src/dm/impls/plex/plexhdf5.c:2750 [1]PETSC ERROR: #1 PetscSectionCreateGlobalSection() at /PETSc3/petsc/petsc-dev/src/vec/is/section/interface/section.c:1322 [0]PETSC ERROR: #4 DMPlexSectionLoad() at /PETSc3/petsc/petsc-dev/src/dm/impls/plex/plex.c:2364 [1]PETSC ERROR: #2 DMGetGlobalSection() at /PETSc3/petsc/petsc-dev/src/dm/interface/dm.c:4527 [0]PETSC ERROR: #5 main() at /Users/knepley/Downloads/tmp/Berend/readandcreate.c:85 [1]PETSC ERROR: #3 DMPlexSectionLoad_HDF5_Internal() at /PETSc3/petsc/petsc-dev/src/dm/impls/plex/plexhdf5.c:2750 [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -malloc_debug (source: environment) [1]PETSC ERROR: #4 DMPlexSectionLoad() at /PETSc3/petsc/petsc-dev/src/dm/impls/plex/plex.c:2364 [1]PETSC ERROR: #5 main() at /Users/knepley/Downloads/tmp/Berend/readandcreate.c:85 [0]PETSC ERROR: -start_in_debugger_no (source: command line) [1]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- [1]PETSC ERROR: -malloc_debug (source: environment) application called MPI_Abort(MPI_COMM_SELF, 60) - process 0 [1]PETSC ERROR: -start_in_debugger_no (source: command line) [1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_SELF, 60) - process 0 4) We now have parallel HDF5 loading, so you should not have to manually distribute. I will change your example to use it and send it back when I am done. Thanks! Matt Many thanks and kind regards, > Berend. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Dec 15 12:11:19 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 15 Dec 2022 12:11:19 -0600 (CST) Subject: [petsc-users] out of date warning when petscmpiexec. In-Reply-To: References: <02eec826-0cb0-3c86-5f17-b01187dc12ce@mcs.anl.gov> Message-ID: <9f93c77f-e263-c8ca-6c9e-1ed00e65c0c5@mcs.anl.gov> > make -q -f gmakefile libs > echo $? > 1 So "make" does think the library is out-of-date. If up-to-date - you should see a '0' - not '1'. One more try: what do you get for: cd $PETSC_DIR make -f gmakefile libs # i.e without -q Satish On Fri, 16 Dec 2022, Syvshc wrote: > I still get this warning, I found that "make -q" command won't get any respond. > > > Here is the whole output: > > > ?7?9 cd $PETSC_DIR > make -q -f gmakefile libs > echo $? > cd ~/git/p4pdes/c/ch1 > make -q e > echo $? > petscmpiexec -n 1 ./e > make clean > make e > make -q e > echo $? > petscmpiexec -n 1 ./e > 1 > 1 > Warning: ************** The PETSc libraries are out of date > Warning: ************** The executable ./e is out of date > e is about 1.000000000000000 > rank 0 did 0 flops > mpicc -o e.o -c -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -g3 -O0 -pedantic -std=c99 -I/home/syvshclily/git/petsc/petsc-latest/include -I/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/include    `pwd`/e.c > mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -g3 -O0 -pedantic -std=c99 -o e e.o  -Wl,-rpath,/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/lib -L/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/lib -Wl,-rpath,/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/lib -L/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.0 -L/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.0 -lpetsc -lsuperlu_dist -llapack -lblas -lm -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl > /usr/sbin/rm -f e.o > 1 > Warning: ************** The PETSc libraries are out of date > Warning: ************** The executable ./e is out of date > e is about 1.000000000000000 > rank 0 did 0 flops > > > > Thanks for your reply.  > > > Syvshc > > > > > ------------------ Original ------------------ > From: "petsc-users" Date: Fri, Dec 16, 2022 01:40 AM > To: "Syvshc" Cc: "petsc-users" Subject: Re: [petsc-users] out of date warning when petscmpiexec. > > > > That's strange. Do you still get this warning from petscmpiexec? > > Can you run these commands - and copy/paste the *complete* session from your terminal [for these commands and their output on terminal] ? > > >>>>>> > cd $PETSC_DIR > make -q -f gmakefile libs > echo $? > cd bueler/p4pdes > make -q e > echo $? > petscmpiexec -n 1 ./e > make clean > make e > make -q e > echo $? > petscmpiexec -n 1 ./e > <<<< > > Alternatively you can just use mpiexec [ i.e not use petscmpiexec - its just a convenience wrapper over using correct mpiexec/valgrind]. > > Satish > > On Fri, 16 Dec 2022, Syvshc wrote: > > > I run this command in the root dir of my petsc, and didn't get any response. > > > > > > > > > > ------------------&nbsp;Original&nbsp;------------------ > > From:                                                                                                                        "petsc-users"                                         &nbs p;                                           > Date:&nbsp;Thu, Dec 15, 2022 10:19 PM > > To:&nbsp;"Syvshc" > Cc:&nbsp;"petsc-users" > Subject:&nbsp;Re: [petsc-users] out of date warning when petscmpiexec. > > > > > > > > What do you get - if you invoke the following command in petsc source dir? > > > > make -q -f gmakefile libs > > > > [this is the test petscmpiexec is using to check if "libraries are out of date"] > > > > Satish > > > > On Thu, 15 Dec 2022, Syvshc wrote: > > > > &gt; I'm a beginner with petsc, and I'm reading&amp;nbsp;PETSc for Partial Differential Equations.&amp;nbsp; > > &gt; > > &gt; There is the newest release version (3.18.2) of PETSc's gitlab repo on my device, and openmpi in my system (/usr/sbin/).&amp;nbsp; > > &gt; > > &gt; > > &gt; Here is the repo and what I tried to compiling.&amp;nbsp;p4pdes/e.c at master ?? bueler/p4pdes (github.com) > > &gt; > > &gt; > > &gt; After "make e", I got an excutable file "e". "./e" or "mpiexec -n 4 ./e" can perfectly run.&amp;nbsp; > > &gt; > > &gt; > > &gt; However when I use "petscmpiexec -n 4 ./e", I got some warnings: > > &gt; > > &gt; > > &gt; Warning: ************** The PETSc libraries are out of date > > &gt; Warning: ************** The executable ./e is out of date > > &gt; > > &gt; > > &gt; What should I do to fix the warning? > > &gt; > > &gt; > > &gt; Also this is the first time that I send to a mail-list, if there are some mistakes I made, please tell me.&amp;nbsp; > > &gt; > > &gt; > > &gt; Kind regards, > > &gt; > > &gt; > > &gt; Syvshc From syvshc at foxmail.com Thu Dec 15 12:21:23 2022 From: syvshc at foxmail.com (=?gb18030?B?U3l2c2hj?=) Date: Fri, 16 Dec 2022 02:21:23 +0800 Subject: [petsc-users] =?gb18030?b?u9i4tKO6ICBvdXQgb2YgZGF0ZSB3YXJuaW5n?= =?gb18030?q?_when_petscmpiexec=2E?= In-Reply-To: <9f93c77f-e263-c8ca-6c9e-1ed00e65c0c5@mcs.anl.gov> References: <02eec826-0cb0-3c86-5f17-b01187dc12ce@mcs.anl.gov> <9f93c77f-e263-c8ca-6c9e-1ed00e65c0c5@mcs.anl.gov> Message-ID: I got lots of build information,  and after "make -f gmakefile libs", the out of date warning of petscmpiexec was gone.  ? petscmpiexec -n 1 ./e Warning: ************** The executable ./e is out of date e is about 1.000000000000000 rank 0 did 0 flops BTW, if I want to know something of the "executable out of date" question,  should I send a new mail or I can just ask under these mails? Thanks for your help sincerely. Syvshc ------------------ ???? ------------------ ???: "petsc-users" From balay at mcs.anl.gov Thu Dec 15 12:30:19 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 15 Dec 2022 12:30:19 -0600 (CST) Subject: [petsc-users] ?????? out of date warning when petscmpiexec. In-Reply-To: References: <02eec826-0cb0-3c86-5f17-b01187dc12ce@mcs.anl.gov> <9f93c77f-e263-c8ca-6c9e-1ed00e65c0c5@mcs.anl.gov> Message-ID: <03ef4e17-ca80-d5d3-3032-6fa80eed82b2@mcs.anl.gov> Ok - so for whatever reason your petsc build were indeed out of date - and "make -f gmakefile libs" updated the library. Wrt out-of-date executable message - I think its due to the old formatted makefile. You can do the following: - edit p4pdes/c/ch1/makefile - remove the line: ' ${RM} e.o' - run 'make e' Now retry: petscmpiexec -n 1 ./e Satish On Fri, 16 Dec 2022, Syvshc wrote: > I got lots of build information,  > and after "make -f gmakefile libs", the out of date warning of petscmpiexec was gone.  > > > ?7?9 petscmpiexec -n 1 ./e > Warning: ************** The executable ./e is out of date > e is about 1.000000000000000 > rank 0 did 0 flops > > > > BTW, if I want to know something of the "executable out of date" question,  > should I send a new mail or I can just ask under these mails? > > > Thanks for your help sincerely. > > > Syvshc > > > > > ------------------ ???????? ------------------ > ??????: "petsc-users" ????????: 2022??12??16??(??????) ????2:11 > ??????: "Syvshc" ????: "petsc-users" ????: Re: [petsc-users] out of date warning when petscmpiexec. > > > > > make -q -f gmakefile libs > > echo $? > > 1 > > So "make" does think the library is out-of-date. If up-to-date - you should see a '0' - not '1'. One more try: > > what do you get for: > > cd $PETSC_DIR > make -f gmakefile libs  # i.e without -q > > Satish > > On Fri, 16 Dec 2022, Syvshc wrote: > > > I still get this warning, I found that "make -q" command won't get any respond. > > > > > > Here is the whole output: > > > > > > ?7?9 cd $PETSC_DIR > > make -q -f gmakefile libs > > echo $? > > cd ~/git/p4pdes/c/ch1 > > make -q e > > echo $? > > petscmpiexec -n 1 ./e > > make clean > > make e > > make -q e > > echo $? > > petscmpiexec -n 1 ./e > > 1 > > 1 > > Warning: ************** The PETSc libraries are out of date > > Warning: ************** The executable ./e is out of date > > e is about 1.000000000000000 > > rank 0 did 0 flops > > mpicc -o e.o -c -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -g3 -O0 -pedantic -std=c99 -I/home/syvshclily/git/petsc/petsc-latest/include -I/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/include&nbsp; &nbsp; `pwd`/e.c > > mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -g3 -O0 -pedantic -std=c99 -o e e.o&nbsp; -Wl,-rpath,/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/lib -L/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/lib -Wl,-rpath,/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/lib -L/home/syvshclily/git/petsc/petsc-latest/arch-linux-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.0 -L/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.0 -lpetsc -lsuperlu_dist -llapack -lblas -lm -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl > > /usr/sbin/rm -f e.o > > 1 > > Warning: ************** The PETSc libraries are out of date > > Warning: ************** The executable ./e is out of date > > e is about 1.000000000000000 > > rank 0 did 0 flops > > > > > > > > Thanks for your reply.&nbsp; > > > > > > Syvshc > > > > > > > > > > ------------------&nbsp;Original&nbsp;------------------ > > From:                                                                                                                        "petsc-users"                                         &nbs p;                                           > Date:&nbsp;Fri, Dec 16, 2022 01:40 AM > > To:&nbsp;"Syvshc" > Cc:&nbsp;"petsc-users" > Subject:&nbsp;Re: [petsc-users] out of date warning when petscmpiexec. > > > > > > > > That's strange. Do you still get this warning from petscmpiexec? > > > > Can you run these commands - and copy/paste the *complete* session from your terminal [for these commands and their output on terminal] ? > > > > &gt;&gt;&gt;&gt;&gt;&gt; > > cd $PETSC_DIR > > make -q -f gmakefile libs > > echo $? > > cd bueler/p4pdes > > make -q e > > echo $? > > petscmpiexec -n 1 ./e > > make clean > > make e > > make -q e > > echo $? > > petscmpiexec -n 1 ./e > > <<<< > > > > Alternatively you can just use mpiexec [ i.e not use petscmpiexec - its just a convenience wrapper over using correct mpiexec/valgrind]. > > > > Satish > > > > On Fri, 16 Dec 2022, Syvshc wrote: > > > > &gt; I run this command in the root dir of my petsc, and didn't get any response. > > &gt; > > &gt; > > &gt; > > &gt; > > &gt; ------------------&amp;nbsp;Original&amp;nbsp;------------------ > > &gt; From:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp ;&nb sp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "petsc-users"&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&a mp;nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; > &gt; Date:&amp;nbsp;Thu, Dec 15, 2022 10:19 PM > > &gt; To:&amp;nbsp;"Syvshc" > &gt; Cc:&amp;nbsp;"petsc-users" > &gt; Subject:&amp;nbsp;Re: [petsc-users] out of date warning when petscmpiexec. > > &gt; > > &gt; > > &gt; > > &gt; What do you get - if you invoke the following command in petsc source dir? > > &gt; > > &gt; make -q -f gmakefile libs > > &gt; > > &gt; [this is the test petscmpiexec is using to check if "libraries are out of date"] > > &gt; > > &gt; Satish > > &gt; > > &gt; On Thu, 15 Dec 2022, Syvshc wrote: > > &gt; > > &gt; &amp;gt; I'm a beginner with petsc, and I'm reading&amp;amp;nbsp;PETSc for Partial Differential Equations.&amp;amp;nbsp; > > &gt; &amp;gt; > > &gt; &amp;gt; There is the newest release version (3.18.2) of PETSc's gitlab repo on my device, and openmpi in my system (/usr/sbin/).&amp;amp;nbsp; > > &gt; &amp;gt; > > &gt; &amp;gt; > > &gt; &amp;gt; Here is the repo and what I tried to compiling.&amp;amp;nbsp;p4pdes/e.c at master ?? bueler/p4pdes (github.com) > > &gt; &amp;gt; > > &gt; &amp;gt; > > &gt; &amp;gt; After "make e", I got an excutable file "e". "./e" or "mpiexec -n 4 ./e" can perfectly run.&amp;amp;nbsp; > > &gt; &amp;gt; > > &gt; &amp;gt; > > &gt; &amp;gt; However when I use "petscmpiexec -n 4 ./e", I got some warnings: > > &gt; &amp;gt; > > &gt; &amp;gt; > > &gt; &amp;gt; Warning: ************** The PETSc libraries are out of date > > &gt; &amp;gt; Warning: ************** The executable ./e is out of date > > &gt; &amp;gt; > > &gt; &amp;gt; > > &gt; &amp;gt; What should I do to fix the warning? > > &gt; &amp;gt; > > &gt; &amp;gt; > > &gt; &amp;gt; Also this is the first time that I send to a mail-list, if there are some mistakes I made, please tell me.&amp;amp;nbsp; > > &gt; &amp;gt; > > &gt; &amp;gt; > > &gt; &amp;gt; Kind regards, > > &gt; &amp;gt; > > &gt; &amp;gt; > > &gt; &amp;gt; Syvshc From guglielmo2 at llnl.gov Thu Dec 15 14:56:36 2022 From: guglielmo2 at llnl.gov (Guglielmo, Tyler Hardy) Date: Thu, 15 Dec 2022 20:56:36 +0000 Subject: [petsc-users] Using slepc MFN Message-ID: Not sure if there are many slepc users here, but I have a question regarding MFN in slepc. Essentially I am running a loop over time steps, and at each step I would like to compute a matrix exponential, however the matrix that I am computing the exponential of changes in a non-trivial way at each time step. It seems like the proper way to implement this is to build and destroy an MFN object at each time step. However, this doesn?t seem terribly efficient as most Krylov methods I have looked at (Expokit) require a large amount of workspace memory allocation at runtime. Is there a safe way to change the operator appearing in the MFN object without having to destroy and recreate? Best, Tyler +++++++++++++++++++++++++++++ Tyler Guglielmo Postdoctoral Researcher Lawrence Livermore National Lab Office: 925-423-6186 Cell: 210-480-8000 +++++++++++++++++++++++++++++ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Thu Dec 15 15:52:26 2022 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 15 Dec 2022 22:52:26 +0100 Subject: [petsc-users] Using slepc MFN In-Reply-To: References: Message-ID: <3D5336EA-8EA5-4FA8-A67E-7F3908972775@dsic.upv.es> You don't need to destroy the solver object, it should be sufficient to just call MFNSetOperator() with the new matrix. Jose > El 15 dic 2022, a las 21:56, Guglielmo, Tyler Hardy via petsc-users escribi?: > > Not sure if there are many slepc users here, but I have a question regarding MFN in slepc. > > Essentially I am running a loop over time steps, and at each step I would like to compute a matrix exponential, however the matrix that I am computing the exponential of changes in a non-trivial way at each time step. > > It seems like the proper way to implement this is to build and destroy an MFN object at each time step. However, this doesn?t seem terribly efficient as most Krylov methods I have looked at (Expokit) require a large amount of workspace memory allocation at runtime. > > Is there a safe way to change the operator appearing in the MFN object without having to destroy and recreate? > > Best, > Tyler > > +++++++++++++++++++++++++++++ > Tyler Guglielmo > Postdoctoral Researcher > Lawrence Livermore National Lab > Office: 925-423-6186 > Cell: 210-480-8000 > +++++++++++++++++++++++++++++ From syvshc at foxmail.com Thu Dec 15 20:08:11 2022 From: syvshc at foxmail.com (=?gb18030?B?U3l2c2hj?=) Date: Fri, 16 Dec 2022 10:08:11 +0800 Subject: [petsc-users] =?gb18030?b?u9i4tKO6ICA/Pz8/Pz8gIG91dCBvZiBkYXRl?= =?gb18030?q?_warning_when_petscmpiexec=2E?= In-Reply-To: <03ef4e17-ca80-d5d3-3032-6fa80eed82b2@mcs.anl.gov> References: <02eec826-0cb0-3c86-5f17-b01187dc12ce@mcs.anl.gov> <9f93c77f-e263-c8ca-6c9e-1ed00e65c0c5@mcs.anl.gov> <03ef4e17-ca80-d5d3-3032-6fa80eed82b2@mcs.anl.gov> Message-ID: That's the solution, Thank you very much. Syvshc ------------------ ???? ------------------ ???: "petsc-users" From mi.mike1021 at gmail.com Thu Dec 15 21:09:57 2022 From: mi.mike1021 at gmail.com (Mike Michell) Date: Thu, 15 Dec 2022 21:09:57 -0600 Subject: [petsc-users] DMLocalToLocal() and DMCoarsen() for DMPlex with Fortran Message-ID: Hello PETSc developer team, I am a user of DMPlex in PETSc with Fortran. I have two questions: - Is DMLocalToLocal() now available for DMPlex with Fortran? I made similar inquiry before: https://www.mail-archive.com/petsc-users at mcs.anl.gov/msg44500.html - Is there any example that can see how DMCoarsen() works? I can see either src/dm/impls/stag/tutorials/ex4.c or src/ksp/ksp/tutorials/ex65.c from the example folder. However, it is a bit tough to get an idea of how DMCoarsen() works. What can be the "coarsening" criteria? Is it uniformly coarsening over the domain? or Can it be variable-gradient based? Having more examples would be very helpful. Thanks, Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: From praveen at gmx.net Thu Dec 15 23:22:16 2022 From: praveen at gmx.net (Praveen C) Date: Fri, 16 Dec 2022 10:52:16 +0530 Subject: [petsc-users] dmplex normal vector incorrect for periodic gmsh grids In-Reply-To: References: <32720DD5-FA50-4766-A36A-7566913795B3@gmx.net> Message-ID: <4497F6D2-9B0F-450D-8767-A9BBB9CA1DF5@gmx.net> Thank you very much. I do see correct normals now. Is there a way to set the option >> -dm_localize_height 1 within the code ? best praveen > On 15-Dec-2022, at 9:53 PM, Matthew Knepley wrote: > > On Wed, Dec 14, 2022 at 9:20 AM Praveen C > wrote: >> Hello >> >> I tried this with the attached code but I still get wrong normals. >> >> ./dmplex -dm_plex_filename ug_periodic.msh -dm_localize_height 1 >> >> Code and grid file are attached > > I have fixed the code. The above options should now work for you. > > Thanks, > > Matt > >> Thank you >> praveen >> >> >> >>> On 14-Dec-2022, at 6:40 PM, Matthew Knepley > wrote: >>> >>> On Wed, Dec 14, 2022 at 2:38 AM Praveen C > wrote: >>>> Thank you, this MR works if I generate the mesh within the code. >>>> >>>> But using a mesh made in gmsh, I see the same issue. >>> >>> Because the operations happen in a different order. If you read in your mesh with >>> >>> -dm_plex_filename mymesh.gmsh -dm_localize_height 1 >>> >>> then it should work. >>> >>> Thanks, >>> >>> Matt >>> >>>> Thanks >>>> praveen >>>> >>>>> On 14-Dec-2022, at 12:51 AM, Matthew Knepley > wrote: >>>>> >>>>> On Tue, Dec 13, 2022 at 10:57 AM Matthew Knepley > wrote: >>>>>> On Tue, Dec 13, 2022 at 6:11 AM Praveen C > wrote: >>>>>>> Hello >>>>>>> >>>>>>> In the attached test, I read a small grid made in gmsh with periodic bc. >>>>>>> >>>>>>> This is a 2d mesh. >>>>>>> >>>>>>> The cell numbers are shown in the figure. >>>>>>> >>>>>>> All faces have length = 2.5 >>>>>>> >>>>>>> But using PetscFVFaceGeom I am getting length of 7.5 for some faces. E.g., >>>>>>> >>>>>>> face: 59, centroid = 3.750000, 2.500000, normal = 0.000000, -7.500000 >>>>>>> ===> Face length incorrect = 7.500000, should be 2.5 >>>>>>> support[0] = 11, cent = 8.750000, 3.750000, area = 6.250000 >>>>>>> support[1] = 15, cent = 8.750000, 1.250000, area = 6.250000 >>>>>>> >>>>>>> There are also errors in the orientation of normal. >>>>>>> >>>>>>> If we disable periodicity in geo file, this error goes away. >>>>>> >>>>>> Yes, by default we only localize coordinates for cells. I can put in code to localize faces. >>>>> >>>>> Okay, I now have a MR for this: https://gitlab.com/petsc/petsc/-/merge_requests/5917 >>>>> >>>>> I am attaching your code, slightly modified. You can run >>>>> >>>>> ./dmplex -malloc_debug 0 -dm_plex_box_upper 10,10 -dm_plex_box_faces 4,4 -dm_plex_simplex 0 -dm_view ::ascii_info_detail -draw_pause 3 -dm_plex_box_bd periodic,periodic -dm_localize_height 0 >>>>> >>>>> which shows incorrect edges and >>>>> >>>>> ./dmplex -malloc_debug 0 -dm_plex_box_upper 10,10 -dm_plex_box_faces 4,4 -dm_plex_simplex 0 -dm_view ::ascii_info_detail -draw_pause 3 -dm_plex_box_bd periodic,periodic -dm_localize_height 1 >>>>> >>>>> which is correct. If you want to control things yourself, instead of using the option you can call DMPlexSetMaxProjectionHeight() on the coordinate DM yourself. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>>> Thanks >>>>>>> praveen >>>>>> -- >>>>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rohany at alumni.cmu.edu Fri Dec 16 02:03:58 2022 From: rohany at alumni.cmu.edu (Rohan Yadav) Date: Fri, 16 Dec 2022 00:03:58 -0800 Subject: [petsc-users] Help with input construction hang on 2-GPU CG Solve Message-ID: Hi, I'm developing a microbenchmark that runs a CG solve using PETSc on a mesh using a 5-point stencil matrix. My code (linked here: https://github.com/rohany/petsc-pde-benchmark/blob/main/main.cpp, only 120 lines) works on 1 GPU and has great performance. When I move to 2 GPUs, the program appears to get stuck in the input generation. I've literred the code with print statements and have found out the following clues: * The first rank progresses through this loop: https://github.com/rohany/petsc-pde-benchmark/blob/main/main.cpp#L44, but then does not exit (it seems to get stuck right before rowStart == rowEnd) * The second rank makes very few iterations through the loop for its allotted rows. Therefore, neither rank makes it to the call to MatAssemblyBegin. I'm running the code using the following command line on the Summit supercomputer: ``` jsrun -n 2 -g 1 -c 1 -b rs -r 2 /gpfs/alpine/scratch/rohany/csc335/petsc-pde-benchmark/main -ksp_max_it 200 -ksp_type cg -pc_type none -ksp_atol 1e-10 -ksp_rtol 1e-10 -vec_type cuda -mat_type aijcusparse -use_gpu_aware_mpi 0 -nx 8485 -ny 8485 ``` Any suggestions will be appreciated! I feel that I have applied many of the common petsc optimizations of preallocating my matrix row counts, so I'm not sure what's going on with this input generation. Thanks, Rohan Yadav -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Dec 16 10:02:10 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 16 Dec 2022 11:02:10 -0500 Subject: [petsc-users] dmplex normal vector incorrect for periodic gmsh grids In-Reply-To: <4497F6D2-9B0F-450D-8767-A9BBB9CA1DF5@gmx.net> References: <32720DD5-FA50-4766-A36A-7566913795B3@gmx.net> <4497F6D2-9B0F-450D-8767-A9BBB9CA1DF5@gmx.net> Message-ID: On Fri, Dec 16, 2022 at 12:22 AM Praveen C wrote: > Thank you very much. I do see correct normals now. > > Is there a way to set the option > > -dm_localize_height 1 >> > > within the code ? > The problem is that the localization happens within the Gmsh construction, so it is difficult to insert an API. We could use a callback, but that rapidly becomes unwieldy. If you want to do it programmatically, I would use PetscOptionsSetValue(). Thanks, Matt > best > praveen > > On 15-Dec-2022, at 9:53 PM, Matthew Knepley wrote: > > On Wed, Dec 14, 2022 at 9:20 AM Praveen C wrote: > >> Hello >> >> I tried this with the attached code but I still get wrong normals. >> >> ./dmplex -dm_plex_filename ug_periodic.msh -dm_localize_height 1 >> >> Code and grid file are attached >> > > I have fixed the code. The above options should now work for you. > > Thanks, > > Matt > > >> Thank you >> praveen >> >> >> >> On 14-Dec-2022, at 6:40 PM, Matthew Knepley wrote: >> >> On Wed, Dec 14, 2022 at 2:38 AM Praveen C wrote: >> >>> Thank you, this MR works if I generate the mesh within the code. >>> >>> But using a mesh made in gmsh, I see the same issue. >>> >> >> Because the operations happen in a different order. If you read in your >> mesh with >> >> -dm_plex_filename mymesh.gmsh -dm_localize_height 1 >> >> then it should work. >> >> Thanks, >> >> Matt >> >> >>> Thanks >>> praveen >>> >>> On 14-Dec-2022, at 12:51 AM, Matthew Knepley wrote: >>> >>> On Tue, Dec 13, 2022 at 10:57 AM Matthew Knepley >>> wrote: >>> >>>> On Tue, Dec 13, 2022 at 6:11 AM Praveen C wrote: >>>> >>>>> Hello >>>>> >>>>> In the attached test, I read a small grid made in gmsh with periodic >>>>> bc. >>>>> >>>>> This is a 2d mesh. >>>>> >>>>> The cell numbers are shown in the figure. >>>>> >>>>> All faces have length = 2.5 >>>>> >>>>> But using PetscFVFaceGeom I am getting length of 7.5 for some faces. >>>>> E.g., >>>>> >>>>> face: 59, centroid = 3.750000, 2.500000, normal = 0.000000, -7.500000 >>>>> ===> Face length incorrect = 7.500000, should be 2.5 >>>>> support[0] = 11, cent = 8.750000, 3.750000, area = 6.250000 >>>>> support[1] = 15, cent = 8.750000, 1.250000, area = 6.250000 >>>>> >>>>> There are also errors in the orientation of normal. >>>>> >>>>> If we disable periodicity in geo file, this error goes away. >>>>> >>>> >>>> Yes, by default we only localize coordinates for cells. I can put in >>>> code to localize faces. >>>> >>> >>> Okay, I now have a MR for this: >>> https://gitlab.com/petsc/petsc/-/merge_requests/5917 >>> >>> I am attaching your code, slightly modified. You can run >>> >>> ./dmplex -malloc_debug 0 -dm_plex_box_upper 10,10 -dm_plex_box_faces >>> 4,4 -dm_plex_simplex 0 -dm_view ::ascii_info_detail -draw_pause 3 >>> -dm_plex_box_bd periodic,periodic -dm_localize_height 0 >>> >>> which shows incorrect edges and >>> >>> ./dmplex -malloc_debug 0 -dm_plex_box_upper 10,10 -dm_plex_box_faces 4,4 >>> -dm_plex_simplex 0 -dm_view ::ascii_info_detail -draw_pause 3 >>> -dm_plex_box_bd periodic,periodic -dm_localize_height 1 >>> >>> which is correct. If you want to control things yourself, instead of >>> using the option you can call DMPlexSetMaxProjectionHeight() on the >>> coordinate DM yourself. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks >>>>> praveen >>>>> >>>> -- >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Dec 16 10:18:16 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 16 Dec 2022 11:18:16 -0500 Subject: [petsc-users] DMLocalToLocal() and DMCoarsen() for DMPlex with Fortran In-Reply-To: References: Message-ID: On Thu, Dec 15, 2022 at 10:10 PM Mike Michell wrote: > Hello PETSc developer team, > > I am a user of DMPlex in PETSc with Fortran. I have two questions: > > - Is DMLocalToLocal() now available for DMPlex with Fortran? I made > similar inquiry before: > https://www.mail-archive.com/petsc-users at mcs.anl.gov/msg44500.html > There is a DMDA example ( https://gitlab.com/petsc/petsc/-/blob/main/src/dm/tutorials/ex13f90.F90) so the Fortran binding works. However, there is no Plex implementation. I can add it to the TODO list. > - Is there any example that can see how DMCoarsen() works? I can see > either src/dm/impls/stag/tutorials/ex4.c or src/ksp/ksp/tutorials/ex65.c > from the example folder. However, it is a bit tough to get an idea of how > DMCoarsen() works. What can be the "coarsening" criteria? Is it uniformly > coarsening over the domain? or Can it be variable-gradient based? Having > more examples would be very helpful. > DMCoarsen really only applies to more structured grids. There is a definition for unstructured grids, and we have written a paper about it (https://arxiv.org/abs/1104.0261), but that code could not be maintained and was difficult to generalize. Right now, if you want unstructured coarsening, I would recommend trying DMPlexAdaptMetric() which works with MMG, or DMPlexAdaptLabel() which works with p4est. Thanks, Matt > Thanks, > Mike > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sat Dec 17 22:57:01 2022 From: jed at jedbrown.org (Jed Brown) Date: Sat, 17 Dec 2022 21:57:01 -0700 Subject: [petsc-users] Help with input construction hang on 2-GPU CG Solve In-Reply-To: References: Message-ID: <87o7s1e2ci.fsf@jedbrown.org> I ran your code successfully with and without GPU-aware MPI. I see a bit of time in MatSetValue -- you can make it a bit faster using one MatSetValues call per row, but it's typical that assembling a matrix like this (sequentially on the host) will be more expensive than some unpreconditioned CG iterations (that don't come close to solving the problem -- use multigrid if you want to actually solve this problem). Rohan Yadav writes: > Hi, > > I'm developing a microbenchmark that runs a CG solve using PETSc on a mesh > using a 5-point stencil matrix. My code (linked here: > https://github.com/rohany/petsc-pde-benchmark/blob/main/main.cpp, only 120 > lines) works on 1 GPU and has great performance. When I move to 2 GPUs, the > program appears to get stuck in the input generation. I've literred the > code with print statements and have found out the following clues: > > * The first rank progresses through this loop: > https://github.com/rohany/petsc-pde-benchmark/blob/main/main.cpp#L44, but > then does not exit (it seems to get stuck right before rowStart == rowEnd) > * The second rank makes very few iterations through the loop for its > allotted rows. > > Therefore, neither rank makes it to the call to MatAssemblyBegin. > > I'm running the code using the following command line on the Summit > supercomputer: > ``` > jsrun -n 2 -g 1 -c 1 -b rs -r 2 > /gpfs/alpine/scratch/rohany/csc335/petsc-pde-benchmark/main -ksp_max_it 200 > -ksp_type cg -pc_type none -ksp_atol 1e-10 -ksp_rtol 1e-10 -vec_type cuda > -mat_type aijcusparse -use_gpu_aware_mpi 0 -nx 8485 -ny 8485 > ``` > > Any suggestions will be appreciated! I feel that I have applied many of the > common petsc optimizations of preallocating my matrix row counts, so I'm > not sure what's going on with this input generation. > > Thanks, > > Rohan Yadav From rohany at alumni.cmu.edu Sat Dec 17 23:10:04 2022 From: rohany at alumni.cmu.edu (Rohan Yadav) Date: Sat, 17 Dec 2022 22:10:04 -0700 Subject: [petsc-users] Help with input construction hang on 2-GPU CG Solve In-Reply-To: <87o7s1e2ci.fsf@jedbrown.org> References: <87o7s1e2ci.fsf@jedbrown.org> Message-ID: Thanks Jed. I had tried just over-preallocating the matrix (using 10 nnz per row) and that solved the problem. I'm not sure what was wrong with my initial preallocation, but it's probably likely that things weren't hanging but just moving very slowly. Rohan On Sat, Dec 17, 2022 at 9:57 PM Jed Brown wrote: > I ran your code successfully with and without GPU-aware MPI. I see a bit > of time in MatSetValue -- you can make it a bit faster using one > MatSetValues call per row, but it's typical that assembling a matrix like > this (sequentially on the host) will be more expensive than some > unpreconditioned CG iterations (that don't come close to solving the > problem -- use multigrid if you want to actually solve this problem). > > Rohan Yadav writes: > > > Hi, > > > > I'm developing a microbenchmark that runs a CG solve using PETSc on a > mesh > > using a 5-point stencil matrix. My code (linked here: > > https://github.com/rohany/petsc-pde-benchmark/blob/main/main.cpp, only > 120 > > lines) works on 1 GPU and has great performance. When I move to 2 GPUs, > the > > program appears to get stuck in the input generation. I've literred the > > code with print statements and have found out the following clues: > > > > * The first rank progresses through this loop: > > https://github.com/rohany/petsc-pde-benchmark/blob/main/main.cpp#L44, > but > > then does not exit (it seems to get stuck right before rowStart == > rowEnd) > > * The second rank makes very few iterations through the loop for its > > allotted rows. > > > > Therefore, neither rank makes it to the call to MatAssemblyBegin. > > > > I'm running the code using the following command line on the Summit > > supercomputer: > > ``` > > jsrun -n 2 -g 1 -c 1 -b rs -r 2 > > /gpfs/alpine/scratch/rohany/csc335/petsc-pde-benchmark/main -ksp_max_it > 200 > > -ksp_type cg -pc_type none -ksp_atol 1e-10 -ksp_rtol 1e-10 -vec_type cuda > > -mat_type aijcusparse -use_gpu_aware_mpi 0 -nx 8485 -ny 8485 > > ``` > > > > Any suggestions will be appreciated! I feel that I have applied many of > the > > common petsc optimizations of preallocating my matrix row counts, so I'm > > not sure what's going on with this input generation. > > > > Thanks, > > > > Rohan Yadav > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun Dec 18 08:20:57 2022 From: jed at jedbrown.org (Jed Brown) Date: Sun, 18 Dec 2022 07:20:57 -0700 Subject: [petsc-users] dmplex normal vector incorrect for periodic gmsh grids In-Reply-To: References: <32720DD5-FA50-4766-A36A-7566913795B3@gmx.net> <4497F6D2-9B0F-450D-8767-A9BBB9CA1DF5@gmx.net> Message-ID: <87len4eqt2.fsf@jedbrown.org> Matthew Knepley writes: > On Fri, Dec 16, 2022 at 12:22 AM Praveen C wrote: > >> Thank you very much. I do see correct normals now. >> >> Is there a way to set the option >> >> -dm_localize_height 1 >>> >> >> within the code ? >> > > The problem is that the localization happens within the Gmsh construction, > so it is difficult to insert an API. > We could use a callback, but that rapidly becomes unwieldy. If you want to > do it programmatically, I would use > PetscOptionsSetValue(). Can it not be created explicitly later? This kind of thing isn't really a run-time choice, but rather a statement about the way the calling code has been written. From ksi2443 at gmail.com Sun Dec 18 08:24:58 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Sun, 18 Dec 2022 23:24:58 +0900 Subject: [petsc-users] Matrix version of VecScatterCreateToAll Message-ID: Hello, Is there a matrix version function of VecScatterCreateToAll? I'm using preallocator for preallocation of global matrix. Prior to using the preallocator, information related to allocation is stored in the specific matrix, and this information must be viewed in all mpi ranks. For using preallocator, I need scattering the specific matrix info to all mpi ranks. Thanks, Hyung Kim -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Sun Dec 18 08:36:45 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Sun, 18 Dec 2022 08:36:45 -0600 Subject: [petsc-users] Matrix version of VecScatterCreateToAll In-Reply-To: References: Message-ID: I am not clear on what you said "this information must be viewed in all mpi ranks." Even with preallocator, each rank only needs to insert entries it knows (i.e., don't need to get all entries) Maybe you could provide an example code for us to better understand what you mean? --Junchao Zhang On Sun, Dec 18, 2022 at 8:25 AM ??? wrote: > Hello, > > > Is there a matrix version function of VecScatterCreateToAll? > > I'm using preallocator for preallocation of global matrix. > Prior to using the preallocator, information related to allocation is > stored in the specific matrix, and this information must be viewed in all > mpi ranks. > For using preallocator, I need scattering the specific matrix info to all > mpi ranks. > > Thanks, > Hyung Kim > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Sun Dec 18 08:44:21 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Sun, 18 Dec 2022 23:44:21 +0900 Subject: [petsc-users] Matrix version of VecScatterCreateToAll In-Reply-To: References: Message-ID: For example, before using preallocaotor, I made A matrix. (A matrix has information about indices of global matrix) And the code structures are as below. for (int i=0; i?? ??: > I am not clear on what you said "this information must be viewed in all > mpi ranks." Even with preallocator, each rank only needs to insert entries > it knows (i.e., don't need to get all entries) > > Maybe you could provide an example code for us to better understand what > you mean? > > --Junchao Zhang > > > On Sun, Dec 18, 2022 at 8:25 AM ??? wrote: > >> Hello, >> >> >> Is there a matrix version function of VecScatterCreateToAll? >> >> I'm using preallocator for preallocation of global matrix. >> Prior to using the preallocator, information related to allocation is >> stored in the specific matrix, and this information must be viewed in all >> mpi ranks. >> For using preallocator, I need scattering the specific matrix info to all >> mpi ranks. >> >> Thanks, >> Hyung Kim >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Dec 18 12:11:08 2022 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 18 Dec 2022 13:11:08 -0500 Subject: [petsc-users] dmplex normal vector incorrect for periodic gmsh grids In-Reply-To: <87len4eqt2.fsf@jedbrown.org> References: <32720DD5-FA50-4766-A36A-7566913795B3@gmx.net> <4497F6D2-9B0F-450D-8767-A9BBB9CA1DF5@gmx.net> <87len4eqt2.fsf@jedbrown.org> Message-ID: On Sun, Dec 18, 2022 at 9:21 AM Jed Brown wrote: > Matthew Knepley writes: > > > On Fri, Dec 16, 2022 at 12:22 AM Praveen C wrote: > > > >> Thank you very much. I do see correct normals now. > >> > >> Is there a way to set the option > >> > >> -dm_localize_height 1 > >>> > >> > >> within the code ? > >> > > > > The problem is that the localization happens within the Gmsh > construction, > > so it is difficult to insert an API. > > We could use a callback, but that rapidly becomes unwieldy. If you want > to > > do it programmatically, I would use > > PetscOptionsSetValue(). > > Can it not be created explicitly later? > > This kind of thing isn't really a run-time choice, but rather a statement > about the way the calling code has been written. > You could tear down the localization, then set the height, and recreate the localization. Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Dec 18 12:22:37 2022 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 18 Dec 2022 13:22:37 -0500 Subject: [petsc-users] Matrix version of VecScatterCreateToAll In-Reply-To: References: Message-ID: On Sun, Dec 18, 2022 at 9:44 AM ??? wrote: > For example, before using preallocaotor, I made A matrix. (A matrix has > information about indices of global matrix) > And the code structures are as below. > for (int i=0; i MatGetRow(A,~~) // for getting indices info of global matrix. > MatSetvalues(preallocator, value_from_A); > } > Current state, I got an error at 'MatGetRow' when I run this code in > parallel. > That's why I want "all mpi ranks has information of A matrix". > Just from your explanation, this does not sound like a scalable algorithm for this, and FEM should be scalable. I would take a look at some parallel FEM codes before settling on this structure. However, if you really wish to do this, you can create the original matrix and then call https://petsc.org/main/docs/manualpages/Mat/MatCreateSubMatrices/ with every process giving all indices as the argument. Thanks, Matt > Thanks, > Hyung Kim > > > 2022? 12? 18? (?) ?? 11:36, Junchao Zhang ?? ??: > >> I am not clear on what you said "this information must be viewed in all >> mpi ranks." Even with preallocator, each rank only needs to insert entries >> it knows (i.e., don't need to get all entries) >> >> Maybe you could provide an example code for us to better understand what >> you mean? >> >> --Junchao Zhang >> >> >> On Sun, Dec 18, 2022 at 8:25 AM ??? wrote: >> >>> Hello, >>> >>> >>> Is there a matrix version function of VecScatterCreateToAll? >>> >>> I'm using preallocator for preallocation of global matrix. >>> Prior to using the preallocator, information related to allocation is >>> stored in the specific matrix, and this information must be viewed in all >>> mpi ranks. >>> For using preallocator, I need scattering the specific matrix info to >>> all mpi ranks. >>> >>> Thanks, >>> Hyung Kim >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.tardieu at edf.fr Mon Dec 19 05:11:43 2022 From: nicolas.tardieu at edf.fr (TARDIEU Nicolas) Date: Mon, 19 Dec 2022 11:11:43 +0000 Subject: [petsc-users] SNES and TS for nonlinear quasi-static problems Message-ID: Dear PETSc users, I plan to solve nonlinear quasi-static problems with PETSc. More precisely, these are solid mechanics problems with elasto-plasticity. So they do not involve "physical time", rather "pseudo time", which is mandatory to describe the stepping of the loading application. In general, the loading vector F(x, t) is expressed as the following product F(x, t)=F0(x)*g(t), where g is a scalar function of the pseudo-time. I see how to use a SNES in order to solve a certain step of the loading history but I wonder if a TS can be used to deal with the loading history through the definition of this g(t) function ? Furthermore, since too large load steps can lead to non-convergence, a stepping strategy is almost always required to restart a load step that failed. Does TS offer such a feature ? Thank you for your answers. Regards, Nicolas -- Nicolas Tardieu Ing PhD Computational Mechanics EDF - R&D Dpt ERMES PARIS-SACLAY, FRANCE Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont ?tablis ? l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme ? sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions ?galement d'en avertir imm?diatement l'exp?diteur par retour du message. Il est impossible de garantir que les communications par messagerie ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 19 06:46:31 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 19 Dec 2022 07:46:31 -0500 Subject: [petsc-users] SNES and TS for nonlinear quasi-static problems In-Reply-To: References: Message-ID: On Mon, Dec 19, 2022 at 6:12 AM TARDIEU Nicolas via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc users, > > I plan to solve nonlinear quasi-static problems with PETSc. More > precisely, these are solid mechanics problems with elasto-plasticity. > So they do not involve "physical time", rather "pseudo time", which is > mandatory to describe the stepping of the loading application. > In general, the loading vector F(x, t) is expressed as the following > product F(x, t)=F0(x)*g(t), where g is a scalar function of the > pseudo-time. > > I see how to use a SNES in order to solve a certain step of the loading > history but I wonder if a TS can be used to deal with the loading history > through the definition of this g(t) function ? > I believe so. We would want you to formulate it as a differential equation, F(x, \dot x, t) = G(x) which is your case is easy F(x, t) = 0 so you would just put your function completely into the IFunction. Since there is no \dot x term, this is a DAE, and you need to use one of the solvers for that. > Furthermore, since too large load steps can lead to non-convergence, a > stepping strategy is almost always required to restart a load step that > failed. Does TS offer such a feature ? > I think you can use PETSc adaptation. There is a PID option, which might be able to capture your behavior. Thanks, Matt > Thank you for your answers. > Regards, > Nicolas > -- > *Nicolas Tardieu* > *Ing PhD Computational Mechanics* > EDF - R&D Dpt ERMES > PARIS-SACLAY, FRANCE > > > Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont > ?tablis ? l'intention exclusive des destinataires et les informations qui y > figurent sont strictement confidentielles. Toute utilisation de ce Message > non conforme ? sa destination, toute diffusion ou toute publication totale > ou partielle, est interdite sauf autorisation expresse. > > Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de > le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou > partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de > votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace > sur quelque support que ce soit. Nous vous remercions ?galement d'en > avertir imm?diatement l'exp?diteur par retour du message. > > Il est impossible de garantir que les communications par messagerie > ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute > erreur ou virus. > ____________________________________________________ > > This message and any attachments (the 'Message') are intended solely for > the addressees. The information contained in this Message is confidential. > Any use of information contained in this Message not in accord with its > purpose, any dissemination or disclosure, either whole or partial, is > prohibited except formal approval. > > If you are not the addressee, you may not copy, forward, disclose or use > any part of it. If you have received this message in error, please delete > it and all copies from your system and notify the sender immediately by > return message. > > E-mail communication cannot be guaranteed to be timely secure, error or > virus-free. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon Dec 19 07:40:49 2022 From: jed at jedbrown.org (Jed Brown) Date: Mon, 19 Dec 2022 06:40:49 -0700 Subject: [petsc-users] SNES and TS for nonlinear quasi-static problems In-Reply-To: References: Message-ID: <87edsvecke.fsf@jedbrown.org> Indeed, this is exactly how we do quasistatic analysis for solid mechanics in Ratel (https://gitlab.com/micromorph/ratel) -- make sure to choose an L-stable integrator (backward Euler being the most natural choice). Implicit dynamics can be done by choosing a suitable integrator, like TSALPHA2, with almost no code change to the residual (only adding the mass term in DMTSSetI2Function()). Matthew Knepley writes: > On Mon, Dec 19, 2022 at 6:12 AM TARDIEU Nicolas via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Dear PETSc users, >> >> I plan to solve nonlinear quasi-static problems with PETSc. More >> precisely, these are solid mechanics problems with elasto-plasticity. >> So they do not involve "physical time", rather "pseudo time", which is >> mandatory to describe the stepping of the loading application. >> In general, the loading vector F(x, t) is expressed as the following >> product F(x, t)=F0(x)*g(t), where g is a scalar function of the >> pseudo-time. >> >> I see how to use a SNES in order to solve a certain step of the loading >> history but I wonder if a TS can be used to deal with the loading history >> through the definition of this g(t) function ? >> > > I believe so. We would want you to formulate it as a differential equation, > > F(x, \dot x, t) = G(x) > > which is your case is easy > > F(x, t) = 0 > > so you would just put your function completely into the IFunction. Since > there is no \dot x term, this is a DAE, > and you need to use one of the solvers for that. > > >> Furthermore, since too large load steps can lead to non-convergence, a >> stepping strategy is almost always required to restart a load step that >> failed. Does TS offer such a feature ? >> > > I think you can use PETSc adaptation. There is a PID option, which might be > able to capture your behavior. > > Thanks, > > Matt > > >> Thank you for your answers. >> Regards, >> Nicolas >> -- >> *Nicolas Tardieu* >> *Ing PhD Computational Mechanics* >> EDF - R&D Dpt ERMES >> PARIS-SACLAY, FRANCE >> >> >> Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont >> ?tablis ? l'intention exclusive des destinataires et les informations qui y >> figurent sont strictement confidentielles. Toute utilisation de ce Message >> non conforme ? sa destination, toute diffusion ou toute publication totale >> ou partielle, est interdite sauf autorisation expresse. >> >> Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de >> le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou >> partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de >> votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace >> sur quelque support que ce soit. Nous vous remercions ?galement d'en >> avertir imm?diatement l'exp?diteur par retour du message. >> >> Il est impossible de garantir que les communications par messagerie >> ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute >> erreur ou virus. >> ____________________________________________________ >> >> This message and any attachments (the 'Message') are intended solely for >> the addressees. The information contained in this Message is confidential. >> Any use of information contained in this Message not in accord with its >> purpose, any dissemination or disclosure, either whole or partial, is >> prohibited except formal approval. >> >> If you are not the addressee, you may not copy, forward, disclose or use >> any part of it. If you have received this message in error, please delete >> it and all copies from your system and notify the sender immediately by >> return message. >> >> E-mail communication cannot be guaranteed to be timely secure, error or >> virus-free. >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From nicolas.tardieu at edf.fr Mon Dec 19 09:45:21 2022 From: nicolas.tardieu at edf.fr (TARDIEU Nicolas) Date: Mon, 19 Dec 2022 15:45:21 +0000 Subject: [petsc-users] SNES and TS for nonlinear quasi-static problems In-Reply-To: <87edsvecke.fsf@jedbrown.org> References: <87edsvecke.fsf@jedbrown.org> Message-ID: Dear Matt and Jed, Thank you for your answers. I'll have a look at Ratel to see how to set-up a quasi-static nonlinear problem with TS and SNES. Best regards, Nicolas -- Nicolas Tardieu Ing PhD Computational Mechanics EDF - R&D Dpt ERMES PARIS-SACLAY, FRANCE ________________________________ De : jed at jedbrown.org Envoy? : lundi 19 d?cembre 2022 14:40 ? : Matthew Knepley ; TARDIEU Nicolas Cc : petsc-users at mcs.anl.gov Objet : Re: [petsc-users] SNES and TS for nonlinear quasi-static problems Indeed, this is exactly how we do quasistatic analysis for solid mechanics in Ratel (https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.com%2Fmicromorph%2Fratel&data=05%7C01%7Cnicolas.tardieu%40edf.fr%7C08e0ffed6942443fe5e908dae1c6c819%7Ce242425b70fc44dc9ddfc21e304e6c80%7C1%7C0%7C638070541137140223%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9E63JeIqTU1efxC8ij5OaL6XWPcErHbaVAAZwpvPY3o%3D&reserved=0) -- make sure to choose an L-stable integrator (backward Euler being the most natural choice). Implicit dynamics can be done by choosing a suitable integrator, like TSALPHA2, with almost no code change to the residual (only adding the mass term in DMTSSetI2Function()). Matthew Knepley writes: > On Mon, Dec 19, 2022 at 6:12 AM TARDIEU Nicolas via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Dear PETSc users, >> >> I plan to solve nonlinear quasi-static problems with PETSc. More >> precisely, these are solid mechanics problems with elasto-plasticity. >> So they do not involve "physical time", rather "pseudo time", which is >> mandatory to describe the stepping of the loading application. >> In general, the loading vector F(x, t) is expressed as the following >> product F(x, t)=F0(x)*g(t), where g is a scalar function of the >> pseudo-time. >> >> I see how to use a SNES in order to solve a certain step of the loading >> history but I wonder if a TS can be used to deal with the loading history >> through the definition of this g(t) function ? >> > > I believe so. We would want you to formulate it as a differential equation, > > F(x, \dot x, t) = G(x) > > which is your case is easy > > F(x, t) = 0 > > so you would just put your function completely into the IFunction. Since > there is no \dot x term, this is a DAE, > and you need to use one of the solvers for that. > > >> Furthermore, since too large load steps can lead to non-convergence, a >> stepping strategy is almost always required to restart a load step that >> failed. Does TS offer such a feature ? >> > > I think you can use PETSc adaptation. There is a PID option, which might be > able to capture your behavior. > > Thanks, > > Matt > > >> Thank you for your answers. >> Regards, >> Nicolas >> -- >> *Nicolas Tardieu* >> *Ing PhD Computational Mechanics* >> EDF - R&D Dpt ERMES >> PARIS-SACLAY, FRANCE >> >> >> Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont >> ?tablis ? l'intention exclusive des destinataires et les informations qui y >> figurent sont strictement confidentielles. Toute utilisation de ce Message >> non conforme ? sa destination, toute diffusion ou toute publication totale >> ou partielle, est interdite sauf autorisation expresse. >> >> Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de >> le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou >> partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de >> votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace >> sur quelque support que ce soit. Nous vous remercions ?galement d'en >> avertir imm?diatement l'exp?diteur par retour du message. >> >> Il est impossible de garantir que les communications par messagerie >> ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute >> erreur ou virus. >> ____________________________________________________ >> >> This message and any attachments (the 'Message') are intended solely for >> the addressees. The information contained in this Message is confidential. >> Any use of information contained in this Message not in accord with its >> purpose, any dissemination or disclosure, either whole or partial, is >> prohibited except formal approval. >> >> If you are not the addressee, you may not copy, forward, disclose or use >> any part of it. If you have received this message in error, please delete >> it and all copies from your system and notify the sender immediately by >> return message. >> >> E-mail communication cannot be guaranteed to be timely secure, error or >> virus-free. >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.cse.buffalo.edu%2F~knepley%2F&data=05%7C01%7Cnicolas.tardieu%40edf.fr%7C08e0ffed6942443fe5e908dae1c6c819%7Ce242425b70fc44dc9ddfc21e304e6c80%7C1%7C0%7C638070541137140223%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=a7f7QQi5pGTnaSeqHJmCe5v7S17c4yq%2FSaXGRr0P6w0%3D&reserved=0 Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont ?tablis ? l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme ? sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions ?galement d'en avertir imm?diatement l'exp?diteur par retour du message. Il est impossible de garantir que les communications par messagerie ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. -------------- next part -------------- An HTML attachment was scrubbed... URL: From grezende.oliv at gmail.com Mon Dec 19 14:08:58 2022 From: grezende.oliv at gmail.com (Guilherme Lima) Date: Mon, 19 Dec 2022 17:08:58 -0300 Subject: [petsc-users] Installation issues Message-ID: Hello, I'm having an issue with my PETSc installation. It started with an unexpected output on src/ksp/ksp/tutorial ex50, then tried doing the configuring early steps again and here I am. It can't execute more than 1 MPI process, also at the example that I mentioned before, using "mpiexec -n 4 ... " it appeared to have executed the same interaction 4 times, instead of 1 time with 4 processors. Hope I've been clear about what is the problem. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: text/x-log Size: 12533 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 1193759 bytes Desc: not available URL: From balay at mcs.anl.gov Mon Dec 19 14:31:30 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 19 Dec 2022 14:31:30 -0600 (CST) Subject: [petsc-users] Installation issues In-Reply-To: References: Message-ID: <12ded54c-1e74-bd19-a789-ed948e0a5ca3@mcs.anl.gov> This occurs if one uses an incompatible mpiexec [to the mpi that PETSc was installed with. I see you've installed using -download-mpich So the compatible mpiexec here is MPIEXEC = /home/slave12/projects/petsc/arch-linux-c-debug/bin/mpiexec So try: /home/slave12/projects/petsc/arch-linux-c-debug/bin/mpiexec -n 4 ./ex50 Satish On Mon, 19 Dec 2022, Guilherme Lima wrote: > Hello, > I'm having an issue with my PETSc installation. It started with an > unexpected output on src/ksp/ksp/tutorial ex50, then tried doing the > configuring early steps again and here I am. > It can't execute more than 1 MPI process, also at the example that I > mentioned before, using "mpiexec -n 4 ... " it appeared to have executed > the same interaction 4 times, instead of 1 time with 4 processors. > Hope I've been clear about what is the problem. > From grezende.oliv at gmail.com Wed Dec 21 11:11:56 2022 From: grezende.oliv at gmail.com (Guilherme Lima) Date: Wed, 21 Dec 2022 14:11:56 -0300 Subject: [petsc-users] Error while running an example Message-ID: Hello So, this error occurred while running example 50 on petsc/src/ksp/ksp/tutorials The machine also have a MPI installed, so in order to run the ex50 I did: /home/slave12/projects/petsc/arch-linux-c-debug/bin/mpiexec -n 4 ./ex50 -da_grid_x 120 -da_grid_y 120 -pc_type lu -pc_factor_mat_solver_type superlu_dist -ksp_monitor -ksp_view Expecting that the problem was solved, but instead it showed an Error message Configure.log, make.log and error attached. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 42115 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: text/x-log Size: 12534 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Error.jpeg Type: image/jpeg Size: 246953 bytes Desc: not available URL: From knepley at gmail.com Wed Dec 21 11:19:28 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 21 Dec 2022 12:19:28 -0500 Subject: [petsc-users] Error while running an example In-Reply-To: References: Message-ID: On Wed, Dec 21, 2022 at 12:12 PM Guilherme Lima wrote: > Hello > > So, this error occurred while running example 50 on > petsc/src/ksp/ksp/tutorials > The machine also have a MPI installed, so in order to run the ex50 I did: > > /home/slave12/projects/petsc/arch-linux-c-debug/bin/mpiexec -n 4 ./ex50 > -da_grid_x 120 -da_grid_y 120 -pc_type lu -pc_factor_mat_solver_type > superlu_dist -ksp_monitor -ksp_view > > Expecting that the problem was solved, but instead it showed an Error > message > Configure.log, make.log and error attached. > There appears to be some problem getting the name of your home directory on this machine. We could figure it out in the debugger. You can get the example to run by adding -skip_petscrc Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Dec 21 11:19:57 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 21 Dec 2022 11:19:57 -0600 (CST) Subject: [petsc-users] Error while running an example In-Reply-To: References: Message-ID: <1350ff31-bfb4-727b-56c9-0efb6173e9f2@mcs.anl.gov> Likely mis-configured network. What do you get for: host `hostname` Some additional info: https://petsc.org/release/faq/#what-does-it-mean-when-make-check-errors-on-petscoptionsinsertfile For mis-configured network - the following fix might work: echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts Satish On Wed, 21 Dec 2022, Guilherme Lima wrote: > Hello > > So, this error occurred while running example 50 on > petsc/src/ksp/ksp/tutorials > The machine also have a MPI installed, so in order to run the ex50 I did: > > /home/slave12/projects/petsc/arch-linux-c-debug/bin/mpiexec -n 4 ./ex50 > -da_grid_x 120 -da_grid_y 120 -pc_type lu -pc_factor_mat_solver_type > superlu_dist -ksp_monitor -ksp_view > > Expecting that the problem was solved, but instead it showed an Error > message > Configure.log, make.log and error attached. > From narnoldm at umich.edu Wed Dec 21 23:39:52 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Thu, 22 Dec 2022 00:39:52 -0500 Subject: [petsc-users] Getting a vector from a DM to output VTK Message-ID: Hi Petsc Users I've been having trouble consistently getting a vector generated from a DM to output to VTK correctly. I've used ex1.c (which works properly)to try and figure it out, but I'm still having some issues. I must be missing something small that isn't correctly associating the section with the DM. DMPlexGetChart(dm, &p0, &p1); PetscSection section_full; PetscSectionCreate(PETSC_COMM_WORLD, §ion_full); PetscSectionSetNumFields(section_full, 1); PetscSectionSetChart(section_full, p0, p1); PetscSectionSetFieldName(section_full, 0, "state"); for (int i = c0; i < c1; i++) { PetscSectionSetDof(section_full, i, 1); PetscSectionSetFieldDof(section_full, i, 0, 1); } PetscSectionSetUp(section_full); DMSetNumFields(dm, 1); DMSetLocalSection(dm, section_full); DMCreateGlobalVector(dm, &state_full); int o0, o1; VecGetOwnershipRange(state_full, &o0, &o1); PetscScalar *state_full_array; VecGetArray(state_full, &state_full_array); for (int i = 0; i < (c1 - c0); i++) { int offset; PetscSectionGetOffset(section_full, i, &offset); state_full_array[offset] = 101325 + i; } VecRestoreArray(state_full, &state_full_array); PetscViewerCreate(PETSC_COMM_WORLD, &viewer); PetscViewerSetType(viewer, PETSCVIEWERVTK); PetscViewerFileSetMode(viewer, FILE_MODE_WRITE); PetscViewerFileSetName(viewer, "mesh.vtu"); VecView(state_full, viewer); If I run this mesh.vtu isn't generated at all. If I instead do a DMView passing the DM it will just output the mesh correctly. Any assistance would be greatly appreciated. Sincerely Nicholas -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From matteo.semplice at uninsubria.it Thu Dec 22 04:44:25 2022 From: matteo.semplice at uninsubria.it (Matteo Semplice) Date: Thu, 22 Dec 2022 11:44:25 +0100 Subject: [petsc-users] locate DMSwarm particles with respect to a background DMDA mesh In-Reply-To: <35ebfa58-eed5-39fa-8b3d-918ff9d7e633@uninsubria.it> References: <8a853d0e-b856-5dc2-5439-d25911d672e4@uninsubria.it> <35ebfa58-eed5-39fa-8b3d-918ff9d7e633@uninsubria.it> Message-ID: Dear everybody, ??? I have bug a bit into the code and I am able to add more information. Il 02/12/22 12:48, Matteo Semplice ha scritto: > Hi. > I am sorry to take this up again, but further tests show that it's not > right yet. > > Il 04/11/22 12:48, Matthew Knepley ha scritto: >> On Fri, Nov 4, 2022 at 7:46 AM Matteo Semplice >> wrote: >> >> On 04/11/2022 02:43, Matthew Knepley wrote: >>> On Thu, Nov 3, 2022 at 8:36 PM Matthew Knepley >>> wrote: >>> >>> On Thu, Oct 27, 2022 at 11:57 AM Semplice Matteo >>> wrote: >>> >>> Dear Petsc developers, >>> I am trying to use a DMSwarm to locate a cloud of points >>> with respect to a background mesh. In the real >>> application the points will be loaded from disk, but I >>> have created a small demo in which >>> >>> * each processor creates Npart particles, all within >>> the domain covered by the mesh, but not all in the >>> local portion of the mesh >>> * migrate the particles >>> >>> After migration most particles are not any more in the >>> DMSwarm (how many and which ones seems to depend on the >>> number of cpus, but it never happens that all particle >>> survive the migration process). >>> >>> Thanks for sending this. I found the problem. Someone has >>> some overly fancy code inside DMDA to figure out the local >>> bounding box from the coordinates. >>> It is broken for DM_BOUNDARY_GHOSTED, but we never tested >>> with this. I will fix it. >>> >>> >>> Okay, I think this fix is correct >>> >>> https://gitlab.com/petsc/petsc/-/merge_requests/5802 >>> >>> >>> I incorporated your test as src/dm/impls/da/tests/ex1.c. Can you >>> take a look and see if this fixes your issue? >> >> Yes, we have tested 2d and 3d, with various combinations of >> DM_BOUNDARY_* along different directions and it works like a charm. >> >> On a side note, neither DMSwarmViewXDMF nor DMSwarmMigrate seem >> to be implemented for 1d: I get >> >> [0]PETSC ERROR: No support for this operation for this object >> type[0]PETSC ERROR: Support not provided for 1D >> >> However, currently I have no need for this feature. >> >> Finally, if the test is meant to stay in the source, you may >> remove the call to DMSwarmRegisterPetscDatatypeField as in the >> attached patch. >> >> Thanks a lot!! >> >> Thanks! Glad it works. >> >> ? ?Matt >> > There are still problems when not using 1,2 or 4 cpus. Any other > number of cpus that I've tested does not work corectly. > I have now modified private_DMDALocatePointsIS_2D_Regular to print out some debugging information. I see that this is called twice during migration, once before and once after DMSwarmMigrate_DMNeighborScatter. If I understand correctly, the second call to private_DMDALocatePointsIS_2D_Regular should be able to locate all particles owned by the rank but it fails for some of them because they have been sent to the wrong rank (despite being well away from process boundaries). For example, running the example src/dm/impls/da/tests/ex1.c with Nx=21 (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 processors, - the particles (-0.191,-0.462) and (0.191,-0.462) are sent cpu2 instead of cpu0 - those at (-0.287,-0.693)and (0.287,-0.693) are sent to cpu1 instead of cpu0 - those at (0.191,0.462) and (-0.191,0.462) are sent to cpu0 instead of cpu2 (This is 2d and thus not affected by the 3d issue mentioned yesterday on petsc-dev. Tests were made based on the release branch pulled out this morning, i.e. on commit bebdc8d016f). I attach the output separated by process. If you have any hints, they would be appreciated. Thanks ??? Matteo -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ... calling DMSwarmMigrate ... === private_DMDALocatePointsIS_2D_Regular 0] local domain (-1.000,-1.000) ... (1.000,-0.400) 0] local domain 20X6 cells with dx=(0.100,0.100) 0] locating particle 0 at (0.231,0.096) 0] locating particle 1 at (0.096,0.231) 0] locating particle 2 at (-0.096,0.231) 0] locating particle 3 at (-0.231,0.096) 0] locating particle 4 at (-0.231,-0.096) 0] locating particle 5 at (-0.096,-0.231) 0] locating particle 6 at (0.096,-0.231) 0] locating particle 7 at (0.231,-0.096) === DMSwarmMigrate_DMNeighborScatter === private_DMDALocatePointsIS_2D_Regular 0] local domain (-1.000,-1.000) ... (1.000,-0.400) 0] local domain 20X6 cells with dx=(0.100,0.100) 0] locating particle 0 at (0.191,0.462) 0] locating particle 1 at (-0.191,0.462) 0] locating particle 2 at (-0.191,-0.462) 0] ---->ok, particle 2 is in cell (8,5)-->108 0] locating particle 3 at (0.191,-0.462) 0] ---->ok, particle 3 is in cell (11,5)-->111 -------------- next part -------------- === private_DMDALocatePointsIS_2D_Regular 1] local domain (-1.000,-0.400) ... (1.000,0.300) 1] local domain 20X7 cells with dx=(0.100,0.100) 1] locating particle 0 at (0.462,0.191) 1] ---->ok, particle 0 is in cell (14,11)-->114 1] locating particle 1 at (0.191,0.462) 1] locating particle 2 at (-0.191,0.462) 1] locating particle 3 at (-0.462,0.191) 1] ---->ok, particle 3 is in cell (5,11)-->105 1] locating particle 4 at (-0.462,-0.191) 1] ---->ok, particle 4 is in cell (5,8)-->45 1] locating particle 5 at (-0.191,-0.462) 1] locating particle 6 at (0.191,-0.462) 1] locating particle 7 at (0.462,-0.191) 1] ---->ok, particle 7 is in cell (14,8)-->54 === DMSwarmMigrate_DMNeighborScatter === private_DMDALocatePointsIS_2D_Regular 1] local domain (-1.000,-0.400) ... (1.000,0.300) 1] local domain 20X7 cells with dx=(0.100,0.100) 1] locating particle 0 at (0.231,0.096) 1] ---->ok, particle 0 is in cell (12,10)-->92 1] locating particle 1 at (0.096,0.231) 1] ---->ok, particle 1 is in cell (10,12)-->130 1] locating particle 2 at (-0.096,0.231) 1] ---->ok, particle 2 is in cell (9,12)-->129 1] locating particle 3 at (-0.231,0.096) 1] ---->ok, particle 3 is in cell (7,10)-->87 1] locating particle 4 at (-0.231,-0.096) 1] ---->ok, particle 4 is in cell (7,9)-->67 1] locating particle 5 at (-0.096,-0.231) 1] ---->ok, particle 5 is in cell (9,7)-->29 1] locating particle 6 at (0.096,-0.231) 1] ---->ok, particle 6 is in cell (10,7)-->30 1] locating particle 7 at (0.231,-0.096) 1] ---->ok, particle 7 is in cell (12,9)-->72 1] locating particle 8 at (0.693,0.287) 1] ---->ok, particle 8 is in cell (16,12)-->136 1] locating particle 9 at (-0.693,0.287) 1] ---->ok, particle 9 is in cell (3,12)-->123 1] locating particle 10 at (-0.693,-0.287) 1] ---->ok, particle 10 is in cell (3,7)-->23 1] locating particle 11 at (-0.287,-0.693) 1] locating particle 12 at (0.287,-0.693) 1] locating particle 13 at (0.693,-0.287) 1] ---->ok, particle 13 is in cell (16,7)-->36 -------------- next part -------------- === private_DMDALocatePointsIS_2D_Regular 2] local domain (-1.000,0.300) ... (1.000,1.000) 2] local domain 20X7 cells with dx=(0.100,0.100) 2] locating particle 0 at (0.693,0.287) 2] locating particle 1 at (0.287,0.693) 2] ---->ok, particle 1 is in cell (12,16)-->72 2] locating particle 2 at (-0.287,0.693) 2] ---->ok, particle 2 is in cell (7,16)-->67 2] locating particle 3 at (-0.693,0.287) 2] locating particle 4 at (-0.693,-0.287) 2] locating particle 5 at (-0.287,-0.693) 2] locating particle 6 at (0.287,-0.693) 2] locating particle 7 at (0.693,-0.287) === DMSwarmMigrate_DMNeighborScatter === private_DMDALocatePointsIS_2D_Regular 2] local domain (-1.000,0.300) ... (1.000,1.000) 2] local domain 20X7 cells with dx=(0.100,0.100) 2] locating particle 0 at (0.191,0.462) 2] ---->ok, particle 0 is in cell (11,14)-->31 2] locating particle 1 at (-0.191,0.462) 2] ---->ok, particle 1 is in cell (8,14)-->28 2] locating particle 2 at (-0.191,-0.462) 2] locating particle 3 at (0.191,-0.462) From matteo.semplice at uninsubria.it Thu Dec 22 05:28:27 2022 From: matteo.semplice at uninsubria.it (Matteo Semplice) Date: Thu, 22 Dec 2022 12:28:27 +0100 Subject: [petsc-users] locate DMSwarm particles with respect to a background DMDA mesh In-Reply-To: References: <8a853d0e-b856-5dc2-5439-d25911d672e4@uninsubria.it> <35ebfa58-eed5-39fa-8b3d-918ff9d7e633@uninsubria.it> Message-ID: <52622e96-4dfe-9105-a5a2-8d60d2fc90a8@uninsubria.it> Dear all ??? please ignore my previous email and read this one: I have better localized the problem. Maybe DMSwarmMigrate is designed to migrate particles only to first neighbouring ranks? Il 22/12/22 11:44, Matteo Semplice ha scritto: > > Dear everybody, > > ??? I have bug a bit into the code and I am able to add more information. > > Il 02/12/22 12:48, Matteo Semplice ha scritto: >> Hi. >> I am sorry to take this up again, but further tests show that it's >> not right yet. >> >> Il 04/11/22 12:48, Matthew Knepley ha scritto: >>> On Fri, Nov 4, 2022 at 7:46 AM Matteo Semplice >>> wrote: >>> >>> On 04/11/2022 02:43, Matthew Knepley wrote: >>>> On Thu, Nov 3, 2022 at 8:36 PM Matthew Knepley >>>> wrote: >>>> >>>> On Thu, Oct 27, 2022 at 11:57 AM Semplice Matteo >>>> wrote: >>>> >>>> Dear Petsc developers, >>>> I am trying to use a DMSwarm to locate a cloud of >>>> points with respect to a background mesh. In the real >>>> application the points will be loaded from disk, but I >>>> have created a small demo in which >>>> >>>> * each processor creates Npart particles, all within >>>> the domain covered by the mesh, but not all in the >>>> local portion of the mesh >>>> * migrate the particles >>>> >>>> After migration most particles are not any more in the >>>> DMSwarm (how many and which ones seems to depend on the >>>> number of cpus, but it never happens that all particle >>>> survive the migration process). >>>> >>>> Thanks for sending this. I found the problem. Someone has >>>> some overly fancy code inside DMDA to figure out the local >>>> bounding box from the coordinates. >>>> It is broken for DM_BOUNDARY_GHOSTED, but we never tested >>>> with this. I will fix it. >>>> >>>> >>>> Okay, I think this fix is correct >>>> >>>> https://gitlab.com/petsc/petsc/-/merge_requests/5802 >>>> >>>> >>>> I incorporated your test as src/dm/impls/da/tests/ex1.c. Can >>>> you take a look and see if this fixes your issue? >>> >>> Yes, we have tested 2d and 3d, with various combinations of >>> DM_BOUNDARY_* along different directions and it works like a charm. >>> >>> On a side note, neither DMSwarmViewXDMF nor DMSwarmMigrate seem >>> to be implemented for 1d: I get >>> >>> [0]PETSC ERROR: No support for this operation for this object >>> type[0]PETSC ERROR: Support not provided for 1D >>> >>> However, currently I have no need for this feature. >>> >>> Finally, if the test is meant to stay in the source, you may >>> remove the call to DMSwarmRegisterPetscDatatypeField as in the >>> attached patch. >>> >>> Thanks a lot!! >>> >>> Thanks! Glad it works. >>> >>> ? ?Matt >>> >> There are still problems when not using 1,2 or 4 cpus. Any other >> number of cpus that I've tested does not work corectly. >> > I have now modified private_DMDALocatePointsIS_2D_Regular to print out > some debugging information. I see that this is called twice during > migration, once before and once after > DMSwarmMigrate_DMNeighborScatter. If I understand correctly, the > second call to private_DMDALocatePointsIS_2D_Regular should be able to > locate all particles owned by the rank but it fails for some of them > because they have been sent to the wrong rank (despite being well away > from process boundaries). > > For example, running the example src/dm/impls/da/tests/ex1.c with > Nx=21 (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 processors, > > - the particles (-0.191,-0.462) and (0.191,-0.462) are sent cpu2 > instead of cpu0 > > - those at (-0.287,-0.693)and (0.287,-0.693) are sent to cpu1 instead > of cpu0 > > - those at (0.191,0.462) and (-0.191,0.462) are sent to cpu0 instead > of cpu2 > > (This is 2d and thus not affected by the 3d issue mentioned yesterday > on petsc-dev. Tests were made based on the release branch pulled out > this morning, i.e. on commit bebdc8d016f). > I see: particles are sent "all around" and not only to the destination rank. Still however, running the example src/dm/impls/da/tests/ex1.c with Nx=21 (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 processors, there are 2 particles initially owned by rank2 (at y=-0.6929 and x=+/-0.2870) that are sent only to rank1 and never make it to rank0 and are thus lost in the end since rank1, correctly, discards them. Thanks ??? Matteo -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Dec 22 07:02:08 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 22 Dec 2022 08:02:08 -0500 Subject: [petsc-users] locate DMSwarm particles with respect to a background DMDA mesh In-Reply-To: <52622e96-4dfe-9105-a5a2-8d60d2fc90a8@uninsubria.it> References: <8a853d0e-b856-5dc2-5439-d25911d672e4@uninsubria.it> <35ebfa58-eed5-39fa-8b3d-918ff9d7e633@uninsubria.it> <52622e96-4dfe-9105-a5a2-8d60d2fc90a8@uninsubria.it> Message-ID: On Thu, Dec 22, 2022 at 6:28 AM Matteo Semplice < matteo.semplice at uninsubria.it> wrote: > Dear all > > please ignore my previous email and read this one: I have better > localized the problem. Maybe DMSwarmMigrate is designed to migrate > particles only to first neighbouring ranks? > Yes, I believe that was the design. Dave, is this correct? Thanks, Matt > Il 22/12/22 11:44, Matteo Semplice ha scritto: > > Dear everybody, > > I have bug a bit into the code and I am able to add more information. > Il 02/12/22 12:48, Matteo Semplice ha scritto: > > Hi. > I am sorry to take this up again, but further tests show that it's not > right yet. > > Il 04/11/22 12:48, Matthew Knepley ha scritto: > > On Fri, Nov 4, 2022 at 7:46 AM Matteo Semplice < > matteo.semplice at uninsubria.it> wrote: > >> On 04/11/2022 02:43, Matthew Knepley wrote: >> >> On Thu, Nov 3, 2022 at 8:36 PM Matthew Knepley wrote: >> >>> On Thu, Oct 27, 2022 at 11:57 AM Semplice Matteo < >>> matteo.semplice at uninsubria.it> wrote: >>> >>>> Dear Petsc developers, >>>> I am trying to use a DMSwarm to locate a cloud of points with >>>> respect to a background mesh. In the real application the points will be >>>> loaded from disk, but I have created a small demo in which >>>> >>>> - each processor creates Npart particles, all within the domain >>>> covered by the mesh, but not all in the local portion of the mesh >>>> - migrate the particles >>>> >>>> After migration most particles are not any more in the DMSwarm (how >>>> many and which ones seems to depend on the number of cpus, but it never >>>> happens that all particle survive the migration process). >>>> >>>> Thanks for sending this. I found the problem. Someone has some overly >>> fancy code inside DMDA to figure out the local bounding box from the >>> coordinates. >>> It is broken for DM_BOUNDARY_GHOSTED, but we never tested with this. I >>> will fix it. >>> >> >> Okay, I think this fix is correct >> >> https://gitlab.com/petsc/petsc/-/merge_requests/5802 >> >> >> I incorporated your test as src/dm/impls/da/tests/ex1.c. Can you take a >> look and see if this fixes your issue? >> >> Yes, we have tested 2d and 3d, with various combinations of DM_BOUNDARY_* >> along different directions and it works like a charm. >> >> On a side note, neither DMSwarmViewXDMF nor DMSwarmMigrate seem to be >> implemented for 1d: I get >> >> [0]PETSC ERROR: No support for this operation for this object type >> [0]PETSC >> ERROR: Support not provided for 1D >> >> However, currently I have no need for this feature. >> >> Finally, if the test is meant to stay in the source, you may remove the >> call to DMSwarmRegisterPetscDatatypeField as in the attached patch. >> >> Thanks a lot!! >> > Thanks! Glad it works. > > Matt > > There are still problems when not using 1,2 or 4 cpus. Any other number of > cpus that I've tested does not work corectly. > > I have now modified private_DMDALocatePointsIS_2D_Regular to print out > some debugging information. I see that this is called twice during > migration, once before and once after DMSwarmMigrate_DMNeighborScatter. If > I understand correctly, the second call to > private_DMDALocatePointsIS_2D_Regular should be able to locate all > particles owned by the rank but it fails for some of them because they have > been sent to the wrong rank (despite being well away from process > boundaries). > > For example, running the example src/dm/impls/da/tests/ex1.c with Nx=21 > (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 processors, > > - the particles (-0.191,-0.462) and (0.191,-0.462) are sent cpu2 instead > of cpu0 > > - those at (-0.287,-0.693)and (0.287,-0.693) are sent to cpu1 instead of > cpu0 > > - those at (0.191,0.462) and (-0.191,0.462) are sent to cpu0 instead of > cpu2 > > (This is 2d and thus not affected by the 3d issue mentioned yesterday on > petsc-dev. Tests were made based on the release branch pulled out this > morning, i.e. on commit bebdc8d016f). > > I see: particles are sent "all around" and not only to the destination > rank. > > Still however, running the example src/dm/impls/da/tests/ex1.c with Nx=21 > (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 processors, there are 2 > particles initially owned by rank2 (at y=-0.6929 and x=+/-0.2870) that are > sent only to rank1 and never make it to rank0 and are thus lost in the end > since rank1, correctly, discards them. > > Thanks > > Matteo > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From matteo.semplice at uninsubria.it Thu Dec 22 07:59:56 2022 From: matteo.semplice at uninsubria.it (Matteo Semplice) Date: Thu, 22 Dec 2022 14:59:56 +0100 Subject: [petsc-users] locate DMSwarm particles with respect to a background DMDA mesh In-Reply-To: References: <8a853d0e-b856-5dc2-5439-d25911d672e4@uninsubria.it> <35ebfa58-eed5-39fa-8b3d-918ff9d7e633@uninsubria.it> <52622e96-4dfe-9105-a5a2-8d60d2fc90a8@uninsubria.it> Message-ID: <638d57ef-8e0c-d1d7-38c6-dbb7eb3b384d@uninsubria.it> Il 22/12/22 14:02, Matthew Knepley ha scritto: > On Thu, Dec 22, 2022 at 6:28 AM Matteo Semplice > wrote: > > Dear all > > ??? please ignore my previous email and read this one: I have > better localized the problem. Maybe DMSwarmMigrate is designed to > migrate particles only to first neighbouring ranks? > > Yes, I believe that was the design. > > Dave, is this correct? This is totally understandable (but would be worth pointing it out in the documentation of DMSwarmMigrate). It would then be useful to have the DMSwarmSetMigrateType routine implemented so that I can put a PIC Swarm into MIGRATE_BASIC mode. The routine is mentioned in the documentation, but it seems not implemented. Could you please add it when you have a minute? Thanks ??? Matteo -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Thu Dec 22 11:40:01 2022 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 22 Dec 2022 09:40:01 -0800 Subject: [petsc-users] locate DMSwarm particles with respect to a background DMDA mesh In-Reply-To: References: <8a853d0e-b856-5dc2-5439-d25911d672e4@uninsubria.it> <35ebfa58-eed5-39fa-8b3d-918ff9d7e633@uninsubria.it> <52622e96-4dfe-9105-a5a2-8d60d2fc90a8@uninsubria.it> Message-ID: Hey Matt, On Thu 22. Dec 2022 at 05:02, Matthew Knepley wrote: > On Thu, Dec 22, 2022 at 6:28 AM Matteo Semplice < > matteo.semplice at uninsubria.it> wrote: > >> Dear all >> >> please ignore my previous email and read this one: I have better >> localized the problem. Maybe DMSwarmMigrate is designed to migrate >> particles only to first neighbouring ranks? >> > Yes, I believe that was the design. > > Dave, is this correct? > Correct. DMSwarmMigrate_DMNeighborScatter() only scatter points to the neighbour ranks - where neighbours are defined by the DM provided to represent the mesh. DMSwarmMigrate_DMNeighborScatter() Is selected by default if you attach a DM. The scatter method should be over ridden with DMSwarmSetMigrateType() however it appears this method no longer exists. If one can determine the exact rank where points should should be sent and it is not going to be the neighbour rank (given by the DM), I would suggest not attaching the DM at all. However if this is not possible and one wanted to scatter to say the neighbours neighbours, we will have to add a new interface and refactor things a little bit. Cheers Dave > Thanks, > > Matt > > >> Il 22/12/22 11:44, Matteo Semplice ha scritto: >> >> Dear everybody, >> >> I have bug a bit into the code and I am able to add more information. >> Il 02/12/22 12:48, Matteo Semplice ha scritto: >> >> Hi. >> I am sorry to take this up again, but further tests show that it's not >> right yet. >> >> Il 04/11/22 12:48, Matthew Knepley ha scritto: >> >> On Fri, Nov 4, 2022 at 7:46 AM Matteo Semplice < >> matteo.semplice at uninsubria.it> wrote: >> >>> On 04/11/2022 02:43, Matthew Knepley wrote: >>> >>> On Thu, Nov 3, 2022 at 8:36 PM Matthew Knepley >>> wrote: >>> >>>> On Thu, Oct 27, 2022 at 11:57 AM Semplice Matteo < >>>> matteo.semplice at uninsubria.it> wrote: >>>> >>>>> Dear Petsc developers, >>>>> I am trying to use a DMSwarm to locate a cloud of points with >>>>> respect to a background mesh. In the real application the points will be >>>>> loaded from disk, but I have created a small demo in which >>>>> >>>>> - each processor creates Npart particles, all within the domain >>>>> covered by the mesh, but not all in the local portion of the mesh >>>>> - migrate the particles >>>>> >>>>> After migration most particles are not any more in the DMSwarm (how >>>>> many and which ones seems to depend on the number of cpus, but it never >>>>> happens that all particle survive the migration process). >>>>> >>>>> Thanks for sending this. I found the problem. Someone has some overly >>>> fancy code inside DMDA to figure out the local bounding box from the >>>> coordinates. >>>> It is broken for DM_BOUNDARY_GHOSTED, but we never tested with this. I >>>> will fix it. >>>> >>> >>> Okay, I think this fix is correct >>> >>> https://gitlab.com/petsc/petsc/-/merge_requests/5802 >>> >>> >>> I incorporated your test as src/dm/impls/da/tests/ex1.c. Can you take a >>> look and see if this fixes your issue? >>> >>> Yes, we have tested 2d and 3d, with various combinations of >>> DM_BOUNDARY_* along different directions and it works like a charm. >>> >>> On a side note, neither DMSwarmViewXDMF nor DMSwarmMigrate seem to be >>> implemented for 1d: I get >>> >>> [0]PETSC ERROR: No support for this operation for this object type >>> [0]PETSC >>> ERROR: Support not provided for 1D >>> >>> However, currently I have no need for this feature. >>> >>> Finally, if the test is meant to stay in the source, you may remove the >>> call to DMSwarmRegisterPetscDatatypeField as in the attached patch. >>> >>> Thanks a lot!! >>> >> Thanks! Glad it works. >> >> Matt >> >> There are still problems when not using 1,2 or 4 cpus. Any other number >> of cpus that I've tested does not work corectly. >> >> I have now modified private_DMDALocatePointsIS_2D_Regular to print out >> some debugging information. I see that this is called twice during >> migration, once before and once after DMSwarmMigrate_DMNeighborScatter. If >> I understand correctly, the second call to >> private_DMDALocatePointsIS_2D_Regular should be able to locate all >> particles owned by the rank but it fails for some of them because they have >> been sent to the wrong rank (despite being well away from process >> boundaries). >> >> For example, running the example src/dm/impls/da/tests/ex1.c with Nx=21 >> (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 processors, >> >> - the particles (-0.191,-0.462) and (0.191,-0.462) are sent cpu2 instead >> of cpu0 >> >> - those at (-0.287,-0.693)and (0.287,-0.693) are sent to cpu1 instead of >> cpu0 >> >> - those at (0.191,0.462) and (-0.191,0.462) are sent to cpu0 instead of >> cpu2 >> >> (This is 2d and thus not affected by the 3d issue mentioned yesterday on >> petsc-dev. Tests were made based on the release branch pulled out this >> morning, i.e. on commit bebdc8d016f). >> >> I see: particles are sent "all around" and not only to the destination >> rank. >> >> Still however, running the example src/dm/impls/da/tests/ex1.c with Nx=21 >> (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 processors, there are 2 >> particles initially owned by rank2 (at y=-0.6929 and x=+/-0.2870) that are >> sent only to rank1 and never make it to rank0 and are thus lost in the end >> since rank1, correctly, discards them. >> >> Thanks >> >> Matteo >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matteo.semplice at uninsubria.it Thu Dec 22 12:27:45 2022 From: matteo.semplice at uninsubria.it (Matteo Semplice) Date: Thu, 22 Dec 2022 19:27:45 +0100 Subject: [petsc-users] locate DMSwarm particles with respect to a background DMDA mesh In-Reply-To: References: <8a853d0e-b856-5dc2-5439-d25911d672e4@uninsubria.it> <35ebfa58-eed5-39fa-8b3d-918ff9d7e633@uninsubria.it> <52622e96-4dfe-9105-a5a2-8d60d2fc90a8@uninsubria.it> Message-ID: <869ea7bc-52fd-c876-4278-37a8e829af8a@uninsubria.it> Dear Dave and Matt, ??? I am really dealing with two different use cases in a code that will compute a levelset function passing through a large set of points. If I had DMSwarmSetMigrateType() and if it were safe to switch the migration mode back and forth in the same swarm, this would cover all my use cases here. Is it safe to add it back to petsc? Details below if you are curious. 1) During preprocessing I am loading a point cloud from disk (in whatever order it comes) and need to send the particles to the right ranks. Since the background DM is a DMDA I can easily figure out the destination rank. This would be covered by your suggestion not to attach the DM, except that later I need to locate these points with respect to the background cells in order to initialize data on the Vecs associated to the DMDA. 2) Then I need to implement a semilagrangian time evolution scheme. For this I'd like to send particles around at the "foot of characteristic", collect data there and then send them back to the originating point. The first migration would be based on particle coordinates (DMSwarmMigrate_DMNeighborScatter and the restriction to only neighbouring ranks is perfect), while for the second move it would be easier to just send them back to the originating rank, which I can easily store in an Int field in the swarm. Thus at each timestep I'd need to swap migrate types in this swarm (DMScatter for moving them to the feet and BASIC to send them back). Thanks ??? Matteo Il 22/12/22 18:40, Dave May ha scritto: > Hey Matt, > > On Thu 22. Dec 2022 at 05:02, Matthew Knepley wrote: > > On Thu, Dec 22, 2022 at 6:28 AM Matteo Semplice > wrote: > > Dear all > > ??? please ignore my previous email and read this one: I have > better localized the problem. Maybe DMSwarmMigrate is designed > to migrate particles only to first neighbouring ranks? > > Yes, I believe that was the design. > > Dave, is this correct? > > > Correct. DMSwarmMigrate_DMNeighborScatter() only scatter points to the > neighbour ranks - where neighbours are defined by the DM provided to > represent the mesh. > > DMSwarmMigrate_DMNeighborScatter() Is selected by default if you > attach a DM. > > The scatter method should be over ridden with > > DMSwarmSetMigrateType() > > however it appears this method no longer exists. > > If one can determine the exact rank where points should should be sent > and it is not going to be the neighbour rank (given by the DM), I > would suggest not attaching the DM at all. > > However if this is not possible and one wanted to scatter to say the > neighbours neighbours, we will have to add a new interface and > refactor things a little bit. > > Cheers > Dave > > > > ? Thanks, > > ? ? Matt > > Il 22/12/22 11:44, Matteo Semplice ha scritto: >> >> Dear everybody, >> >> ??? I have bug a bit into the code and I am able to add more >> information. >> >> Il 02/12/22 12:48, Matteo Semplice ha scritto: >>> Hi. >>> I am sorry to take this up again, but further tests show >>> that it's not right yet. >>> >>> Il 04/11/22 12:48, Matthew Knepley ha scritto: >>>> On Fri, Nov 4, 2022 at 7:46 AM Matteo Semplice >>>> wrote: >>>> >>>> On 04/11/2022 02:43, Matthew Knepley wrote: >>>>> On Thu, Nov 3, 2022 at 8:36 PM Matthew Knepley >>>>> wrote: >>>>> >>>>> On Thu, Oct 27, 2022 at 11:57 AM Semplice Matteo >>>>> wrote: >>>>> >>>>> Dear Petsc developers, >>>>> I am trying to use a DMSwarm to locate a cloud >>>>> of points with respect to a background mesh. >>>>> In the real application the points will be >>>>> loaded from disk, but I have created a small >>>>> demo in which >>>>> >>>>> * each processor creates Npart particles, >>>>> all within the domain covered by the mesh, >>>>> but not all in the local portion of the mesh >>>>> * migrate the particles >>>>> >>>>> After migration most particles are not any >>>>> more in the DMSwarm (how many and which ones >>>>> seems to depend on the number of cpus, but it >>>>> never happens that all particle survive the >>>>> migration process). >>>>> >>>>> Thanks for sending this. I found the problem. >>>>> Someone has some overly fancy code inside DMDA to >>>>> figure out the local bounding box from the >>>>> coordinates. >>>>> It is broken for DM_BOUNDARY_GHOSTED, but we never >>>>> tested with this. I will fix it. >>>>> >>>>> >>>>> Okay, I think this fix is correct >>>>> >>>>> https://gitlab.com/petsc/petsc/-/merge_requests/5802 >>>>> >>>>> >>>>> I incorporated your test as >>>>> src/dm/impls/da/tests/ex1.c. Can you take a look and >>>>> see if this fixes your issue? >>>> >>>> Yes, we have tested 2d and 3d, with various >>>> combinations of DM_BOUNDARY_* along different >>>> directions and it works like a charm. >>>> >>>> On a side note, neither DMSwarmViewXDMF nor >>>> DMSwarmMigrate seem to be implemented for 1d: I get >>>> >>>> [0]PETSC ERROR: No support for this operation for this >>>> object type[0]PETSC ERROR: Support not provided for 1D >>>> >>>> However, currently I have no need for this feature. >>>> >>>> Finally, if the test is meant to stay in the source, >>>> you may remove the call to >>>> DMSwarmRegisterPetscDatatypeField as in the attached patch. >>>> >>>> Thanks a lot!! >>>> >>>> Thanks! Glad it works. >>>> >>>> ? ?Matt >>>> >>> There are still problems when not using 1,2 or 4 cpus. Any >>> other number of cpus that I've tested does not work corectly. >>> >> I have now modified private_DMDALocatePointsIS_2D_Regular to >> print out some debugging information. I see that this is >> called twice during migration, once before and once after >> DMSwarmMigrate_DMNeighborScatter. If I understand correctly, >> the second call to private_DMDALocatePointsIS_2D_Regular >> should be able to locate all particles owned by the rank but >> it fails for some of them because they have been sent to the >> wrong rank (despite being well away from process boundaries). >> >> For example, running the example src/dm/impls/da/tests/ex1.c >> with Nx=21 (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 >> processors, >> >> - the particles (-0.191,-0.462) and (0.191,-0.462) are sent >> cpu2 instead of cpu0 >> >> - those at (-0.287,-0.693)and (0.287,-0.693) are sent to cpu1 >> instead of cpu0 >> >> - those at (0.191,0.462) and (-0.191,0.462) are sent to cpu0 >> instead of cpu2 >> >> (This is 2d and thus not affected by the 3d issue mentioned >> yesterday on petsc-dev. Tests were made based on the release >> branch pulled out this morning, i.e. on commit bebdc8d016f). >> > I see: particles are sent "all around" and not only to the > destination rank. > > Still however, running the example src/dm/impls/da/tests/ex1.c > with Nx=21 (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 > processors, there are 2 particles initially owned by rank2 (at > y=-0.6929 and x=+/-0.2870) that are sent only to rank1 and > never make it to rank0 and are thus lost in the end since > rank1, correctly, discards them. > > Thanks > > ??? Matteo > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to > which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- --- Professore Associato in Analisi Numerica Dipartimento di Scienza e Alta Tecnologia Universit? degli Studi dell'Insubria Via Valleggio, 11 - Como -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Thu Dec 22 13:06:06 2022 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 22 Dec 2022 11:06:06 -0800 Subject: [petsc-users] locate DMSwarm particles with respect to a background DMDA mesh In-Reply-To: <869ea7bc-52fd-c876-4278-37a8e829af8a@uninsubria.it> References: <8a853d0e-b856-5dc2-5439-d25911d672e4@uninsubria.it> <35ebfa58-eed5-39fa-8b3d-918ff9d7e633@uninsubria.it> <52622e96-4dfe-9105-a5a2-8d60d2fc90a8@uninsubria.it> <869ea7bc-52fd-c876-4278-37a8e829af8a@uninsubria.it> Message-ID: On Thu 22. Dec 2022 at 10:27, Matteo Semplice wrote: > Dear Dave and Matt, > > I am really dealing with two different use cases in a code that will > compute a levelset function passing through a large set of points. If I had > DMSwarmSetMigrateType() and if it were safe to switch the migration mode > back and forth in the same swarm, this would cover all my use cases here. > Is it safe to add it back to petsc? Details below if you are curious. > > 1) During preprocessing I am loading a point cloud from disk (in whatever > order it comes) and need to send the particles to the right ranks. Since > the background DM is a DMDA I can easily figure out the destination rank. > This would be covered by your suggestion not to attach the DM, except that > later I need to locate these points with respect to the background cells in > order to initialize data on the Vecs associated to the DMDA. > > 2) Then I need to implement a semilagrangian time evolution scheme. For > this I'd like to send particles around at the "foot of characteristic", > collect data there and then send them back to the originating point. The > first migration would be based on particle coordinates > (DMSwarmMigrate_DMNeighborScatter and the restriction to only neighbouring > ranks is perfect), while for the second move it would be easier to just > send them back to the originating rank, which I can easily store in an Int > field in the swarm. Thus at each timestep I'd need to swap migrate types in > this swarm (DMScatter for moving them to the feet and BASIC to send them > back). > When you use BASIC, you would have to explicitly call the point location routine from your code as BASIC does not interact with the DM. Based on what I see in the code, switching migrate modes between basic and dmneighbourscatter should be safe. If you are fine calling the point location from your side then what you propose should work. Cheers Dave > Thanks > > Matteo > Il 22/12/22 18:40, Dave May ha scritto: > > Hey Matt, > > On Thu 22. Dec 2022 at 05:02, Matthew Knepley wrote: > >> On Thu, Dec 22, 2022 at 6:28 AM Matteo Semplice < >> matteo.semplice at uninsubria.it> wrote: >> >>> Dear all >>> >>> please ignore my previous email and read this one: I have better >>> localized the problem. Maybe DMSwarmMigrate is designed to migrate >>> particles only to first neighbouring ranks? >>> >> Yes, I believe that was the design. >> >> Dave, is this correct? >> > > Correct. DMSwarmMigrate_DMNeighborScatter() only scatter points to the > neighbour ranks - where neighbours are defined by the DM provided to > represent the mesh. > > DMSwarmMigrate_DMNeighborScatter() Is selected by default if you attach a > DM. > > The scatter method should be over ridden with > > DMSwarmSetMigrateType() > > however it appears this method no longer exists. > > If one can determine the exact rank where points should should be sent and > it is not going to be the neighbour rank (given by the DM), I would suggest > not attaching the DM at all. > > However if this is not possible and one wanted to scatter to say the > neighbours neighbours, we will have to add a new interface and refactor > things a little bit. > > Cheers > Dave > > > >> Thanks, >> >> Matt >> >> >>> Il 22/12/22 11:44, Matteo Semplice ha scritto: >>> >>> Dear everybody, >>> >>> I have bug a bit into the code and I am able to add more information. >>> Il 02/12/22 12:48, Matteo Semplice ha scritto: >>> >>> Hi. >>> I am sorry to take this up again, but further tests show that it's not >>> right yet. >>> >>> Il 04/11/22 12:48, Matthew Knepley ha scritto: >>> >>> On Fri, Nov 4, 2022 at 7:46 AM Matteo Semplice < >>> matteo.semplice at uninsubria.it> wrote: >>> >>>> On 04/11/2022 02:43, Matthew Knepley wrote: >>>> >>>> On Thu, Nov 3, 2022 at 8:36 PM Matthew Knepley >>>> wrote: >>>> >>>>> On Thu, Oct 27, 2022 at 11:57 AM Semplice Matteo < >>>>> matteo.semplice at uninsubria.it> wrote: >>>>> >>>>>> Dear Petsc developers, >>>>>> I am trying to use a DMSwarm to locate a cloud of points with >>>>>> respect to a background mesh. In the real application the points will be >>>>>> loaded from disk, but I have created a small demo in which >>>>>> >>>>>> - each processor creates Npart particles, all within the domain >>>>>> covered by the mesh, but not all in the local portion of the mesh >>>>>> - migrate the particles >>>>>> >>>>>> After migration most particles are not any more in the DMSwarm (how >>>>>> many and which ones seems to depend on the number of cpus, but it never >>>>>> happens that all particle survive the migration process). >>>>>> >>>>>> Thanks for sending this. I found the problem. Someone has some overly >>>>> fancy code inside DMDA to figure out the local bounding box from the >>>>> coordinates. >>>>> It is broken for DM_BOUNDARY_GHOSTED, but we never tested with this. I >>>>> will fix it. >>>>> >>>> >>>> Okay, I think this fix is correct >>>> >>>> https://gitlab.com/petsc/petsc/-/merge_requests/5802 >>>> >>>> >>>> I incorporated your test as src/dm/impls/da/tests/ex1.c. Can you take a >>>> look and see if this fixes your issue? >>>> >>>> Yes, we have tested 2d and 3d, with various combinations of >>>> DM_BOUNDARY_* along different directions and it works like a charm. >>>> >>>> On a side note, neither DMSwarmViewXDMF nor DMSwarmMigrate seem to be >>>> implemented for 1d: I get >>>> >>>> [0]PETSC ERROR: No support for this operation for this object type >>>> [0]PETSC >>>> ERROR: Support not provided for 1D >>>> >>>> However, currently I have no need for this feature. >>>> >>>> Finally, if the test is meant to stay in the source, you may remove the >>>> call to DMSwarmRegisterPetscDatatypeField as in the attached patch. >>>> >>>> Thanks a lot!! >>>> >>> Thanks! Glad it works. >>> >>> Matt >>> >>> There are still problems when not using 1,2 or 4 cpus. Any other number >>> of cpus that I've tested does not work corectly. >>> >>> I have now modified private_DMDALocatePointsIS_2D_Regular to print out >>> some debugging information. I see that this is called twice during >>> migration, once before and once after DMSwarmMigrate_DMNeighborScatter. If >>> I understand correctly, the second call to >>> private_DMDALocatePointsIS_2D_Regular should be able to locate all >>> particles owned by the rank but it fails for some of them because they have >>> been sent to the wrong rank (despite being well away from process >>> boundaries). >>> >>> For example, running the example src/dm/impls/da/tests/ex1.c with Nx=21 >>> (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 processors, >>> >>> - the particles (-0.191,-0.462) and (0.191,-0.462) are sent cpu2 instead >>> of cpu0 >>> >>> - those at (-0.287,-0.693)and (0.287,-0.693) are sent to cpu1 instead of >>> cpu0 >>> >>> - those at (0.191,0.462) and (-0.191,0.462) are sent to cpu0 instead of >>> cpu2 >>> >>> (This is 2d and thus not affected by the 3d issue mentioned yesterday on >>> petsc-dev. Tests were made based on the release branch pulled out this >>> morning, i.e. on commit bebdc8d016f). >>> >>> I see: particles are sent "all around" and not only to the destination >>> rank. >>> >>> Still however, running the example src/dm/impls/da/tests/ex1.c with >>> Nx=21 (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 processors, there are 2 >>> particles initially owned by rank2 (at y=-0.6929 and x=+/-0.2870) that are >>> sent only to rank1 and never make it to rank0 and are thus lost in the end >>> since rank1, correctly, discards them. >>> >>> Thanks >>> >>> Matteo >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- > --- > Professore Associato in Analisi Numerica > Dipartimento di Scienza e Alta Tecnologia > Universit? degli Studi dell'InsubriaVia Valleggio, 11 - Como > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matteo.semplice at uninsubria.it Thu Dec 22 14:08:27 2022 From: matteo.semplice at uninsubria.it (Matteo Semplice) Date: Thu, 22 Dec 2022 21:08:27 +0100 Subject: [petsc-users] locate DMSwarm particles with respect to a background DMDA mesh In-Reply-To: References: <8a853d0e-b856-5dc2-5439-d25911d672e4@uninsubria.it> <35ebfa58-eed5-39fa-8b3d-918ff9d7e633@uninsubria.it> <52622e96-4dfe-9105-a5a2-8d60d2fc90a8@uninsubria.it> <869ea7bc-52fd-c876-4278-37a8e829af8a@uninsubria.it> Message-ID: <08423cfc-4301-6b76-a791-b5c642198ecf@uninsubria.it> Il 22/12/22 20:06, Dave May ha scritto: > > > On Thu 22. Dec 2022 at 10:27, Matteo Semplice > wrote: > > Dear Dave and Matt, > > ??? I am really dealing with two different use cases in a code > that will compute a levelset function passing through a large set > of points. If I had DMSwarmSetMigrateType() and if it were safe to > switch the migration mode back and forth in the same swarm, this > would cover all my use cases here. Is it safe to add it back to > petsc? Details below if you are curious. > > 1) During preprocessing I am loading a point cloud from disk (in > whatever order it comes) and need to send the particles to the > right ranks. Since the background DM is a DMDA I can easily figure > out the destination rank. This would be covered by your suggestion > not to attach the DM, except that later I need to locate these > points with respect to the background cells in order to initialize > data on the Vecs associated to the DMDA. > > 2) Then I need to implement a semilagrangian time evolution > scheme. For this I'd like to send particles around at the "foot of > characteristic", collect data there and then send them back to the > originating point. The first migration would be based on particle > coordinates (DMSwarmMigrate_DMNeighborScatter and the restriction > to only neighbouring ranks is perfect), while for the second move > it would be easier to just send them back to the originating rank, > which I can easily store in an Int field in the swarm. Thus at > each timestep I'd need to swap migrate types in this swarm > (DMScatter for moving them to the feet and BASIC to send them back). > > > When you use BASIC, you would have to explicitly call the point > location routine from your code as BASIC does not interact with the DM. > > Based on what I see in the code, switching ?migrate modes between > basic and dmneighbourscatter should be safe. > > If you are fine calling the point location from your side then what > you propose should work. If I understood the code correctly, BASIC will just migrate particles sending them to what is stored in DMSwarmField_rank, right? That'd be easy since I can create a SWARM with all the data I need and an extra int field (say "original_rank") and copy those values into DMSwarmField_rank before calling migrate for the "going back" step. After this backward migration I do not need to locate particles again (e.g. I do not need DMSwarmSortGetAccess after the BASIC migration, but only after the DMNeighborScatter one). Thus having back DMSwarmSetMigrateType() should be enough for me. Thanks ??? Matteo > > Cheers > Dave > > > > Thanks > > ??? Matteo > > Il 22/12/22 18:40, Dave May ha scritto: >> Hey Matt, >> >> On Thu 22. Dec 2022 at 05:02, Matthew Knepley >> wrote: >> >> On Thu, Dec 22, 2022 at 6:28 AM Matteo Semplice >> wrote: >> >> Dear all >> >> ??? please ignore my previous email and read this one: I >> have better localized the problem. Maybe DMSwarmMigrate >> is designed to migrate particles only to first >> neighbouring ranks? >> >> Yes, I believe that was the design. >> >> Dave, is this correct? >> >> >> Correct. DMSwarmMigrate_DMNeighborScatter() only scatter points >> to the neighbour ranks - where neighbours are defined by the DM >> provided to represent the mesh. >> >> DMSwarmMigrate_DMNeighborScatter() Is selected by default if you >> attach a DM. >> >> The scatter method should be over ridden with >> >> DMSwarmSetMigrateType() >> >> however it appears this method no longer exists. >> >> If one can determine the exact rank where points should should be >> sent and it is not going to be the neighbour rank (given by the >> DM), I would suggest not attaching the DM at all. >> >> However if this is not possible and one wanted to scatter to say >> the neighbours neighbours, we will have to add a new interface >> and refactor things a little bit. >> >> Cheers >> Dave >> >> >> >> ? Thanks, >> >> ? ? Matt >> >> Il 22/12/22 11:44, Matteo Semplice ha scritto: >>> >>> Dear everybody, >>> >>> ??? I have bug a bit into the code and I am able to add >>> more information. >>> >>> Il 02/12/22 12:48, Matteo Semplice ha scritto: >>>> Hi. >>>> I am sorry to take this up again, but further tests >>>> show that it's not right yet. >>>> >>>> Il 04/11/22 12:48, Matthew Knepley ha scritto: >>>>> On Fri, Nov 4, 2022 at 7:46 AM Matteo Semplice >>>>> wrote: >>>>> >>>>> On 04/11/2022 02:43, Matthew Knepley wrote: >>>>>> On Thu, Nov 3, 2022 at 8:36 PM Matthew Knepley >>>>>> wrote: >>>>>> >>>>>> On Thu, Oct 27, 2022 at 11:57 AM Semplice >>>>>> Matteo wrote: >>>>>> >>>>>> Dear Petsc developers, >>>>>> I am trying to use a DMSwarm to locate a >>>>>> cloud of points with respect to a >>>>>> background mesh. In the real application >>>>>> the points will be loaded from disk, but >>>>>> I have created a small demo in which >>>>>> >>>>>> * each processor creates Npart >>>>>> particles, all within the domain >>>>>> covered by the mesh, but not all in >>>>>> the local portion of the mesh >>>>>> * migrate the particles >>>>>> >>>>>> After migration most particles are not >>>>>> any more in the DMSwarm (how many and >>>>>> which ones seems to depend on the number >>>>>> of cpus, but it never happens that all >>>>>> particle survive the migration process). >>>>>> >>>>>> Thanks for sending this. I found the problem. >>>>>> Someone has some overly fancy code inside >>>>>> DMDA to figure out the local bounding box >>>>>> from the coordinates. >>>>>> It is broken for DM_BOUNDARY_GHOSTED, but we >>>>>> never tested with this. I will fix it. >>>>>> >>>>>> >>>>>> Okay, I think this fix is correct >>>>>> >>>>>> https://gitlab.com/petsc/petsc/-/merge_requests/5802 >>>>>> >>>>>> >>>>>> I incorporated your test as >>>>>> src/dm/impls/da/tests/ex1.c. Can you take a look >>>>>> and see if this fixes your issue? >>>>> >>>>> Yes, we have tested 2d and 3d, with various >>>>> combinations of DM_BOUNDARY_* along different >>>>> directions and it works like a charm. >>>>> >>>>> On a side note, neither DMSwarmViewXDMF nor >>>>> DMSwarmMigrate seem to be implemented for 1d: I get >>>>> >>>>> [0]PETSC ERROR: No support for this operation for >>>>> this object type[0]PETSC ERROR: Support not >>>>> provided for 1D >>>>> >>>>> However, currently I have no need for this feature. >>>>> >>>>> Finally, if the test is meant to stay in the >>>>> source, you may remove the call to >>>>> DMSwarmRegisterPetscDatatypeField as in the >>>>> attached patch. >>>>> >>>>> Thanks a lot!! >>>>> >>>>> Thanks! Glad it works. >>>>> >>>>> ? ?Matt >>>>> >>>> There are still problems when not using 1,2 or 4 cpus. >>>> Any other number of cpus that I've tested does not work >>>> corectly. >>>> >>> I have now modified >>> private_DMDALocatePointsIS_2D_Regular to print out some >>> debugging information. I see that this is called twice >>> during migration, once before and once after >>> DMSwarmMigrate_DMNeighborScatter. If I understand >>> correctly, the second call to >>> private_DMDALocatePointsIS_2D_Regular should be able to >>> locate all particles owned by the rank but it fails for >>> some of them because they have been sent to the wrong >>> rank (despite being well away from process boundaries). >>> >>> For example, running the example >>> src/dm/impls/da/tests/ex1.c with Nx=21 (20x20 Q1 >>> elements on [-1,1]X[-1,1]) with 3 processors, >>> >>> - the particles (-0.191,-0.462) and (0.191,-0.462) are >>> sent cpu2 instead of cpu0 >>> >>> - those at (-0.287,-0.693)and (0.287,-0.693) are sent to >>> cpu1 instead of cpu0 >>> >>> - those at (0.191,0.462) and (-0.191,0.462) are sent to >>> cpu0 instead of cpu2 >>> >>> (This is 2d and thus not affected by the 3d issue >>> mentioned yesterday on petsc-dev. Tests were made based >>> on the release branch pulled out this morning, i.e. on >>> commit bebdc8d016f). >>> >> I see: particles are sent "all around" and not only to >> the destination rank. >> >> Still however, running the example >> src/dm/impls/da/tests/ex1.c with Nx=21 (20x20 Q1 elements >> on [-1,1]X[-1,1]) with 3 processors, there are 2 >> particles initially owned by rank2 (at y=-0.6929 and >> x=+/-0.2870) that are sent only to rank1 and never make >> it to rank0 and are thus lost in the end since rank1, >> correctly, discards them. >> >> Thanks >> >> ??? Matteo >> >> >> >> -- >> What most experimenters take for granted before they begin >> their experiments is infinitely more interesting than any >> results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- > --- > Professore Associato in Analisi Numerica > Dipartimento di Scienza e Alta Tecnologia > Universit? degli Studi dell'Insubria > Via Valleggio, 11 - Como > -- --- Professore Associato in Analisi Numerica Dipartimento di Scienza e Alta Tecnologia Universit? degli Studi dell'Insubria Via Valleggio, 11 - Como -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Thu Dec 22 14:20:35 2022 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 22 Dec 2022 12:20:35 -0800 Subject: [petsc-users] locate DMSwarm particles with respect to a background DMDA mesh In-Reply-To: <08423cfc-4301-6b76-a791-b5c642198ecf@uninsubria.it> References: <8a853d0e-b856-5dc2-5439-d25911d672e4@uninsubria.it> <35ebfa58-eed5-39fa-8b3d-918ff9d7e633@uninsubria.it> <52622e96-4dfe-9105-a5a2-8d60d2fc90a8@uninsubria.it> <869ea7bc-52fd-c876-4278-37a8e829af8a@uninsubria.it> <08423cfc-4301-6b76-a791-b5c642198ecf@uninsubria.it> Message-ID: On Thu, 22 Dec 2022 at 12:08, Matteo Semplice wrote: > > Il 22/12/22 20:06, Dave May ha scritto: > > > > On Thu 22. Dec 2022 at 10:27, Matteo Semplice < > matteo.semplice at uninsubria.it> wrote: > >> Dear Dave and Matt, >> >> I am really dealing with two different use cases in a code that will >> compute a levelset function passing through a large set of points. If I had >> DMSwarmSetMigrateType() and if it were safe to switch the migration mode >> back and forth in the same swarm, this would cover all my use cases here. >> Is it safe to add it back to petsc? Details below if you are curious. >> >> 1) During preprocessing I am loading a point cloud from disk (in whatever >> order it comes) and need to send the particles to the right ranks. Since >> the background DM is a DMDA I can easily figure out the destination rank. >> This would be covered by your suggestion not to attach the DM, except that >> later I need to locate these points with respect to the background cells in >> order to initialize data on the Vecs associated to the DMDA. >> >> 2) Then I need to implement a semilagrangian time evolution scheme. For >> this I'd like to send particles around at the "foot of characteristic", >> collect data there and then send them back to the originating point. The >> first migration would be based on particle coordinates >> (DMSwarmMigrate_DMNeighborScatter and the restriction to only neighbouring >> ranks is perfect), while for the second move it would be easier to just >> send them back to the originating rank, which I can easily store in an Int >> field in the swarm. Thus at each timestep I'd need to swap migrate types in >> this swarm (DMScatter for moving them to the feet and BASIC to send them >> back). >> > > When you use BASIC, you would have to explicitly call the point location > routine from your code as BASIC does not interact with the DM. > > Based on what I see in the code, switching migrate modes between basic > and dmneighbourscatter should be safe. > > If you are fine calling the point location from your side then what you > propose should work. > > If I understood the code correctly, BASIC will just migrate particles > sending them to what is stored in DMSwarmField_rank, right? > Correct. > That'd be easy since I can create a SWARM with all the data I need and an > extra int field (say "original_rank") and copy those values into > DMSwarmField_rank before calling migrate for the "going back" step. After > this backward migration I do not need to locate particles again (e.g. I do > not need DMSwarmSortGetAccess after the BASIC migration, but only after the > DMNeighborScatter one). > Okay > Thus having back DMSwarmSetMigrateType() should be enough for me. > Okay. Thanks for clarifying. Cheers, Dave > Thanks > > Matteo > > > Cheers > Dave > > > >> Thanks >> >> Matteo >> Il 22/12/22 18:40, Dave May ha scritto: >> >> Hey Matt, >> >> On Thu 22. Dec 2022 at 05:02, Matthew Knepley wrote: >> >>> On Thu, Dec 22, 2022 at 6:28 AM Matteo Semplice < >>> matteo.semplice at uninsubria.it> wrote: >>> >>>> Dear all >>>> >>>> please ignore my previous email and read this one: I have better >>>> localized the problem. Maybe DMSwarmMigrate is designed to migrate >>>> particles only to first neighbouring ranks? >>>> >>> Yes, I believe that was the design. >>> >>> Dave, is this correct? >>> >> >> Correct. DMSwarmMigrate_DMNeighborScatter() only scatter points to the >> neighbour ranks - where neighbours are defined by the DM provided to >> represent the mesh. >> >> DMSwarmMigrate_DMNeighborScatter() Is selected by default if you attach >> a DM. >> >> The scatter method should be over ridden with >> >> DMSwarmSetMigrateType() >> >> however it appears this method no longer exists. >> >> If one can determine the exact rank where points should should be sent >> and it is not going to be the neighbour rank (given by the DM), I would >> suggest not attaching the DM at all. >> >> However if this is not possible and one wanted to scatter to say the >> neighbours neighbours, we will have to add a new interface and refactor >> things a little bit. >> >> Cheers >> Dave >> >> >> >>> Thanks, >>> >>> Matt >>> >>> >>>> Il 22/12/22 11:44, Matteo Semplice ha scritto: >>>> >>>> Dear everybody, >>>> >>>> I have bug a bit into the code and I am able to add more >>>> information. >>>> Il 02/12/22 12:48, Matteo Semplice ha scritto: >>>> >>>> Hi. >>>> I am sorry to take this up again, but further tests show that it's not >>>> right yet. >>>> >>>> Il 04/11/22 12:48, Matthew Knepley ha scritto: >>>> >>>> On Fri, Nov 4, 2022 at 7:46 AM Matteo Semplice < >>>> matteo.semplice at uninsubria.it> wrote: >>>> >>>>> On 04/11/2022 02:43, Matthew Knepley wrote: >>>>> >>>>> On Thu, Nov 3, 2022 at 8:36 PM Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Thu, Oct 27, 2022 at 11:57 AM Semplice Matteo < >>>>>> matteo.semplice at uninsubria.it> wrote: >>>>>> >>>>>>> Dear Petsc developers, >>>>>>> I am trying to use a DMSwarm to locate a cloud of points with >>>>>>> respect to a background mesh. In the real application the points will be >>>>>>> loaded from disk, but I have created a small demo in which >>>>>>> >>>>>>> - each processor creates Npart particles, all within the domain >>>>>>> covered by the mesh, but not all in the local portion of the mesh >>>>>>> - migrate the particles >>>>>>> >>>>>>> After migration most particles are not any more in the DMSwarm (how >>>>>>> many and which ones seems to depend on the number of cpus, but it never >>>>>>> happens that all particle survive the migration process). >>>>>>> >>>>>>> Thanks for sending this. I found the problem. Someone has some >>>>>> overly fancy code inside DMDA to figure out the local bounding box from the >>>>>> coordinates. >>>>>> It is broken for DM_BOUNDARY_GHOSTED, but we never tested with this. >>>>>> I will fix it. >>>>>> >>>>> >>>>> Okay, I think this fix is correct >>>>> >>>>> https://gitlab.com/petsc/petsc/-/merge_requests/5802 >>>>> >>>>> >>>>> I incorporated your test as src/dm/impls/da/tests/ex1.c. Can you take >>>>> a look and see if this fixes your issue? >>>>> >>>>> Yes, we have tested 2d and 3d, with various combinations of >>>>> DM_BOUNDARY_* along different directions and it works like a charm. >>>>> >>>>> On a side note, neither DMSwarmViewXDMF nor DMSwarmMigrate seem to be >>>>> implemented for 1d: I get >>>>> >>>>> [0]PETSC ERROR: No support for this operation for this object type >>>>> [0]PETSC >>>>> ERROR: Support not provided for 1D >>>>> >>>>> However, currently I have no need for this feature. >>>>> >>>>> Finally, if the test is meant to stay in the source, you may remove >>>>> the call to DMSwarmRegisterPetscDatatypeField as in the attached >>>>> patch. >>>>> >>>>> Thanks a lot!! >>>>> >>>> Thanks! Glad it works. >>>> >>>> Matt >>>> >>>> There are still problems when not using 1,2 or 4 cpus. Any other number >>>> of cpus that I've tested does not work corectly. >>>> >>>> I have now modified private_DMDALocatePointsIS_2D_Regular to print out >>>> some debugging information. I see that this is called twice during >>>> migration, once before and once after DMSwarmMigrate_DMNeighborScatter. If >>>> I understand correctly, the second call to >>>> private_DMDALocatePointsIS_2D_Regular should be able to locate all >>>> particles owned by the rank but it fails for some of them because they have >>>> been sent to the wrong rank (despite being well away from process >>>> boundaries). >>>> >>>> For example, running the example src/dm/impls/da/tests/ex1.c with Nx=21 >>>> (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 processors, >>>> >>>> - the particles (-0.191,-0.462) and (0.191,-0.462) are sent cpu2 >>>> instead of cpu0 >>>> >>>> - those at (-0.287,-0.693)and (0.287,-0.693) are sent to cpu1 instead >>>> of cpu0 >>>> >>>> - those at (0.191,0.462) and (-0.191,0.462) are sent to cpu0 instead of >>>> cpu2 >>>> >>>> (This is 2d and thus not affected by the 3d issue mentioned yesterday >>>> on petsc-dev. Tests were made based on the release branch pulled out this >>>> morning, i.e. on commit bebdc8d016f). >>>> >>>> I see: particles are sent "all around" and not only to the destination >>>> rank. >>>> >>>> Still however, running the example src/dm/impls/da/tests/ex1.c with >>>> Nx=21 (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 processors, there are 2 >>>> particles initially owned by rank2 (at y=-0.6929 and x=+/-0.2870) that are >>>> sent only to rank1 and never make it to rank0 and are thus lost in the end >>>> since rank1, correctly, discards them. >>>> >>>> Thanks >>>> >>>> Matteo >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- >> --- >> Professore Associato in Analisi Numerica >> Dipartimento di Scienza e Alta Tecnologia >> Universit? degli Studi dell'InsubriaVia Valleggio, 11 - Como >> >> -- > --- > Professore Associato in Analisi Numerica > Dipartimento di Scienza e Alta Tecnologia > Universit? degli Studi dell'Insubria > Via Valleggio, 11 - Como > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Dec 23 09:57:18 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 23 Dec 2022 10:57:18 -0500 Subject: [petsc-users] Getting a vector from a DM to output VTK In-Reply-To: References: Message-ID: On Thu, Dec 22, 2022 at 12:41 AM Nicholas Arnold-Medabalimi < narnoldm at umich.edu> wrote: > Hi Petsc Users > > I've been having trouble consistently getting a vector generated from a DM > to output to VTK correctly. I've used ex1.c (which works properly)to try > and figure it out, but I'm still having some issues. I must be missing > something small that isn't correctly associating the section with the DM. > > DMPlexGetChart(dm, &p0, &p1); > PetscSection section_full; > PetscSectionCreate(PETSC_COMM_WORLD, §ion_full); > PetscSectionSetNumFields(section_full, 1); > PetscSectionSetChart(section_full, p0, p1); > PetscSectionSetFieldName(section_full, 0, "state"); > > for (int i = c0; i < c1; i++) > { > PetscSectionSetDof(section_full, i, 1); > PetscSectionSetFieldDof(section_full, i, 0, 1); > } > PetscSectionSetUp(section_full); > DMSetNumFields(dm, 1); > DMSetLocalSection(dm, section_full); > DMCreateGlobalVector(dm, &state_full); > > int o0, o1; > VecGetOwnershipRange(state_full, &o0, &o1); > PetscScalar *state_full_array; > VecGetArray(state_full, &state_full_array); > > for (int i = 0; i < (c1 - c0); i++) > { > int offset; > PetscSectionGetOffset(section_full, i, &offset); > state_full_array[offset] = 101325 + i; > } > > VecRestoreArray(state_full, &state_full_array); > > > PetscViewerCreate(PETSC_COMM_WORLD, &viewer); > PetscViewerSetType(viewer, PETSCVIEWERVTK); > PetscViewerFileSetMode(viewer, FILE_MODE_WRITE); > PetscViewerFileSetName(viewer, "mesh.vtu"); > VecView(state_full, viewer); > > If I run this mesh.vtu isn't generated at all. If I instead do a DMView > passing the DM it will just output the mesh correctly. > > Any assistance would be greatly appreciated. > DMCreateGlobalVector() dispatches to DMCreateGlobalVector_Plex(), which resets the view method to VecView_Plex(), which should dispatch to VecView_Plex_Local_VTK(). You can verify this in the debugger, or send us code we can run to verify it. Thanks, Matt > Sincerely > Nicholas > > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Dec 23 10:14:46 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 23 Dec 2022 11:14:46 -0500 Subject: [petsc-users] locate DMSwarm particles with respect to a background DMDA mesh In-Reply-To: <08423cfc-4301-6b76-a791-b5c642198ecf@uninsubria.it> References: <8a853d0e-b856-5dc2-5439-d25911d672e4@uninsubria.it> <35ebfa58-eed5-39fa-8b3d-918ff9d7e633@uninsubria.it> <52622e96-4dfe-9105-a5a2-8d60d2fc90a8@uninsubria.it> <869ea7bc-52fd-c876-4278-37a8e829af8a@uninsubria.it> <08423cfc-4301-6b76-a791-b5c642198ecf@uninsubria.it> Message-ID: On Thu, Dec 22, 2022 at 3:08 PM Matteo Semplice < matteo.semplice at uninsubria.it> wrote: > > Il 22/12/22 20:06, Dave May ha scritto: > > > > On Thu 22. Dec 2022 at 10:27, Matteo Semplice < > matteo.semplice at uninsubria.it> wrote: > >> Dear Dave and Matt, >> >> I am really dealing with two different use cases in a code that will >> compute a levelset function passing through a large set of points. If I had >> DMSwarmSetMigrateType() and if it were safe to switch the migration mode >> back and forth in the same swarm, this would cover all my use cases here. >> Is it safe to add it back to petsc? Details below if you are curious. >> >> 1) During preprocessing I am loading a point cloud from disk (in whatever >> order it comes) and need to send the particles to the right ranks. Since >> the background DM is a DMDA I can easily figure out the destination rank. >> This would be covered by your suggestion not to attach the DM, except that >> later I need to locate these points with respect to the background cells in >> order to initialize data on the Vecs associated to the DMDA. >> >> 2) Then I need to implement a semilagrangian time evolution scheme. For >> this I'd like to send particles around at the "foot of characteristic", >> collect data there and then send them back to the originating point. The >> first migration would be based on particle coordinates >> (DMSwarmMigrate_DMNeighborScatter and the restriction to only neighbouring >> ranks is perfect), while for the second move it would be easier to just >> send them back to the originating rank, which I can easily store in an Int >> field in the swarm. Thus at each timestep I'd need to swap migrate types in >> this swarm (DMScatter for moving them to the feet and BASIC to send them >> back). >> > > When you use BASIC, you would have to explicitly call the point location > routine from your code as BASIC does not interact with the DM. > > Based on what I see in the code, switching migrate modes between basic > and dmneighbourscatter should be safe. > > If you are fine calling the point location from your side then what you > propose should work. > > If I understood the code correctly, BASIC will just migrate particles > sending them to what is stored in DMSwarmField_rank, right? That'd be easy > since I can create a SWARM with all the data I need and an extra int field > (say "original_rank") and copy those values into DMSwarmField_rank before > calling migrate for the "going back" step. After this backward migration I > do not need to locate particles again (e.g. I do not need > DMSwarmSortGetAccess after the BASIC migration, but only after the > DMNeighborScatter one). > > Thus having back DMSwarmSetMigrateType() should be enough for me. > > Hi Matteo, I have done this in https://gitlab.com/petsc/petsc/-/merge_requests/5941 I also hope to get the fix for your DMDA issue in there. Thanks, Matt > Thanks > > Matteo > > > Cheers > Dave > > > >> Thanks >> >> Matteo >> Il 22/12/22 18:40, Dave May ha scritto: >> >> Hey Matt, >> >> On Thu 22. Dec 2022 at 05:02, Matthew Knepley wrote: >> >>> On Thu, Dec 22, 2022 at 6:28 AM Matteo Semplice < >>> matteo.semplice at uninsubria.it> wrote: >>> >>>> Dear all >>>> >>>> please ignore my previous email and read this one: I have better >>>> localized the problem. Maybe DMSwarmMigrate is designed to migrate >>>> particles only to first neighbouring ranks? >>>> >>> Yes, I believe that was the design. >>> >>> Dave, is this correct? >>> >> >> Correct. DMSwarmMigrate_DMNeighborScatter() only scatter points to the >> neighbour ranks - where neighbours are defined by the DM provided to >> represent the mesh. >> >> DMSwarmMigrate_DMNeighborScatter() Is selected by default if you attach >> a DM. >> >> The scatter method should be over ridden with >> >> DMSwarmSetMigrateType() >> >> however it appears this method no longer exists. >> >> If one can determine the exact rank where points should should be sent >> and it is not going to be the neighbour rank (given by the DM), I would >> suggest not attaching the DM at all. >> >> However if this is not possible and one wanted to scatter to say the >> neighbours neighbours, we will have to add a new interface and refactor >> things a little bit. >> >> Cheers >> Dave >> >> >> >>> Thanks, >>> >>> Matt >>> >>> >>>> Il 22/12/22 11:44, Matteo Semplice ha scritto: >>>> >>>> Dear everybody, >>>> >>>> I have bug a bit into the code and I am able to add more >>>> information. >>>> Il 02/12/22 12:48, Matteo Semplice ha scritto: >>>> >>>> Hi. >>>> I am sorry to take this up again, but further tests show that it's not >>>> right yet. >>>> >>>> Il 04/11/22 12:48, Matthew Knepley ha scritto: >>>> >>>> On Fri, Nov 4, 2022 at 7:46 AM Matteo Semplice < >>>> matteo.semplice at uninsubria.it> wrote: >>>> >>>>> On 04/11/2022 02:43, Matthew Knepley wrote: >>>>> >>>>> On Thu, Nov 3, 2022 at 8:36 PM Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Thu, Oct 27, 2022 at 11:57 AM Semplice Matteo < >>>>>> matteo.semplice at uninsubria.it> wrote: >>>>>> >>>>>>> Dear Petsc developers, >>>>>>> I am trying to use a DMSwarm to locate a cloud of points with >>>>>>> respect to a background mesh. In the real application the points will be >>>>>>> loaded from disk, but I have created a small demo in which >>>>>>> >>>>>>> - each processor creates Npart particles, all within the domain >>>>>>> covered by the mesh, but not all in the local portion of the mesh >>>>>>> - migrate the particles >>>>>>> >>>>>>> After migration most particles are not any more in the DMSwarm (how >>>>>>> many and which ones seems to depend on the number of cpus, but it never >>>>>>> happens that all particle survive the migration process). >>>>>>> >>>>>>> Thanks for sending this. I found the problem. Someone has some >>>>>> overly fancy code inside DMDA to figure out the local bounding box from the >>>>>> coordinates. >>>>>> It is broken for DM_BOUNDARY_GHOSTED, but we never tested with this. >>>>>> I will fix it. >>>>>> >>>>> >>>>> Okay, I think this fix is correct >>>>> >>>>> https://gitlab.com/petsc/petsc/-/merge_requests/5802 >>>>> >>>>> >>>>> I incorporated your test as src/dm/impls/da/tests/ex1.c. Can you take >>>>> a look and see if this fixes your issue? >>>>> >>>>> Yes, we have tested 2d and 3d, with various combinations of >>>>> DM_BOUNDARY_* along different directions and it works like a charm. >>>>> >>>>> On a side note, neither DMSwarmViewXDMF nor DMSwarmMigrate seem to be >>>>> implemented for 1d: I get >>>>> >>>>> [0]PETSC ERROR: No support for this operation for this object type >>>>> [0]PETSC >>>>> ERROR: Support not provided for 1D >>>>> >>>>> However, currently I have no need for this feature. >>>>> >>>>> Finally, if the test is meant to stay in the source, you may remove >>>>> the call to DMSwarmRegisterPetscDatatypeField as in the attached >>>>> patch. >>>>> >>>>> Thanks a lot!! >>>>> >>>> Thanks! Glad it works. >>>> >>>> Matt >>>> >>>> There are still problems when not using 1,2 or 4 cpus. Any other number >>>> of cpus that I've tested does not work corectly. >>>> >>>> I have now modified private_DMDALocatePointsIS_2D_Regular to print out >>>> some debugging information. I see that this is called twice during >>>> migration, once before and once after DMSwarmMigrate_DMNeighborScatter. If >>>> I understand correctly, the second call to >>>> private_DMDALocatePointsIS_2D_Regular should be able to locate all >>>> particles owned by the rank but it fails for some of them because they have >>>> been sent to the wrong rank (despite being well away from process >>>> boundaries). >>>> >>>> For example, running the example src/dm/impls/da/tests/ex1.c with Nx=21 >>>> (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 processors, >>>> >>>> - the particles (-0.191,-0.462) and (0.191,-0.462) are sent cpu2 >>>> instead of cpu0 >>>> >>>> - those at (-0.287,-0.693)and (0.287,-0.693) are sent to cpu1 instead >>>> of cpu0 >>>> >>>> - those at (0.191,0.462) and (-0.191,0.462) are sent to cpu0 instead of >>>> cpu2 >>>> >>>> (This is 2d and thus not affected by the 3d issue mentioned yesterday >>>> on petsc-dev. Tests were made based on the release branch pulled out this >>>> morning, i.e. on commit bebdc8d016f). >>>> >>>> I see: particles are sent "all around" and not only to the destination >>>> rank. >>>> >>>> Still however, running the example src/dm/impls/da/tests/ex1.c with >>>> Nx=21 (20x20 Q1 elements on [-1,1]X[-1,1]) with 3 processors, there are 2 >>>> particles initially owned by rank2 (at y=-0.6929 and x=+/-0.2870) that are >>>> sent only to rank1 and never make it to rank0 and are thus lost in the end >>>> since rank1, correctly, discards them. >>>> >>>> Thanks >>>> >>>> Matteo >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- >> --- >> Professore Associato in Analisi Numerica >> Dipartimento di Scienza e Alta Tecnologia >> Universit? degli Studi dell'InsubriaVia Valleggio, 11 - Como >> >> -- > --- > Professore Associato in Analisi Numerica > Dipartimento di Scienza e Alta Tecnologia > Universit? degli Studi dell'Insubria > Via Valleggio, 11 - Como > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Sat Dec 24 04:19:41 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Sat, 24 Dec 2022 19:19:41 +0900 Subject: [petsc-users] Question about eigenvalue and eigenvectors Message-ID: Hello, I tried to calculate the eigenvalues and eigenvectors in 3 by 3 matrix (real and nonsymmetric). I already checked the kspcomputeeigenvalues and kspcomputeritz. However, the target matrix is just 3 by 3 matrix. So I need another way to calculate the values and vectors. Can anyone recommend other methods that are efficient for such small size problems?? Thanks, Hyung Kim -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Sat Dec 24 06:13:49 2022 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Sat, 24 Dec 2022 13:13:49 +0100 Subject: [petsc-users] Question about eigenvalue and eigenvectors In-Reply-To: References: Message-ID: For 3x3 matrices you can use explicit formulas On Sat, Dec 24, 2022, 11:20 ??? wrote: > Hello, > > > I tried to calculate the eigenvalues and eigenvectors in 3 by 3 matrix > (real and nonsymmetric). > I already checked the kspcomputeeigenvalues and kspcomputeritz. > > However, the target matrix is just 3 by 3 matrix. > So I need another way to calculate the values and vectors. > Can anyone recommend other methods that are efficient for such small size > problems?? > > Thanks, > Hyung Kim > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Dec 24 07:45:50 2022 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 24 Dec 2022 08:45:50 -0500 Subject: [petsc-users] Question about eigenvalue and eigenvectors In-Reply-To: References: Message-ID: Or just call the LAPACK routine directly. Matt On Sat, Dec 24, 2022 at 7:14 AM Stefano Zampini wrote: > For 3x3 matrices you can use explicit formulas > > On Sat, Dec 24, 2022, 11:20 ??? wrote: > >> Hello, >> >> >> I tried to calculate the eigenvalues and eigenvectors in 3 by 3 matrix >> (real and nonsymmetric). >> I already checked the kspcomputeeigenvalues and kspcomputeritz. >> >> However, the target matrix is just 3 by 3 matrix. >> So I need another way to calculate the values and vectors. >> Can anyone recommend other methods that are efficient for such small size >> problems?? >> >> Thanks, >> Hyung Kim >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From narnoldm at umich.edu Mon Dec 26 02:20:37 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Mon, 26 Dec 2022 03:20:37 -0500 Subject: [petsc-users] Getting a vector from a DM to output VTK In-Reply-To: References: Message-ID: Hi Matt I was able to get this all squared away. It turns out I was initializing the viewer incorrectly?my mistake. However, there is a follow-up question. A while back, we discussed distributing a vector field from an initial DM to a new distributed DM. The way you said to do this was // Distribute the submesh with overlap of 1 DMPlexDistribute(sub_da, overlap, &distributionSF, &sub_da_dist); //Create a vector and section for the distribution VecCreate(PETSC_COMM_WORLD, &state_dist); VecSetDM(state_dist, sub_da_dist); PetscSectionCreate(PETSC_COMM_WORLD, &distSection); DMSetLocalSection(sub_da_dist, distSection); DMPlexDistributeField(sub_da_dist, distributionSF, filteredSection, state_filtered, distSection, state_dist); I've forgone Fortran to debug this all in C and then integrate the function calls into the Fortran code. There are two questions here. 1) How do I associate a vector associated with a DM using VecSetDM to output properly as a VTK? When I call VecView at present, if I call VecView on state_dist, it will not output anything. 2) The visualization is nice, but when I look at the Vec of the distributed field using stdout, something isn't distributing correctly, as the vector still has some uninitialized values. This is apparent if I output the original vector and the distributed vector. Examining the inside of DMPlexDistributeField I suspect I'm making a mistake with the sections I'm passing. filtered section in this case is the global section but if I try the local section I get an error so I'm not sure. *Original Vector(state_filtered)* Vec Object: Vec_0x84000004_1 2 MPI processes type: mpi Process [0] 101325. 300. 101326. 301. 101341. 316. Process [1] 101325. 300. 101326. 301. 101345. 320. 101497. 472. 101516. 491. *Re-Distributed Vector (state_dist) * Vec Object: 2 MPI processes type: mpi Process [0] 101325. 300. 101326. 301. 101341. 316. 7.90505e-323 1.97626e-323 4.30765e-312 6.91179e-310 Process [1] 101497. 472. 101516. 491. 1.99665e-314 8.14714e-321 Any insight on distributing this field and correcting the error would be appreciated. Sincerely and Happy Holiday Nicholas On Fri, Dec 23, 2022 at 10:57 AM Matthew Knepley wrote: > On Thu, Dec 22, 2022 at 12:41 AM Nicholas Arnold-Medabalimi < > narnoldm at umich.edu> wrote: > >> Hi Petsc Users >> >> I've been having trouble consistently getting a vector generated from a >> DM to output to VTK correctly. I've used ex1.c (which works properly)to try >> and figure it out, but I'm still having some issues. I must be missing >> something small that isn't correctly associating the section with the DM. >> >> DMPlexGetChart(dm, &p0, &p1); >> PetscSection section_full; >> PetscSectionCreate(PETSC_COMM_WORLD, §ion_full); >> PetscSectionSetNumFields(section_full, 1); >> PetscSectionSetChart(section_full, p0, p1); >> PetscSectionSetFieldName(section_full, 0, "state"); >> >> for (int i = c0; i < c1; i++) >> { >> PetscSectionSetDof(section_full, i, 1); >> PetscSectionSetFieldDof(section_full, i, 0, 1); >> } >> PetscSectionSetUp(section_full); >> DMSetNumFields(dm, 1); >> DMSetLocalSection(dm, section_full); >> DMCreateGlobalVector(dm, &state_full); >> >> int o0, o1; >> VecGetOwnershipRange(state_full, &o0, &o1); >> PetscScalar *state_full_array; >> VecGetArray(state_full, &state_full_array); >> >> for (int i = 0; i < (c1 - c0); i++) >> { >> int offset; >> PetscSectionGetOffset(section_full, i, &offset); >> state_full_array[offset] = 101325 + i; >> } >> >> VecRestoreArray(state_full, &state_full_array); >> >> >> PetscViewerCreate(PETSC_COMM_WORLD, &viewer); >> PetscViewerSetType(viewer, PETSCVIEWERVTK); >> PetscViewerFileSetMode(viewer, FILE_MODE_WRITE); >> PetscViewerFileSetName(viewer, "mesh.vtu"); >> VecView(state_full, viewer); >> >> If I run this mesh.vtu isn't generated at all. If I instead do a DMView >> passing the DM it will just output the mesh correctly. >> >> Any assistance would be greatly appreciated. >> > > DMCreateGlobalVector() dispatches to DMCreateGlobalVector_Plex(), which > resets the view method to VecView_Plex(), which should dispatch to > VecView_Plex_Local_VTK(). You can verify this in the debugger, or send us > code we can run to verify it. > > Thanks, > > Matt > > >> Sincerely >> Nicholas >> >> >> -- >> Nicholas Arnold-Medabalimi >> >> Ph.D. Candidate >> Computational Aeroscience Lab >> University of Michigan >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From edoardo.centofanti01 at universitadipavia.it Mon Dec 26 03:41:37 2022 From: edoardo.centofanti01 at universitadipavia.it (Edoardo Centofanti) Date: Mon, 26 Dec 2022 10:41:37 +0100 Subject: [petsc-users] gamg out of memory with gpu Message-ID: Hi PETSc Users, I am experimenting some issues with the GAMG precondtioner when used with GPU. In particular, it seems to go out of memory very easily (around 5000 dofs are enough to make it throw the "[0]PETSC ERROR: cuda error 2 (cudaErrorMemoryAllocation) : out of memory" error). I have these issues both with single and multiple GPUs (on the same or on different nodes). The exact same problems work like a charm with HYPRE BoomerAMG on GPUs. With both preconditioners I exploit the device acceleration by giving the usual command line options "-dm_vec_type cuda" and "-dm_mat_type aijcusparse" (I am working with structured meshes). My PETSc version is 3.17. Is this a known issue of the GAMG preconditioner? Thank you in advance, Edoardo -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 26 08:37:19 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 26 Dec 2022 09:37:19 -0500 Subject: [petsc-users] Getting a vector from a DM to output VTK In-Reply-To: References: Message-ID: On Mon, Dec 26, 2022 at 3:21 AM Nicholas Arnold-Medabalimi < narnoldm at umich.edu> wrote: > Hi Matt > > I was able to get this all squared away. It turns out I was initializing > the viewer incorrectly?my mistake. However, there is a follow-up question. > A while back, we discussed distributing a vector field from an initial DM > to a new distributed DM. The way you said to do this was > > // Distribute the submesh with overlap of 1 > DMPlexDistribute(sub_da, overlap, &distributionSF, &sub_da_dist); > //Create a vector and section for the distribution > VecCreate(PETSC_COMM_WORLD, &state_dist); > VecSetDM(state_dist, sub_da_dist); > PetscSectionCreate(PETSC_COMM_WORLD, &distSection); > DMSetLocalSection(sub_da_dist, distSection); > DMPlexDistributeField(sub_da_dist, distributionSF, filteredSection, > state_filtered, distSection, state_dist); > I've forgone Fortran to debug this all in C and then integrate the > function calls into the Fortran code. > > There are two questions here. > > 1) How do I associate a vector associated with a DM using VecSetDM to > output properly as a VTK? When I call VecView at present, if I call VecView > on state_dist, it will not output anything. > This is a problem. The different pieces of interface were added at different times. We should really move that manipulation of the function table into VecSetDM(). Here is the code: https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexcreate.c#L4135 You can make the overload call yourself for now, until we decide on the best fix. > 2) The visualization is nice, but when I look at the Vec of the > distributed field using stdout, something isn't distributing correctly, as > the vector still has some uninitialized values. This is apparent if I > output the original vector and the distributed vector. Examining the inside > of DMPlexDistributeField I suspect I'm making a mistake with the sections > I'm passing. filtered section in this case is the global section but if I > try the local section I get an error so I'm not sure. > These should definitely be local sections. Global sections are always built after the fact, and building the global section needs the SF that indicates what points are shared, not the distribution SF that moves points. I need to go back and put in checks that all the arguments are the right type. Thanks for bringing that up. Lets track down the error for local sections. Matt > *Original Vector(state_filtered)* > Vec Object: Vec_0x84000004_1 2 MPI processes > type: mpi > Process [0] > 101325. > 300. > 101326. > 301. > 101341. > 316. > Process [1] > 101325. > 300. > 101326. > 301. > 101345. > 320. > 101497. > 472. > 101516. > 491. > *Re-Distributed Vector (state_dist) * > Vec Object: 2 MPI processes > type: mpi > Process [0] > 101325. > 300. > 101326. > 301. > 101341. > 316. > 7.90505e-323 > 1.97626e-323 > 4.30765e-312 > 6.91179e-310 > Process [1] > 101497. > 472. > 101516. > 491. > 1.99665e-314 > 8.14714e-321 > > > Any insight on distributing this field and correcting the error would be > appreciated. > > Sincerely and Happy Holiday > Nicholas > > > > On Fri, Dec 23, 2022 at 10:57 AM Matthew Knepley > wrote: > >> On Thu, Dec 22, 2022 at 12:41 AM Nicholas Arnold-Medabalimi < >> narnoldm at umich.edu> wrote: >> >>> Hi Petsc Users >>> >>> I've been having trouble consistently getting a vector generated from a >>> DM to output to VTK correctly. I've used ex1.c (which works properly)to try >>> and figure it out, but I'm still having some issues. I must be missing >>> something small that isn't correctly associating the section with the DM. >>> >>> DMPlexGetChart(dm, &p0, &p1); >>> PetscSection section_full; >>> PetscSectionCreate(PETSC_COMM_WORLD, §ion_full); >>> PetscSectionSetNumFields(section_full, 1); >>> PetscSectionSetChart(section_full, p0, p1); >>> PetscSectionSetFieldName(section_full, 0, "state"); >>> >>> for (int i = c0; i < c1; i++) >>> { >>> PetscSectionSetDof(section_full, i, 1); >>> PetscSectionSetFieldDof(section_full, i, 0, 1); >>> } >>> PetscSectionSetUp(section_full); >>> DMSetNumFields(dm, 1); >>> DMSetLocalSection(dm, section_full); >>> DMCreateGlobalVector(dm, &state_full); >>> >>> int o0, o1; >>> VecGetOwnershipRange(state_full, &o0, &o1); >>> PetscScalar *state_full_array; >>> VecGetArray(state_full, &state_full_array); >>> >>> for (int i = 0; i < (c1 - c0); i++) >>> { >>> int offset; >>> PetscSectionGetOffset(section_full, i, &offset); >>> state_full_array[offset] = 101325 + i; >>> } >>> >>> VecRestoreArray(state_full, &state_full_array); >>> >>> >>> PetscViewerCreate(PETSC_COMM_WORLD, &viewer); >>> PetscViewerSetType(viewer, PETSCVIEWERVTK); >>> PetscViewerFileSetMode(viewer, FILE_MODE_WRITE); >>> PetscViewerFileSetName(viewer, "mesh.vtu"); >>> VecView(state_full, viewer); >>> >>> If I run this mesh.vtu isn't generated at all. If I instead do a DMView >>> passing the DM it will just output the mesh correctly. >>> >>> Any assistance would be greatly appreciated. >>> >> >> DMCreateGlobalVector() dispatches to DMCreateGlobalVector_Plex(), which >> resets the view method to VecView_Plex(), which should dispatch to >> VecView_Plex_Local_VTK(). You can verify this in the debugger, or send us >> code we can run to verify it. >> >> Thanks, >> >> Matt >> >> >>> Sincerely >>> Nicholas >>> >>> >>> -- >>> Nicholas Arnold-Medabalimi >>> >>> Ph.D. Candidate >>> Computational Aeroscience Lab >>> University of Michigan >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 26 08:39:10 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 26 Dec 2022 09:39:10 -0500 Subject: [petsc-users] gamg out of memory with gpu In-Reply-To: References: Message-ID: On Mon, Dec 26, 2022 at 4:41 AM Edoardo Centofanti < edoardo.centofanti01 at universitadipavia.it> wrote: > Hi PETSc Users, > > I am experimenting some issues with the GAMG precondtioner when used with > GPU. > In particular, it seems to go out of memory very easily (around 5000 > dofs are enough to make it throw the "[0]PETSC ERROR: cuda error 2 > (cudaErrorMemoryAllocation) : out of memory" error). > I have these issues both with single and multiple GPUs (on the same or on > different nodes). The exact same problems work like a charm with HYPRE > BoomerAMG on GPUs. > With both preconditioners I exploit the device acceleration by giving the > usual command line options "-dm_vec_type cuda" and "-dm_mat_type > aijcusparse" (I am working with structured meshes). My PETSc version is > 3.17. > > Is this a known issue of the GAMG preconditioner? > No. Can you get it to do this with a PETSc example? Say SNES ex5? Thanks, Matt > Thank you in advance, > Edoardo > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From edoardo.centofanti01 at universitadipavia.it Mon Dec 26 09:29:05 2022 From: edoardo.centofanti01 at universitadipavia.it (Edoardo Centofanti) Date: Mon, 26 Dec 2022 16:29:05 +0100 Subject: [petsc-users] gamg out of memory with gpu In-Reply-To: References: Message-ID: Thank you for your answer. Can you provide me the full path of the example you have in mind? The one I found does not seem to exploit the algebraic multigrid, but just the geometric one. Thanks, Edoardo Il giorno lun 26 dic 2022 alle ore 15:39 Matthew Knepley ha scritto: > On Mon, Dec 26, 2022 at 4:41 AM Edoardo Centofanti < > edoardo.centofanti01 at universitadipavia.it> wrote: > >> Hi PETSc Users, >> >> I am experimenting some issues with the GAMG precondtioner when used with >> GPU. >> In particular, it seems to go out of memory very easily (around 5000 >> dofs are enough to make it throw the "[0]PETSC ERROR: cuda error 2 >> (cudaErrorMemoryAllocation) : out of memory" error). >> I have these issues both with single and multiple GPUs (on the same or on >> different nodes). The exact same problems work like a charm with HYPRE >> BoomerAMG on GPUs. >> With both preconditioners I exploit the device acceleration by giving the >> usual command line options "-dm_vec_type cuda" and "-dm_mat_type >> aijcusparse" (I am working with structured meshes). My PETSc version is >> 3.17. >> >> Is this a known issue of the GAMG preconditioner? >> > > No. Can you get it to do this with a PETSc example? Say SNES ex5? > > Thanks, > > Matt > > >> Thank you in advance, >> Edoardo >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From narnoldm at umich.edu Mon Dec 26 09:39:31 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Mon, 26 Dec 2022 10:39:31 -0500 Subject: [petsc-users] Getting a vector from a DM to output VTK In-Reply-To: References: Message-ID: Hi Matt 1) I'm not sure I follow how to call this. If I insert the VecSetOperation call I'm not exactly sure what the VecView_Plex is or where it is defined? 2) Otherwise I've solved this problem with the insight you provided into the local section. Things look good on the ASCII output but if we can resolve 1 then I think the loop is fully closed and I can just worry about the fortran translation. Thanks again for all your help. Sincerely Nicholas On Mon, Dec 26, 2022 at 9:37 AM Matthew Knepley wrote: > On Mon, Dec 26, 2022 at 3:21 AM Nicholas Arnold-Medabalimi < > narnoldm at umich.edu> wrote: > >> Hi Matt >> >> I was able to get this all squared away. It turns out I was initializing >> the viewer incorrectly?my mistake. However, there is a follow-up question. >> A while back, we discussed distributing a vector field from an initial DM >> to a new distributed DM. The way you said to do this was >> >> // Distribute the submesh with overlap of 1 >> DMPlexDistribute(sub_da, overlap, &distributionSF, &sub_da_dist); >> //Create a vector and section for the distribution >> VecCreate(PETSC_COMM_WORLD, &state_dist); >> VecSetDM(state_dist, sub_da_dist); >> PetscSectionCreate(PETSC_COMM_WORLD, &distSection); >> DMSetLocalSection(sub_da_dist, distSection); >> DMPlexDistributeField(sub_da_dist, distributionSF, filteredSection, >> state_filtered, distSection, state_dist); >> I've forgone Fortran to debug this all in C and then integrate the >> function calls into the Fortran code. >> >> There are two questions here. >> >> 1) How do I associate a vector associated with a DM using VecSetDM to >> output properly as a VTK? When I call VecView at present, if I call VecView >> on state_dist, it will not output anything. >> > > This is a problem. The different pieces of interface were added at > different times. We should really move that manipulation of the function > table into VecSetDM(). Here is the code: > > > https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexcreate.c#L4135 > > You can make the overload call yourself for now, until we decide on the > best fix. > > >> 2) The visualization is nice, but when I look at the Vec of the >> distributed field using stdout, something isn't distributing correctly, as >> the vector still has some uninitialized values. This is apparent if I >> output the original vector and the distributed vector. Examining the inside >> of DMPlexDistributeField I suspect I'm making a mistake with the sections >> I'm passing. filtered section in this case is the global section but if I >> try the local section I get an error so I'm not sure. >> > > These should definitely be local sections. Global sections are always > built after the fact, and building the global section needs the SF that > indicates what points are shared, not the distribution SF that moves > points. I need to go back and put in checks that all the arguments are the > right type. Thanks for bringing that up. Lets track down the error for > local sections. > > Matt > > >> *Original Vector(state_filtered)* >> Vec Object: Vec_0x84000004_1 2 MPI processes >> type: mpi >> Process [0] >> 101325. >> 300. >> 101326. >> 301. >> 101341. >> 316. >> Process [1] >> 101325. >> 300. >> 101326. >> 301. >> 101345. >> 320. >> 101497. >> 472. >> 101516. >> 491. >> *Re-Distributed Vector (state_dist) * >> Vec Object: 2 MPI processes >> type: mpi >> Process [0] >> 101325. >> 300. >> 101326. >> 301. >> 101341. >> 316. >> 7.90505e-323 >> 1.97626e-323 >> 4.30765e-312 >> 6.91179e-310 >> Process [1] >> 101497. >> 472. >> 101516. >> 491. >> 1.99665e-314 >> 8.14714e-321 >> >> >> Any insight on distributing this field and correcting the error would be >> appreciated. >> >> Sincerely and Happy Holiday >> Nicholas >> >> >> >> On Fri, Dec 23, 2022 at 10:57 AM Matthew Knepley >> wrote: >> >>> On Thu, Dec 22, 2022 at 12:41 AM Nicholas Arnold-Medabalimi < >>> narnoldm at umich.edu> wrote: >>> >>>> Hi Petsc Users >>>> >>>> I've been having trouble consistently getting a vector generated from a >>>> DM to output to VTK correctly. I've used ex1.c (which works properly)to try >>>> and figure it out, but I'm still having some issues. I must be missing >>>> something small that isn't correctly associating the section with the DM. >>>> >>>> DMPlexGetChart(dm, &p0, &p1); >>>> PetscSection section_full; >>>> PetscSectionCreate(PETSC_COMM_WORLD, §ion_full); >>>> PetscSectionSetNumFields(section_full, 1); >>>> PetscSectionSetChart(section_full, p0, p1); >>>> PetscSectionSetFieldName(section_full, 0, "state"); >>>> >>>> for (int i = c0; i < c1; i++) >>>> { >>>> PetscSectionSetDof(section_full, i, 1); >>>> PetscSectionSetFieldDof(section_full, i, 0, 1); >>>> } >>>> PetscSectionSetUp(section_full); >>>> DMSetNumFields(dm, 1); >>>> DMSetLocalSection(dm, section_full); >>>> DMCreateGlobalVector(dm, &state_full); >>>> >>>> int o0, o1; >>>> VecGetOwnershipRange(state_full, &o0, &o1); >>>> PetscScalar *state_full_array; >>>> VecGetArray(state_full, &state_full_array); >>>> >>>> for (int i = 0; i < (c1 - c0); i++) >>>> { >>>> int offset; >>>> PetscSectionGetOffset(section_full, i, &offset); >>>> state_full_array[offset] = 101325 + i; >>>> } >>>> >>>> VecRestoreArray(state_full, &state_full_array); >>>> >>>> >>>> PetscViewerCreate(PETSC_COMM_WORLD, &viewer); >>>> PetscViewerSetType(viewer, PETSCVIEWERVTK); >>>> PetscViewerFileSetMode(viewer, FILE_MODE_WRITE); >>>> PetscViewerFileSetName(viewer, "mesh.vtu"); >>>> VecView(state_full, viewer); >>>> >>>> If I run this mesh.vtu isn't generated at all. If I instead do a DMView >>>> passing the DM it will just output the mesh correctly. >>>> >>>> Any assistance would be greatly appreciated. >>>> >>> >>> DMCreateGlobalVector() dispatches to DMCreateGlobalVector_Plex(), which >>> resets the view method to VecView_Plex(), which should dispatch to >>> VecView_Plex_Local_VTK(). You can verify this in the debugger, or send us >>> code we can run to verify it. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Sincerely >>>> Nicholas >>>> >>>> >>>> -- >>>> Nicholas Arnold-Medabalimi >>>> >>>> Ph.D. Candidate >>>> Computational Aeroscience Lab >>>> University of Michigan >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> >> -- >> Nicholas Arnold-Medabalimi >> >> Ph.D. Candidate >> Computational Aeroscience Lab >> University of Michigan >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 26 09:44:55 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 26 Dec 2022 10:44:55 -0500 Subject: [petsc-users] Getting a vector from a DM to output VTK In-Reply-To: References: Message-ID: On Mon, Dec 26, 2022 at 10:40 AM Nicholas Arnold-Medabalimi < narnoldm at umich.edu> wrote: > Hi Matt > > 1) I'm not sure I follow how to call this. If I insert the VecSetOperation > call I'm not exactly sure what the VecView_Plex is or where it is defined? > Shoot, this cannot be done in Fortran. I will rewrite this step to get it working for you. It should have been done anyway. I cannot do it today since I have to grade finals, but I should have it this week. Thanks, Matt > 2) Otherwise I've solved this problem with the insight you provided into > the local section. Things look good on the ASCII output but if we can > resolve 1 then I think the loop is fully closed and I can just worry about > the fortran translation. > > Thanks again for all your help. > > Sincerely > Nicholas > > > > On Mon, Dec 26, 2022 at 9:37 AM Matthew Knepley wrote: > >> On Mon, Dec 26, 2022 at 3:21 AM Nicholas Arnold-Medabalimi < >> narnoldm at umich.edu> wrote: >> >>> Hi Matt >>> >>> I was able to get this all squared away. It turns out I was initializing >>> the viewer incorrectly?my mistake. However, there is a follow-up question. >>> A while back, we discussed distributing a vector field from an initial DM >>> to a new distributed DM. The way you said to do this was >>> >>> // Distribute the submesh with overlap of 1 >>> DMPlexDistribute(sub_da, overlap, &distributionSF, &sub_da_dist); >>> //Create a vector and section for the distribution >>> VecCreate(PETSC_COMM_WORLD, &state_dist); >>> VecSetDM(state_dist, sub_da_dist); >>> PetscSectionCreate(PETSC_COMM_WORLD, &distSection); >>> DMSetLocalSection(sub_da_dist, distSection); >>> DMPlexDistributeField(sub_da_dist, distributionSF, filteredSection, >>> state_filtered, distSection, state_dist); >>> I've forgone Fortran to debug this all in C and then integrate the >>> function calls into the Fortran code. >>> >>> There are two questions here. >>> >>> 1) How do I associate a vector associated with a DM using VecSetDM to >>> output properly as a VTK? When I call VecView at present, if I call VecView >>> on state_dist, it will not output anything. >>> >> >> This is a problem. The different pieces of interface were added at >> different times. We should really move that manipulation of the function >> table into VecSetDM(). Here is the code: >> >> >> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexcreate.c#L4135 >> >> You can make the overload call yourself for now, until we decide on the >> best fix. >> >> >>> 2) The visualization is nice, but when I look at the Vec of the >>> distributed field using stdout, something isn't distributing correctly, as >>> the vector still has some uninitialized values. This is apparent if I >>> output the original vector and the distributed vector. Examining the inside >>> of DMPlexDistributeField I suspect I'm making a mistake with the sections >>> I'm passing. filtered section in this case is the global section but if I >>> try the local section I get an error so I'm not sure. >>> >> >> These should definitely be local sections. Global sections are always >> built after the fact, and building the global section needs the SF that >> indicates what points are shared, not the distribution SF that moves >> points. I need to go back and put in checks that all the arguments are the >> right type. Thanks for bringing that up. Lets track down the error for >> local sections. >> >> Matt >> >> >>> *Original Vector(state_filtered)* >>> Vec Object: Vec_0x84000004_1 2 MPI processes >>> type: mpi >>> Process [0] >>> 101325. >>> 300. >>> 101326. >>> 301. >>> 101341. >>> 316. >>> Process [1] >>> 101325. >>> 300. >>> 101326. >>> 301. >>> 101345. >>> 320. >>> 101497. >>> 472. >>> 101516. >>> 491. >>> *Re-Distributed Vector (state_dist) * >>> Vec Object: 2 MPI processes >>> type: mpi >>> Process [0] >>> 101325. >>> 300. >>> 101326. >>> 301. >>> 101341. >>> 316. >>> 7.90505e-323 >>> 1.97626e-323 >>> 4.30765e-312 >>> 6.91179e-310 >>> Process [1] >>> 101497. >>> 472. >>> 101516. >>> 491. >>> 1.99665e-314 >>> 8.14714e-321 >>> >>> >>> Any insight on distributing this field and correcting the error would be >>> appreciated. >>> >>> Sincerely and Happy Holiday >>> Nicholas >>> >>> >>> >>> On Fri, Dec 23, 2022 at 10:57 AM Matthew Knepley >>> wrote: >>> >>>> On Thu, Dec 22, 2022 at 12:41 AM Nicholas Arnold-Medabalimi < >>>> narnoldm at umich.edu> wrote: >>>> >>>>> Hi Petsc Users >>>>> >>>>> I've been having trouble consistently getting a vector generated from >>>>> a DM to output to VTK correctly. I've used ex1.c (which works properly)to >>>>> try and figure it out, but I'm still having some issues. I must be missing >>>>> something small that isn't correctly associating the section with the DM. >>>>> >>>>> DMPlexGetChart(dm, &p0, &p1); >>>>> PetscSection section_full; >>>>> PetscSectionCreate(PETSC_COMM_WORLD, §ion_full); >>>>> PetscSectionSetNumFields(section_full, 1); >>>>> PetscSectionSetChart(section_full, p0, p1); >>>>> PetscSectionSetFieldName(section_full, 0, "state"); >>>>> >>>>> for (int i = c0; i < c1; i++) >>>>> { >>>>> PetscSectionSetDof(section_full, i, 1); >>>>> PetscSectionSetFieldDof(section_full, i, 0, 1); >>>>> } >>>>> PetscSectionSetUp(section_full); >>>>> DMSetNumFields(dm, 1); >>>>> DMSetLocalSection(dm, section_full); >>>>> DMCreateGlobalVector(dm, &state_full); >>>>> >>>>> int o0, o1; >>>>> VecGetOwnershipRange(state_full, &o0, &o1); >>>>> PetscScalar *state_full_array; >>>>> VecGetArray(state_full, &state_full_array); >>>>> >>>>> for (int i = 0; i < (c1 - c0); i++) >>>>> { >>>>> int offset; >>>>> PetscSectionGetOffset(section_full, i, &offset); >>>>> state_full_array[offset] = 101325 + i; >>>>> } >>>>> >>>>> VecRestoreArray(state_full, &state_full_array); >>>>> >>>>> >>>>> PetscViewerCreate(PETSC_COMM_WORLD, &viewer); >>>>> PetscViewerSetType(viewer, PETSCVIEWERVTK); >>>>> PetscViewerFileSetMode(viewer, FILE_MODE_WRITE); >>>>> PetscViewerFileSetName(viewer, "mesh.vtu"); >>>>> VecView(state_full, viewer); >>>>> >>>>> If I run this mesh.vtu isn't generated at all. If I instead do a >>>>> DMView passing the DM it will just output the mesh correctly. >>>>> >>>>> Any assistance would be greatly appreciated. >>>>> >>>> >>>> DMCreateGlobalVector() dispatches to DMCreateGlobalVector_Plex(), which >>>> resets the view method to VecView_Plex(), which should dispatch to >>>> VecView_Plex_Local_VTK(). You can verify this in the debugger, or send us >>>> code we can run to verify it. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Sincerely >>>>> Nicholas >>>>> >>>>> >>>>> -- >>>>> Nicholas Arnold-Medabalimi >>>>> >>>>> Ph.D. Candidate >>>>> Computational Aeroscience Lab >>>>> University of Michigan >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >>> -- >>> Nicholas Arnold-Medabalimi >>> >>> Ph.D. Candidate >>> Computational Aeroscience Lab >>> University of Michigan >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From narnoldm at umich.edu Mon Dec 26 09:48:43 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Mon, 26 Dec 2022 10:48:43 -0500 Subject: [petsc-users] Getting a vector from a DM to output VTK In-Reply-To: References: Message-ID: Oh it's not a worry. I'm debugging this first in C++, and once it's working I don't actually need to view what's happening in Fortran when I move over. In my C debugging code. After I create the distribution vector and distribute the field based on your input, I'm adding VecSetOperation(state_dist, VECOP_VIEW, (void(*)(void))VecView_Plex); But I can't compile it because VecView_Plex is undefined. Thanks Sincerely Nicholas On Mon, Dec 26, 2022 at 10:45 AM Matthew Knepley wrote: > On Mon, Dec 26, 2022 at 10:40 AM Nicholas Arnold-Medabalimi < > narnoldm at umich.edu> wrote: > >> Hi Matt >> >> 1) I'm not sure I follow how to call this. If I insert the >> VecSetOperation call I'm not exactly sure what the VecView_Plex is or where >> it is defined? >> > > Shoot, this cannot be done in Fortran. I will rewrite this step to get it > working for you. It should have been done anyway. I cannot do it > today since I have to grade finals, but I should have it this week. > > Thanks, > > Matt > > >> 2) Otherwise I've solved this problem with the insight you provided into >> the local section. Things look good on the ASCII output but if we can >> resolve 1 then I think the loop is fully closed and I can just worry about >> the fortran translation. >> >> Thanks again for all your help. >> >> Sincerely >> Nicholas >> >> >> >> On Mon, Dec 26, 2022 at 9:37 AM Matthew Knepley >> wrote: >> >>> On Mon, Dec 26, 2022 at 3:21 AM Nicholas Arnold-Medabalimi < >>> narnoldm at umich.edu> wrote: >>> >>>> Hi Matt >>>> >>>> I was able to get this all squared away. It turns out I was >>>> initializing the viewer incorrectly?my mistake. However, there is a >>>> follow-up question. A while back, we discussed distributing a vector field >>>> from an initial DM to a new distributed DM. The way you said to do this was >>>> >>>> // Distribute the submesh with overlap of 1 >>>> DMPlexDistribute(sub_da, overlap, &distributionSF, &sub_da_dist); >>>> //Create a vector and section for the distribution >>>> VecCreate(PETSC_COMM_WORLD, &state_dist); >>>> VecSetDM(state_dist, sub_da_dist); >>>> PetscSectionCreate(PETSC_COMM_WORLD, &distSection); >>>> DMSetLocalSection(sub_da_dist, distSection); >>>> DMPlexDistributeField(sub_da_dist, distributionSF, filteredSection, >>>> state_filtered, distSection, state_dist); >>>> I've forgone Fortran to debug this all in C and then integrate the >>>> function calls into the Fortran code. >>>> >>>> There are two questions here. >>>> >>>> 1) How do I associate a vector associated with a DM using VecSetDM to >>>> output properly as a VTK? When I call VecView at present, if I call VecView >>>> on state_dist, it will not output anything. >>>> >>> >>> This is a problem. The different pieces of interface were added at >>> different times. We should really move that manipulation of the function >>> table into VecSetDM(). Here is the code: >>> >>> >>> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexcreate.c#L4135 >>> >>> You can make the overload call yourself for now, until we decide on the >>> best fix. >>> >>> >>>> 2) The visualization is nice, but when I look at the Vec of the >>>> distributed field using stdout, something isn't distributing correctly, as >>>> the vector still has some uninitialized values. This is apparent if I >>>> output the original vector and the distributed vector. Examining the inside >>>> of DMPlexDistributeField I suspect I'm making a mistake with the sections >>>> I'm passing. filtered section in this case is the global section but if I >>>> try the local section I get an error so I'm not sure. >>>> >>> >>> These should definitely be local sections. Global sections are always >>> built after the fact, and building the global section needs the SF that >>> indicates what points are shared, not the distribution SF that moves >>> points. I need to go back and put in checks that all the arguments are the >>> right type. Thanks for bringing that up. Lets track down the error for >>> local sections. >>> >>> Matt >>> >>> >>>> *Original Vector(state_filtered)* >>>> Vec Object: Vec_0x84000004_1 2 MPI processes >>>> type: mpi >>>> Process [0] >>>> 101325. >>>> 300. >>>> 101326. >>>> 301. >>>> 101341. >>>> 316. >>>> Process [1] >>>> 101325. >>>> 300. >>>> 101326. >>>> 301. >>>> 101345. >>>> 320. >>>> 101497. >>>> 472. >>>> 101516. >>>> 491. >>>> *Re-Distributed Vector (state_dist) * >>>> Vec Object: 2 MPI processes >>>> type: mpi >>>> Process [0] >>>> 101325. >>>> 300. >>>> 101326. >>>> 301. >>>> 101341. >>>> 316. >>>> 7.90505e-323 >>>> 1.97626e-323 >>>> 4.30765e-312 >>>> 6.91179e-310 >>>> Process [1] >>>> 101497. >>>> 472. >>>> 101516. >>>> 491. >>>> 1.99665e-314 >>>> 8.14714e-321 >>>> >>>> >>>> Any insight on distributing this field and correcting the error would >>>> be appreciated. >>>> >>>> Sincerely and Happy Holiday >>>> Nicholas >>>> >>>> >>>> >>>> On Fri, Dec 23, 2022 at 10:57 AM Matthew Knepley >>>> wrote: >>>> >>>>> On Thu, Dec 22, 2022 at 12:41 AM Nicholas Arnold-Medabalimi < >>>>> narnoldm at umich.edu> wrote: >>>>> >>>>>> Hi Petsc Users >>>>>> >>>>>> I've been having trouble consistently getting a vector generated from >>>>>> a DM to output to VTK correctly. I've used ex1.c (which works properly)to >>>>>> try and figure it out, but I'm still having some issues. I must be missing >>>>>> something small that isn't correctly associating the section with the DM. >>>>>> >>>>>> DMPlexGetChart(dm, &p0, &p1); >>>>>> PetscSection section_full; >>>>>> PetscSectionCreate(PETSC_COMM_WORLD, §ion_full); >>>>>> PetscSectionSetNumFields(section_full, 1); >>>>>> PetscSectionSetChart(section_full, p0, p1); >>>>>> PetscSectionSetFieldName(section_full, 0, "state"); >>>>>> >>>>>> for (int i = c0; i < c1; i++) >>>>>> { >>>>>> PetscSectionSetDof(section_full, i, 1); >>>>>> PetscSectionSetFieldDof(section_full, i, 0, 1); >>>>>> } >>>>>> PetscSectionSetUp(section_full); >>>>>> DMSetNumFields(dm, 1); >>>>>> DMSetLocalSection(dm, section_full); >>>>>> DMCreateGlobalVector(dm, &state_full); >>>>>> >>>>>> int o0, o1; >>>>>> VecGetOwnershipRange(state_full, &o0, &o1); >>>>>> PetscScalar *state_full_array; >>>>>> VecGetArray(state_full, &state_full_array); >>>>>> >>>>>> for (int i = 0; i < (c1 - c0); i++) >>>>>> { >>>>>> int offset; >>>>>> PetscSectionGetOffset(section_full, i, &offset); >>>>>> state_full_array[offset] = 101325 + i; >>>>>> } >>>>>> >>>>>> VecRestoreArray(state_full, &state_full_array); >>>>>> >>>>>> >>>>>> PetscViewerCreate(PETSC_COMM_WORLD, &viewer); >>>>>> PetscViewerSetType(viewer, PETSCVIEWERVTK); >>>>>> PetscViewerFileSetMode(viewer, FILE_MODE_WRITE); >>>>>> PetscViewerFileSetName(viewer, "mesh.vtu"); >>>>>> VecView(state_full, viewer); >>>>>> >>>>>> If I run this mesh.vtu isn't generated at all. If I instead do a >>>>>> DMView passing the DM it will just output the mesh correctly. >>>>>> >>>>>> Any assistance would be greatly appreciated. >>>>>> >>>>> >>>>> DMCreateGlobalVector() dispatches to DMCreateGlobalVector_Plex(), >>>>> which resets the view method to VecView_Plex(), which should dispatch to >>>>> VecView_Plex_Local_VTK(). You can verify this in the debugger, or send us >>>>> code we can run to verify it. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Sincerely >>>>>> Nicholas >>>>>> >>>>>> >>>>>> -- >>>>>> Nicholas Arnold-Medabalimi >>>>>> >>>>>> Ph.D. Candidate >>>>>> Computational Aeroscience Lab >>>>>> University of Michigan >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>>> >>>> -- >>>> Nicholas Arnold-Medabalimi >>>> >>>> Ph.D. Candidate >>>> Computational Aeroscience Lab >>>> University of Michigan >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> >> -- >> Nicholas Arnold-Medabalimi >> >> Ph.D. Candidate >> Computational Aeroscience Lab >> University of Michigan >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 26 09:50:29 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 26 Dec 2022 10:50:29 -0500 Subject: [petsc-users] gamg out of memory with gpu In-Reply-To: References: Message-ID: On Mon, Dec 26, 2022 at 10:29 AM Edoardo Centofanti < edoardo.centofanti01 at universitadipavia.it> wrote: > Thank you for your answer. Can you provide me the full path of the example > you have in mind? The one I found does not seem to exploit the algebraic > multigrid, but just the geometric one. > cd $PETSC_DIR/src/snes/tutorials/ex5 ./ex5 -da_grid_x 64 -da_grid_y 64 -mms 3 -pc_type gang and for GPUs I think you need the options to move things over -dm_vec_type cuda -dm_mat_type aijcusparse Thanks, Matt > Thanks, > Edoardo > > Il giorno lun 26 dic 2022 alle ore 15:39 Matthew Knepley < > knepley at gmail.com> ha scritto: > >> On Mon, Dec 26, 2022 at 4:41 AM Edoardo Centofanti < >> edoardo.centofanti01 at universitadipavia.it> wrote: >> >>> Hi PETSc Users, >>> >>> I am experimenting some issues with the GAMG precondtioner when used >>> with GPU. >>> In particular, it seems to go out of memory very easily (around 5000 >>> dofs are enough to make it throw the "[0]PETSC ERROR: cuda error 2 >>> (cudaErrorMemoryAllocation) : out of memory" error). >>> I have these issues both with single and multiple GPUs (on the same or >>> on different nodes). The exact same problems work like a charm with HYPRE >>> BoomerAMG on GPUs. >>> With both preconditioners I exploit the device acceleration by giving >>> the usual command line options "-dm_vec_type cuda" and "-dm_mat_type >>> aijcusparse" (I am working with structured meshes). My PETSc version is >>> 3.17. >>> >>> Is this a known issue of the GAMG preconditioner? >>> >> >> No. Can you get it to do this with a PETSc example? Say SNES ex5? >> >> Thanks, >> >> Matt >> >> >>> Thank you in advance, >>> Edoardo >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Dec 26 09:51:51 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 26 Dec 2022 10:51:51 -0500 Subject: [petsc-users] Getting a vector from a DM to output VTK In-Reply-To: References: Message-ID: On Mon, Dec 26, 2022 at 10:49 AM Nicholas Arnold-Medabalimi < narnoldm at umich.edu> wrote: > Oh it's not a worry. I'm debugging this first in C++, and once it's > working I don't actually need to view what's happening in Fortran when I > move over. In my C debugging code. After I create the distribution vector > and distribute the field based on your input, I'm adding > > VecSetOperation(state_dist, VECOP_VIEW, (void(*)(void))VecView_Plex); > > But I can't compile it because VecView_Plex is undefined. Thanks > You need #include Thanks Matt > Sincerely > Nicholas > > > > On Mon, Dec 26, 2022 at 10:45 AM Matthew Knepley > wrote: > >> On Mon, Dec 26, 2022 at 10:40 AM Nicholas Arnold-Medabalimi < >> narnoldm at umich.edu> wrote: >> >>> Hi Matt >>> >>> 1) I'm not sure I follow how to call this. If I insert the >>> VecSetOperation call I'm not exactly sure what the VecView_Plex is or where >>> it is defined? >>> >> >> Shoot, this cannot be done in Fortran. I will rewrite this step to get it >> working for you. It should have been done anyway. I cannot do it >> today since I have to grade finals, but I should have it this week. >> >> Thanks, >> >> Matt >> >> >>> 2) Otherwise I've solved this problem with the insight you provided into >>> the local section. Things look good on the ASCII output but if we can >>> resolve 1 then I think the loop is fully closed and I can just worry about >>> the fortran translation. >>> >>> Thanks again for all your help. >>> >>> Sincerely >>> Nicholas >>> >>> >>> >>> On Mon, Dec 26, 2022 at 9:37 AM Matthew Knepley >>> wrote: >>> >>>> On Mon, Dec 26, 2022 at 3:21 AM Nicholas Arnold-Medabalimi < >>>> narnoldm at umich.edu> wrote: >>>> >>>>> Hi Matt >>>>> >>>>> I was able to get this all squared away. It turns out I was >>>>> initializing the viewer incorrectly?my mistake. However, there is a >>>>> follow-up question. A while back, we discussed distributing a vector field >>>>> from an initial DM to a new distributed DM. The way you said to do this was >>>>> >>>>> // Distribute the submesh with overlap of 1 >>>>> DMPlexDistribute(sub_da, overlap, &distributionSF, &sub_da_dist); >>>>> //Create a vector and section for the distribution >>>>> VecCreate(PETSC_COMM_WORLD, &state_dist); >>>>> VecSetDM(state_dist, sub_da_dist); >>>>> PetscSectionCreate(PETSC_COMM_WORLD, &distSection); >>>>> DMSetLocalSection(sub_da_dist, distSection); >>>>> DMPlexDistributeField(sub_da_dist, distributionSF, filteredSection, >>>>> state_filtered, distSection, state_dist); >>>>> I've forgone Fortran to debug this all in C and then integrate the >>>>> function calls into the Fortran code. >>>>> >>>>> There are two questions here. >>>>> >>>>> 1) How do I associate a vector associated with a DM using VecSetDM to >>>>> output properly as a VTK? When I call VecView at present, if I call VecView >>>>> on state_dist, it will not output anything. >>>>> >>>> >>>> This is a problem. The different pieces of interface were added at >>>> different times. We should really move that manipulation of the function >>>> table into VecSetDM(). Here is the code: >>>> >>>> >>>> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexcreate.c#L4135 >>>> >>>> You can make the overload call yourself for now, until we decide on the >>>> best fix. >>>> >>>> >>>>> 2) The visualization is nice, but when I look at the Vec of the >>>>> distributed field using stdout, something isn't distributing correctly, as >>>>> the vector still has some uninitialized values. This is apparent if I >>>>> output the original vector and the distributed vector. Examining the inside >>>>> of DMPlexDistributeField I suspect I'm making a mistake with the sections >>>>> I'm passing. filtered section in this case is the global section but if I >>>>> try the local section I get an error so I'm not sure. >>>>> >>>> >>>> These should definitely be local sections. Global sections are always >>>> built after the fact, and building the global section needs the SF that >>>> indicates what points are shared, not the distribution SF that moves >>>> points. I need to go back and put in checks that all the arguments are the >>>> right type. Thanks for bringing that up. Lets track down the error for >>>> local sections. >>>> >>>> Matt >>>> >>>> >>>>> *Original Vector(state_filtered)* >>>>> Vec Object: Vec_0x84000004_1 2 MPI processes >>>>> type: mpi >>>>> Process [0] >>>>> 101325. >>>>> 300. >>>>> 101326. >>>>> 301. >>>>> 101341. >>>>> 316. >>>>> Process [1] >>>>> 101325. >>>>> 300. >>>>> 101326. >>>>> 301. >>>>> 101345. >>>>> 320. >>>>> 101497. >>>>> 472. >>>>> 101516. >>>>> 491. >>>>> *Re-Distributed Vector (state_dist) * >>>>> Vec Object: 2 MPI processes >>>>> type: mpi >>>>> Process [0] >>>>> 101325. >>>>> 300. >>>>> 101326. >>>>> 301. >>>>> 101341. >>>>> 316. >>>>> 7.90505e-323 >>>>> 1.97626e-323 >>>>> 4.30765e-312 >>>>> 6.91179e-310 >>>>> Process [1] >>>>> 101497. >>>>> 472. >>>>> 101516. >>>>> 491. >>>>> 1.99665e-314 >>>>> 8.14714e-321 >>>>> >>>>> >>>>> Any insight on distributing this field and correcting the error would >>>>> be appreciated. >>>>> >>>>> Sincerely and Happy Holiday >>>>> Nicholas >>>>> >>>>> >>>>> >>>>> On Fri, Dec 23, 2022 at 10:57 AM Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Thu, Dec 22, 2022 at 12:41 AM Nicholas Arnold-Medabalimi < >>>>>> narnoldm at umich.edu> wrote: >>>>>> >>>>>>> Hi Petsc Users >>>>>>> >>>>>>> I've been having trouble consistently getting a vector generated >>>>>>> from a DM to output to VTK correctly. I've used ex1.c (which works >>>>>>> properly)to try and figure it out, but I'm still having some issues. I must >>>>>>> be missing something small that isn't correctly associating the section >>>>>>> with the DM. >>>>>>> >>>>>>> DMPlexGetChart(dm, &p0, &p1); >>>>>>> PetscSection section_full; >>>>>>> PetscSectionCreate(PETSC_COMM_WORLD, §ion_full); >>>>>>> PetscSectionSetNumFields(section_full, 1); >>>>>>> PetscSectionSetChart(section_full, p0, p1); >>>>>>> PetscSectionSetFieldName(section_full, 0, "state"); >>>>>>> >>>>>>> for (int i = c0; i < c1; i++) >>>>>>> { >>>>>>> PetscSectionSetDof(section_full, i, 1); >>>>>>> PetscSectionSetFieldDof(section_full, i, 0, 1); >>>>>>> } >>>>>>> PetscSectionSetUp(section_full); >>>>>>> DMSetNumFields(dm, 1); >>>>>>> DMSetLocalSection(dm, section_full); >>>>>>> DMCreateGlobalVector(dm, &state_full); >>>>>>> >>>>>>> int o0, o1; >>>>>>> VecGetOwnershipRange(state_full, &o0, &o1); >>>>>>> PetscScalar *state_full_array; >>>>>>> VecGetArray(state_full, &state_full_array); >>>>>>> >>>>>>> for (int i = 0; i < (c1 - c0); i++) >>>>>>> { >>>>>>> int offset; >>>>>>> PetscSectionGetOffset(section_full, i, &offset); >>>>>>> state_full_array[offset] = 101325 + i; >>>>>>> } >>>>>>> >>>>>>> VecRestoreArray(state_full, &state_full_array); >>>>>>> >>>>>>> >>>>>>> PetscViewerCreate(PETSC_COMM_WORLD, &viewer); >>>>>>> PetscViewerSetType(viewer, PETSCVIEWERVTK); >>>>>>> PetscViewerFileSetMode(viewer, FILE_MODE_WRITE); >>>>>>> PetscViewerFileSetName(viewer, "mesh.vtu"); >>>>>>> VecView(state_full, viewer); >>>>>>> >>>>>>> If I run this mesh.vtu isn't generated at all. If I instead do a >>>>>>> DMView passing the DM it will just output the mesh correctly. >>>>>>> >>>>>>> Any assistance would be greatly appreciated. >>>>>>> >>>>>> >>>>>> DMCreateGlobalVector() dispatches to DMCreateGlobalVector_Plex(), >>>>>> which resets the view method to VecView_Plex(), which should dispatch to >>>>>> VecView_Plex_Local_VTK(). You can verify this in the debugger, or send us >>>>>> code we can run to verify it. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Sincerely >>>>>>> Nicholas >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Nicholas Arnold-Medabalimi >>>>>>> >>>>>>> Ph.D. Candidate >>>>>>> Computational Aeroscience Lab >>>>>>> University of Michigan >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Nicholas Arnold-Medabalimi >>>>> >>>>> Ph.D. Candidate >>>>> Computational Aeroscience Lab >>>>> University of Michigan >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >>> -- >>> Nicholas Arnold-Medabalimi >>> >>> Ph.D. Candidate >>> Computational Aeroscience Lab >>> University of Michigan >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From narnoldm at umich.edu Mon Dec 26 13:37:26 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Mon, 26 Dec 2022 14:37:26 -0500 Subject: [petsc-users] Getting a vector from a DM to output VTK In-Reply-To: References: Message-ID: Hi Thanks so much for all your help. I've gotten all the core tech working in C++ and am working on the Fortran integration. Before I do that, I've been doing some memory checks using Valgrind to ensure everything is acceptable since I've been seeing random memory corruption errors for specific mesh sizes. I'm getting an invalid write memory error associated with a DMCreateGlobalVector call. I presume I have something small in my section assignment causing this, but I would appreciate any insight. This is more or less ripped directly from plex tutorial example 7. PetscErrorCode addSectionToDM(DM &dm, Vec &state) { int p0, p1, c0, c1; DMPlexGetHeightStratum(dm, 0, &c0, &c1); DMPlexGetChart(dm, &p0, &p1); PetscSection section_full; PetscSectionCreate(PETSC_COMM_WORLD, §ion_full); PetscSectionSetNumFields(section_full, 2); PetscSectionSetChart(section_full, p0, p1); PetscSectionSetFieldName(section_full, 0, "Pressure"); PetscSectionSetFieldName(section_full, 1, "Temperature"); for (int i = c0; i < c1; i++) { PetscSectionSetDof(section_full, i, 2); PetscSectionSetFieldDof(section_full, i, 0, 1); PetscSectionSetFieldDof(section_full, i, 1, 1); } PetscSectionSetUp(section_full); DMSetNumFields(dm, 2); DMSetLocalSection(dm, section_full); PetscSectionDestroy(§ion_full); DMCreateGlobalVector(dm, &state); return 0; } results in ==12603== Invalid write of size 8 ==12603== at 0x10CD8B: main (redistribute_filter.cpp:254) ==12603== Address 0xb9fe800 is 4,816 bytes inside a block of size 4,820 alloc'd ==12603== at 0x483E0F0: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==12603== by 0x483E212: posix_memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==12603== by 0x4C4DAB0: PetscMallocAlign (mal.c:54) ==12603== by 0x4C5262F: PetscTrMallocDefault (mtr.c:186) ==12603== by 0x4C501F7: PetscMallocA (mal.c:420) ==12603== by 0x527E8A9: VecCreate_MPI_Private (pbvec.c:485) ==12603== by 0x527F04F: VecCreate_MPI (pbvec.c:523) ==12603== by 0x53E7097: VecSetType (vecreg.c:89) ==12603== by 0x527FBC8: VecCreate_Standard (pbvec.c:547) ==12603== by 0x53E7097: VecSetType (vecreg.c:89) ==12603== by 0x6CD77C0: DMCreateGlobalVector_Section_Private (dmi.c:58) ==12603== by 0x61D52DB: DMCreateGlobalVector_Plex (plexcreate.c:4130) Sincerely Nicholas On Mon, Dec 26, 2022 at 10:52 AM Matthew Knepley wrote: > On Mon, Dec 26, 2022 at 10:49 AM Nicholas Arnold-Medabalimi < > narnoldm at umich.edu> wrote: > >> Oh it's not a worry. I'm debugging this first in C++, and once it's >> working I don't actually need to view what's happening in Fortran when I >> move over. In my C debugging code. After I create the distribution vector >> and distribute the field based on your input, I'm adding >> >> VecSetOperation(state_dist, VECOP_VIEW, (void(*)(void))VecView_Plex); >> >> But I can't compile it because VecView_Plex is undefined. Thanks >> > > You need > > #include > > Thanks > > Matt > > >> Sincerely >> Nicholas >> >> >> >> On Mon, Dec 26, 2022 at 10:45 AM Matthew Knepley >> wrote: >> >>> On Mon, Dec 26, 2022 at 10:40 AM Nicholas Arnold-Medabalimi < >>> narnoldm at umich.edu> wrote: >>> >>>> Hi Matt >>>> >>>> 1) I'm not sure I follow how to call this. If I insert the >>>> VecSetOperation call I'm not exactly sure what the VecView_Plex is or where >>>> it is defined? >>>> >>> >>> Shoot, this cannot be done in Fortran. I will rewrite this step to get >>> it working for you. It should have been done anyway. I cannot do it >>> today since I have to grade finals, but I should have it this week. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> 2) Otherwise I've solved this problem with the insight you provided >>>> into the local section. Things look good on the ASCII output but if we can >>>> resolve 1 then I think the loop is fully closed and I can just worry about >>>> the fortran translation. >>>> >>>> Thanks again for all your help. >>>> >>>> Sincerely >>>> Nicholas >>>> >>>> >>>> >>>> On Mon, Dec 26, 2022 at 9:37 AM Matthew Knepley >>>> wrote: >>>> >>>>> On Mon, Dec 26, 2022 at 3:21 AM Nicholas Arnold-Medabalimi < >>>>> narnoldm at umich.edu> wrote: >>>>> >>>>>> Hi Matt >>>>>> >>>>>> I was able to get this all squared away. It turns out I was >>>>>> initializing the viewer incorrectly?my mistake. However, there is a >>>>>> follow-up question. A while back, we discussed distributing a vector field >>>>>> from an initial DM to a new distributed DM. The way you said to do this was >>>>>> >>>>>> // Distribute the submesh with overlap of 1 >>>>>> DMPlexDistribute(sub_da, overlap, &distributionSF, &sub_da_dist); >>>>>> //Create a vector and section for the distribution >>>>>> VecCreate(PETSC_COMM_WORLD, &state_dist); >>>>>> VecSetDM(state_dist, sub_da_dist); >>>>>> PetscSectionCreate(PETSC_COMM_WORLD, &distSection); >>>>>> DMSetLocalSection(sub_da_dist, distSection); >>>>>> DMPlexDistributeField(sub_da_dist, distributionSF, >>>>>> filteredSection, state_filtered, distSection, state_dist); >>>>>> I've forgone Fortran to debug this all in C and then integrate the >>>>>> function calls into the Fortran code. >>>>>> >>>>>> There are two questions here. >>>>>> >>>>>> 1) How do I associate a vector associated with a DM using VecSetDM to >>>>>> output properly as a VTK? When I call VecView at present, if I call VecView >>>>>> on state_dist, it will not output anything. >>>>>> >>>>> >>>>> This is a problem. The different pieces of interface were added at >>>>> different times. We should really move that manipulation of the function >>>>> table into VecSetDM(). Here is the code: >>>>> >>>>> >>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/plexcreate.c#L4135 >>>>> >>>>> You can make the overload call yourself for now, until we decide on >>>>> the best fix. >>>>> >>>>> >>>>>> 2) The visualization is nice, but when I look at the Vec of the >>>>>> distributed field using stdout, something isn't distributing correctly, as >>>>>> the vector still has some uninitialized values. This is apparent if I >>>>>> output the original vector and the distributed vector. Examining the inside >>>>>> of DMPlexDistributeField I suspect I'm making a mistake with the sections >>>>>> I'm passing. filtered section in this case is the global section but if I >>>>>> try the local section I get an error so I'm not sure. >>>>>> >>>>> >>>>> These should definitely be local sections. Global sections are always >>>>> built after the fact, and building the global section needs the SF that >>>>> indicates what points are shared, not the distribution SF that moves >>>>> points. I need to go back and put in checks that all the arguments are the >>>>> right type. Thanks for bringing that up. Lets track down the error for >>>>> local sections. >>>>> >>>>> Matt >>>>> >>>>> >>>>>> *Original Vector(state_filtered)* >>>>>> Vec Object: Vec_0x84000004_1 2 MPI processes >>>>>> type: mpi >>>>>> Process [0] >>>>>> 101325. >>>>>> 300. >>>>>> 101326. >>>>>> 301. >>>>>> 101341. >>>>>> 316. >>>>>> Process [1] >>>>>> 101325. >>>>>> 300. >>>>>> 101326. >>>>>> 301. >>>>>> 101345. >>>>>> 320. >>>>>> 101497. >>>>>> 472. >>>>>> 101516. >>>>>> 491. >>>>>> *Re-Distributed Vector (state_dist) * >>>>>> Vec Object: 2 MPI processes >>>>>> type: mpi >>>>>> Process [0] >>>>>> 101325. >>>>>> 300. >>>>>> 101326. >>>>>> 301. >>>>>> 101341. >>>>>> 316. >>>>>> 7.90505e-323 >>>>>> 1.97626e-323 >>>>>> 4.30765e-312 >>>>>> 6.91179e-310 >>>>>> Process [1] >>>>>> 101497. >>>>>> 472. >>>>>> 101516. >>>>>> 491. >>>>>> 1.99665e-314 >>>>>> 8.14714e-321 >>>>>> >>>>>> >>>>>> Any insight on distributing this field and correcting the error would >>>>>> be appreciated. >>>>>> >>>>>> Sincerely and Happy Holiday >>>>>> Nicholas >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Dec 23, 2022 at 10:57 AM Matthew Knepley >>>>>> wrote: >>>>>> >>>>>>> On Thu, Dec 22, 2022 at 12:41 AM Nicholas Arnold-Medabalimi < >>>>>>> narnoldm at umich.edu> wrote: >>>>>>> >>>>>>>> Hi Petsc Users >>>>>>>> >>>>>>>> I've been having trouble consistently getting a vector generated >>>>>>>> from a DM to output to VTK correctly. I've used ex1.c (which works >>>>>>>> properly)to try and figure it out, but I'm still having some issues. I must >>>>>>>> be missing something small that isn't correctly associating the section >>>>>>>> with the DM. >>>>>>>> >>>>>>>> DMPlexGetChart(dm, &p0, &p1); >>>>>>>> PetscSection section_full; >>>>>>>> PetscSectionCreate(PETSC_COMM_WORLD, §ion_full); >>>>>>>> PetscSectionSetNumFields(section_full, 1); >>>>>>>> PetscSectionSetChart(section_full, p0, p1); >>>>>>>> PetscSectionSetFieldName(section_full, 0, "state"); >>>>>>>> >>>>>>>> for (int i = c0; i < c1; i++) >>>>>>>> { >>>>>>>> PetscSectionSetDof(section_full, i, 1); >>>>>>>> PetscSectionSetFieldDof(section_full, i, 0, 1); >>>>>>>> } >>>>>>>> PetscSectionSetUp(section_full); >>>>>>>> DMSetNumFields(dm, 1); >>>>>>>> DMSetLocalSection(dm, section_full); >>>>>>>> DMCreateGlobalVector(dm, &state_full); >>>>>>>> >>>>>>>> int o0, o1; >>>>>>>> VecGetOwnershipRange(state_full, &o0, &o1); >>>>>>>> PetscScalar *state_full_array; >>>>>>>> VecGetArray(state_full, &state_full_array); >>>>>>>> >>>>>>>> for (int i = 0; i < (c1 - c0); i++) >>>>>>>> { >>>>>>>> int offset; >>>>>>>> PetscSectionGetOffset(section_full, i, &offset); >>>>>>>> state_full_array[offset] = 101325 + i; >>>>>>>> } >>>>>>>> >>>>>>>> VecRestoreArray(state_full, &state_full_array); >>>>>>>> >>>>>>>> >>>>>>>> PetscViewerCreate(PETSC_COMM_WORLD, &viewer); >>>>>>>> PetscViewerSetType(viewer, PETSCVIEWERVTK); >>>>>>>> PetscViewerFileSetMode(viewer, FILE_MODE_WRITE); >>>>>>>> PetscViewerFileSetName(viewer, "mesh.vtu"); >>>>>>>> VecView(state_full, viewer); >>>>>>>> >>>>>>>> If I run this mesh.vtu isn't generated at all. If I instead do a >>>>>>>> DMView passing the DM it will just output the mesh correctly. >>>>>>>> >>>>>>>> Any assistance would be greatly appreciated. >>>>>>>> >>>>>>> >>>>>>> DMCreateGlobalVector() dispatches to DMCreateGlobalVector_Plex(), >>>>>>> which resets the view method to VecView_Plex(), which should dispatch to >>>>>>> VecView_Plex_Local_VTK(). You can verify this in the debugger, or send us >>>>>>> code we can run to verify it. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Sincerely >>>>>>>> Nicholas >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Nicholas Arnold-Medabalimi >>>>>>>> >>>>>>>> Ph.D. Candidate >>>>>>>> Computational Aeroscience Lab >>>>>>>> University of Michigan >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Nicholas Arnold-Medabalimi >>>>>> >>>>>> Ph.D. Candidate >>>>>> Computational Aeroscience Lab >>>>>> University of Michigan >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>>> >>>> -- >>>> Nicholas Arnold-Medabalimi >>>> >>>> Ph.D. Candidate >>>> Computational Aeroscience Lab >>>> University of Michigan >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> >> -- >> Nicholas Arnold-Medabalimi >> >> Ph.D. Candidate >> Computational Aeroscience Lab >> University of Michigan >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From berend.vanwachem at ovgu.de Tue Dec 27 04:21:16 2022 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Tue, 27 Dec 2022 11:21:16 +0100 Subject: [petsc-users] Write/read a DMForest to/from a file Message-ID: <23156021-e431-1780-4415-3fdfa64d34bf@ovgu.de> Dear Petsc-team/users, I am trying to save a DMForest which has been used for a calculation (so refined/coarsened in places) to a file and later read it from the file to continue a calculation with. My question is: how do I do this? I've tried a few things, such as using DMView and DMLoad, and saving the BaseDM, but that doesn't seem to save the refining/coarsening which has taken place in the initial calculation. I haven't been able to find any example or documentation for this. Any pointers or examples would be very much appreciated! Best regards, Berend. From carl-johan.thore at liu.se Wed Dec 28 04:55:54 2022 From: carl-johan.thore at liu.se (Carl-Johan Thore) Date: Wed, 28 Dec 2022 10:55:54 +0000 Subject: [petsc-users] Extracting reduced stiffness matrix In-Reply-To: <23156021-e431-1780-4415-3fdfa64d34bf@ovgu.de> References: <23156021-e431-1780-4415-3fdfa64d34bf@ovgu.de> Message-ID: Hi, I'm trying to set up and solve the "reduced system" obtained by omitting all rows and columns associated with zero-prescribed unknowns from the global stiffness matrix K. In Matlab-notation: u = K(freedofs,freedofs)\F(freedofs) where freedofs are the free unknowns (the purpose is to compare this method with using MatZeroRowsCols to solve the "full system" for problems with many prescribed unknowns (does anyone have benchmark data?)). I've tried various ways to accomplish the same thing in PETSc. The main difficulty seems to be the construction of the IS freedofs (see code sketch below) from the vector NN for use in MatCreateSubMatrix and VecGetSubVector. I've managed to do this but not in such a way that it works for arbitrary partitionings of the mesh (DMDA). I think that something along the lines of the sketch below might be the "correct" way of doing it, but I suspect there is some issue with ghost values which causes freedofs to be slightly incorrect (if imported into Matlab I can see that there are a few duplicate values for example). Any suggestions? Kind regards, Carl-Johan Code sketch: IS freedofs; DMCreateGlobalVector(da, &NN); // Some code to populate NN // NN[i] = 0.0 if DOF i is prescribed // NN[i] = 1.0 if DOF i is free // ... // Get local version of NN DMCreateLocalVector(da, &NNloc); DMGlobalToLocalBegin(da, NN, INSERT_VALUES, NNloc); DMGlobalToLocalEnd(da, NN, INSERT_VALUES, NNloc); VecGetArray(NNloc, &nnlocarr); // Get global indices from local ISLocalToGlobalMapping ltogm; const PetscInt *g_idx; PetscInt locsize; DMGetLocalToGlobalMapping(da, <ogm); VecGetLocalToGlobalMapping(NN, <ogm); ISLocalToGlobalMappingGetIndices(ltogm, &g_idx); ISLocalToGlobalMappingGetSize(ltogm, &locsize); PetscScalar nfreedofs; VecSum(NN, &nfreedofs); PetscInt idx[(PetscInt) nfreedofs]; PetscInt n=0; for (PetscInt i=0; i Hi Petsc Users I've been working with vectors generated from a DM and getting some odd memory errors. Using Valgrind, I have been able to trace the issue to DMCreateGlobalVector. I've reduced the code to a relatively simple routine (very similar to example 7) and attached it. I suspect the issue comes down to something improperly set in the section. The code, when integrated, will run correctly 10-30% of the time and otherwise give a memory corruption error. Any insight on the issue or possible error on my part would be appreciated. Using Valgrind, I get the following error. ==27064== Invalid write of size 8 ==27064== at 0x10C91E: main (section_vector_build.cpp:184) ==27064== Address 0xc4aa248 is 4 bytes after a block of size 3,204 alloc'd ==27064== at 0x483E0F0: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==27064== by 0x483E212: posix_memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==27064== by 0x4C4DAB0: PetscMallocAlign (mal.c:54) ==27064== by 0x4C5262F: PetscTrMallocDefault (mtr.c:186) ==27064== by 0x4C501F7: PetscMallocA (mal.c:420) ==27064== by 0x527E8A9: VecCreate_MPI_Private (pbvec.c:485) ==27064== by 0x527F04F: VecCreate_MPI (pbvec.c:523) ==27064== by 0x53E7097: VecSetType (vecreg.c:89) ==27064== by 0x527FBC8: VecCreate_Standard (pbvec.c:547) ==27064== by 0x53E7097: VecSetType (vecreg.c:89) ==27064== by 0x6CD77C0: DMCreateGlobalVector_Section_Private (dmi.c:58) ==27064== by 0x61D52DB: DMCreateGlobalVector_Plex (plexcreate.c:4130) -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- #include #include #include #include #include #include #include #include "hdf5.h" #define PI 3.14159265 static char help[] = "Show vector build seg fault error\n\n"; int rank, size; PetscErrorCode ViewDM(DM V, std::string filename) { PetscViewer viewer; PetscInt ierr; ierr = PetscViewerCreate(PETSC_COMM_WORLD, &viewer); ierr = PetscViewerSetType(viewer, PETSCVIEWERVTK); ierr = PetscViewerFileSetMode(viewer, FILE_MODE_WRITE); ierr = PetscViewerFileSetName(viewer, filename.c_str()); DMView(V, viewer); ierr = PetscViewerDestroy(&viewer); return 0; } PetscErrorCode ViewVec(Vec V, std::string filename) { PetscViewer viewer; PetscInt ierr; ierr = PetscViewerCreate(PETSC_COMM_WORLD, &viewer); ierr = PetscViewerSetType(viewer, PETSCVIEWERVTK); ierr = PetscViewerFileSetMode(viewer, FILE_MODE_WRITE); ierr = PetscViewerFileSetName(viewer, filename.c_str()); VecView(V, viewer); ierr = PetscViewerDestroy(&viewer); return 0; } PetscErrorCode addSectionToDM(DM &dm, Vec &state) { int ierr; int p0, p1, c0, c1; DMPlexGetHeightStratum(dm, 0, &c0, &c1); DMPlexGetChart(dm, &p0, &p1); PetscSection section_full; PetscSectionCreate(PETSC_COMM_WORLD, §ion_full); PetscSectionSetNumFields(section_full, 2); PetscSectionSetChart(section_full, p0, p1); PetscSectionSetFieldComponents(section_full, 0, 1); PetscSectionSetFieldComponents(section_full, 1, 1); PetscSectionSetFieldName(section_full, 0, "Pressure"); PetscSectionSetFieldName(section_full, 1, "Temperature"); for (int i = c0; i < c1; i++) { PetscSectionSetDof(section_full, i, 2); PetscSectionSetFieldDof(section_full, i, 0, 1); PetscSectionSetFieldDof(section_full, i, 1, 1); } PetscSectionSetUp(section_full); DMSetNumFields(dm, 2); DMSetLocalSection(dm, section_full); //PetscSectionView(section_full, PETSC_VIEWER_STDOUT_WORLD); PetscSectionDestroy(§ion_full); ierr=DMCreateGlobalVector(dm, &state); CHKERRQ(ierr); return 0; } int main(int argc, char **argv) { PetscErrorCode ierr; // Creates Petsc world communicator just like MPI, No need for additional MPI call. ierr = PetscInitialize(&argc, &argv, (char *)0, help); MPI_Comm_size(PETSC_COMM_WORLD, &size); MPI_Comm_rank(PETSC_COMM_WORLD, &rank); if (!rank) { std::cout << "Running on " << size << " processors" << std::endl; std::cout << help << std::endl; } DM dm, dmDist; PetscInt i, dim = 2, overlap = 1; PetscInt faces[dim]; /// Create 2D mesh // -meshdim will set the square dimension of the mesh // overlap sets the number of cells around each partition faces[0] = 20; PetscOptionsGetInt(NULL, NULL, "-meshdim", &(faces[0]), NULL); PetscOptionsGetInt(NULL, NULL, "-overlap", &overlap, NULL); faces[1] = faces[0]; PetscSF distributionSF; PetscBool simplex = PETSC_FALSE, dmInterped = PETSC_TRUE; PetscReal lower[2] = {0.0, 0.0}; PetscReal upper[2] = {2.0, 2.0}; ierr = DMPlexCreateBoxMesh(PETSC_COMM_WORLD, dim, simplex, faces, lower, upper, /* periodicity */ NULL, dmInterped, &dm); CHKERRQ(ierr); PetscPrintf(PETSC_COMM_WORLD, "Mesh created\n"); double t1 = 0, t2 = 0; t1 = MPI_Wtime(); ierr = DMPlexDistribute(dm, overlap, &distributionSF, &dmDist); t2 = MPI_Wtime(); CHKERRQ(ierr); if (dmDist) { ierr = DMDestroy(&dm); CHKERRQ(ierr); dm = dmDist; } PetscPrintf(PETSC_COMM_WORLD, "Mesh distributed\n"); PetscInt c0, c1, f0, f1, e0, e1, v0, v1, p0, p1; if (dim == 2) { DMPlexGetHeightStratum(dm, 0, &c0, &c1); DMPlexGetHeightStratum(dm, 1, &e0, &e1); DMPlexGetHeightStratum(dm, 2, &v0, &v1); } PetscSynchronizedPrintf(PETSC_COMM_WORLD, "Rank %d\n", rank); PetscSynchronizedPrintf(PETSC_COMM_WORLD, "Cells: p0:%d ,p1:%d \n", c0, c1); PetscSynchronizedPrintf(PETSC_COMM_WORLD, "Edges: p0:%d ,p1:%d \n", e0, e1); PetscSynchronizedPrintf(PETSC_COMM_WORLD, "Vertices: p0:%d ,p1:%d \n", v0, v1); PetscSynchronizedFlush(PETSC_COMM_WORLD, NULL); Vec state_full; addSectionToDM(dm, state_full); int o0, o1; VecGetOwnershipRange(state_full, &o0, &o1); PetscScalar *state_full_array; VecGetArray(state_full, &state_full_array); PetscSection section_full; DMGetLocalSection(dm, §ion_full); for (int i = c0; i < c1; i++) { int offset; PetscSectionGetOffset(section_full, i, &offset); //PetscSynchronizedPrintf(PETSC_COMM_WORLD, "Rank %d, Cell %d, Offset %d\n", rank, i, offset); state_full_array[offset] = 100000 + i + rank * 100; state_full_array[offset + 1] = 300 + i + rank * 100; } //PetscSynchronizedFlush (PETSC_COMM_WORLD, NULL); VecRestoreArray(state_full, &state_full_array); ViewVec(state_full, "mesh_vec.vtu"); ierr = PetscFinalize(); return ierr; } From bsmith at petsc.dev Wed Dec 28 14:50:41 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 28 Dec 2022 15:50:41 -0500 Subject: [petsc-users] Global Vector Generated from Section memory error In-Reply-To: References: Message-ID: This is a mysterious stack. It is inside memalign() that Valgrind has found the code is accessing memory outside of the block size allocated, but memalign() is presumably the routine that is in the middle of the process of doing the allocation! This could indicate some (undetected) memory corruption has occurred earlier in the run thus memalign() has corrupted data structures. I presume this is the first warning message? You can try running without Valgrind but with the PETSc option -malloc_debug and see if that detects corruption more clearly. Barry > On Dec 28, 2022, at 2:10 PM, Nicholas Arnold-Medabalimi wrote: > > Hi Petsc Users > > I've been working with vectors generated from a DM and getting some odd memory errors. Using Valgrind, I have been able to trace the issue to DMCreateGlobalVector. I've reduced the code to a relatively simple routine (very similar to example 7) and attached it. I suspect the issue comes down to something improperly set in the section. The code, when integrated, will run correctly 10-30% of the time and otherwise give a memory corruption error. > > Any insight on the issue or possible error on my part would be appreciated. > > > Using Valgrind, I get the following error. > > ==27064== Invalid write of size 8 > ==27064== at 0x10C91E: main (section_vector_build.cpp:184) > ==27064== Address 0xc4aa248 is 4 bytes after a block of size 3,204 alloc'd > ==27064== at 0x483E0F0: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) > ==27064== by 0x483E212: posix_memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) > ==27064== by 0x4C4DAB0: PetscMallocAlign (mal.c:54) > ==27064== by 0x4C5262F: PetscTrMallocDefault (mtr.c:186) > ==27064== by 0x4C501F7: PetscMallocA (mal.c:420) > ==27064== by 0x527E8A9: VecCreate_MPI_Private (pbvec.c:485) > ==27064== by 0x527F04F: VecCreate_MPI (pbvec.c:523) > ==27064== by 0x53E7097: VecSetType (vecreg.c:89) > ==27064== by 0x527FBC8: VecCreate_Standard (pbvec.c:547) > ==27064== by 0x53E7097: VecSetType (vecreg.c:89) > ==27064== by 0x6CD77C0: DMCreateGlobalVector_Section_Private (dmi.c:58) > ==27064== by 0x61D52DB: DMCreateGlobalVector_Plex (plexcreate.c:4130) > > > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From narnoldm at umich.edu Wed Dec 28 15:50:45 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Wed, 28 Dec 2022 16:50:45 -0500 Subject: [petsc-users] Global Vector Generated from Section memory error In-Reply-To: References: Message-ID: Hi Thanks for the help. This was quite confusing, and there is quite an exciting matrix of test results, but I think this is resolved. Basically, I was playing fast and loose with using a local section to get offsets for a global vector. When I instead used a global section and subtracted the ownership range from the offset, everything seemed to be working okay. What remains a mystery is why was the stack trace pointing to the GlobalVectorCreate instead of the assignment statement to the Scalar array grabbed using VecGetArray? Thanks Nicholas On Wed, Dec 28, 2022 at 3:51 PM Barry Smith wrote: > > This is a mysterious stack. It is inside memalign() that Valgrind has > found the code is accessing memory outside of the block size allocated, but > memalign() is presumably the routine that is in the middle of the process > of doing the allocation! This could indicate some (undetected) memory > corruption has occurred earlier in the run thus memalign() has corrupted > data structures. I presume this is the first warning message? > > You can try running without Valgrind but with the PETSc option > -malloc_debug and see if that detects corruption more clearly. > > Barry > > > On Dec 28, 2022, at 2:10 PM, Nicholas Arnold-Medabalimi < > narnoldm at umich.edu> wrote: > > Hi Petsc Users > > I've been working with vectors generated from a DM and getting some odd > memory errors. Using Valgrind, I have been able to trace the issue to > DMCreateGlobalVector. I've reduced the code to a relatively simple routine > (very similar to example 7) and attached it. I suspect the issue comes down > to something improperly set in the section. The code, when integrated, will > run correctly 10-30% of the time and otherwise give a memory corruption > error. > > Any insight on the issue or possible error on my part would be > appreciated. > > > Using Valgrind, I get the following error. > > ==27064== Invalid write of size 8 > ==27064== at 0x10C91E: main (section_vector_build.cpp:184) > ==27064== Address 0xc4aa248 is 4 bytes after a block of size 3,204 alloc'd > ==27064== at 0x483E0F0: memalign (in > /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) > ==27064== by 0x483E212: posix_memalign (in > /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) > ==27064== by 0x4C4DAB0: PetscMallocAlign (mal.c:54) > ==27064== by 0x4C5262F: PetscTrMallocDefault (mtr.c:186) > ==27064== by 0x4C501F7: PetscMallocA (mal.c:420) > ==27064== by 0x527E8A9: VecCreate_MPI_Private (pbvec.c:485) > ==27064== by 0x527F04F: VecCreate_MPI (pbvec.c:523) > ==27064== by 0x53E7097: VecSetType (vecreg.c:89) > ==27064== by 0x527FBC8: VecCreate_Standard (pbvec.c:547) > ==27064== by 0x53E7097: VecSetType (vecreg.c:89) > ==27064== by 0x6CD77C0: DMCreateGlobalVector_Section_Private (dmi.c:58) > ==27064== by 0x61D52DB: DMCreateGlobalVector_Plex (plexcreate.c:4130) > > > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan > > > > -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Dec 29 14:54:37 2022 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 29 Dec 2022 15:54:37 -0500 Subject: [petsc-users] Write/read a DMForest to/from a file In-Reply-To: <23156021-e431-1780-4415-3fdfa64d34bf@ovgu.de> References: <23156021-e431-1780-4415-3fdfa64d34bf@ovgu.de> Message-ID: Have you tried using the DMForest as you would use a DMPLex? That should work. Forst has extra stuff but it just makes a Plex and the end of the day. On Tue, Dec 27, 2022 at 5:21 AM Berend van Wachem wrote: > Dear Petsc-team/users, > > I am trying to save a DMForest which has been used for a calculation (so > refined/coarsened in places) to a file and later read it from the file > to continue a calculation with. > My question is: how do I do this? I've tried a few things, such as using > DMView and DMLoad, and saving the BaseDM, but that doesn't seem to save > the refining/coarsening which has taken place in the initial calculation. > I haven't been able to find any example or documentation for this. Any > pointers or examples would be very much appreciated! > > Best regards, Berend. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksi2443 at gmail.com Fri Dec 30 03:36:19 2022 From: ksi2443 at gmail.com (=?UTF-8?B?6rmA7ISx7J21?=) Date: Fri, 30 Dec 2022 18:36:19 +0900 Subject: [petsc-users] Question-Memory of matsetvalue Message-ID: Hello, I have a question about memory of matsetvalue. When I assembly the local matrix to global matrix, I?m just using matsetvalue. Because the connectivity is so complex I can?t use matsetvalues. I asked this question because I was curious about how ram memory is allocated differently for the two simtuations below. First situation. The local matrix size is 24 by 24. And the number of local matrix is 125,000. When assembly procedure by using matsetvalue, memory allocation does not increase. So I just put Matassemblybegin and matassemblyend after all matsetvalue. Second situation. The local matrix size is 60 by 60. And the number of local matrix is 27,000. When assembly procedure by using matsetvalue, memory allocation does increase. So I just put Matassemblybegin and matassemblyend after each local matrix assembly. This did not increase the memory further.. Why this situation is happen? And is there any way to prevent the memory allocation from increasing? Thanks, Hyung Kim -------------- next part -------------- An HTML attachment was scrubbed... URL: From berend.vanwachem at ovgu.de Fri Dec 30 04:59:06 2022 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Fri, 30 Dec 2022 11:59:06 +0100 Subject: [petsc-users] Write/read a DMForest to/from a file In-Reply-To: References: <23156021-e431-1780-4415-3fdfa64d34bf@ovgu.de> Message-ID: Dear Mark, Yes, I have tried that. That will only work if I convert the DMForest to a DMPlex first, and then write the DMPlex to file. But then, I cannot read in a DMForest when I want to continue the calculations later on. Below is the code I use to write a DMPlex to file. When I call this routine with a DMForest, I get the error: [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Wrong type of object: Parameter # 1 [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.18.2, Nov 28, 2022 [0]PETSC ERROR: ./bin/write_periodic on a linux-gcc-dev named multiflow.multiflow.org by berend Fri Dec 30 11:54:51 2022 [0]PETSC ERROR: Configure options --with-clean --download-metis=yes --download-parmetis=yes --download-hdf5 --download-p4est --download-triangle --download-tetgen --with-zlib-lib=/usr/lib64/libz.a --with-zlib-include=/usr/include --with-mpi=yes --with-mpi-dir=/usr --with-mpiexec=/usr/bin/mpiexec --download-slepc=yes --download-fftw=yes [0]PETSC ERROR: #1 PetscSectionGetChart() at /usr/local/petsc-3.18.2/src/vec/is/section/interface/section.c:592 [0]PETSC ERROR: #2 DMPlexGetChart() at /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:2702 [0]PETSC ERROR: #3 DMPlexGetDepthStratum() at /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:4882 [0]PETSC ERROR: #4 DMPlexCreatePointNumbering() at /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:8303 [0]PETSC ERROR: #5 DMPlexTopologyView() at /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:1854 [0]PETSC ERROR: #6 WriteDMAndVectortoHDF5File() at /home/berend/git/Code/petsc-dmplex-restart-test/src/createandwrite.c:176 [0]PETSC ERROR: #7 main() at /home/berend/git/Code/petsc-dmplex-restart-test/src/createandwrite.c:555 The code I use to write is: PetscErrorCode WriteDMAndVectortoHDF5File(DM dm, Vec *DataVector, const char *HDF5file) { PetscViewer H5Viewer; PetscErrorCode ierr; DM dmCopy; PetscSection sectionCopy; PetscInt i; PetscScalar *xVecArray; PetscInt numPoints, numPointsCopy; Vec vectorCopy; PetscScalar *array; PetscFunctionBegin; ierr = PetscViewerHDF5Open(PETSC_COMM_WORLD, HDF5file, FILE_MODE_WRITE, &H5Viewer); CHKERRQ(ierr); ierr = PetscObjectSetName((PetscObject) dm, "DM"); CHKERRQ(ierr); ierr = PetscViewerPushFormat(H5Viewer, PETSC_VIEWER_HDF5_PETSC); CHKERRQ(ierr); ierr = DMPlexTopologyView(dm, H5Viewer); CHKERRQ(ierr); ierr = DMPlexLabelsView(dm, H5Viewer); CHKERRQ(ierr); ierr = DMPlexCoordinatesView(dm, H5Viewer); CHKERRQ(ierr); ierr = PetscViewerPopFormat(H5Viewer); CHKERRQ(ierr); ierr = DMClone(dm, &dmCopy); CHKERRQ(ierr); ierr = PetscObjectSetName((PetscObject) dmCopy, "DM"); CHKERRQ(ierr); ierr = DMGetLocalSection(dm, §ionCopy); CHKERRQ(ierr); ierr = DMSetLocalSection(dmCopy, sectionCopy); CHKERRQ(ierr); /* Write the section to the file */ ierr = DMPlexSectionView(dm, H5Viewer, dmCopy); CHKERRQ(ierr); ierr = DMGetGlobalVector(dmCopy, &vectorCopy); CHKERRQ(ierr); /*** We have to copy the vector into the new vector ... ***/ ierr = VecGetArray(vectorCopy, &array); CHKERRQ(ierr); ierr = VecGetLocalSize(*DataVector, &numPoints); CHKERRQ(ierr); ierr = VecGetLocalSize(vectorCopy, &numPointsCopy); CHKERRQ(ierr); assert(numPoints == numPointsCopy); ierr = VecGetArray(*DataVector, &xVecArray); CHKERRQ(ierr); for (i = 0; i < numPoints; i++) /* Loop over all internal cells */ { array[i] = xVecArray[i]; } ierr = VecRestoreArray(vectorCopy, &array); CHKERRQ(ierr); ierr = VecRestoreArray(*DataVector, &xVecArray); CHKERRQ(ierr); ierr = PetscObjectSetName((PetscObject) vectorCopy, "DataVector"); CHKERRQ(ierr); /* Write the vector to the file */ ierr = DMPlexGlobalVectorView(dm, H5Viewer, dmCopy, vectorCopy); CHKERRQ(ierr); /* Close the file */ ierr = PetscViewerDestroy(&H5Viewer); CHKERRQ(ierr); ierr = DMDestroy(&dmCopy); CHKERRQ(ierr); /*** End of writing ****/ PetscFunctionReturn(0); } On 29/12/2022 21:54, Mark Adams wrote: > Have you tried using the DMForest as you would use a DMPLex? > That should work. Forst has extra stuff but it just makes a Plex and the > end of the day. > > On Tue, Dec 27, 2022 at 5:21 AM Berend van Wachem > > wrote: > > Dear Petsc-team/users, > > I am trying to save a DMForest which has been used for a calculation > (so > refined/coarsened in places) to a file and later read it from the file > to continue a calculation with. > My question is: how do I do this? I've tried a few things, such as > using > DMView and DMLoad, and saving the BaseDM, but that doesn't seem to save > the refining/coarsening which has taken place in the initial > calculation. > I haven't been able to find any example or documentation for this. Any > pointers or examples would be very much appreciated! > > Best regards, Berend. > From knepley at gmail.com Fri Dec 30 08:27:26 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 30 Dec 2022 09:27:26 -0500 Subject: [petsc-users] Question-Memory of matsetvalue In-Reply-To: References: Message-ID: On Fri, Dec 30, 2022 at 4:36 AM ??? wrote: > Hello, > > > > I have a question about memory of matsetvalue. > > When I assembly the local matrix to global matrix, I?m just using > matsetvalue. > Because the connectivity is so complex I can?t use matsetvalues. > > I asked this question because I was curious about how ram memory is > allocated differently for the two simtuations below. > > > > First situation. > > The local matrix size is 24 by 24. And the number of local matrix is > 125,000. > > When assembly procedure by using matsetvalue, memory allocation does not > increase. > So I just put Matassemblybegin and matassemblyend after all matsetvalue. > > > > Second situation. > > The local matrix size is 60 by 60. And the number of local matrix is > 27,000. > > When assembly procedure by using matsetvalue, memory allocation does > increase. > > So I just put Matassemblybegin and matassemblyend after each local matrix > assembly. > This did not increase the memory further.. > > > > Why this situation is happen? > > And is there any way to prevent the memory allocation from increasing? > Matrix assembly has two stages. First you insert entries into the local portion of your parallel matrix using MatSetValue(s). If all values you try to insert are local, this is the end. However, if you try to insert values that are local to another process, we store those values. When you call MatAssemblyBegin/End(), we communicate them to the correct process and insert. For a scalable code, you should insert most values on the correct process. If not, significant memory can be consumed storing these values. Anywhere in the assembly process you can call MatAssemblyBegin(A, MAT_ASSEMBLY_FLUSH); MatAssemblyEnd(A, MAT_ASSEMBLY_FLUSH); This will communicate the cache of values, but not end the assembly. Thanks, Matt > Thanks, > > Hyung Kim > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Dec 30 08:29:13 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 30 Dec 2022 09:29:13 -0500 Subject: [petsc-users] Write/read a DMForest to/from a file In-Reply-To: References: <23156021-e431-1780-4415-3fdfa64d34bf@ovgu.de> Message-ID: On Fri, Dec 30, 2022 at 5:59 AM Berend van Wachem wrote: > Dear Mark, > > Yes, I have tried that. That will only work if I convert the DMForest to > a DMPlex first, and then write the DMPlex to file. But then, I cannot > read in a DMForest when I want to continue the calculations later on. > > Below is the code I use to write a DMPlex to file. When I call this > routine with a DMForest, I get the error: > This is true. I am going to have to look at this with Toby. I will get back to you. THanks, Matt > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.18.2, Nov 28, 2022 > [0]PETSC ERROR: ./bin/write_periodic on a linux-gcc-dev named > multiflow.multiflow.org by berend Fri Dec 30 11:54:51 2022 > [0]PETSC ERROR: Configure options --with-clean --download-metis=yes > --download-parmetis=yes --download-hdf5 --download-p4est > --download-triangle --download-tetgen --with-zlib-lib=/usr/lib64/libz.a > --with-zlib-include=/usr/include --with-mpi=yes --with-mpi-dir=/usr > --with-mpiexec=/usr/bin/mpiexec --download-slepc=yes --download-fftw=yes > [0]PETSC ERROR: #1 PetscSectionGetChart() at > /usr/local/petsc-3.18.2/src/vec/is/section/interface/section.c:592 > [0]PETSC ERROR: #2 DMPlexGetChart() at > /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:2702 > [0]PETSC ERROR: #3 DMPlexGetDepthStratum() at > /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:4882 > [0]PETSC ERROR: #4 DMPlexCreatePointNumbering() at > /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:8303 > [0]PETSC ERROR: #5 DMPlexTopologyView() at > /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:1854 > [0]PETSC ERROR: #6 WriteDMAndVectortoHDF5File() at > /home/berend/git/Code/petsc-dmplex-restart-test/src/createandwrite.c:176 > [0]PETSC ERROR: #7 main() at > /home/berend/git/Code/petsc-dmplex-restart-test/src/createandwrite.c:555 > > The code I use to write is: > > PetscErrorCode WriteDMAndVectortoHDF5File(DM dm, Vec *DataVector, const > char *HDF5file) > { > PetscViewer H5Viewer; > PetscErrorCode ierr; > DM dmCopy; > PetscSection sectionCopy; > PetscInt i; > PetscScalar *xVecArray; > PetscInt numPoints, numPointsCopy; > Vec vectorCopy; > PetscScalar *array; > > PetscFunctionBegin; > ierr = PetscViewerHDF5Open(PETSC_COMM_WORLD, HDF5file, > FILE_MODE_WRITE, &H5Viewer); > CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) dm, "DM"); > CHKERRQ(ierr); > ierr = PetscViewerPushFormat(H5Viewer, PETSC_VIEWER_HDF5_PETSC); > CHKERRQ(ierr); > ierr = DMPlexTopologyView(dm, H5Viewer); > CHKERRQ(ierr); > ierr = DMPlexLabelsView(dm, H5Viewer); > CHKERRQ(ierr); > ierr = DMPlexCoordinatesView(dm, H5Viewer); > CHKERRQ(ierr); > ierr = PetscViewerPopFormat(H5Viewer); > CHKERRQ(ierr); > > ierr = DMClone(dm, &dmCopy); > CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) dmCopy, "DM"); > CHKERRQ(ierr); > ierr = DMGetLocalSection(dm, §ionCopy); > CHKERRQ(ierr); > ierr = DMSetLocalSection(dmCopy, sectionCopy); > CHKERRQ(ierr); > > /* Write the section to the file */ > ierr = DMPlexSectionView(dm, H5Viewer, dmCopy); > CHKERRQ(ierr); > > ierr = DMGetGlobalVector(dmCopy, &vectorCopy); > CHKERRQ(ierr); > > /*** We have to copy the vector into the new vector ... ***/ > ierr = VecGetArray(vectorCopy, &array); > CHKERRQ(ierr); > ierr = VecGetLocalSize(*DataVector, &numPoints); > CHKERRQ(ierr); > ierr = VecGetLocalSize(vectorCopy, &numPointsCopy); > CHKERRQ(ierr); > assert(numPoints == numPointsCopy); > ierr = VecGetArray(*DataVector, &xVecArray); > CHKERRQ(ierr); > > for (i = 0; i < numPoints; i++) /* Loop over all internal cells */ > { > array[i] = xVecArray[i]; > } > > ierr = VecRestoreArray(vectorCopy, &array); > CHKERRQ(ierr); > ierr = VecRestoreArray(*DataVector, &xVecArray); > CHKERRQ(ierr); > > ierr = PetscObjectSetName((PetscObject) vectorCopy, "DataVector"); > CHKERRQ(ierr); > > /* Write the vector to the file */ > ierr = DMPlexGlobalVectorView(dm, H5Viewer, dmCopy, vectorCopy); > CHKERRQ(ierr); > > /* Close the file */ > ierr = PetscViewerDestroy(&H5Viewer); > CHKERRQ(ierr); > > ierr = DMDestroy(&dmCopy); > CHKERRQ(ierr); > /*** End of writing ****/ > PetscFunctionReturn(0); > } > > > > > > On 29/12/2022 21:54, Mark Adams wrote: > > Have you tried using the DMForest as you would use a DMPLex? > > That should work. Forst has extra stuff but it just makes a Plex and the > > end of the day. > > > > On Tue, Dec 27, 2022 at 5:21 AM Berend van Wachem > > > wrote: > > > > Dear Petsc-team/users, > > > > I am trying to save a DMForest which has been used for a calculation > > (so > > refined/coarsened in places) to a file and later read it from the > file > > to continue a calculation with. > > My question is: how do I do this? I've tried a few things, such as > > using > > DMView and DMLoad, and saving the BaseDM, but that doesn't seem to > save > > the refining/coarsening which has taken place in the initial > > calculation. > > I haven't been able to find any example or documentation for this. > Any > > pointers or examples would be very much appreciated! > > > > Best regards, Berend. > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Dec 30 08:30:28 2022 From: jed at jedbrown.org (Jed Brown) Date: Fri, 30 Dec 2022 07:30:28 -0700 Subject: [petsc-users] Question-Memory of matsetvalue In-Reply-To: References: Message-ID: <87sfgx7ym3.fsf@jedbrown.org> This is what I'd expect to observe if you didn't preallocate correctly for the second matrix, which has more nonzeros per row. https://petsc.org/release/docs/manual/mat/#sec-matsparse ??? writes: > Hello, > > > > I have a question about memory of matsetvalue. > > When I assembly the local matrix to global matrix, I?m just using > matsetvalue. > Because the connectivity is so complex I can?t use matsetvalues. > > I asked this question because I was curious about how ram memory is > allocated differently for the two simtuations below. > > > > First situation. > > The local matrix size is 24 by 24. And the number of local matrix is > 125,000. > > When assembly procedure by using matsetvalue, memory allocation does not > increase. > So I just put Matassemblybegin and matassemblyend after all matsetvalue. > > > > Second situation. > > The local matrix size is 60 by 60. And the number of local matrix is 27,000. > > When assembly procedure by using matsetvalue, memory allocation does > increase. > > So I just put Matassemblybegin and matassemblyend after each local matrix > assembly. > This did not increase the memory further.. > > > > Why this situation is happen? > > And is there any way to prevent the memory allocation from increasing? > > > > > > > > Thanks, > > Hyung Kim From mfadams at lbl.gov Fri Dec 30 16:46:34 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 30 Dec 2022 17:46:34 -0500 Subject: [petsc-users] Write/read a DMForest to/from a file In-Reply-To: References: <23156021-e431-1780-4415-3fdfa64d34bf@ovgu.de> Message-ID: Oh, right, you need to convert to a Plex and then write. I've never read a Forest but you will read in a Plex and I would think just convert to a Forest. Mark On Fri, Dec 30, 2022 at 9:29 AM Matthew Knepley wrote: > On Fri, Dec 30, 2022 at 5:59 AM Berend van Wachem < > berend.vanwachem at ovgu.de> wrote: > >> Dear Mark, >> >> Yes, I have tried that. That will only work if I convert the DMForest to >> a DMPlex first, and then write the DMPlex to file. But then, I cannot >> read in a DMForest when I want to continue the calculations later on. >> >> Below is the code I use to write a DMPlex to file. When I call this >> routine with a DMForest, I get the error: >> > > This is true. I am going to have to look at this with Toby. I will get > back to you. > > THanks, > > Matt > > >> [0]PETSC ERROR: Invalid argument >> [0]PETSC ERROR: Wrong type of object: Parameter # 1 >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.18.2, Nov 28, 2022 >> [0]PETSC ERROR: ./bin/write_periodic on a linux-gcc-dev named >> multiflow.multiflow.org by berend Fri Dec 30 11:54:51 2022 >> [0]PETSC ERROR: Configure options --with-clean --download-metis=yes >> --download-parmetis=yes --download-hdf5 --download-p4est >> --download-triangle --download-tetgen --with-zlib-lib=/usr/lib64/libz.a >> --with-zlib-include=/usr/include --with-mpi=yes --with-mpi-dir=/usr >> --with-mpiexec=/usr/bin/mpiexec --download-slepc=yes --download-fftw=yes >> [0]PETSC ERROR: #1 PetscSectionGetChart() at >> /usr/local/petsc-3.18.2/src/vec/is/section/interface/section.c:592 >> [0]PETSC ERROR: #2 DMPlexGetChart() at >> /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:2702 >> [0]PETSC ERROR: #3 DMPlexGetDepthStratum() at >> /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:4882 >> [0]PETSC ERROR: #4 DMPlexCreatePointNumbering() at >> /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:8303 >> [0]PETSC ERROR: #5 DMPlexTopologyView() at >> /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:1854 >> [0]PETSC ERROR: #6 WriteDMAndVectortoHDF5File() at >> /home/berend/git/Code/petsc-dmplex-restart-test/src/createandwrite.c:176 >> [0]PETSC ERROR: #7 main() at >> /home/berend/git/Code/petsc-dmplex-restart-test/src/createandwrite.c:555 >> >> The code I use to write is: >> >> PetscErrorCode WriteDMAndVectortoHDF5File(DM dm, Vec *DataVector, const >> char *HDF5file) >> { >> PetscViewer H5Viewer; >> PetscErrorCode ierr; >> DM dmCopy; >> PetscSection sectionCopy; >> PetscInt i; >> PetscScalar *xVecArray; >> PetscInt numPoints, numPointsCopy; >> Vec vectorCopy; >> PetscScalar *array; >> >> PetscFunctionBegin; >> ierr = PetscViewerHDF5Open(PETSC_COMM_WORLD, HDF5file, >> FILE_MODE_WRITE, &H5Viewer); >> CHKERRQ(ierr); >> ierr = PetscObjectSetName((PetscObject) dm, "DM"); >> CHKERRQ(ierr); >> ierr = PetscViewerPushFormat(H5Viewer, PETSC_VIEWER_HDF5_PETSC); >> CHKERRQ(ierr); >> ierr = DMPlexTopologyView(dm, H5Viewer); >> CHKERRQ(ierr); >> ierr = DMPlexLabelsView(dm, H5Viewer); >> CHKERRQ(ierr); >> ierr = DMPlexCoordinatesView(dm, H5Viewer); >> CHKERRQ(ierr); >> ierr = PetscViewerPopFormat(H5Viewer); >> CHKERRQ(ierr); >> >> ierr = DMClone(dm, &dmCopy); >> CHKERRQ(ierr); >> ierr = PetscObjectSetName((PetscObject) dmCopy, "DM"); >> CHKERRQ(ierr); >> ierr = DMGetLocalSection(dm, §ionCopy); >> CHKERRQ(ierr); >> ierr = DMSetLocalSection(dmCopy, sectionCopy); >> CHKERRQ(ierr); >> >> /* Write the section to the file */ >> ierr = DMPlexSectionView(dm, H5Viewer, dmCopy); >> CHKERRQ(ierr); >> >> ierr = DMGetGlobalVector(dmCopy, &vectorCopy); >> CHKERRQ(ierr); >> >> /*** We have to copy the vector into the new vector ... ***/ >> ierr = VecGetArray(vectorCopy, &array); >> CHKERRQ(ierr); >> ierr = VecGetLocalSize(*DataVector, &numPoints); >> CHKERRQ(ierr); >> ierr = VecGetLocalSize(vectorCopy, &numPointsCopy); >> CHKERRQ(ierr); >> assert(numPoints == numPointsCopy); >> ierr = VecGetArray(*DataVector, &xVecArray); >> CHKERRQ(ierr); >> >> for (i = 0; i < numPoints; i++) /* Loop over all internal cells */ >> { >> array[i] = xVecArray[i]; >> } >> >> ierr = VecRestoreArray(vectorCopy, &array); >> CHKERRQ(ierr); >> ierr = VecRestoreArray(*DataVector, &xVecArray); >> CHKERRQ(ierr); >> >> ierr = PetscObjectSetName((PetscObject) vectorCopy, "DataVector"); >> CHKERRQ(ierr); >> >> /* Write the vector to the file */ >> ierr = DMPlexGlobalVectorView(dm, H5Viewer, dmCopy, vectorCopy); >> CHKERRQ(ierr); >> >> /* Close the file */ >> ierr = PetscViewerDestroy(&H5Viewer); >> CHKERRQ(ierr); >> >> ierr = DMDestroy(&dmCopy); >> CHKERRQ(ierr); >> /*** End of writing ****/ >> PetscFunctionReturn(0); >> } >> >> >> >> >> >> On 29/12/2022 21:54, Mark Adams wrote: >> > Have you tried using the DMForest as you would use a DMPLex? >> > That should work. Forst has extra stuff but it just makes a Plex and >> the >> > end of the day. >> > >> > On Tue, Dec 27, 2022 at 5:21 AM Berend van Wachem >> > > wrote: >> > >> > Dear Petsc-team/users, >> > >> > I am trying to save a DMForest which has been used for a calculation >> > (so >> > refined/coarsened in places) to a file and later read it from the >> file >> > to continue a calculation with. >> > My question is: how do I do this? I've tried a few things, such as >> > using >> > DMView and DMLoad, and saving the BaseDM, but that doesn't seem to >> save >> > the refining/coarsening which has taken place in the initial >> > calculation. >> > I haven't been able to find any example or documentation for this. >> Any >> > pointers or examples would be very much appreciated! >> > >> > Best regards, Berend. >> > >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From narnoldm at umich.edu Fri Dec 30 18:51:52 2022 From: narnoldm at umich.edu (Nicholas Arnold-Medabalimi) Date: Fri, 30 Dec 2022 19:51:52 -0500 Subject: [petsc-users] DMLabel Views Message-ID: Hi Petsc Users I'm in the process of tagging cells in a DMPlex for identification before a mesh filter. I'm trying to debug some issues related to my tagging metrics that is only appearing for more complex meshes. This makes using the ASCII DMLabelView a little tedious to parse and I was wondering if there is a convenient way to visualize DMLabels on the mesh. At least, from what I can tell, I can't send the DMLabel into a VTK file. I'm considering making a 1 DOF vector and just copying the Label values into it and visualizing that, but I wanted to ask if there is a more convenient way. Sincerely Nicholas -- Nicholas Arnold-Medabalimi Ph.D. Candidate Computational Aeroscience Lab University of Michigan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Dec 30 19:29:18 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 30 Dec 2022 20:29:18 -0500 Subject: [petsc-users] DMLabel Views In-Reply-To: References: Message-ID: On Fri, Dec 30, 2022 at 7:52 PM Nicholas Arnold-Medabalimi < narnoldm at umich.edu> wrote: > Hi Petsc Users > > I'm in the process of tagging cells in a DMPlex for identification before > a mesh filter. I'm trying to debug some issues related to my tagging > metrics that is only appearing for more complex meshes. This makes using > the ASCII DMLabelView a little tedious to parse and I was wondering if > there is a convenient way to visualize DMLabels on the mesh. At least, from > what I can tell, I can't send the DMLabel into a VTK file. > > I'm considering making a 1 DOF vector and just copying the Label values > into it and visualizing that, but I wanted to ask if there is a more > convenient way. > There is not. Maybe we should automate that. Thanks Matt > Sincerely > Nicholas > > > > > > -- > Nicholas Arnold-Medabalimi > > Ph.D. Candidate > Computational Aeroscience Lab > University of Michigan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidsal at buffalo.edu Fri Dec 30 18:10:20 2022 From: davidsal at buffalo.edu (David Salac) Date: Fri, 30 Dec 2022 19:10:20 -0500 Subject: [petsc-users] Write/read a DMForest to/from a file In-Reply-To: References: <23156021-e431-1780-4415-3fdfa64d34bf@ovgu.de> Message-ID: <5928aace-3d47-a864-7b8a-b0f3631dbd0a@buffalo.edu> Matt and I had a student who was working with Forest that had this issue. The problem is that once you convert the refined Forest to Plex and save it to a file you lose all of the hierarchical information when you load it again. When you load it the mesh basically becomes the base mesh, with no information about the coarser levels. It wasn't a huge deal for us so we basically saved the original base PLEX and refined down to what we needed during a restart. Since we were working with a mesh that wasn't time-evolving this wasn't an issue, but obviously this isn't generalizable. Dave Salac On 12/30/22 17:46, Mark Adams wrote: > Oh, right, you need to convert to a Plex and then write. > > I've never read a Forest but you will read in a Plex and I would think > just convert to a Forest. > > Mark > > On Fri, Dec 30, 2022 at 9:29 AM Matthew Knepley wrote: > > On Fri, Dec 30, 2022 at 5:59 AM Berend van Wachem > wrote: > > Dear Mark, > > Yes, I have tried that. That will only work if I convert the > DMForest to > a DMPlex first, and then write the DMPlex to file. But then, I > cannot > read in a DMForest when I want to continue the calculations > later on. > > Below is the code I use to write a DMPlex to file. When I call > this > routine with a DMForest, I get the error: > > > This is true. I am going to have to look at this with Toby. I will > get back to you. > > ? THanks, > > ? ? ?Matt > > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Wrong type of object: Parameter # 1 > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble > shooting. > [0]PETSC ERROR: Petsc Release Version 3.18.2, Nov 28, 2022 > [0]PETSC ERROR: ./bin/write_periodic on a linux-gcc-dev named > multiflow.multiflow.org by > berend Fri Dec 30 11:54:51 2022 > [0]PETSC ERROR: Configure options --with-clean > --download-metis=yes > --download-parmetis=yes --download-hdf5 --download-p4est > --download-triangle --download-tetgen > --with-zlib-lib=/usr/lib64/libz.a > --with-zlib-include=/usr/include --with-mpi=yes > --with-mpi-dir=/usr > --with-mpiexec=/usr/bin/mpiexec --download-slepc=yes > --download-fftw=yes > [0]PETSC ERROR: #1 PetscSectionGetChart() at > /usr/local/petsc-3.18.2/src/vec/is/section/interface/section.c:592 > [0]PETSC ERROR: #2 DMPlexGetChart() at > /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:2702 > [0]PETSC ERROR: #3 DMPlexGetDepthStratum() at > /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:4882 > [0]PETSC ERROR: #4 DMPlexCreatePointNumbering() at > /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:8303 > [0]PETSC ERROR: #5 DMPlexTopologyView() at > /usr/local/petsc-3.18.2/src/dm/impls/plex/plex.c:1854 > [0]PETSC ERROR: #6 WriteDMAndVectortoHDF5File() at > /home/berend/git/Code/petsc-dmplex-restart-test/src/createandwrite.c:176 > [0]PETSC ERROR: #7 main() at > /home/berend/git/Code/petsc-dmplex-restart-test/src/createandwrite.c:555 > > The code I use to write is: > > PetscErrorCode WriteDMAndVectortoHDF5File(DM dm, Vec > *DataVector, const > char *HDF5file) > { > ? ?PetscViewer H5Viewer; > ? ?PetscErrorCode ierr; > ? ?DM dmCopy; > ? ?PetscSection sectionCopy; > ? ?PetscInt i; > ? ?PetscScalar *xVecArray; > ? ?PetscInt numPoints, numPointsCopy; > ? ?Vec vectorCopy; > ? ?PetscScalar *array; > > ? ?PetscFunctionBegin; > ? ?ierr = PetscViewerHDF5Open(PETSC_COMM_WORLD, HDF5file, > FILE_MODE_WRITE, &H5Viewer); > ? ?CHKERRQ(ierr); > ? ?ierr = PetscObjectSetName((PetscObject) dm, "DM"); > ? ?CHKERRQ(ierr); > ? ?ierr = PetscViewerPushFormat(H5Viewer, > PETSC_VIEWER_HDF5_PETSC); > ? ?CHKERRQ(ierr); > ? ?ierr = DMPlexTopologyView(dm, H5Viewer); > ? ?CHKERRQ(ierr); > ? ?ierr = DMPlexLabelsView(dm, H5Viewer); > ? ?CHKERRQ(ierr); > ? ?ierr = DMPlexCoordinatesView(dm, H5Viewer); > ? ?CHKERRQ(ierr); > ? ?ierr = PetscViewerPopFormat(H5Viewer); > ? ?CHKERRQ(ierr); > > ? ?ierr = DMClone(dm, &dmCopy); > ? ?CHKERRQ(ierr); > ? ?ierr = PetscObjectSetName((PetscObject) dmCopy, "DM"); > ? ?CHKERRQ(ierr); > ? ?ierr = DMGetLocalSection(dm, §ionCopy); > ? ?CHKERRQ(ierr); > ? ?ierr = DMSetLocalSection(dmCopy, sectionCopy); > ? ?CHKERRQ(ierr); > > ? ?/* Write the section to the file */ > ? ?ierr = DMPlexSectionView(dm, H5Viewer, dmCopy); > ? ?CHKERRQ(ierr); > > ? ?ierr = DMGetGlobalVector(dmCopy, &vectorCopy); > ? ?CHKERRQ(ierr); > > ? ?/*** We have to copy the vector into the new vector ... ***/ > ? ?ierr = VecGetArray(vectorCopy, &array); > ? ?CHKERRQ(ierr); > ? ?ierr = VecGetLocalSize(*DataVector, &numPoints); > ? ?CHKERRQ(ierr); > ? ?ierr = VecGetLocalSize(vectorCopy, &numPointsCopy); > ? ?CHKERRQ(ierr); > ? ?assert(numPoints == numPointsCopy); > ? ?ierr = VecGetArray(*DataVector, &xVecArray); > ? ?CHKERRQ(ierr); > > ? ?for (i = 0; i < numPoints; i++) /* Loop over all internal > cells */ > ? ?{ > ? ? ?array[i] = xVecArray[i]; > ? ?} > > ? ?ierr = VecRestoreArray(vectorCopy, &array); > ? ?CHKERRQ(ierr); > ? ?ierr = VecRestoreArray(*DataVector, &xVecArray); > ? ?CHKERRQ(ierr); > > ? ?ierr = PetscObjectSetName((PetscObject) vectorCopy, > "DataVector"); > ? ?CHKERRQ(ierr); > > ? ?/* Write the vector to the file */ > ? ?ierr = DMPlexGlobalVectorView(dm, H5Viewer, dmCopy, > vectorCopy); > ? ?CHKERRQ(ierr); > > ? ?/* Close the file */ > ? ?ierr = PetscViewerDestroy(&H5Viewer); > ? ?CHKERRQ(ierr); > > ? ?ierr = DMDestroy(&dmCopy); > ? ?CHKERRQ(ierr); > ? ?/*** End of writing ****/ > ? ?PetscFunctionReturn(0); > } > > > > > > On 29/12/2022 21:54, Mark Adams wrote: > > Have you tried using the DMForest as you would use a DMPLex? > > That should work. Forst has extra stuff but it just makes a > Plex and the > > end of the day. > > > > On Tue, Dec 27, 2022 at 5:21 AM Berend van Wachem > > > > wrote: > > > >? ? ?Dear Petsc-team/users, > > > >? ? ?I am trying to save a DMForest which has been used for a > calculation > >? ? ?(so > >? ? ?refined/coarsened in places) to a file and later read it > from the file > >? ? ?to continue a calculation with. > >? ? ?My question is: how do I do this? I've tried a few > things, such as > >? ? ?using > >? ? ?DMView and DMLoad, and saving the BaseDM, but that > doesn't seem to save > >? ? ?the refining/coarsening which has taken place in the initial > >? ? ?calculation. > >? ? ?I haven't been able to find any example or documentation > for this. Any > >? ? ?pointers or examples would be very much appreciated! > > > >? ? ?Best regards, Berend. > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to > which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- David Salac Associate Professor Director of Graduate Studies Mechanical and Aerospace Engineering University at Buffalo www.buffalo.edu/~davidsal (716)645-1460 -------------- next part -------------- An HTML attachment was scrubbed... URL: