From myriam.peyrounette at idris.fr Fri Mar 1 09:52:55 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Fri, 1 Mar 2019 16:52:55 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 Message-ID: Hi, I used to run my code with PETSc 3.6. Since I upgraded the PETSc version to 3.10, this code has a bad memory scaling. To report this issue, I took the PETSc script ex42.c and slightly modified it so that the KSP and PC configurations are the same as in my code. In particular, I use a "personnalised" multi-grid method. The modifications are indicated by the keyword "TopBridge" in the attached scripts. To plot the memory (weak) scaling, I ran four calculations for each script with increasing problem sizes and computations cores: 1. 100,000 elts on 4 cores 2. 1 million elts on 40 cores 3. 10 millions elts on 400 cores 4. 100 millions elts on 4,000 cores The resulting graph is also attached. The scaling using PETSc 3.10 clearly deteriorates for large cases, while the one using PETSc 3.6 is robust. After a few tests, I found that the scaling is mostly sensitive to the use of the AMG method for the coarse grid (line 1780 in main_ex42_petsc36.cc). In particular, the performance strongly deteriorates when commenting lines 1777 to 1790 (in main_ex42_petsc36.cc). Do you have any idea of what changed between version 3.6 and version 3.10 that may imply such degradation? Let me know if you need further information. Best, Myriam Peyrounette -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- A non-text attachment was scrubbed... Name: main_ex42_petsc36.cc Type: text/x-c++src Size: 89014 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: main_ex42_petsc310.cc Type: text/x-c++src Size: 89260 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: scaling_ex42.png Type: image/png Size: 24427 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From sajidsyed2021 at u.northwestern.edu Fri Mar 1 13:00:53 2019 From: sajidsyed2021 at u.northwestern.edu (Sajid Ali) Date: Fri, 1 Mar 2019 13:00:53 -0600 Subject: [petsc-users] Direct PETSc to use MCDRAM on KNL and other optimizations for KNL In-Reply-To: References: <27578CF7-FF30-4461-9B30-3BE5B41585C8@anl.gov> Message-ID: Hi Hong, So, the speedup was coming from increased DRAM bandwidth and not the usage of MCDRAM. There is moderate MPI imbalance, large amount of Back-End stalls and good vectorization. I'm attaching my submit script, PETSc log file and Intel APS summary (all as non-HTML text). I can give more detailed analysis via Intel Vtune if needed. Thank You, Sajid Ali Applied Physics Northwestern University -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: submit_script Type: application/octet-stream Size: 951 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: intel_aps_report Type: application/octet-stream Size: 4731 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: knl_petsc Type: application/octet-stream Size: 28126 bytes Desc: not available URL: From knepley at gmail.com Fri Mar 1 19:27:57 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 1 Mar 2019 20:27:57 -0500 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: Message-ID: On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > > I used to run my code with PETSc 3.6. Since I upgraded the PETSc version > to 3.10, this code has a bad memory scaling. > > To report this issue, I took the PETSc script ex42.c and slightly > modified it so that the KSP and PC configurations are the same as in my > code. In particular, I use a "personnalised" multi-grid method. The > modifications are indicated by the keyword "TopBridge" in the attached > scripts. > > To plot the memory (weak) scaling, I ran four calculations for each > script with increasing problem sizes and computations cores: > > 1. 100,000 elts on 4 cores > 2. 1 million elts on 40 cores > 3. 10 millions elts on 400 cores > 4. 100 millions elts on 4,000 cores > > The resulting graph is also attached. The scaling using PETSc 3.10 > clearly deteriorates for large cases, while the one using PETSc 3.6 is > robust. > > After a few tests, I found that the scaling is mostly sensitive to the > use of the AMG method for the coarse grid (line 1780 in > main_ex42_petsc36.cc). In particular, the performance strongly > deteriorates when commenting lines 1777 to 1790 (in main_ex42_petsc36.cc). > > Do you have any idea of what changed between version 3.6 and version > 3.10 that may imply such degradation? > I believe the default values for PCGAMG changed between versions. It sounds like the coarsening rate is not great enough, so that these grids are too large. This can be set using: https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html There is some explanation of this effect on that page. Let us know if setting this does not correct the situation. Thanks, Matt > Let me know if you need further information. > > Best, > > Myriam Peyrounette > > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Fri Mar 1 19:33:22 2019 From: hongzhang at anl.gov (Zhang, Hong) Date: Sat, 2 Mar 2019 01:33:22 +0000 Subject: [petsc-users] Direct PETSc to use MCDRAM on KNL and other optimizations for KNL In-Reply-To: References: <27578CF7-FF30-4461-9B30-3BE5B41585C8@anl.gov> Message-ID: <43EE5EFA-C1D5-43E0-AB02-06E2EC334D38@anl.gov> On Mar 1, 2019, at 11:00 AM, Sajid Ali > wrote: Hi Hong, So, the speedup was coming from increased DRAM bandwidth and not the usage of MCDRAM. Certainly the speedup was coming from the usage of MCDRAM (which has much higher bandwidth than DRAM). What I meant is your code is still using MCDRAM, but MCDRAM acts like L3 cache in cache mode. Hong There is moderate MPI imbalance, large amount of Back-End stalls and good vectorization. I'm attaching my submit script, PETSc log file and Intel APS summary (all as non-HTML text). I can give more detailed analysis via Intel Vtune if needed. Thank You, Sajid Ali Applied Physics Northwestern University -------------- next part -------------- An HTML attachment was scrubbed... URL: From yyang85 at stanford.edu Sun Mar 3 12:03:25 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Sun, 3 Mar 2019 18:03:25 +0000 Subject: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed In-Reply-To: References: Message-ID: I tried compiling without the sanitizer and running on valgrind. Got a bunch of errors ?Uninitialised value was created by a stack allocation at 0x41B280: ComputeVel_qd::computeVel(double*, double, int&, int)?. HEAP SUMMARY: ==74== in use at exit: 96,637 bytes in 91 blocks ==74== total heap usage: 47,774 allocs, 47,522 frees, 308,253,653 bytes allocated LEAK SUMMARY: ==74== definitely lost: 0 bytes in 0 blocks ==74== indirectly lost: 0 bytes in 0 blocks ==74== possibly lost: 0 bytes in 0 blocks ==74== still reachable: 96,637 bytes in 91 blocks ==74== suppressed: 0 bytes in 0 blocks The error is located in the attached code (I?ve extracted only the relevant functions), but I couldn?t figure out what is wrong. Is this causing the memory corruption/double free error that happens when I execute the code? Thanks a lot for your help. Best regards, Yuyun From: Zhang, Junchao Sent: Friday, March 1, 2019 7:36 AM To: Yuyun Yang Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed On Fri, Mar 1, 2019 at 1:02 AM Yuyun Yang > wrote: Actually, I also saw a line at the beginning of valgrind saying "shadow memory range interleaves with an existing memory mapping. ASan cannot proceed properly. ABORTING." I guess the code didn't really run through valgrind since it aborted. Should I remove the address sanitizer flag when compiling? From the message, it seems ASan (not valgrind) aborted. You can try to compile without sanitizer and then run with valgrind. If no problem, then it is probably a sanitizer issue. Get Outlook for iOS ________________________________ From: Yuyun Yang Sent: Thursday, February 28, 2019 10:54:57 PM To: Zhang, Junchao Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed Hmm, still getting the same error from address sanitizer even though valgrind shows no errors and no leaks are possible. Should I ignore that error? My results did run alright. Best, Yuyun Get Outlook for iOS ________________________________ From: Zhang, Junchao > Sent: Wednesday, February 27, 2019 8:27:17 PM To: Yuyun Yang Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed Try the following to see if you can catch the bug easily: 1) Get error code for each petsc function and check it with CHKERRQ; 2) Link your code with a petsc library with debugging enabled (configured with --with-debugging=1); 3) Run your code with valgrind --Junchao Zhang On Wed, Feb 27, 2019 at 9:04 PM Yuyun Yang > wrote: Hi Junchao, This code actually involves a lot of classes and is pretty big. Might be an overkill for me to send everything to you. I'd like to know if I see this sort of error message, which points to this domain file, is it possible that the problem happens in another file (whose operations are linked to this one)? If so, I'll debug a little more and maybe send you more useful information later. Best regards, Yuyun Get Outlook for iOS ________________________________ From: Zhang, Junchao > Sent: Wednesday, February 27, 2019 6:24:13 PM To: Yuyun Yang Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed Could you provide a compilable and runnable test so I can try it? --Junchao Zhang On Wed, Feb 27, 2019 at 7:34 PM Yuyun Yang > wrote: Thanks, I fixed that, but I?m not actually calling the testScatters() function in my implementation (in the constructor, the only functions I called are setFields and setScatters). So the problem couldn?t have been that? Best, Yuyun From: Zhang, Junchao > Sent: Wednesday, February 27, 2019 10:50 AM To: Yuyun Yang > Cc: Matthew Knepley >; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed On Wed, Feb 27, 2019 at 10:41 AM Yuyun Yang via petsc-users > wrote: I called VecDestroy() in the destructor for this object ? is that not the right way to do it? In Domain::testScatters(), you have many VecDuplicate(,&out), You need to VecDestroy(&out) before doing new VecDuplicate(,&out); How do I implement CHECK ALL RETURN CODES? For each PETSc function, do ierr = ...; CHKERRQ(ierr); From: Matthew Knepley > Sent: Wednesday, February 27, 2019 7:24 AM To: Yuyun Yang > Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed You call VecDuplicate() a bunch, but VecDestroy() only once in the bottom function. This is wrong. Also, CHECK ALL RETURN CODES. This is the fastest way to find errors. Matt On Wed, Feb 27, 2019 at 2:06 AM Yuyun Yang via petsc-users > wrote: Hello team, I ran into the address sanitizer error that I hope you could help me with. I don?t really know what?s wrong with the way the code frees memory. The relevant code file is attached. The line number following domain.cpp specifically referenced to the vector _q, which seems a little odd, since some other vectors are constructed and freed the same way. ==1719==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x61f0000076c0 in thread T0 #0 0x7fbf195282ca in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca) #1 0x7fbf1706f895 in PetscFreeAlign /home/yyy910805/petsc/src/sys/memory/mal.c:87 #2 0x7fbf1731a898 in VecDestroy_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:788 #3 0x7fbf1735f795 in VecDestroy /home/yyy910805/petsc/src/vec/vec/interface/vector.c:408 #4 0x40dd0a in Domain::~Domain() /home/yyy910805/scycle/source/domain.cpp:132 #5 0x40b479 in main /home/yyy910805/scycle/source/main.cpp:242 #6 0x7fbf14d2082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) #7 0x4075d8 in _start (/home/yyy910805/scycle/source/main+0x4075d8) 0x61f0000076c0 is located 1600 bytes inside of 3220-byte region [0x61f000007080,0x61f000007d14) allocated by thread T0 here: #0 0x7fbf19528b32 in __interceptor_memalign (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98b32) #1 0x7fbf1706f7e0 in PetscMallocAlign /home/yyy910805/petsc/src/sys/memory/mal.c:41 #2 0x7fbf17073022 in PetscTrMallocDefault /home/yyy910805/petsc/src/sys/memory/mtr.c:183 #3 0x7fbf170710a1 in PetscMallocA /home/yyy910805/petsc/src/sys/memory/mal.c:397 #4 0x7fbf17326fb0 in VecCreate_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec3.c:35 #5 0x7fbf1736f560 in VecSetType /home/yyy910805/petsc/src/vec/vec/interface/vecreg.c:51 #6 0x7fbf1731afae in VecDuplicate_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:807 #7 0x7fbf1735eff7 in VecDuplicate /home/yyy910805/petsc/src/vec/vec/interface/vector.c:379 #8 0x4130de in Domain::setFields() /home/yyy910805/scycle/source/domain.cpp:431 #9 0x40c60a in Domain::Domain(char const*) /home/yyy910805/scycle/source/domain.cpp:57 #10 0x40b433 in main /home/yyy910805/scycle/source/main.cpp:242 #11 0x7fbf14d2082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) SUMMARY: AddressSanitizer: bad-free ??:0 __interceptor_free ==1719==ABORTING Thanks very much! Yuyun -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: uninitialized_error.cpp URL: From knepley at gmail.com Sun Mar 3 13:27:55 2019 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 3 Mar 2019 14:27:55 -0500 Subject: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed In-Reply-To: References: Message-ID: On Sun, Mar 3, 2019 at 1:03 PM Yuyun Yang via petsc-users < petsc-users at mcs.anl.gov> wrote: > I tried compiling without the sanitizer and running on valgrind. Got a > bunch of errors ?Uninitialised value was created by a stack allocation at > 0x41B280: ComputeVel_qd::computeVel(double*, double, int&, int)?. > There is no memory management code here, so other parts of the code must be relevant. Thanks, Matt > > > HEAP SUMMARY: > > ==74== in use at exit: 96,637 bytes in 91 blocks > > ==74== total heap usage: 47,774 allocs, 47,522 frees, 308,253,653 bytes > allocated > > LEAK SUMMARY: > > ==74== definitely lost: 0 bytes in 0 blocks > > ==74== indirectly lost: 0 bytes in 0 blocks > > ==74== possibly lost: 0 bytes in 0 blocks > > ==74== still reachable: 96,637 bytes in 91 blocks > > ==74== suppressed: 0 bytes in 0 blocks > > > > The error is located in the attached code (I?ve extracted only the > relevant functions), but I couldn?t figure out what is wrong. Is this > causing the memory corruption/double free error that happens when I execute > the code? > > > > Thanks a lot for your help. > > > > Best regards, > > Yuyun > > > > *From:* Zhang, Junchao > *Sent:* Friday, March 1, 2019 7:36 AM > *To:* Yuyun Yang > *Subject:* Re: [petsc-users] AddressSanitizer: attempting free on address > which was not malloc()-ed > > > > > > On Fri, Mar 1, 2019 at 1:02 AM Yuyun Yang wrote: > > Actually, I also saw a line at the beginning of valgrind saying "shadow > memory range interleaves with an existing memory mapping. ASan cannot > proceed properly. ABORTING." I guess the code didn't really run through > valgrind since it aborted. Should I remove the address sanitizer flag when > compiling? > > From the message, it seems ASan (not valgrind) aborted. You can try to > compile without sanitizer and then run with valgrind. If no problem, then > it is probably a sanitizer issue. > > > > > > Get Outlook for iOS > ------------------------------ > > *From:* Yuyun Yang > *Sent:* Thursday, February 28, 2019 10:54:57 PM > *To:* Zhang, Junchao > *Subject:* Re: [petsc-users] AddressSanitizer: attempting free on address > which was not malloc()-ed > > > > Hmm, still getting the same error from address sanitizer even though > valgrind shows no errors and no leaks are possible. > > > > Should I ignore that error? My results did run alright. > > > > Best, > > Yuyun > > > > Get Outlook for iOS > ------------------------------ > > *From:* Zhang, Junchao > *Sent:* Wednesday, February 27, 2019 8:27:17 PM > *To:* Yuyun Yang > *Cc:* Matthew Knepley; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] AddressSanitizer: attempting free on address > which was not malloc()-ed > > > > Try the following to see if you can catch the bug easily: 1) Get error > code for each petsc function and check it with CHKERRQ; 2) Link your code > with a petsc library with debugging enabled (configured > with --with-debugging=1); 3) Run your code with valgrind > > > > --Junchao Zhang > > > > > > On Wed, Feb 27, 2019 at 9:04 PM Yuyun Yang wrote: > > Hi Junchao, > > > > This code actually involves a lot of classes and is pretty big. Might be > an overkill for me to send everything to you. I'd like to know if I see > this sort of error message, which points to this domain file, is it > possible that the problem happens in another file (whose operations are > linked to this one)? If so, I'll debug a little more and maybe send you > more useful information later. > > > > Best regards, > > Yuyun > > > > Get Outlook for iOS > ------------------------------ > > *From:* Zhang, Junchao > *Sent:* Wednesday, February 27, 2019 6:24:13 PM > *To:* Yuyun Yang > *Cc:* Matthew Knepley; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] AddressSanitizer: attempting free on address > which was not malloc()-ed > > > > Could you provide a compilable and runnable test so I can try it? > > --Junchao Zhang > > > > > > On Wed, Feb 27, 2019 at 7:34 PM Yuyun Yang wrote: > > Thanks, I fixed that, but I?m not actually calling the testScatters() > function in my implementation (in the constructor, the only functions I > called are setFields and setScatters). So the problem couldn?t have been > that? > > > > Best, > > Yuyun > > > > *From:* Zhang, Junchao > *Sent:* Wednesday, February 27, 2019 10:50 AM > *To:* Yuyun Yang > *Cc:* Matthew Knepley ; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] AddressSanitizer: attempting free on address > which was not malloc()-ed > > > > > > On Wed, Feb 27, 2019 at 10:41 AM Yuyun Yang via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > I called VecDestroy() in the destructor for this object ? is that not the > right way to do it? > > In Domain::testScatters(), you have many VecDuplicate(,&out), You need to > VecDestroy(&out) before doing new VecDuplicate(,&out); > > How do I implement CHECK ALL RETURN CODES? > > For each PETSc function, do ierr = ...; CHKERRQ(ierr); > > > > *From:* Matthew Knepley > *Sent:* Wednesday, February 27, 2019 7:24 AM > *To:* Yuyun Yang > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] AddressSanitizer: attempting free on address > which was not malloc()-ed > > > > You call VecDuplicate() a bunch, but VecDestroy() only once in the bottom > function. This is wrong. > > Also, CHECK ALL RETURN CODES. This is the fastest way to find errors. > > > > Matt > > > > On Wed, Feb 27, 2019 at 2:06 AM Yuyun Yang via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello team, > > > > I ran into the address sanitizer error that I hope you could help me with. > I don?t really know what?s wrong with the way the code frees memory. The > relevant code file is attached. The line number following domain.cpp > specifically referenced to the vector _q, which seems a little odd, since > some other vectors are constructed and freed the same way. > > > > ==1719==ERROR: AddressSanitizer: attempting free on address which was not > malloc()-ed: 0x61f0000076c0 in thread T0 > > #0 0x7fbf195282ca in __interceptor_free > (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca) > > #1 0x7fbf1706f895 in PetscFreeAlign > /home/yyy910805/petsc/src/sys/memory/mal.c:87 > > #2 0x7fbf1731a898 in VecDestroy_Seq > /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:788 > > #3 0x7fbf1735f795 in VecDestroy > /home/yyy910805/petsc/src/vec/vec/interface/vector.c:408 > > #4 0x40dd0a in Domain::~Domain() > /home/yyy910805/scycle/source/domain.cpp:132 > > #5 0x40b479 in main /home/yyy910805/scycle/source/main.cpp:242 > > #6 0x7fbf14d2082f in __libc_start_main > (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) > > #7 0x4075d8 in _start (/home/yyy910805/scycle/source/main+0x4075d8) > > > > 0x61f0000076c0 is located 1600 bytes inside of 3220-byte region > [0x61f000007080,0x61f000007d14) > > allocated by thread T0 here: > > #0 0x7fbf19528b32 in __interceptor_memalign > (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98b32) > > #1 0x7fbf1706f7e0 in PetscMallocAlign > /home/yyy910805/petsc/src/sys/memory/mal.c:41 > > #2 0x7fbf17073022 in PetscTrMallocDefault > /home/yyy910805/petsc/src/sys/memory/mtr.c:183 > > #3 0x7fbf170710a1 in PetscMallocA > /home/yyy910805/petsc/src/sys/memory/mal.c:397 > > #4 0x7fbf17326fb0 in VecCreate_Seq > /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec3.c:35 > > #5 0x7fbf1736f560 in VecSetType > /home/yyy910805/petsc/src/vec/vec/interface/vecreg.c:51 > > #6 0x7fbf1731afae in VecDuplicate_Seq > /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:807 > > #7 0x7fbf1735eff7 in VecDuplicate > /home/yyy910805/petsc/src/vec/vec/interface/vector.c:379 > > #8 0x4130de in Domain::setFields() > /home/yyy910805/scycle/source/domain.cpp:431 > > #9 0x40c60a in Domain::Domain(char const*) > /home/yyy910805/scycle/source/domain.cpp:57 > > #10 0x40b433 in main /home/yyy910805/scycle/source/main.cpp:242 > > #11 0x7fbf14d2082f in __libc_start_main > (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) > > > > SUMMARY: AddressSanitizer: bad-free ??:0 __interceptor_free > > ==1719==ABORTING > > > > Thanks very much! > > Yuyun > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yyang85 at stanford.edu Sun Mar 3 14:05:35 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Sun, 3 Mar 2019 20:05:35 +0000 Subject: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed In-Reply-To: References: Message-ID: Actually, I tried just creating a domain object (since the address sanitizer was complaining about that code to start with). Simply creating that object gave me a core dump, so I suppose the issue must be there. I got the following message when running the code with -objects_dump flag on the command line, but couldn?t find a problem with the code (I?ve attached it here with only the relevant functions). Thanks a lot for your help! The following objects were never freed ----------------------------------------- [0] Vec seq y [0] VecCreate() in /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c [0] Vec seq Vec_0x84000000_0 [0] VecCreate() in /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c [0] Vec seq Vec_0x84000000_1 [0] VecCreate() in /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c [0] VecScatter seq VecScatter_0x84000000_2 [0] VecScatterCreate() in /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c [0] VecScatter seq VecScatter_0x84000000_3 [0] VecScatterCreate() in /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c [0] VecScatter seq VecScatter_0x84000000_4 [0] VecScatterCreate() in /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c [0] VecScatter seq VecScatter_0x84000000_5 [0] VecScatterCreate() in /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c Attempting to use an MPI routine after finalizing MPICH ------------------------------------------------------------------------------------------------------------------------------------------------- From: Matthew Knepley Sent: Sunday, March 3, 2019 11:28 AM To: Yuyun Yang Cc: Zhang, Junchao ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed On Sun, Mar 3, 2019 at 1:03 PM Yuyun Yang via petsc-users > wrote: I tried compiling without the sanitizer and running on valgrind. Got a bunch of errors ?Uninitialised value was created by a stack allocation at 0x41B280: ComputeVel_qd::computeVel(double*, double, int&, int)?. There is no memory management code here, so other parts of the code must be relevant. Thanks, Matt HEAP SUMMARY: ==74== in use at exit: 96,637 bytes in 91 blocks ==74== total heap usage: 47,774 allocs, 47,522 frees, 308,253,653 bytes allocated LEAK SUMMARY: ==74== definitely lost: 0 bytes in 0 blocks ==74== indirectly lost: 0 bytes in 0 blocks ==74== possibly lost: 0 bytes in 0 blocks ==74== still reachable: 96,637 bytes in 91 blocks ==74== suppressed: 0 bytes in 0 blocks The error is located in the attached code (I?ve extracted only the relevant functions), but I couldn?t figure out what is wrong. Is this causing the memory corruption/double free error that happens when I execute the code? Thanks a lot for your help. Best regards, Yuyun From: Zhang, Junchao > Sent: Friday, March 1, 2019 7:36 AM To: Yuyun Yang > Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed On Fri, Mar 1, 2019 at 1:02 AM Yuyun Yang > wrote: Actually, I also saw a line at the beginning of valgrind saying "shadow memory range interleaves with an existing memory mapping. ASan cannot proceed properly. ABORTING." I guess the code didn't really run through valgrind since it aborted. Should I remove the address sanitizer flag when compiling? From the message, it seems ASan (not valgrind) aborted. You can try to compile without sanitizer and then run with valgrind. If no problem, then it is probably a sanitizer issue. Get Outlook for iOS ________________________________ From: Yuyun Yang Sent: Thursday, February 28, 2019 10:54:57 PM To: Zhang, Junchao Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed Hmm, still getting the same error from address sanitizer even though valgrind shows no errors and no leaks are possible. Should I ignore that error? My results did run alright. Best, Yuyun Get Outlook for iOS ________________________________ From: Zhang, Junchao > Sent: Wednesday, February 27, 2019 8:27:17 PM To: Yuyun Yang Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed Try the following to see if you can catch the bug easily: 1) Get error code for each petsc function and check it with CHKERRQ; 2) Link your code with a petsc library with debugging enabled (configured with --with-debugging=1); 3) Run your code with valgrind --Junchao Zhang On Wed, Feb 27, 2019 at 9:04 PM Yuyun Yang > wrote: Hi Junchao, This code actually involves a lot of classes and is pretty big. Might be an overkill for me to send everything to you. I'd like to know if I see this sort of error message, which points to this domain file, is it possible that the problem happens in another file (whose operations are linked to this one)? If so, I'll debug a little more and maybe send you more useful information later. Best regards, Yuyun Get Outlook for iOS ________________________________ From: Zhang, Junchao > Sent: Wednesday, February 27, 2019 6:24:13 PM To: Yuyun Yang Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed Could you provide a compilable and runnable test so I can try it? --Junchao Zhang On Wed, Feb 27, 2019 at 7:34 PM Yuyun Yang > wrote: Thanks, I fixed that, but I?m not actually calling the testScatters() function in my implementation (in the constructor, the only functions I called are setFields and setScatters). So the problem couldn?t have been that? Best, Yuyun From: Zhang, Junchao > Sent: Wednesday, February 27, 2019 10:50 AM To: Yuyun Yang > Cc: Matthew Knepley >; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed On Wed, Feb 27, 2019 at 10:41 AM Yuyun Yang via petsc-users > wrote: I called VecDestroy() in the destructor for this object ? is that not the right way to do it? In Domain::testScatters(), you have many VecDuplicate(,&out), You need to VecDestroy(&out) before doing new VecDuplicate(,&out); How do I implement CHECK ALL RETURN CODES? For each PETSc function, do ierr = ...; CHKERRQ(ierr); From: Matthew Knepley > Sent: Wednesday, February 27, 2019 7:24 AM To: Yuyun Yang > Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed You call VecDuplicate() a bunch, but VecDestroy() only once in the bottom function. This is wrong. Also, CHECK ALL RETURN CODES. This is the fastest way to find errors. Matt On Wed, Feb 27, 2019 at 2:06 AM Yuyun Yang via petsc-users > wrote: Hello team, I ran into the address sanitizer error that I hope you could help me with. I don?t really know what?s wrong with the way the code frees memory. The relevant code file is attached. The line number following domain.cpp specifically referenced to the vector _q, which seems a little odd, since some other vectors are constructed and freed the same way. ==1719==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x61f0000076c0 in thread T0 #0 0x7fbf195282ca in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca) #1 0x7fbf1706f895 in PetscFreeAlign /home/yyy910805/petsc/src/sys/memory/mal.c:87 #2 0x7fbf1731a898 in VecDestroy_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:788 #3 0x7fbf1735f795 in VecDestroy /home/yyy910805/petsc/src/vec/vec/interface/vector.c:408 #4 0x40dd0a in Domain::~Domain() /home/yyy910805/scycle/source/domain.cpp:132 #5 0x40b479 in main /home/yyy910805/scycle/source/main.cpp:242 #6 0x7fbf14d2082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) #7 0x4075d8 in _start (/home/yyy910805/scycle/source/main+0x4075d8) 0x61f0000076c0 is located 1600 bytes inside of 3220-byte region [0x61f000007080,0x61f000007d14) allocated by thread T0 here: #0 0x7fbf19528b32 in __interceptor_memalign (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98b32) #1 0x7fbf1706f7e0 in PetscMallocAlign /home/yyy910805/petsc/src/sys/memory/mal.c:41 #2 0x7fbf17073022 in PetscTrMallocDefault /home/yyy910805/petsc/src/sys/memory/mtr.c:183 #3 0x7fbf170710a1 in PetscMallocA /home/yyy910805/petsc/src/sys/memory/mal.c:397 #4 0x7fbf17326fb0 in VecCreate_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec3.c:35 #5 0x7fbf1736f560 in VecSetType /home/yyy910805/petsc/src/vec/vec/interface/vecreg.c:51 #6 0x7fbf1731afae in VecDuplicate_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:807 #7 0x7fbf1735eff7 in VecDuplicate /home/yyy910805/petsc/src/vec/vec/interface/vector.c:379 #8 0x4130de in Domain::setFields() /home/yyy910805/scycle/source/domain.cpp:431 #9 0x40c60a in Domain::Domain(char const*) /home/yyy910805/scycle/source/domain.cpp:57 #10 0x40b433 in main /home/yyy910805/scycle/source/main.cpp:242 #11 0x7fbf14d2082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) SUMMARY: AddressSanitizer: bad-free ??:0 __interceptor_free ==1719==ABORTING Thanks very much! Yuyun -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: domain.cpp URL: From knepley at gmail.com Sun Mar 3 14:45:50 2019 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 3 Mar 2019 15:45:50 -0500 Subject: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed In-Reply-To: References: Message-ID: On Sun, Mar 3, 2019 at 3:05 PM Yuyun Yang wrote: > Actually, I tried just creating a domain object (since the address > sanitizer was complaining about that code to start with). Simply creating > that object gave me a core dump, so I suppose the issue must be there. I > got the following message when running the code with -objects_dump flag on > the command line, but couldn?t find a problem with the code (I?ve attached > it here with only the relevant functions). > I think what we are going to need from you is a minimal example that has the error. I am guessing you have a logic bug in the C++, which we cannot debug by looking. Thanks, Matt > > > Thanks a lot for your help! > > > > The following objects were never freed > > ----------------------------------------- > > [0] Vec seq y > > [0] VecCreate() in > /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c > > [0] Vec seq Vec_0x84000000_0 > > [0] VecCreate() in > /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c > > [0] Vec seq Vec_0x84000000_1 > > [0] VecCreate() in > /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c > > [0] VecScatter seq VecScatter_0x84000000_2 > > [0] VecScatterCreate() in > /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c > > [0] VecScatter seq VecScatter_0x84000000_3 > > [0] VecScatterCreate() in > /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c > > [0] VecScatter seq VecScatter_0x84000000_4 > > [0] VecScatterCreate() in > /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c > > [0] VecScatter seq VecScatter_0x84000000_5 > > [0] VecScatterCreate() in > /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c > > Attempting to use an MPI routine after finalizing MPICH > > > > > ------------------------------------------------------------------------------------------------------------------------------------------------- > > *From:* Matthew Knepley > *Sent:* Sunday, March 3, 2019 11:28 AM > *To:* Yuyun Yang > *Cc:* Zhang, Junchao ; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] AddressSanitizer: attempting free on address > which was not malloc()-ed > > > > On Sun, Mar 3, 2019 at 1:03 PM Yuyun Yang via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > I tried compiling without the sanitizer and running on valgrind. Got a > bunch of errors ?Uninitialised value was created by a stack allocation at > 0x41B280: ComputeVel_qd::computeVel(double*, double, int&, int)?. > > > > There is no memory management code here, so other parts of the code must > be relevant. > > > > Thanks, > > > > Matt > > > > > > HEAP SUMMARY: > > ==74== in use at exit: 96,637 bytes in 91 blocks > > ==74== total heap usage: 47,774 allocs, 47,522 frees, 308,253,653 bytes > allocated > > LEAK SUMMARY: > > ==74== definitely lost: 0 bytes in 0 blocks > > ==74== indirectly lost: 0 bytes in 0 blocks > > ==74== possibly lost: 0 bytes in 0 blocks > > ==74== still reachable: 96,637 bytes in 91 blocks > > ==74== suppressed: 0 bytes in 0 blocks > > > > The error is located in the attached code (I?ve extracted only the > relevant functions), but I couldn?t figure out what is wrong. Is this > causing the memory corruption/double free error that happens when I execute > the code? > > > > Thanks a lot for your help. > > > > Best regards, > > Yuyun > > > > *From:* Zhang, Junchao > *Sent:* Friday, March 1, 2019 7:36 AM > *To:* Yuyun Yang > *Subject:* Re: [petsc-users] AddressSanitizer: attempting free on address > which was not malloc()-ed > > > > > > On Fri, Mar 1, 2019 at 1:02 AM Yuyun Yang wrote: > > Actually, I also saw a line at the beginning of valgrind saying "shadow > memory range interleaves with an existing memory mapping. ASan cannot > proceed properly. ABORTING." I guess the code didn't really run through > valgrind since it aborted. Should I remove the address sanitizer flag when > compiling? > > From the message, it seems ASan (not valgrind) aborted. You can try to > compile without sanitizer and then run with valgrind. If no problem, then > it is probably a sanitizer issue. > > > > > > Get Outlook for iOS > ------------------------------ > > *From:* Yuyun Yang > *Sent:* Thursday, February 28, 2019 10:54:57 PM > *To:* Zhang, Junchao > *Subject:* Re: [petsc-users] AddressSanitizer: attempting free on address > which was not malloc()-ed > > > > Hmm, still getting the same error from address sanitizer even though > valgrind shows no errors and no leaks are possible. > > > > Should I ignore that error? My results did run alright. > > > > Best, > > Yuyun > > > > Get Outlook for iOS > ------------------------------ > > *From:* Zhang, Junchao > *Sent:* Wednesday, February 27, 2019 8:27:17 PM > *To:* Yuyun Yang > *Cc:* Matthew Knepley; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] AddressSanitizer: attempting free on address > which was not malloc()-ed > > > > Try the following to see if you can catch the bug easily: 1) Get error > code for each petsc function and check it with CHKERRQ; 2) Link your code > with a petsc library with debugging enabled (configured > with --with-debugging=1); 3) Run your code with valgrind > > > > --Junchao Zhang > > > > > > On Wed, Feb 27, 2019 at 9:04 PM Yuyun Yang wrote: > > Hi Junchao, > > > > This code actually involves a lot of classes and is pretty big. Might be > an overkill for me to send everything to you. I'd like to know if I see > this sort of error message, which points to this domain file, is it > possible that the problem happens in another file (whose operations are > linked to this one)? If so, I'll debug a little more and maybe send you > more useful information later. > > > > Best regards, > > Yuyun > > > > Get Outlook for iOS > ------------------------------ > > *From:* Zhang, Junchao > *Sent:* Wednesday, February 27, 2019 6:24:13 PM > *To:* Yuyun Yang > *Cc:* Matthew Knepley; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] AddressSanitizer: attempting free on address > which was not malloc()-ed > > > > Could you provide a compilable and runnable test so I can try it? > > --Junchao Zhang > > > > > > On Wed, Feb 27, 2019 at 7:34 PM Yuyun Yang wrote: > > Thanks, I fixed that, but I?m not actually calling the testScatters() > function in my implementation (in the constructor, the only functions I > called are setFields and setScatters). So the problem couldn?t have been > that? > > > > Best, > > Yuyun > > > > *From:* Zhang, Junchao > *Sent:* Wednesday, February 27, 2019 10:50 AM > *To:* Yuyun Yang > *Cc:* Matthew Knepley ; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] AddressSanitizer: attempting free on address > which was not malloc()-ed > > > > > > On Wed, Feb 27, 2019 at 10:41 AM Yuyun Yang via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > I called VecDestroy() in the destructor for this object ? is that not the > right way to do it? > > In Domain::testScatters(), you have many VecDuplicate(,&out), You need to > VecDestroy(&out) before doing new VecDuplicate(,&out); > > How do I implement CHECK ALL RETURN CODES? > > For each PETSc function, do ierr = ...; CHKERRQ(ierr); > > > > *From:* Matthew Knepley > *Sent:* Wednesday, February 27, 2019 7:24 AM > *To:* Yuyun Yang > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] AddressSanitizer: attempting free on address > which was not malloc()-ed > > > > You call VecDuplicate() a bunch, but VecDestroy() only once in the bottom > function. This is wrong. > > Also, CHECK ALL RETURN CODES. This is the fastest way to find errors. > > > > Matt > > > > On Wed, Feb 27, 2019 at 2:06 AM Yuyun Yang via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello team, > > > > I ran into the address sanitizer error that I hope you could help me with. > I don?t really know what?s wrong with the way the code frees memory. The > relevant code file is attached. The line number following domain.cpp > specifically referenced to the vector _q, which seems a little odd, since > some other vectors are constructed and freed the same way. > > > > ==1719==ERROR: AddressSanitizer: attempting free on address which was not > malloc()-ed: 0x61f0000076c0 in thread T0 > > #0 0x7fbf195282ca in __interceptor_free > (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca) > > #1 0x7fbf1706f895 in PetscFreeAlign > /home/yyy910805/petsc/src/sys/memory/mal.c:87 > > #2 0x7fbf1731a898 in VecDestroy_Seq > /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:788 > > #3 0x7fbf1735f795 in VecDestroy > /home/yyy910805/petsc/src/vec/vec/interface/vector.c:408 > > #4 0x40dd0a in Domain::~Domain() > /home/yyy910805/scycle/source/domain.cpp:132 > > #5 0x40b479 in main /home/yyy910805/scycle/source/main.cpp:242 > > #6 0x7fbf14d2082f in __libc_start_main > (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) > > #7 0x4075d8 in _start (/home/yyy910805/scycle/source/main+0x4075d8) > > > > 0x61f0000076c0 is located 1600 bytes inside of 3220-byte region > [0x61f000007080,0x61f000007d14) > > allocated by thread T0 here: > > #0 0x7fbf19528b32 in __interceptor_memalign > (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98b32) > > #1 0x7fbf1706f7e0 in PetscMallocAlign > /home/yyy910805/petsc/src/sys/memory/mal.c:41 > > #2 0x7fbf17073022 in PetscTrMallocDefault > /home/yyy910805/petsc/src/sys/memory/mtr.c:183 > > #3 0x7fbf170710a1 in PetscMallocA > /home/yyy910805/petsc/src/sys/memory/mal.c:397 > > #4 0x7fbf17326fb0 in VecCreate_Seq > /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec3.c:35 > > #5 0x7fbf1736f560 in VecSetType > /home/yyy910805/petsc/src/vec/vec/interface/vecreg.c:51 > > #6 0x7fbf1731afae in VecDuplicate_Seq > /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:807 > > #7 0x7fbf1735eff7 in VecDuplicate > /home/yyy910805/petsc/src/vec/vec/interface/vector.c:379 > > #8 0x4130de in Domain::setFields() > /home/yyy910805/scycle/source/domain.cpp:431 > > #9 0x40c60a in Domain::Domain(char const*) > /home/yyy910805/scycle/source/domain.cpp:57 > > #10 0x40b433 in main /home/yyy910805/scycle/source/main.cpp:242 > > #11 0x7fbf14d2082f in __libc_start_main > (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) > > > > SUMMARY: AddressSanitizer: bad-free ??:0 __interceptor_free > > ==1719==ABORTING > > > > Thanks very much! > > Yuyun > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From t.appel17 at imperial.ac.uk Sun Mar 3 15:04:59 2019 From: t.appel17 at imperial.ac.uk (Appel, Thibaut) Date: Sun, 3 Mar 2019 21:04:59 +0000 Subject: [petsc-users] Get global column indices thay belong to another process? Message-ID: <538C8392-1829-4D40-8C48-BC1E8D13CE89@imperial.ac.uk> Assuming you preallocate/assemble a MPIAIJ matrix ?by hand? using a ISLocalToGlobalMapping object obtained from DMDA: what?s the easiest way to get global column indices that do not belong to a process? Say, if you have a finite-difference stencil and the stencil exceeds the portion owned by the process. I think the way to go is to preallocate a MPIAIJ matrix using local information with MatPreallocateSetLocal and then fill values with MatSetLocal, those routines can?t work with rows/columns not owned by the process right? Thank you, Thibaut From yyang85 at stanford.edu Sun Mar 3 15:21:23 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Sun, 3 Mar 2019 21:21:23 +0000 Subject: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed In-Reply-To: References: Message-ID: Yes, please see the attached files for a minimal example. Thanks a lot! Best regards, Yuyun From: Matthew Knepley Sent: Sunday, March 3, 2019 12:46 PM To: Yuyun Yang Cc: Zhang, Junchao ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed On Sun, Mar 3, 2019 at 3:05 PM Yuyun Yang > wrote: Actually, I tried just creating a domain object (since the address sanitizer was complaining about that code to start with). Simply creating that object gave me a core dump, so I suppose the issue must be there. I got the following message when running the code with -objects_dump flag on the command line, but couldn?t find a problem with the code (I?ve attached it here with only the relevant functions). I think what we are going to need from you is a minimal example that has the error. I am guessing you have a logic bug in the C++, which we cannot debug by looking. Thanks, Matt Thanks a lot for your help! The following objects were never freed ----------------------------------------- [0] Vec seq y [0] VecCreate() in /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c [0] Vec seq Vec_0x84000000_0 [0] VecCreate() in /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c [0] Vec seq Vec_0x84000000_1 [0] VecCreate() in /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c [0] VecScatter seq VecScatter_0x84000000_2 [0] VecScatterCreate() in /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c [0] VecScatter seq VecScatter_0x84000000_3 [0] VecScatterCreate() in /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c [0] VecScatter seq VecScatter_0x84000000_4 [0] VecScatterCreate() in /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c [0] VecScatter seq VecScatter_0x84000000_5 [0] VecScatterCreate() in /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c Attempting to use an MPI routine after finalizing MPICH ------------------------------------------------------------------------------------------------------------------------------------------------- From: Matthew Knepley > Sent: Sunday, March 3, 2019 11:28 AM To: Yuyun Yang > Cc: Zhang, Junchao >; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed On Sun, Mar 3, 2019 at 1:03 PM Yuyun Yang via petsc-users > wrote: I tried compiling without the sanitizer and running on valgrind. Got a bunch of errors ?Uninitialised value was created by a stack allocation at 0x41B280: ComputeVel_qd::computeVel(double*, double, int&, int)?. There is no memory management code here, so other parts of the code must be relevant. Thanks, Matt HEAP SUMMARY: ==74== in use at exit: 96,637 bytes in 91 blocks ==74== total heap usage: 47,774 allocs, 47,522 frees, 308,253,653 bytes allocated LEAK SUMMARY: ==74== definitely lost: 0 bytes in 0 blocks ==74== indirectly lost: 0 bytes in 0 blocks ==74== possibly lost: 0 bytes in 0 blocks ==74== still reachable: 96,637 bytes in 91 blocks ==74== suppressed: 0 bytes in 0 blocks The error is located in the attached code (I?ve extracted only the relevant functions), but I couldn?t figure out what is wrong. Is this causing the memory corruption/double free error that happens when I execute the code? Thanks a lot for your help. Best regards, Yuyun From: Zhang, Junchao > Sent: Friday, March 1, 2019 7:36 AM To: Yuyun Yang > Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed On Fri, Mar 1, 2019 at 1:02 AM Yuyun Yang > wrote: Actually, I also saw a line at the beginning of valgrind saying "shadow memory range interleaves with an existing memory mapping. ASan cannot proceed properly. ABORTING." I guess the code didn't really run through valgrind since it aborted. Should I remove the address sanitizer flag when compiling? From the message, it seems ASan (not valgrind) aborted. You can try to compile without sanitizer and then run with valgrind. If no problem, then it is probably a sanitizer issue. Get Outlook for iOS ________________________________ From: Yuyun Yang Sent: Thursday, February 28, 2019 10:54:57 PM To: Zhang, Junchao Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed Hmm, still getting the same error from address sanitizer even though valgrind shows no errors and no leaks are possible. Should I ignore that error? My results did run alright. Best, Yuyun Get Outlook for iOS ________________________________ From: Zhang, Junchao > Sent: Wednesday, February 27, 2019 8:27:17 PM To: Yuyun Yang Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed Try the following to see if you can catch the bug easily: 1) Get error code for each petsc function and check it with CHKERRQ; 2) Link your code with a petsc library with debugging enabled (configured with --with-debugging=1); 3) Run your code with valgrind --Junchao Zhang On Wed, Feb 27, 2019 at 9:04 PM Yuyun Yang > wrote: Hi Junchao, This code actually involves a lot of classes and is pretty big. Might be an overkill for me to send everything to you. I'd like to know if I see this sort of error message, which points to this domain file, is it possible that the problem happens in another file (whose operations are linked to this one)? If so, I'll debug a little more and maybe send you more useful information later. Best regards, Yuyun Get Outlook for iOS ________________________________ From: Zhang, Junchao > Sent: Wednesday, February 27, 2019 6:24:13 PM To: Yuyun Yang Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed Could you provide a compilable and runnable test so I can try it? --Junchao Zhang On Wed, Feb 27, 2019 at 7:34 PM Yuyun Yang > wrote: Thanks, I fixed that, but I?m not actually calling the testScatters() function in my implementation (in the constructor, the only functions I called are setFields and setScatters). So the problem couldn?t have been that? Best, Yuyun From: Zhang, Junchao > Sent: Wednesday, February 27, 2019 10:50 AM To: Yuyun Yang > Cc: Matthew Knepley >; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed On Wed, Feb 27, 2019 at 10:41 AM Yuyun Yang via petsc-users > wrote: I called VecDestroy() in the destructor for this object ? is that not the right way to do it? In Domain::testScatters(), you have many VecDuplicate(,&out), You need to VecDestroy(&out) before doing new VecDuplicate(,&out); How do I implement CHECK ALL RETURN CODES? For each PETSc function, do ierr = ...; CHKERRQ(ierr); From: Matthew Knepley > Sent: Wednesday, February 27, 2019 7:24 AM To: Yuyun Yang > Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed You call VecDuplicate() a bunch, but VecDestroy() only once in the bottom function. This is wrong. Also, CHECK ALL RETURN CODES. This is the fastest way to find errors. Matt On Wed, Feb 27, 2019 at 2:06 AM Yuyun Yang via petsc-users > wrote: Hello team, I ran into the address sanitizer error that I hope you could help me with. I don?t really know what?s wrong with the way the code frees memory. The relevant code file is attached. The line number following domain.cpp specifically referenced to the vector _q, which seems a little odd, since some other vectors are constructed and freed the same way. ==1719==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x61f0000076c0 in thread T0 #0 0x7fbf195282ca in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca) #1 0x7fbf1706f895 in PetscFreeAlign /home/yyy910805/petsc/src/sys/memory/mal.c:87 #2 0x7fbf1731a898 in VecDestroy_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:788 #3 0x7fbf1735f795 in VecDestroy /home/yyy910805/petsc/src/vec/vec/interface/vector.c:408 #4 0x40dd0a in Domain::~Domain() /home/yyy910805/scycle/source/domain.cpp:132 #5 0x40b479 in main /home/yyy910805/scycle/source/main.cpp:242 #6 0x7fbf14d2082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) #7 0x4075d8 in _start (/home/yyy910805/scycle/source/main+0x4075d8) 0x61f0000076c0 is located 1600 bytes inside of 3220-byte region [0x61f000007080,0x61f000007d14) allocated by thread T0 here: #0 0x7fbf19528b32 in __interceptor_memalign (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98b32) #1 0x7fbf1706f7e0 in PetscMallocAlign /home/yyy910805/petsc/src/sys/memory/mal.c:41 #2 0x7fbf17073022 in PetscTrMallocDefault /home/yyy910805/petsc/src/sys/memory/mtr.c:183 #3 0x7fbf170710a1 in PetscMallocA /home/yyy910805/petsc/src/sys/memory/mal.c:397 #4 0x7fbf17326fb0 in VecCreate_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec3.c:35 #5 0x7fbf1736f560 in VecSetType /home/yyy910805/petsc/src/vec/vec/interface/vecreg.c:51 #6 0x7fbf1731afae in VecDuplicate_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:807 #7 0x7fbf1735eff7 in VecDuplicate /home/yyy910805/petsc/src/vec/vec/interface/vector.c:379 #8 0x4130de in Domain::setFields() /home/yyy910805/scycle/source/domain.cpp:431 #9 0x40c60a in Domain::Domain(char const*) /home/yyy910805/scycle/source/domain.cpp:57 #10 0x40b433 in main /home/yyy910805/scycle/source/main.cpp:242 #11 0x7fbf14d2082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) SUMMARY: AddressSanitizer: bad-free ??:0 __interceptor_free ==1719==ABORTING Thanks very much! Yuyun -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: domain.cpp URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: domain.hpp Type: application/octet-stream Size: 710 bytes Desc: domain.hpp URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: makefile Type: application/octet-stream Size: 624 bytes Desc: makefile URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test_domain.cpp URL: From knepley at gmail.com Sun Mar 3 15:36:15 2019 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 3 Mar 2019 16:36:15 -0500 Subject: [petsc-users] Get global column indices thay belong to another process? In-Reply-To: <538C8392-1829-4D40-8C48-BC1E8D13CE89@imperial.ac.uk> References: <538C8392-1829-4D40-8C48-BC1E8D13CE89@imperial.ac.uk> Message-ID: On Sun, Mar 3, 2019 at 4:06 PM Appel, Thibaut via petsc-users < petsc-users at mcs.anl.gov> wrote: > Assuming you preallocate/assemble a MPIAIJ matrix ?by hand? using a > ISLocalToGlobalMapping object obtained from DMDA: what?s the easiest way to > get global column indices that do not belong to a process? > > Say, if you have a finite-difference stencil and the stencil exceeds the > portion owned by the process. > > I think the way to go is to preallocate a MPIAIJ matrix using local > information with MatPreallocateSetLocal and then fill values with > MatSetLocal, those routines can?t work with rows/columns not owned by the > process right? > I can't quite understand what you want. Are you saying: Suppose I create a LocalToGlobalMapping, but I still want to set values on another process that are not in my map? If you have no other structure, then you just need to know the global indices for the element. It sounds like then you say, if I am using a DMDA and I declare a stencil width of s, but want to set value for point greater than s away from the block I own, can I determine those indices? Yes, its just a pain. You can get everything you need from DMDAInfo and DMDALocalInfo, but we have no function to do it. Matt > Thank you, > > Thibaut -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Mar 3 16:01:21 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sun, 3 Mar 2019 22:01:21 +0000 Subject: [petsc-users] Get global column indices thay belong to another process? In-Reply-To: <538C8392-1829-4D40-8C48-BC1E8D13CE89@imperial.ac.uk> References: <538C8392-1829-4D40-8C48-BC1E8D13CE89@imperial.ac.uk> Message-ID: <17E78FAD-FD57-4660-9A71-A3F49E17E070@anl.gov> > On Mar 3, 2019, at 3:04 PM, Appel, Thibaut via petsc-users wrote: > > Assuming you preallocate/assemble a MPIAIJ matrix ?by hand? using a ISLocalToGlobalMapping object obtained from DMDA: what?s the easiest way to get global column indices that do not belong to a process? > > Say, if you have a finite-difference stencil and the stencil exceeds the portion owned by the process. > > I think the way to go is to preallocate a MPIAIJ matrix using local information with MatPreallocateSetLocal and then fill values with MatSetLocal, those routines can?t work with rows/columns not owned by the process right? Yes, MatSetValuesLocal() cannot write for matrix locations that are NOT in the ghosted region of the process. > > Thank you, > > Thibaut From jed at jedbrown.org Sun Mar 3 16:19:05 2019 From: jed at jedbrown.org (Jed Brown) Date: Sun, 03 Mar 2019 15:19:05 -0700 Subject: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed In-Reply-To: References: Message-ID: <878sxvy52e.fsf@jedbrown.org> If you run this with MPICH, it prints Attempting to use an MPI routine after finalizing MPICH You need to ensure that the C++ class destructor is called before PetscFinalize. For example, like this: diff --git i/test_domain.cpp w/test_domain.cpp index 0cfe22f..23545f2 100644 --- i/test_domain.cpp +++ w/test_domain.cpp @@ -8,11 +8,12 @@ int main(int argc, char **argv) { PetscErrorCode ierr = 0; PetscInitialize(&argc, &argv, NULL, NULL); - Domain d; + { + Domain d; - ierr = d.setFields(); CHKERRQ(ierr); - ierr = d.setScatters(); CHKERRQ(ierr); - + ierr = d.setFields(); CHKERRQ(ierr); + ierr = d.setScatters(); CHKERRQ(ierr); + } PetscFinalize(); return ierr; } Yuyun Yang via petsc-users writes: > Yes, please see the attached files for a minimal example. Thanks a lot! > > Best regards, > Yuyun > > From: Matthew Knepley > Sent: Sunday, March 3, 2019 12:46 PM > To: Yuyun Yang > Cc: Zhang, Junchao ; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed > > On Sun, Mar 3, 2019 at 3:05 PM Yuyun Yang > wrote: > Actually, I tried just creating a domain object (since the address sanitizer was complaining about that code to start with). Simply creating that object gave me a core dump, so I suppose the issue must be there. I got the following message when running the code with -objects_dump flag on the command line, but couldn?t find a problem with the code (I?ve attached it here with only the relevant functions). > > I think what we are going to need from you is a minimal example that has the error. I am guessing you have a logic > bug in the C++, which we cannot debug by looking. > > Thanks, > > Matt > > Thanks a lot for your help! > > The following objects were never freed > ----------------------------------------- > [0] Vec seq y > [0] VecCreate() in /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c > [0] Vec seq Vec_0x84000000_0 > [0] VecCreate() in /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c > [0] Vec seq Vec_0x84000000_1 > [0] VecCreate() in /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c > [0] VecScatter seq VecScatter_0x84000000_2 > [0] VecScatterCreate() in /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c > [0] VecScatter seq VecScatter_0x84000000_3 > [0] VecScatterCreate() in /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c > [0] VecScatter seq VecScatter_0x84000000_4 > [0] VecScatterCreate() in /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c > [0] VecScatter seq VecScatter_0x84000000_5 > [0] VecScatterCreate() in /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c > Attempting to use an MPI routine after finalizing MPICH > > ------------------------------------------------------------------------------------------------------------------------------------------------- > From: Matthew Knepley > > Sent: Sunday, March 3, 2019 11:28 AM > To: Yuyun Yang > > Cc: Zhang, Junchao >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed > > On Sun, Mar 3, 2019 at 1:03 PM Yuyun Yang via petsc-users > wrote: > I tried compiling without the sanitizer and running on valgrind. Got a bunch of errors ?Uninitialised value was created by a stack allocation at 0x41B280: ComputeVel_qd::computeVel(double*, double, int&, int)?. > > There is no memory management code here, so other parts of the code must be relevant. > > Thanks, > > Matt > > > HEAP SUMMARY: > ==74== in use at exit: 96,637 bytes in 91 blocks > ==74== total heap usage: 47,774 allocs, 47,522 frees, 308,253,653 bytes allocated > LEAK SUMMARY: > ==74== definitely lost: 0 bytes in 0 blocks > ==74== indirectly lost: 0 bytes in 0 blocks > ==74== possibly lost: 0 bytes in 0 blocks > ==74== still reachable: 96,637 bytes in 91 blocks > ==74== suppressed: 0 bytes in 0 blocks > > The error is located in the attached code (I?ve extracted only the relevant functions), but I couldn?t figure out what is wrong. Is this causing the memory corruption/double free error that happens when I execute the code? > > Thanks a lot for your help. > > Best regards, > Yuyun > > From: Zhang, Junchao > > Sent: Friday, March 1, 2019 7:36 AM > To: Yuyun Yang > > Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed > > > On Fri, Mar 1, 2019 at 1:02 AM Yuyun Yang > wrote: > Actually, I also saw a line at the beginning of valgrind saying "shadow memory range interleaves with an existing memory mapping. ASan cannot proceed properly. ABORTING." I guess the code didn't really run through valgrind since it aborted. Should I remove the address sanitizer flag when compiling? > From the message, it seems ASan (not valgrind) aborted. You can try to compile without sanitizer and then run with valgrind. If no problem, then it is probably a sanitizer issue. > > > Get Outlook for iOS > ________________________________ > From: Yuyun Yang > Sent: Thursday, February 28, 2019 10:54:57 PM > To: Zhang, Junchao > Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed > > Hmm, still getting the same error from address sanitizer even though valgrind shows no errors and no leaks are possible. > > Should I ignore that error? My results did run alright. > > Best, > Yuyun > > Get Outlook for iOS > ________________________________ > From: Zhang, Junchao > > Sent: Wednesday, February 27, 2019 8:27:17 PM > To: Yuyun Yang > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed > > Try the following to see if you can catch the bug easily: 1) Get error code for each petsc function and check it with CHKERRQ; 2) Link your code with a petsc library with debugging enabled (configured with --with-debugging=1); 3) Run your code with valgrind > > --Junchao Zhang > > > On Wed, Feb 27, 2019 at 9:04 PM Yuyun Yang > wrote: > Hi Junchao, > > This code actually involves a lot of classes and is pretty big. Might be an overkill for me to send everything to you. I'd like to know if I see this sort of error message, which points to this domain file, is it possible that the problem happens in another file (whose operations are linked to this one)? If so, I'll debug a little more and maybe send you more useful information later. > > Best regards, > Yuyun > > Get Outlook for iOS > ________________________________ > From: Zhang, Junchao > > Sent: Wednesday, February 27, 2019 6:24:13 PM > To: Yuyun Yang > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed > > Could you provide a compilable and runnable test so I can try it? > --Junchao Zhang > > > On Wed, Feb 27, 2019 at 7:34 PM Yuyun Yang > wrote: > Thanks, I fixed that, but I?m not actually calling the testScatters() function in my implementation (in the constructor, the only functions I called are setFields and setScatters). So the problem couldn?t have been that? > > Best, > Yuyun > > From: Zhang, Junchao > > Sent: Wednesday, February 27, 2019 10:50 AM > To: Yuyun Yang > > Cc: Matthew Knepley >; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed > > > On Wed, Feb 27, 2019 at 10:41 AM Yuyun Yang via petsc-users > wrote: > I called VecDestroy() in the destructor for this object ? is that not the right way to do it? > In Domain::testScatters(), you have many VecDuplicate(,&out), You need to VecDestroy(&out) before doing new VecDuplicate(,&out); > How do I implement CHECK ALL RETURN CODES? > For each PETSc function, do ierr = ...; CHKERRQ(ierr); > > From: Matthew Knepley > > Sent: Wednesday, February 27, 2019 7:24 AM > To: Yuyun Yang > > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed > > You call VecDuplicate() a bunch, but VecDestroy() only once in the bottom function. This is wrong. > Also, CHECK ALL RETURN CODES. This is the fastest way to find errors. > > Matt > > On Wed, Feb 27, 2019 at 2:06 AM Yuyun Yang via petsc-users > wrote: > Hello team, > > I ran into the address sanitizer error that I hope you could help me with. I don?t really know what?s wrong with the way the code frees memory. The relevant code file is attached. The line number following domain.cpp specifically referenced to the vector _q, which seems a little odd, since some other vectors are constructed and freed the same way. > > ==1719==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x61f0000076c0 in thread T0 > #0 0x7fbf195282ca in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca) > #1 0x7fbf1706f895 in PetscFreeAlign /home/yyy910805/petsc/src/sys/memory/mal.c:87 > #2 0x7fbf1731a898 in VecDestroy_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:788 > #3 0x7fbf1735f795 in VecDestroy /home/yyy910805/petsc/src/vec/vec/interface/vector.c:408 > #4 0x40dd0a in Domain::~Domain() /home/yyy910805/scycle/source/domain.cpp:132 > #5 0x40b479 in main /home/yyy910805/scycle/source/main.cpp:242 > #6 0x7fbf14d2082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) > #7 0x4075d8 in _start (/home/yyy910805/scycle/source/main+0x4075d8) > > 0x61f0000076c0 is located 1600 bytes inside of 3220-byte region [0x61f000007080,0x61f000007d14) > allocated by thread T0 here: > #0 0x7fbf19528b32 in __interceptor_memalign (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98b32) > #1 0x7fbf1706f7e0 in PetscMallocAlign /home/yyy910805/petsc/src/sys/memory/mal.c:41 > #2 0x7fbf17073022 in PetscTrMallocDefault /home/yyy910805/petsc/src/sys/memory/mtr.c:183 > #3 0x7fbf170710a1 in PetscMallocA /home/yyy910805/petsc/src/sys/memory/mal.c:397 > #4 0x7fbf17326fb0 in VecCreate_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec3.c:35 > #5 0x7fbf1736f560 in VecSetType /home/yyy910805/petsc/src/vec/vec/interface/vecreg.c:51 > #6 0x7fbf1731afae in VecDuplicate_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:807 > #7 0x7fbf1735eff7 in VecDuplicate /home/yyy910805/petsc/src/vec/vec/interface/vector.c:379 > #8 0x4130de in Domain::setFields() /home/yyy910805/scycle/source/domain.cpp:431 > #9 0x40c60a in Domain::Domain(char const*) /home/yyy910805/scycle/source/domain.cpp:57 > #10 0x40b433 in main /home/yyy910805/scycle/source/main.cpp:242 > #11 0x7fbf14d2082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) > > SUMMARY: AddressSanitizer: bad-free ??:0 __interceptor_free > ==1719==ABORTING > > Thanks very much! > Yuyun > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > #include "domain.hpp" > > using namespace std; > > Domain::Domain() > : _sbpType("mfc_coordTrans"),_Ny(201),_Nz(1),_Ly(30),_Lz(30), > _q(NULL),_r(NULL),_y(NULL),_z(NULL),_y0(NULL),_z0(NULL),_dq(-1),_dr(-1), > _bCoordTrans(5) > { > if (_Ny > 1) { > _dq = 1.0 / (_Ny - 1.0); > } > else { > _dq = 1; > } > > if (_Nz > 1) { > _dr = 1.0 / (_Nz - 1.0); > } > else { > _dr = 1; > } > } > > // destructor > Domain::~Domain() > { > // free memory > VecDestroy(&_q); > VecDestroy(&_r); > VecDestroy(&_y); > VecDestroy(&_z); > VecDestroy(&_y0); > VecDestroy(&_z0); > > // set map iterator, free memory from VecScatter > map::iterator it; > for (it = _scatters.begin(); it != _scatters.end(); it++) { > VecScatterDestroy(&(it->second)); > } > } > > // construct coordinate transform, setting vectors q, r, y, z > PetscErrorCode Domain::setFields() > { > PetscErrorCode ierr = 0; > > // generate vector _y with size _Ny*_Nz > ierr = VecCreate(PETSC_COMM_WORLD,&_y); CHKERRQ(ierr); > ierr = VecSetSizes(_y,PETSC_DECIDE,_Ny*_Nz); CHKERRQ(ierr); > ierr = VecSetFromOptions(_y); CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) _y, "y"); CHKERRQ(ierr); > > // duplicate _y into _z, _q, _r > ierr = VecDuplicate(_y,&_z); CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) _z, "z"); CHKERRQ(ierr); > ierr = VecDuplicate(_y,&_q); CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) _q, "q"); CHKERRQ(ierr); > ierr = VecDuplicate(_y,&_r); CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) _r, "r"); CHKERRQ(ierr); > > // construct coordinate transform > PetscInt Ii,Istart,Iend,Jj = 0; > PetscScalar *y,*z,*q,*r; > ierr = VecGetOwnershipRange(_q,&Istart,&Iend);CHKERRQ(ierr); > > // return pointers to local data arrays (the processor's portion of vector data) > ierr = VecGetArray(_y,&y); CHKERRQ(ierr); > ierr = VecGetArray(_z,&z); CHKERRQ(ierr); > ierr = VecGetArray(_q,&q); CHKERRQ(ierr); > ierr = VecGetArray(_r,&r); CHKERRQ(ierr); > > // set vector entries for q, r (coordinate transform) and y, z (no transform) > for (Ii=Istart; Ii q[Jj] = _dq*(Ii/_Nz); > r[Jj] = _dr*(Ii-_Nz*(Ii/_Nz)); > > // matrix-based, fully compatible, allows curvilinear coordinate transformation > if (_sbpType.compare("mfc_coordTrans") ) { > y[Jj] = (_dq*_Ly)*(Ii/_Nz); > z[Jj] = (_dr*_Lz)*(Ii-_Nz*(Ii/_Nz)); > } > else { > // hardcoded transformation (not available for z) > if (_bCoordTrans > 0) { > y[Jj] = _Ly * sinh(_bCoordTrans * q[Jj]) / sinh(_bCoordTrans); > } > // no transformation > y[Jj] = q[Jj]*_Ly; > z[Jj] = r[Jj]*_Lz; > } > Jj++; > } > > // restore arrays > ierr = VecRestoreArray(_y,&y); CHKERRQ(ierr); > ierr = VecRestoreArray(_z,&z); CHKERRQ(ierr); > ierr = VecRestoreArray(_q,&q); CHKERRQ(ierr); > ierr = VecRestoreArray(_r,&r); CHKERRQ(ierr); > > return ierr; > } > > > // scatters values from one vector to another > PetscErrorCode Domain::setScatters() { > PetscErrorCode ierr = 0; > > ierr = VecCreate(PETSC_COMM_WORLD,&_y0); CHKERRQ(ierr); > ierr = VecSetSizes(_y0,PETSC_DECIDE,_Nz); CHKERRQ(ierr); > ierr = VecSetFromOptions(_y0); CHKERRQ(ierr); > ierr = VecSet(_y0,0.0); CHKERRQ(ierr); > > ierr = VecCreate(PETSC_COMM_WORLD,&_z0); CHKERRQ(ierr); > ierr = VecSetSizes(_z0,PETSC_DECIDE,_Ny); CHKERRQ(ierr); > ierr = VecSetFromOptions(_z0); CHKERRQ(ierr); > ierr = VecSet(_z0,0.0); CHKERRQ(ierr); > > PetscInt *indices; > IS is; > ierr = PetscMalloc1(_Nz,&indices); CHKERRQ(ierr); > > // we want to scatter from index 0 to _Nz - 1, i.e. take the first _Nz components of the vector to scatter from > for (PetscInt Ii = 0; Ii<_Nz; Ii++) { > indices[Ii] = Ii; > } > > ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Nz, indices, PETSC_COPY_VALUES, &is); CHKERRQ(ierr); > ierr = VecScatterCreate(_y, is, _y0, is, &_scatters["body2L"]); CHKERRQ(ierr); > > // free memory > ierr = PetscFree(indices); CHKERRQ(ierr); > ierr = ISDestroy(&is); CHKERRQ(ierr); > > //=============================================================================== > // set up scatter context to take values for y = Ly from body field and put them on a Vec of size Nz > PetscInt *fi; > IS isf; > ierr = PetscMalloc1(_Nz,&fi); CHKERRQ(ierr); > > for (PetscInt Ii = 0; Ii<_Nz; Ii++) { > fi[Ii] = Ii + (_Ny*_Nz-_Nz); > } > ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Nz, fi, PETSC_COPY_VALUES, &isf); CHKERRQ(ierr); > > PetscInt *ti; > IS ist; > ierr = PetscMalloc1(_Nz,&ti); CHKERRQ(ierr); > for (PetscInt Ii = 0; Ii<_Nz; Ii++) { > ti[Ii] = Ii; > } > ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Nz, ti, PETSC_COPY_VALUES, &ist); CHKERRQ(ierr); > ierr = VecScatterCreate(_y, isf, _y0, ist, &_scatters["body2R"]); CHKERRQ(ierr); > > // free memory > ierr = PetscFree(fi); CHKERRQ(ierr); > ierr = PetscFree(ti); CHKERRQ(ierr); > ierr = ISDestroy(&isf); CHKERRQ(ierr); > ierr = ISDestroy(&ist); CHKERRQ(ierr); > > > //============================================================================== > IS isf2; > ierr = ISCreateStride(PETSC_COMM_WORLD, _Ny, 0, _Nz, &isf2); CHKERRQ(ierr); > > PetscInt *ti2; > IS ist2; > ierr = PetscMalloc1(_Ny,&ti2); CHKERRQ(ierr); > > for (PetscInt Ii=0; Ii<_Ny; Ii++) { > ti2[Ii] = Ii; > } > ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Ny, ti2, PETSC_COPY_VALUES, &ist2); CHKERRQ(ierr); > ierr = VecScatterCreate(_y, isf2, _z0, ist2, &_scatters["body2T"]); CHKERRQ(ierr); > > // free memory > ierr = PetscFree(ti2); CHKERRQ(ierr); > ierr = ISDestroy(&isf2); CHKERRQ(ierr); > ierr = ISDestroy(&ist2); CHKERRQ(ierr); > > > //============================================================================== > IS isf3; > ierr = ISCreateStride(PETSC_COMM_WORLD, _Ny, _Nz - 1, _Nz, &isf3); CHKERRQ(ierr); > > PetscInt *ti3; > IS ist3; > ierr = PetscMalloc1(_Ny,&ti3); CHKERRQ(ierr); > for (PetscInt Ii = 0; Ii<_Ny; Ii++) { > ti3[Ii] = Ii; > } > ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Ny, ti3, PETSC_COPY_VALUES, &ist3); CHKERRQ(ierr); > ierr = VecScatterCreate(_y, isf3, _z0, ist3, &_scatters["body2B"]); CHKERRQ(ierr); > > // free memory > ierr = PetscFree(ti3); CHKERRQ(ierr); > ierr = ISDestroy(&isf3); CHKERRQ(ierr); > ierr = ISDestroy(&ist3); CHKERRQ(ierr); > > return ierr; > } > #include "domain.hpp" > > using namespace std; > > // creates a domain object > int main(int argc, char **argv) { > > PetscErrorCode ierr = 0; > PetscInitialize(&argc, &argv, NULL, NULL); > > Domain d; > > ierr = d.setFields(); CHKERRQ(ierr); > ierr = d.setScatters(); CHKERRQ(ierr); > > PetscFinalize(); > return ierr; > } From yyang85 at stanford.edu Sun Mar 3 16:28:33 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Sun, 3 Mar 2019 22:28:33 +0000 Subject: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed In-Reply-To: <878sxvy52e.fsf@jedbrown.org> References: <878sxvy52e.fsf@jedbrown.org> Message-ID: Oh interesting, so I need to add those extra brackets around my class object and function calls. I thought the destructor is automatically at Finalize. Thanks! Yuyun -----Original Message----- From: Jed Brown Sent: Sunday, March 3, 2019 2:19 PM To: Yuyun Yang ; Matthew Knepley Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed If you run this with MPICH, it prints Attempting to use an MPI routine after finalizing MPICH You need to ensure that the C++ class destructor is called before PetscFinalize. For example, like this: diff --git i/test_domain.cpp w/test_domain.cpp index 0cfe22f..23545f2 100644 --- i/test_domain.cpp +++ w/test_domain.cpp @@ -8,11 +8,12 @@ int main(int argc, char **argv) { PetscErrorCode ierr = 0; PetscInitialize(&argc, &argv, NULL, NULL); - Domain d; + { + Domain d; - ierr = d.setFields(); CHKERRQ(ierr); - ierr = d.setScatters(); CHKERRQ(ierr); - + ierr = d.setFields(); CHKERRQ(ierr); + ierr = d.setScatters(); CHKERRQ(ierr); } PetscFinalize(); return ierr; } Yuyun Yang via petsc-users writes: > Yes, please see the attached files for a minimal example. Thanks a lot! > > Best regards, > Yuyun > > From: Matthew Knepley > Sent: Sunday, March 3, 2019 12:46 PM > To: Yuyun Yang > Cc: Zhang, Junchao ; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] AddressSanitizer: attempting free on > address which was not malloc()-ed > > On Sun, Mar 3, 2019 at 3:05 PM Yuyun Yang > wrote: > Actually, I tried just creating a domain object (since the address sanitizer was complaining about that code to start with). Simply creating that object gave me a core dump, so I suppose the issue must be there. I got the following message when running the code with -objects_dump flag on the command line, but couldn?t find a problem with the code (I?ve attached it here with only the relevant functions). > > I think what we are going to need from you is a minimal example that > has the error. I am guessing you have a logic bug in the C++, which we cannot debug by looking. > > Thanks, > > Matt > > Thanks a lot for your help! > > The following objects were never freed > ----------------------------------------- > [0] Vec seq y > [0] VecCreate() in > /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c > [0] Vec seq Vec_0x84000000_0 > [0] VecCreate() in > /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c > [0] Vec seq Vec_0x84000000_1 > [0] VecCreate() in > /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c > [0] VecScatter seq VecScatter_0x84000000_2 > [0] VecScatterCreate() in > /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c > [0] VecScatter seq VecScatter_0x84000000_3 > [0] VecScatterCreate() in > /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c > [0] VecScatter seq VecScatter_0x84000000_4 > [0] VecScatterCreate() in > /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c > [0] VecScatter seq VecScatter_0x84000000_5 > [0] VecScatterCreate() in > /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c > Attempting to use an MPI routine after finalizing MPICH > > ---------------------------------------------------------------------- > ---------------------------------------------------------------------- > ----- > From: Matthew Knepley > > Sent: Sunday, March 3, 2019 11:28 AM > To: Yuyun Yang > > Cc: Zhang, Junchao >; > petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] AddressSanitizer: attempting free on > address which was not malloc()-ed > > On Sun, Mar 3, 2019 at 1:03 PM Yuyun Yang via petsc-users > wrote: > I tried compiling without the sanitizer and running on valgrind. Got a bunch of errors ?Uninitialised value was created by a stack allocation at 0x41B280: ComputeVel_qd::computeVel(double*, double, int&, int)?. > > There is no memory management code here, so other parts of the code must be relevant. > > Thanks, > > Matt > > > HEAP SUMMARY: > ==74== in use at exit: 96,637 bytes in 91 blocks > ==74== total heap usage: 47,774 allocs, 47,522 frees, 308,253,653 bytes allocated > LEAK SUMMARY: > ==74== definitely lost: 0 bytes in 0 blocks > ==74== indirectly lost: 0 bytes in 0 blocks > ==74== possibly lost: 0 bytes in 0 blocks > ==74== still reachable: 96,637 bytes in 91 blocks > ==74== suppressed: 0 bytes in 0 blocks > > The error is located in the attached code (I?ve extracted only the relevant functions), but I couldn?t figure out what is wrong. Is this causing the memory corruption/double free error that happens when I execute the code? > > Thanks a lot for your help. > > Best regards, > Yuyun > > From: Zhang, Junchao > > Sent: Friday, March 1, 2019 7:36 AM > To: Yuyun Yang > > Subject: Re: [petsc-users] AddressSanitizer: attempting free on > address which was not malloc()-ed > > > On Fri, Mar 1, 2019 at 1:02 AM Yuyun Yang > wrote: > Actually, I also saw a line at the beginning of valgrind saying "shadow memory range interleaves with an existing memory mapping. ASan cannot proceed properly. ABORTING." I guess the code didn't really run through valgrind since it aborted. Should I remove the address sanitizer flag when compiling? > From the message, it seems ASan (not valgrind) aborted. You can try to compile without sanitizer and then run with valgrind. If no problem, then it is probably a sanitizer issue. > > > Get Outlook for iOS > ________________________________ > From: Yuyun Yang > Sent: Thursday, February 28, 2019 10:54:57 PM > To: Zhang, Junchao > Subject: Re: [petsc-users] AddressSanitizer: attempting free on > address which was not malloc()-ed > > Hmm, still getting the same error from address sanitizer even though valgrind shows no errors and no leaks are possible. > > Should I ignore that error? My results did run alright. > > Best, > Yuyun > > Get Outlook for iOS > ________________________________ > From: Zhang, Junchao > > Sent: Wednesday, February 27, 2019 8:27:17 PM > To: Yuyun Yang > Cc: Matthew Knepley; > petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] AddressSanitizer: attempting free on > address which was not malloc()-ed > > Try the following to see if you can catch the bug easily: 1) Get error > code for each petsc function and check it with CHKERRQ; 2) Link your > code with a petsc library with debugging enabled (configured with > --with-debugging=1); 3) Run your code with valgrind > > --Junchao Zhang > > > On Wed, Feb 27, 2019 at 9:04 PM Yuyun Yang > wrote: > Hi Junchao, > > This code actually involves a lot of classes and is pretty big. Might be an overkill for me to send everything to you. I'd like to know if I see this sort of error message, which points to this domain file, is it possible that the problem happens in another file (whose operations are linked to this one)? If so, I'll debug a little more and maybe send you more useful information later. > > Best regards, > Yuyun > > Get Outlook for iOS > ________________________________ > From: Zhang, Junchao > > Sent: Wednesday, February 27, 2019 6:24:13 PM > To: Yuyun Yang > Cc: Matthew Knepley; > petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] AddressSanitizer: attempting free on > address which was not malloc()-ed > > Could you provide a compilable and runnable test so I can try it? > --Junchao Zhang > > > On Wed, Feb 27, 2019 at 7:34 PM Yuyun Yang > wrote: > Thanks, I fixed that, but I?m not actually calling the testScatters() function in my implementation (in the constructor, the only functions I called are setFields and setScatters). So the problem couldn?t have been that? > > Best, > Yuyun > > From: Zhang, Junchao > > Sent: Wednesday, February 27, 2019 10:50 AM > To: Yuyun Yang > > Cc: Matthew Knepley >; > petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] AddressSanitizer: attempting free on > address which was not malloc()-ed > > > On Wed, Feb 27, 2019 at 10:41 AM Yuyun Yang via petsc-users > wrote: > I called VecDestroy() in the destructor for this object ? is that not the right way to do it? > In Domain::testScatters(), you have many VecDuplicate(,&out), You need > to VecDestroy(&out) before doing new VecDuplicate(,&out); How do I implement CHECK ALL RETURN CODES? > For each PETSc function, do ierr = ...; CHKERRQ(ierr); > > From: Matthew Knepley > > Sent: Wednesday, February 27, 2019 7:24 AM > To: Yuyun Yang > > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] AddressSanitizer: attempting free on > address which was not malloc()-ed > > You call VecDuplicate() a bunch, but VecDestroy() only once in the bottom function. This is wrong. > Also, CHECK ALL RETURN CODES. This is the fastest way to find errors. > > Matt > > On Wed, Feb 27, 2019 at 2:06 AM Yuyun Yang via petsc-users > wrote: > Hello team, > > I ran into the address sanitizer error that I hope you could help me with. I don?t really know what?s wrong with the way the code frees memory. The relevant code file is attached. The line number following domain.cpp specifically referenced to the vector _q, which seems a little odd, since some other vectors are constructed and freed the same way. > > ==1719==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x61f0000076c0 in thread T0 > #0 0x7fbf195282ca in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca) > #1 0x7fbf1706f895 in PetscFreeAlign /home/yyy910805/petsc/src/sys/memory/mal.c:87 > #2 0x7fbf1731a898 in VecDestroy_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:788 > #3 0x7fbf1735f795 in VecDestroy /home/yyy910805/petsc/src/vec/vec/interface/vector.c:408 > #4 0x40dd0a in Domain::~Domain() /home/yyy910805/scycle/source/domain.cpp:132 > #5 0x40b479 in main /home/yyy910805/scycle/source/main.cpp:242 > #6 0x7fbf14d2082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) > #7 0x4075d8 in _start > (/home/yyy910805/scycle/source/main+0x4075d8) > > 0x61f0000076c0 is located 1600 bytes inside of 3220-byte region > [0x61f000007080,0x61f000007d14) allocated by thread T0 here: > #0 0x7fbf19528b32 in __interceptor_memalign (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98b32) > #1 0x7fbf1706f7e0 in PetscMallocAlign /home/yyy910805/petsc/src/sys/memory/mal.c:41 > #2 0x7fbf17073022 in PetscTrMallocDefault /home/yyy910805/petsc/src/sys/memory/mtr.c:183 > #3 0x7fbf170710a1 in PetscMallocA /home/yyy910805/petsc/src/sys/memory/mal.c:397 > #4 0x7fbf17326fb0 in VecCreate_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec3.c:35 > #5 0x7fbf1736f560 in VecSetType /home/yyy910805/petsc/src/vec/vec/interface/vecreg.c:51 > #6 0x7fbf1731afae in VecDuplicate_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:807 > #7 0x7fbf1735eff7 in VecDuplicate /home/yyy910805/petsc/src/vec/vec/interface/vector.c:379 > #8 0x4130de in Domain::setFields() /home/yyy910805/scycle/source/domain.cpp:431 > #9 0x40c60a in Domain::Domain(char const*) /home/yyy910805/scycle/source/domain.cpp:57 > #10 0x40b433 in main /home/yyy910805/scycle/source/main.cpp:242 > #11 0x7fbf14d2082f in __libc_start_main > (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) > > SUMMARY: AddressSanitizer: bad-free ??:0 __interceptor_free > ==1719==ABORTING > > Thanks very much! > Yuyun > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ ley/> > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ ley/> > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ ley/> > #include "domain.hpp" > > using namespace std; > > Domain::Domain() > : _sbpType("mfc_coordTrans"),_Ny(201),_Nz(1),_Ly(30),_Lz(30), > _q(NULL),_r(NULL),_y(NULL),_z(NULL),_y0(NULL),_z0(NULL),_dq(-1),_dr(-1), > _bCoordTrans(5) > { > if (_Ny > 1) { > _dq = 1.0 / (_Ny - 1.0); > } > else { > _dq = 1; > } > > if (_Nz > 1) { > _dr = 1.0 / (_Nz - 1.0); > } > else { > _dr = 1; > } > } > > // destructor > Domain::~Domain() > { > // free memory > VecDestroy(&_q); > VecDestroy(&_r); > VecDestroy(&_y); > VecDestroy(&_z); > VecDestroy(&_y0); > VecDestroy(&_z0); > > // set map iterator, free memory from VecScatter > map::iterator it; > for (it = _scatters.begin(); it != _scatters.end(); it++) { > VecScatterDestroy(&(it->second)); > } > } > > // construct coordinate transform, setting vectors q, r, y, z > PetscErrorCode Domain::setFields() { > PetscErrorCode ierr = 0; > > // generate vector _y with size _Ny*_Nz > ierr = VecCreate(PETSC_COMM_WORLD,&_y); CHKERRQ(ierr); > ierr = VecSetSizes(_y,PETSC_DECIDE,_Ny*_Nz); CHKERRQ(ierr); > ierr = VecSetFromOptions(_y); CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) _y, "y"); CHKERRQ(ierr); > > // duplicate _y into _z, _q, _r > ierr = VecDuplicate(_y,&_z); CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) _z, "z"); CHKERRQ(ierr); > ierr = VecDuplicate(_y,&_q); CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) _q, "q"); CHKERRQ(ierr); > ierr = VecDuplicate(_y,&_r); CHKERRQ(ierr); > ierr = PetscObjectSetName((PetscObject) _r, "r"); CHKERRQ(ierr); > > // construct coordinate transform > PetscInt Ii,Istart,Iend,Jj = 0; > PetscScalar *y,*z,*q,*r; > ierr = VecGetOwnershipRange(_q,&Istart,&Iend);CHKERRQ(ierr); > > // return pointers to local data arrays (the processor's portion of vector data) > ierr = VecGetArray(_y,&y); CHKERRQ(ierr); > ierr = VecGetArray(_z,&z); CHKERRQ(ierr); > ierr = VecGetArray(_q,&q); CHKERRQ(ierr); > ierr = VecGetArray(_r,&r); CHKERRQ(ierr); > > // set vector entries for q, r (coordinate transform) and y, z (no transform) > for (Ii=Istart; Ii q[Jj] = _dq*(Ii/_Nz); > r[Jj] = _dr*(Ii-_Nz*(Ii/_Nz)); > > // matrix-based, fully compatible, allows curvilinear coordinate transformation > if (_sbpType.compare("mfc_coordTrans") ) { > y[Jj] = (_dq*_Ly)*(Ii/_Nz); > z[Jj] = (_dr*_Lz)*(Ii-_Nz*(Ii/_Nz)); > } > else { > // hardcoded transformation (not available for z) > if (_bCoordTrans > 0) { > y[Jj] = _Ly * sinh(_bCoordTrans * q[Jj]) / sinh(_bCoordTrans); > } > // no transformation > y[Jj] = q[Jj]*_Ly; > z[Jj] = r[Jj]*_Lz; > } > Jj++; > } > > // restore arrays > ierr = VecRestoreArray(_y,&y); CHKERRQ(ierr); > ierr = VecRestoreArray(_z,&z); CHKERRQ(ierr); > ierr = VecRestoreArray(_q,&q); CHKERRQ(ierr); > ierr = VecRestoreArray(_r,&r); CHKERRQ(ierr); > > return ierr; > } > > > // scatters values from one vector to another PetscErrorCode > Domain::setScatters() { > PetscErrorCode ierr = 0; > > ierr = VecCreate(PETSC_COMM_WORLD,&_y0); CHKERRQ(ierr); > ierr = VecSetSizes(_y0,PETSC_DECIDE,_Nz); CHKERRQ(ierr); > ierr = VecSetFromOptions(_y0); CHKERRQ(ierr); > ierr = VecSet(_y0,0.0); CHKERRQ(ierr); > > ierr = VecCreate(PETSC_COMM_WORLD,&_z0); CHKERRQ(ierr); > ierr = VecSetSizes(_z0,PETSC_DECIDE,_Ny); CHKERRQ(ierr); > ierr = VecSetFromOptions(_z0); CHKERRQ(ierr); > ierr = VecSet(_z0,0.0); CHKERRQ(ierr); > > PetscInt *indices; > IS is; > ierr = PetscMalloc1(_Nz,&indices); CHKERRQ(ierr); > > // we want to scatter from index 0 to _Nz - 1, i.e. take the first _Nz components of the vector to scatter from > for (PetscInt Ii = 0; Ii<_Nz; Ii++) { > indices[Ii] = Ii; > } > > ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Nz, indices, PETSC_COPY_VALUES, &is); CHKERRQ(ierr); > ierr = VecScatterCreate(_y, is, _y0, is, &_scatters["body2L"]); > CHKERRQ(ierr); > > // free memory > ierr = PetscFree(indices); CHKERRQ(ierr); > ierr = ISDestroy(&is); CHKERRQ(ierr); > > //=============================================================================== > // set up scatter context to take values for y = Ly from body field and put them on a Vec of size Nz > PetscInt *fi; > IS isf; > ierr = PetscMalloc1(_Nz,&fi); CHKERRQ(ierr); > > for (PetscInt Ii = 0; Ii<_Nz; Ii++) { > fi[Ii] = Ii + (_Ny*_Nz-_Nz); > } > ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Nz, fi, PETSC_COPY_VALUES, > &isf); CHKERRQ(ierr); > > PetscInt *ti; > IS ist; > ierr = PetscMalloc1(_Nz,&ti); CHKERRQ(ierr); > for (PetscInt Ii = 0; Ii<_Nz; Ii++) { > ti[Ii] = Ii; > } > ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Nz, ti, PETSC_COPY_VALUES, &ist); CHKERRQ(ierr); > ierr = VecScatterCreate(_y, isf, _y0, ist, &_scatters["body2R"]); > CHKERRQ(ierr); > > // free memory > ierr = PetscFree(fi); CHKERRQ(ierr); > ierr = PetscFree(ti); CHKERRQ(ierr); > ierr = ISDestroy(&isf); CHKERRQ(ierr); > ierr = ISDestroy(&ist); CHKERRQ(ierr); > > > //============================================================================== > IS isf2; > ierr = ISCreateStride(PETSC_COMM_WORLD, _Ny, 0, _Nz, &isf2); > CHKERRQ(ierr); > > PetscInt *ti2; > IS ist2; > ierr = PetscMalloc1(_Ny,&ti2); CHKERRQ(ierr); > > for (PetscInt Ii=0; Ii<_Ny; Ii++) { > ti2[Ii] = Ii; > } > ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Ny, ti2, PETSC_COPY_VALUES, &ist2); CHKERRQ(ierr); > ierr = VecScatterCreate(_y, isf2, _z0, ist2, &_scatters["body2T"]); > CHKERRQ(ierr); > > // free memory > ierr = PetscFree(ti2); CHKERRQ(ierr); > ierr = ISDestroy(&isf2); CHKERRQ(ierr); > ierr = ISDestroy(&ist2); CHKERRQ(ierr); > > > //============================================================================== > IS isf3; > ierr = ISCreateStride(PETSC_COMM_WORLD, _Ny, _Nz - 1, _Nz, &isf3); > CHKERRQ(ierr); > > PetscInt *ti3; > IS ist3; > ierr = PetscMalloc1(_Ny,&ti3); CHKERRQ(ierr); > for (PetscInt Ii = 0; Ii<_Ny; Ii++) { > ti3[Ii] = Ii; > } > ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Ny, ti3, PETSC_COPY_VALUES, &ist3); CHKERRQ(ierr); > ierr = VecScatterCreate(_y, isf3, _z0, ist3, &_scatters["body2B"]); > CHKERRQ(ierr); > > // free memory > ierr = PetscFree(ti3); CHKERRQ(ierr); > ierr = ISDestroy(&isf3); CHKERRQ(ierr); > ierr = ISDestroy(&ist3); CHKERRQ(ierr); > > return ierr; > } > #include "domain.hpp" > > using namespace std; > > // creates a domain object > int main(int argc, char **argv) { > > PetscErrorCode ierr = 0; > PetscInitialize(&argc, &argv, NULL, NULL); > > Domain d; > > ierr = d.setFields(); CHKERRQ(ierr); > ierr = d.setScatters(); CHKERRQ(ierr); > > PetscFinalize(); > return ierr; > } From jed at jedbrown.org Sun Mar 3 16:33:04 2019 From: jed at jedbrown.org (Jed Brown) Date: Sun, 03 Mar 2019 15:33:04 -0700 Subject: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed In-Reply-To: References: <878sxvy52e.fsf@jedbrown.org> Message-ID: <875zszy4f3.fsf@jedbrown.org> The compiler doesn't know anything special about PetscFinalize. Destructors are called after all executable statements in their scope. So you need the extra scoping if the destructor should be called earlier. Yuyun Yang writes: > Oh interesting, so I need to add those extra brackets around my class object and function calls. I thought the destructor is automatically at Finalize. > > Thanks! > Yuyun > > -----Original Message----- > From: Jed Brown > Sent: Sunday, March 3, 2019 2:19 PM > To: Yuyun Yang ; Matthew Knepley > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] AddressSanitizer: attempting free on address which was not malloc()-ed > > If you run this with MPICH, it prints > > Attempting to use an MPI routine after finalizing MPICH > > You need to ensure that the C++ class destructor is called before PetscFinalize. For example, like this: > > diff --git i/test_domain.cpp w/test_domain.cpp index 0cfe22f..23545f2 100644 > --- i/test_domain.cpp > +++ w/test_domain.cpp > @@ -8,11 +8,12 @@ int main(int argc, char **argv) { > PetscErrorCode ierr = 0; > PetscInitialize(&argc, &argv, NULL, NULL); > > - Domain d; > + { > + Domain d; > > - ierr = d.setFields(); CHKERRQ(ierr); > - ierr = d.setScatters(); CHKERRQ(ierr); > - > + ierr = d.setFields(); CHKERRQ(ierr); > + ierr = d.setScatters(); CHKERRQ(ierr); } > PetscFinalize(); > return ierr; > } > > > Yuyun Yang via petsc-users writes: > >> Yes, please see the attached files for a minimal example. Thanks a lot! >> >> Best regards, >> Yuyun >> >> From: Matthew Knepley >> Sent: Sunday, March 3, 2019 12:46 PM >> To: Yuyun Yang >> Cc: Zhang, Junchao ; petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] AddressSanitizer: attempting free on >> address which was not malloc()-ed >> >> On Sun, Mar 3, 2019 at 3:05 PM Yuyun Yang > wrote: >> Actually, I tried just creating a domain object (since the address sanitizer was complaining about that code to start with). Simply creating that object gave me a core dump, so I suppose the issue must be there. I got the following message when running the code with -objects_dump flag on the command line, but couldn?t find a problem with the code (I?ve attached it here with only the relevant functions). >> >> I think what we are going to need from you is a minimal example that >> has the error. I am guessing you have a logic bug in the C++, which we cannot debug by looking. >> >> Thanks, >> >> Matt >> >> Thanks a lot for your help! >> >> The following objects were never freed >> ----------------------------------------- >> [0] Vec seq y >> [0] VecCreate() in >> /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c >> [0] Vec seq Vec_0x84000000_0 >> [0] VecCreate() in >> /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c >> [0] Vec seq Vec_0x84000000_1 >> [0] VecCreate() in >> /home/yyy910805/petsc/src/vec/vec/interface/veccreate.c >> [0] VecScatter seq VecScatter_0x84000000_2 >> [0] VecScatterCreate() in >> /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c >> [0] VecScatter seq VecScatter_0x84000000_3 >> [0] VecScatterCreate() in >> /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c >> [0] VecScatter seq VecScatter_0x84000000_4 >> [0] VecScatterCreate() in >> /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c >> [0] VecScatter seq VecScatter_0x84000000_5 >> [0] VecScatterCreate() in >> /home/yyy910805/petsc/src/vec/vscat/interface/vscreate.c >> Attempting to use an MPI routine after finalizing MPICH >> >> ---------------------------------------------------------------------- >> ---------------------------------------------------------------------- >> ----- >> From: Matthew Knepley > >> Sent: Sunday, March 3, 2019 11:28 AM >> To: Yuyun Yang > >> Cc: Zhang, Junchao >; >> petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] AddressSanitizer: attempting free on >> address which was not malloc()-ed >> >> On Sun, Mar 3, 2019 at 1:03 PM Yuyun Yang via petsc-users > wrote: >> I tried compiling without the sanitizer and running on valgrind. Got a bunch of errors ?Uninitialised value was created by a stack allocation at 0x41B280: ComputeVel_qd::computeVel(double*, double, int&, int)?. >> >> There is no memory management code here, so other parts of the code must be relevant. >> >> Thanks, >> >> Matt >> >> >> HEAP SUMMARY: >> ==74== in use at exit: 96,637 bytes in 91 blocks >> ==74== total heap usage: 47,774 allocs, 47,522 frees, 308,253,653 bytes allocated >> LEAK SUMMARY: >> ==74== definitely lost: 0 bytes in 0 blocks >> ==74== indirectly lost: 0 bytes in 0 blocks >> ==74== possibly lost: 0 bytes in 0 blocks >> ==74== still reachable: 96,637 bytes in 91 blocks >> ==74== suppressed: 0 bytes in 0 blocks >> >> The error is located in the attached code (I?ve extracted only the relevant functions), but I couldn?t figure out what is wrong. Is this causing the memory corruption/double free error that happens when I execute the code? >> >> Thanks a lot for your help. >> >> Best regards, >> Yuyun >> >> From: Zhang, Junchao > >> Sent: Friday, March 1, 2019 7:36 AM >> To: Yuyun Yang > >> Subject: Re: [petsc-users] AddressSanitizer: attempting free on >> address which was not malloc()-ed >> >> >> On Fri, Mar 1, 2019 at 1:02 AM Yuyun Yang > wrote: >> Actually, I also saw a line at the beginning of valgrind saying "shadow memory range interleaves with an existing memory mapping. ASan cannot proceed properly. ABORTING." I guess the code didn't really run through valgrind since it aborted. Should I remove the address sanitizer flag when compiling? >> From the message, it seems ASan (not valgrind) aborted. You can try to compile without sanitizer and then run with valgrind. If no problem, then it is probably a sanitizer issue. >> >> >> Get Outlook for iOS >> ________________________________ >> From: Yuyun Yang >> Sent: Thursday, February 28, 2019 10:54:57 PM >> To: Zhang, Junchao >> Subject: Re: [petsc-users] AddressSanitizer: attempting free on >> address which was not malloc()-ed >> >> Hmm, still getting the same error from address sanitizer even though valgrind shows no errors and no leaks are possible. >> >> Should I ignore that error? My results did run alright. >> >> Best, >> Yuyun >> >> Get Outlook for iOS >> ________________________________ >> From: Zhang, Junchao > >> Sent: Wednesday, February 27, 2019 8:27:17 PM >> To: Yuyun Yang >> Cc: Matthew Knepley; >> petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] AddressSanitizer: attempting free on >> address which was not malloc()-ed >> >> Try the following to see if you can catch the bug easily: 1) Get error >> code for each petsc function and check it with CHKERRQ; 2) Link your >> code with a petsc library with debugging enabled (configured with >> --with-debugging=1); 3) Run your code with valgrind >> >> --Junchao Zhang >> >> >> On Wed, Feb 27, 2019 at 9:04 PM Yuyun Yang > wrote: >> Hi Junchao, >> >> This code actually involves a lot of classes and is pretty big. Might be an overkill for me to send everything to you. I'd like to know if I see this sort of error message, which points to this domain file, is it possible that the problem happens in another file (whose operations are linked to this one)? If so, I'll debug a little more and maybe send you more useful information later. >> >> Best regards, >> Yuyun >> >> Get Outlook for iOS >> ________________________________ >> From: Zhang, Junchao > >> Sent: Wednesday, February 27, 2019 6:24:13 PM >> To: Yuyun Yang >> Cc: Matthew Knepley; >> petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] AddressSanitizer: attempting free on >> address which was not malloc()-ed >> >> Could you provide a compilable and runnable test so I can try it? >> --Junchao Zhang >> >> >> On Wed, Feb 27, 2019 at 7:34 PM Yuyun Yang > wrote: >> Thanks, I fixed that, but I?m not actually calling the testScatters() function in my implementation (in the constructor, the only functions I called are setFields and setScatters). So the problem couldn?t have been that? >> >> Best, >> Yuyun >> >> From: Zhang, Junchao > >> Sent: Wednesday, February 27, 2019 10:50 AM >> To: Yuyun Yang > >> Cc: Matthew Knepley >; >> petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] AddressSanitizer: attempting free on >> address which was not malloc()-ed >> >> >> On Wed, Feb 27, 2019 at 10:41 AM Yuyun Yang via petsc-users > wrote: >> I called VecDestroy() in the destructor for this object ? is that not the right way to do it? >> In Domain::testScatters(), you have many VecDuplicate(,&out), You need >> to VecDestroy(&out) before doing new VecDuplicate(,&out); How do I implement CHECK ALL RETURN CODES? >> For each PETSc function, do ierr = ...; CHKERRQ(ierr); >> >> From: Matthew Knepley > >> Sent: Wednesday, February 27, 2019 7:24 AM >> To: Yuyun Yang > >> Cc: petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] AddressSanitizer: attempting free on >> address which was not malloc()-ed >> >> You call VecDuplicate() a bunch, but VecDestroy() only once in the bottom function. This is wrong. >> Also, CHECK ALL RETURN CODES. This is the fastest way to find errors. >> >> Matt >> >> On Wed, Feb 27, 2019 at 2:06 AM Yuyun Yang via petsc-users > wrote: >> Hello team, >> >> I ran into the address sanitizer error that I hope you could help me with. I don?t really know what?s wrong with the way the code frees memory. The relevant code file is attached. The line number following domain.cpp specifically referenced to the vector _q, which seems a little odd, since some other vectors are constructed and freed the same way. >> >> ==1719==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x61f0000076c0 in thread T0 >> #0 0x7fbf195282ca in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca) >> #1 0x7fbf1706f895 in PetscFreeAlign /home/yyy910805/petsc/src/sys/memory/mal.c:87 >> #2 0x7fbf1731a898 in VecDestroy_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:788 >> #3 0x7fbf1735f795 in VecDestroy /home/yyy910805/petsc/src/vec/vec/interface/vector.c:408 >> #4 0x40dd0a in Domain::~Domain() /home/yyy910805/scycle/source/domain.cpp:132 >> #5 0x40b479 in main /home/yyy910805/scycle/source/main.cpp:242 >> #6 0x7fbf14d2082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) >> #7 0x4075d8 in _start >> (/home/yyy910805/scycle/source/main+0x4075d8) >> >> 0x61f0000076c0 is located 1600 bytes inside of 3220-byte region >> [0x61f000007080,0x61f000007d14) allocated by thread T0 here: >> #0 0x7fbf19528b32 in __interceptor_memalign (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98b32) >> #1 0x7fbf1706f7e0 in PetscMallocAlign /home/yyy910805/petsc/src/sys/memory/mal.c:41 >> #2 0x7fbf17073022 in PetscTrMallocDefault /home/yyy910805/petsc/src/sys/memory/mtr.c:183 >> #3 0x7fbf170710a1 in PetscMallocA /home/yyy910805/petsc/src/sys/memory/mal.c:397 >> #4 0x7fbf17326fb0 in VecCreate_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec3.c:35 >> #5 0x7fbf1736f560 in VecSetType /home/yyy910805/petsc/src/vec/vec/interface/vecreg.c:51 >> #6 0x7fbf1731afae in VecDuplicate_Seq /home/yyy910805/petsc/src/vec/vec/impls/seq/bvec2.c:807 >> #7 0x7fbf1735eff7 in VecDuplicate /home/yyy910805/petsc/src/vec/vec/interface/vector.c:379 >> #8 0x4130de in Domain::setFields() /home/yyy910805/scycle/source/domain.cpp:431 >> #9 0x40c60a in Domain::Domain(char const*) /home/yyy910805/scycle/source/domain.cpp:57 >> #10 0x40b433 in main /home/yyy910805/scycle/source/main.cpp:242 >> #11 0x7fbf14d2082f in __libc_start_main >> (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) >> >> SUMMARY: AddressSanitizer: bad-free ??:0 __interceptor_free >> ==1719==ABORTING >> >> Thanks very much! >> Yuyun >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/> ley/> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/> ley/> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/> ley/> >> #include "domain.hpp" >> >> using namespace std; >> >> Domain::Domain() >> : _sbpType("mfc_coordTrans"),_Ny(201),_Nz(1),_Ly(30),_Lz(30), >> _q(NULL),_r(NULL),_y(NULL),_z(NULL),_y0(NULL),_z0(NULL),_dq(-1),_dr(-1), >> _bCoordTrans(5) >> { >> if (_Ny > 1) { >> _dq = 1.0 / (_Ny - 1.0); >> } >> else { >> _dq = 1; >> } >> >> if (_Nz > 1) { >> _dr = 1.0 / (_Nz - 1.0); >> } >> else { >> _dr = 1; >> } >> } >> >> // destructor >> Domain::~Domain() >> { >> // free memory >> VecDestroy(&_q); >> VecDestroy(&_r); >> VecDestroy(&_y); >> VecDestroy(&_z); >> VecDestroy(&_y0); >> VecDestroy(&_z0); >> >> // set map iterator, free memory from VecScatter >> map::iterator it; >> for (it = _scatters.begin(); it != _scatters.end(); it++) { >> VecScatterDestroy(&(it->second)); >> } >> } >> >> // construct coordinate transform, setting vectors q, r, y, z >> PetscErrorCode Domain::setFields() { >> PetscErrorCode ierr = 0; >> >> // generate vector _y with size _Ny*_Nz >> ierr = VecCreate(PETSC_COMM_WORLD,&_y); CHKERRQ(ierr); >> ierr = VecSetSizes(_y,PETSC_DECIDE,_Ny*_Nz); CHKERRQ(ierr); >> ierr = VecSetFromOptions(_y); CHKERRQ(ierr); >> ierr = PetscObjectSetName((PetscObject) _y, "y"); CHKERRQ(ierr); >> >> // duplicate _y into _z, _q, _r >> ierr = VecDuplicate(_y,&_z); CHKERRQ(ierr); >> ierr = PetscObjectSetName((PetscObject) _z, "z"); CHKERRQ(ierr); >> ierr = VecDuplicate(_y,&_q); CHKERRQ(ierr); >> ierr = PetscObjectSetName((PetscObject) _q, "q"); CHKERRQ(ierr); >> ierr = VecDuplicate(_y,&_r); CHKERRQ(ierr); >> ierr = PetscObjectSetName((PetscObject) _r, "r"); CHKERRQ(ierr); >> >> // construct coordinate transform >> PetscInt Ii,Istart,Iend,Jj = 0; >> PetscScalar *y,*z,*q,*r; >> ierr = VecGetOwnershipRange(_q,&Istart,&Iend);CHKERRQ(ierr); >> >> // return pointers to local data arrays (the processor's portion of vector data) >> ierr = VecGetArray(_y,&y); CHKERRQ(ierr); >> ierr = VecGetArray(_z,&z); CHKERRQ(ierr); >> ierr = VecGetArray(_q,&q); CHKERRQ(ierr); >> ierr = VecGetArray(_r,&r); CHKERRQ(ierr); >> >> // set vector entries for q, r (coordinate transform) and y, z (no transform) >> for (Ii=Istart; Ii> q[Jj] = _dq*(Ii/_Nz); >> r[Jj] = _dr*(Ii-_Nz*(Ii/_Nz)); >> >> // matrix-based, fully compatible, allows curvilinear coordinate transformation >> if (_sbpType.compare("mfc_coordTrans") ) { >> y[Jj] = (_dq*_Ly)*(Ii/_Nz); >> z[Jj] = (_dr*_Lz)*(Ii-_Nz*(Ii/_Nz)); >> } >> else { >> // hardcoded transformation (not available for z) >> if (_bCoordTrans > 0) { >> y[Jj] = _Ly * sinh(_bCoordTrans * q[Jj]) / sinh(_bCoordTrans); >> } >> // no transformation >> y[Jj] = q[Jj]*_Ly; >> z[Jj] = r[Jj]*_Lz; >> } >> Jj++; >> } >> >> // restore arrays >> ierr = VecRestoreArray(_y,&y); CHKERRQ(ierr); >> ierr = VecRestoreArray(_z,&z); CHKERRQ(ierr); >> ierr = VecRestoreArray(_q,&q); CHKERRQ(ierr); >> ierr = VecRestoreArray(_r,&r); CHKERRQ(ierr); >> >> return ierr; >> } >> >> >> // scatters values from one vector to another PetscErrorCode >> Domain::setScatters() { >> PetscErrorCode ierr = 0; >> >> ierr = VecCreate(PETSC_COMM_WORLD,&_y0); CHKERRQ(ierr); >> ierr = VecSetSizes(_y0,PETSC_DECIDE,_Nz); CHKERRQ(ierr); >> ierr = VecSetFromOptions(_y0); CHKERRQ(ierr); >> ierr = VecSet(_y0,0.0); CHKERRQ(ierr); >> >> ierr = VecCreate(PETSC_COMM_WORLD,&_z0); CHKERRQ(ierr); >> ierr = VecSetSizes(_z0,PETSC_DECIDE,_Ny); CHKERRQ(ierr); >> ierr = VecSetFromOptions(_z0); CHKERRQ(ierr); >> ierr = VecSet(_z0,0.0); CHKERRQ(ierr); >> >> PetscInt *indices; >> IS is; >> ierr = PetscMalloc1(_Nz,&indices); CHKERRQ(ierr); >> >> // we want to scatter from index 0 to _Nz - 1, i.e. take the first _Nz components of the vector to scatter from >> for (PetscInt Ii = 0; Ii<_Nz; Ii++) { >> indices[Ii] = Ii; >> } >> >> ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Nz, indices, PETSC_COPY_VALUES, &is); CHKERRQ(ierr); >> ierr = VecScatterCreate(_y, is, _y0, is, &_scatters["body2L"]); >> CHKERRQ(ierr); >> >> // free memory >> ierr = PetscFree(indices); CHKERRQ(ierr); >> ierr = ISDestroy(&is); CHKERRQ(ierr); >> >> //=============================================================================== >> // set up scatter context to take values for y = Ly from body field and put them on a Vec of size Nz >> PetscInt *fi; >> IS isf; >> ierr = PetscMalloc1(_Nz,&fi); CHKERRQ(ierr); >> >> for (PetscInt Ii = 0; Ii<_Nz; Ii++) { >> fi[Ii] = Ii + (_Ny*_Nz-_Nz); >> } >> ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Nz, fi, PETSC_COPY_VALUES, >> &isf); CHKERRQ(ierr); >> >> PetscInt *ti; >> IS ist; >> ierr = PetscMalloc1(_Nz,&ti); CHKERRQ(ierr); >> for (PetscInt Ii = 0; Ii<_Nz; Ii++) { >> ti[Ii] = Ii; >> } >> ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Nz, ti, PETSC_COPY_VALUES, &ist); CHKERRQ(ierr); >> ierr = VecScatterCreate(_y, isf, _y0, ist, &_scatters["body2R"]); >> CHKERRQ(ierr); >> >> // free memory >> ierr = PetscFree(fi); CHKERRQ(ierr); >> ierr = PetscFree(ti); CHKERRQ(ierr); >> ierr = ISDestroy(&isf); CHKERRQ(ierr); >> ierr = ISDestroy(&ist); CHKERRQ(ierr); >> >> >> //============================================================================== >> IS isf2; >> ierr = ISCreateStride(PETSC_COMM_WORLD, _Ny, 0, _Nz, &isf2); >> CHKERRQ(ierr); >> >> PetscInt *ti2; >> IS ist2; >> ierr = PetscMalloc1(_Ny,&ti2); CHKERRQ(ierr); >> >> for (PetscInt Ii=0; Ii<_Ny; Ii++) { >> ti2[Ii] = Ii; >> } >> ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Ny, ti2, PETSC_COPY_VALUES, &ist2); CHKERRQ(ierr); >> ierr = VecScatterCreate(_y, isf2, _z0, ist2, &_scatters["body2T"]); >> CHKERRQ(ierr); >> >> // free memory >> ierr = PetscFree(ti2); CHKERRQ(ierr); >> ierr = ISDestroy(&isf2); CHKERRQ(ierr); >> ierr = ISDestroy(&ist2); CHKERRQ(ierr); >> >> >> //============================================================================== >> IS isf3; >> ierr = ISCreateStride(PETSC_COMM_WORLD, _Ny, _Nz - 1, _Nz, &isf3); >> CHKERRQ(ierr); >> >> PetscInt *ti3; >> IS ist3; >> ierr = PetscMalloc1(_Ny,&ti3); CHKERRQ(ierr); >> for (PetscInt Ii = 0; Ii<_Ny; Ii++) { >> ti3[Ii] = Ii; >> } >> ierr = ISCreateGeneral(PETSC_COMM_WORLD, _Ny, ti3, PETSC_COPY_VALUES, &ist3); CHKERRQ(ierr); >> ierr = VecScatterCreate(_y, isf3, _z0, ist3, &_scatters["body2B"]); >> CHKERRQ(ierr); >> >> // free memory >> ierr = PetscFree(ti3); CHKERRQ(ierr); >> ierr = ISDestroy(&isf3); CHKERRQ(ierr); >> ierr = ISDestroy(&ist3); CHKERRQ(ierr); >> >> return ierr; >> } >> #include "domain.hpp" >> >> using namespace std; >> >> // creates a domain object >> int main(int argc, char **argv) { >> >> PetscErrorCode ierr = 0; >> PetscInitialize(&argc, &argv, NULL, NULL); >> >> Domain d; >> >> ierr = d.setFields(); CHKERRQ(ierr); >> ierr = d.setScatters(); CHKERRQ(ierr); >> >> PetscFinalize(); >> return ierr; >> } From bsmith at mcs.anl.gov Sun Mar 3 17:51:41 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sun, 3 Mar 2019 23:51:41 +0000 Subject: [petsc-users] Error Norm_2 In-Reply-To: References: <11F74DFF-34F4-4179-A3EB-6306730199CE@anl.gov> Message-ID: Note that your sum is over more and more values (as n gets larger) so unless the individual values are decreasing very rapidly to zero the sum will increase. You need the L2 weighted norm. If your problem is in one dimension, which it appears to be, you need to divide the VecNorm() result by the square root of n Barry Read up on some book on finite difference methods where they discuss the convergence rates and note how they define the weighted norm. > On Feb 25, 2019, at 8:37 PM, Fazlul Huq wrote: > > Hello PETSc Developers, > > Thanks for the response! > > To calculate error, I first calculate the analytical solution and put it inside vector s. > Then I took difference between analytical solution and numerical solution and put it inside vector x. > Then I calculate the NORM_2 of x. > The code is as follows: > /* > Check the error > */ > for (i = 0; i < n; i++) > { > k1 = (float) (i+1)/(n+1); > k2 = -0.5 * k1 * k1 + 5.5 * k1 + 10; > ierr = VecSetValues(s, 1, &i, &k2, INSERT_VALUES);CHKERRQ(ierr); > } > ierr = VecAXPY(x,-1.0,s);CHKERRQ(ierr); > ierr = VecNorm(x,NORM_2,&norm);CHKERRQ(ierr); > if (norm > tol) { > ierr = PetscPrintf(PETSC_COMM_WORLD,"Second Norm of error %g\n",(double)norm);CHKERRQ(ierr); > } > > Thanks again. > Sincerely, > Huq > > On Mon, Feb 25, 2019 at 5:15 PM Smith, Barry F. wrote: > > How are you computing the error norm? > > You need to use the L2 norm in the computations, not the l2 norm. > > Also you need to make sure the convergence criteria you use for the algebraic system is smaller than the descritzation error > > > Barry > > > > On Feb 25, 2019, at 1:55 PM, Fazlul Huq via petsc-users wrote: > > > > Hello PETSc Developers, > > > > I have solved a very simple poisson problem with different matrix sizes (10 to 10^7). > > But when I have compared error norm_2 for the solution, I got the attached curve. > > It looks like error norm_2 increases with increasing matrix size. Shouldn't it decrease rather with increasing matrix size? > > > > Thanks. > > > > Sincerely, > > Huq > > -- > > > > Fazlul Huq > > Graduate Research Assistant > > Department of Nuclear, Plasma & Radiological Engineering (NPRE) > > University of Illinois at Urbana-Champaign (UIUC) > > E-mail: huq2090 at gmail.com > > > > > > -- > > Fazlul Huq > Graduate Research Assistant > Department of Nuclear, Plasma & Radiological Engineering (NPRE) > University of Illinois at Urbana-Champaign (UIUC) > E-mail: huq2090 at gmail.com From k_burkart at yahoo.com Mon Mar 4 07:03:13 2019 From: k_burkart at yahoo.com (Klaus Burkart) Date: Mon, 4 Mar 2019 13:03:13 +0000 (UTC) Subject: [petsc-users] Loading only upper + MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE); References: <232628624.8168576.1551704593241.ref@mail.yahoo.com> Message-ID: <232628624.8168576.1551704593241@mail.yahoo.com> Hello, I want to solve many symmetric linear systems one after another in parallel using boomerAMG + KSPCG? and need to make the matrix transfer more efficient. Matrices are symmetric in structure and values. boomerAMG + KSPCG work fine. So far I have been loading the entire matrices but I read in a paper, that it's sufficient to load the upper part only and tell petsc that the matrix is symmetric using MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE); Unfortunately all computations fail if I load only the upper values and use MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE); The idea is: ??? if (matrix_.symmetric()) ??? { ??????? MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE); ?????? //load only upper part of the matrix MatSetValues(...) ?? }else //asymmetric matrix ?????????? {?????????????? //load the entire matrix MatSetValues(...) ?????????? } Is it possible at all? Klaus -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 4 07:45:03 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 4 Mar 2019 08:45:03 -0500 Subject: [petsc-users] Loading only upper + MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE); In-Reply-To: <232628624.8168576.1551704593241@mail.yahoo.com> References: <232628624.8168576.1551704593241.ref@mail.yahoo.com> <232628624.8168576.1551704593241@mail.yahoo.com> Message-ID: On Mon, Mar 4, 2019 at 8:03 AM Klaus Burkart via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > I want to solve many symmetric linear systems one after another in > parallel using boomerAMG + KSPCG and need to make the matrix transfer more > efficient. Matrices are symmetric in structure and values. boomerAMG + > KSPCG work fine. > Can you be more specific about where you want to save time? The best thing would be to send the output of -log_view and show us. I ask because loading a matrix from memory should be much faster than BAMG. Loading it from disk could possibly be slower, but I do not think that BAMG handles a symmetric format, so you would load a symmetric matrix and convert to AIJ. Thanks, Matt > > So far I have been loading the entire matrices but I read in a paper, that > it's sufficient to load the upper part only and tell petsc that the matrix > is symmetric using MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE); Unfortunately > all computations fail if I load only the upper values and use > MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE); > > The idea is: > > if (matrix_.symmetric()) > { > MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE); > //load only upper part of the matrix MatSetValues(...) > > }else //asymmetric matrix > { > //load the entire matrix MatSetValues(...) > } > > Is it possible at all? > > Klaus > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From cyrill.von.planta at usi.ch Mon Mar 4 10:28:04 2019 From: cyrill.von.planta at usi.ch (Cyrill Vonplanta) Date: Mon, 4 Mar 2019 16:28:04 +0000 Subject: [petsc-users] Compute the sum of the absolute values of the off-block diagonal entries of each row Message-ID: <8F0CF764-20D3-434E-831F-D1E8C8A4BFF9@usi.ch> Dear Petsc Users, I am trying to implement a variant of the $l^1$-Gauss-Seidel smoother from https://doi.org/10.1137/100798806 (eq. 6.1 and below). One of the main issues is that I need to compute the sum $\sum_j |a_{i_j}|$ of the matrix entries that are not part of the local diagonal block. I was looking for something like MatGetRowSumAbs but it looks like it hasn't been made yet. I guess i have to come up with something myself, but would you know of some workaround for this without going too deep into PETCs? Best Cyrill From knepley at gmail.com Mon Mar 4 10:38:01 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 4 Mar 2019 11:38:01 -0500 Subject: [petsc-users] Compute the sum of the absolute values of the off-block diagonal entries of each row In-Reply-To: <8F0CF764-20D3-434E-831F-D1E8C8A4BFF9@usi.ch> References: <8F0CF764-20D3-434E-831F-D1E8C8A4BFF9@usi.ch> Message-ID: On Mon, Mar 4, 2019 at 11:28 AM Cyrill Vonplanta via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear Petsc Users, > > I am trying to implement a variant of the $l^1$-Gauss-Seidel smoother from > https://doi.org/10.1137/100798806 (eq. 6.1 and below). One of the main > issues is that I need to compute the sum $\sum_j |a_{i_j}|$ of the matrix > entries that are not part of the local diagonal block. I was looking for > something like MatGetRowSumAbs but it looks like it hasn't been made yet. > > I guess i have to come up with something myself, but would you know of > some workaround for this without going too deep into PETCs? > MatGetOwnershipRange(A, &rS, &rE); for (r = rS; r < rE; ++r) { sum = 0.0; MatGetRow(A, r, &ncols, &cols, &vals); for (c = 0; c < ncols; ++c) if ((cols[c] < rS) || (cols[c] >= rE)) sum += PetscAbsScalar(vals[c]); } Thanks, Matt > Best Cyrill > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Mon Mar 4 11:21:43 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Mon, 4 Mar 2019 17:21:43 +0000 Subject: [petsc-users] Compute the sum of the absolute values of the off-block diagonal entries of each row In-Reply-To: References: <8F0CF764-20D3-434E-831F-D1E8C8A4BFF9@usi.ch> Message-ID: On Mon, Mar 4, 2019 at 10:39 AM Matthew Knepley via petsc-users > wrote: On Mon, Mar 4, 2019 at 11:28 AM Cyrill Vonplanta via petsc-users > wrote: Dear Petsc Users, I am trying to implement a variant of the $l^1$-Gauss-Seidel smoother from https://doi.org/10.1137/100798806 (eq. 6.1 and below). One of the main issues is that I need to compute the sum $\sum_j |a_{i_j}|$ of the matrix entries that are not part of the local diagonal block. I was looking for something like MatGetRowSumAbs but it looks like it hasn't been made yet. I guess i have to come up with something myself, but would you know of some workaround for this without going too deep into PETCs? MatGetOwnershipRange(A, &rS, &rE); for (r = rS; r < rE; ++r) { sum = 0.0; MatGetRow(A, r, &ncols, &cols, &vals); for (c = 0; c < ncols; ++c) if ((cols[c] < rS) || (cols[c] >= rE)) sum += PetscAbsScalar(vals[c]); } Perhaps PETSc should have a MatGetRemoteRow (or MatGetRowOffDiagonalBlock) (A, r, &ncols, &cols, &vals). MatGetRow() internally has to allocate memory and sort indices and values from local diagonal block and off-diagonal block. It is totally a waste in this case -- users do not care column indices and the local block. With MatGetRemoteRow(A, r, &ncols, NULL, &vals), PETSc just needs to set an integer and a pointer. Thanks, Matt Best Cyrill -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon Mar 4 16:57:32 2019 From: jed at jedbrown.org (Jed Brown) Date: Mon, 04 Mar 2019 15:57:32 -0700 Subject: [petsc-users] Compute the sum of the absolute values of the off-block diagonal entries of each row In-Reply-To: References: <8F0CF764-20D3-434E-831F-D1E8C8A4BFF9@usi.ch> Message-ID: <871s3mtfhf.fsf@jedbrown.org> "Zhang, Junchao via petsc-users" writes: > Perhaps PETSc should have a MatGetRemoteRow (or > MatGetRowOffDiagonalBlock) (A, r, &ncols, &cols, &vals). MatGetRow() > internally has to allocate memory and sort indices and values from > local diagonal block and off-diagonal block. It is totally a waste in > this case -- users do not care column indices and the local block. > With MatGetRemoteRow(A, r, &ncols, NULL, &vals), PETSc just needs to > set an integer and a pointer. I'm not wild about the resulting programming model, which is so intimately tied to PETSc *AIJ storage conventions yet also likely not efficient for operations like SOR. Perhaps PETSc MatSOR should be taught about a supplemental diagonal, such as produced by the "l^1" scheme? From bsmith at mcs.anl.gov Mon Mar 4 17:10:46 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 4 Mar 2019 23:10:46 +0000 Subject: [petsc-users] Compute the sum of the absolute values of the off-block diagonal entries of each row In-Reply-To: References: <8F0CF764-20D3-434E-831F-D1E8C8A4BFF9@usi.ch> Message-ID: <240BC6C7-2AE9-4EB4-80BC-B9E6A85D2BD0@anl.gov> How about something like, MatMPIAIJGetSeqAIJ(A,NULL,&Ao,NULL); > MatGetOwnershipRange(A, &rS, &rE); > for (r = 0; r < rE-rS; ++r) { > sum = 0.0; > MatGetRow(Ao, r, &ncols, NULL, &vals); > for (c = 0; c < ncols; ++c) sum += PetscAbsScalar(vals[c]); // do what you need with sum > } Barry > On Mar 4, 2019, at 10:38 AM, Matthew Knepley via petsc-users wrote: > > On Mon, Mar 4, 2019 at 11:28 AM Cyrill Vonplanta via petsc-users wrote: > Dear Petsc Users, > > I am trying to implement a variant of the $l^1$-Gauss-Seidel smoother from https://doi.org/10.1137/100798806 (eq. 6.1 and below). One of the main issues is that I need to compute the sum $\sum_j |a_{i_j}|$ of the matrix entries that are not part of the local diagonal block. I was looking for something like MatGetRowSumAbs but it looks like it hasn't been made yet. > > I guess i have to come up with something myself, but would you know of some workaround for this without going too deep into PETCs? > > MatGetOwnershipRange(A, &rS, &rE); > for (r = rS; r < rE; ++r) { > sum = 0.0; > MatGetRow(A, r, &ncols, &cols, &vals); > for (c = 0; c < ncols; ++c) if ((cols[c] < rS) || (cols[c] >= rE)) sum += PetscAbsScalar(vals[c]); > } > > Thanks, > > Matt > > Best Cyrill > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From cyrill.von.planta at usi.ch Tue Mar 5 03:08:45 2019 From: cyrill.von.planta at usi.ch (Cyrill Vonplanta) Date: Tue, 5 Mar 2019 09:08:45 +0000 Subject: [petsc-users] Compute the sum of the absolute values of the off-block diagonal entries of each row In-Reply-To: <240BC6C7-2AE9-4EB4-80BC-B9E6A85D2BD0@anl.gov> References: <8F0CF764-20D3-434E-831F-D1E8C8A4BFF9@usi.ch> <240BC6C7-2AE9-4EB4-80BC-B9E6A85D2BD0@anl.gov> Message-ID: <60DE852B-A7F4-46EB-BEEC-6CAF6E8A5A46@usi.ch> Yes, this does the trick for me. Thanks. Thx Cyrill > On 5 Mar 2019, at 00:10, Smith, Barry F. wrote: > > > How about something like, > > MatMPIAIJGetSeqAIJ(A,NULL,&Ao,NULL); > >> MatGetOwnershipRange(A, &rS, &rE); >> for (r = 0; r < rE-rS; ++r) { >> sum = 0.0; >> MatGetRow(Ao, r, &ncols, NULL, &vals); >> for (c = 0; c < ncols; ++c) sum += PetscAbsScalar(vals[c]); > // do what you need with sum >> } > > > Barry > > > >> On Mar 4, 2019, at 10:38 AM, Matthew Knepley via petsc-users wrote: >> >> On Mon, Mar 4, 2019 at 11:28 AM Cyrill Vonplanta via petsc-users wrote: >> Dear Petsc Users, >> >> I am trying to implement a variant of the $l^1$-Gauss-Seidel smoother from https://doi.org/10.1137/100798806 (eq. 6.1 and below). One of the main issues is that I need to compute the sum $\sum_j |a_{i_j}|$ of the matrix entries that are not part of the local diagonal block. I was looking for something like MatGetRowSumAbs but it looks like it hasn't been made yet. >> >> I guess i have to come up with something myself, but would you know of some workaround for this without going too deep into PETCs? >> >> MatGetOwnershipRange(A, &rS, &rE); >> for (r = rS; r < rE; ++r) { >> sum = 0.0; >> MatGetRow(A, r, &ncols, &cols, &vals); >> for (c = 0; c < ncols; ++c) if ((cols[c] < rS) || (cols[c] >= rE)) sum += PetscAbsScalar(vals[c]); >> } >> >> Thanks, >> >> Matt >> >> Best Cyrill >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ > From knepley at gmail.com Tue Mar 5 07:06:41 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 5 Mar 2019 08:06:41 -0500 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: Message-ID: On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette < myriam.peyrounette at idris.fr> wrote: > Hi Matt, > > I plotted the memory scalings using different threshold values. The two > scalings are slightly translated (from -22 to -88 mB) but this gain is > neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling > deteriorates. > > Do you have any other suggestion? > > Mark, what is the option she can give to output all the GAMG data? Also, run using -ksp_view. GAMG will report all the sizes of its grids, so it should be easy to see if the coarse grid sizes are increasing, and also what the effect of the threshold value is. Thanks, Matt > Thanks > Myriam > > Le 03/02/19 ? 02:27, Matthew Knepley a ?crit : > > On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hi, >> >> I used to run my code with PETSc 3.6. Since I upgraded the PETSc version >> to 3.10, this code has a bad memory scaling. >> >> To report this issue, I took the PETSc script ex42.c and slightly >> modified it so that the KSP and PC configurations are the same as in my >> code. In particular, I use a "personnalised" multi-grid method. The >> modifications are indicated by the keyword "TopBridge" in the attached >> scripts. >> >> To plot the memory (weak) scaling, I ran four calculations for each >> script with increasing problem sizes and computations cores: >> >> 1. 100,000 elts on 4 cores >> 2. 1 million elts on 40 cores >> 3. 10 millions elts on 400 cores >> 4. 100 millions elts on 4,000 cores >> >> The resulting graph is also attached. The scaling using PETSc 3.10 >> clearly deteriorates for large cases, while the one using PETSc 3.6 is >> robust. >> >> After a few tests, I found that the scaling is mostly sensitive to the >> use of the AMG method for the coarse grid (line 1780 in >> main_ex42_petsc36.cc). In particular, the performance strongly >> deteriorates when commenting lines 1777 to 1790 (in main_ex42_petsc36.cc). >> >> Do you have any idea of what changed between version 3.6 and version >> 3.10 that may imply such degradation? >> > > I believe the default values for PCGAMG changed between versions. It > sounds like the coarsening rate > is not great enough, so that these grids are too large. This can be set > using: > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html > > There is some explanation of this effect on that page. Let us know if > setting this does not correct the situation. > > Thanks, > > Matt > > >> Let me know if you need further information. >> >> Best, >> >> Myriam Peyrounette >> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From myriam.peyrounette at idris.fr Tue Mar 5 06:14:09 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Tue, 5 Mar 2019 13:14:09 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: Message-ID: Hi Matt, I plotted the memory scalings using different threshold values. The two scalings are slightly translated (from -22 to -88 mB) but this gain is neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling deteriorates. Do you have any other suggestion? Thanks Myriam Le 03/02/19 ? 02:27, Matthew Knepley a ?crit?: > On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users > > wrote: > > Hi, > > I used to run my code with PETSc 3.6. Since I upgraded the PETSc > version > to 3.10, this code has a bad memory scaling. > > To report this issue, I took the PETSc script ex42.c and slightly > modified it so that the KSP and PC configurations are the same as > in my > code. In particular, I use a "personnalised" multi-grid method. The > modifications are indicated by the keyword "TopBridge" in the attached > scripts. > > To plot the memory (weak) scaling, I ran four calculations for each > script with increasing problem sizes and computations cores: > > 1. 100,000 elts on 4 cores > 2. 1 million elts on 40 cores > 3. 10 millions elts on 400 cores > 4. 100 millions elts on 4,000 cores > > The resulting graph is also attached. The scaling using PETSc 3.10 > clearly deteriorates for large cases, while the one using PETSc 3.6 is > robust. > > After a few tests, I found that the scaling is mostly sensitive to the > use of the AMG method for the coarse grid (line 1780 in > main_ex42_petsc36.cc). In particular, the performance strongly > deteriorates when commenting lines 1777 to 1790 (in > main_ex42_petsc36.cc). > > Do you have any idea of what changed between version 3.6 and version > 3.10 that may imply such degradation? > > > I believe the default values for PCGAMG changed between versions. It > sounds like the coarsening rate > is not great enough, so that these grids are too large. This can be > set using: > > ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html > > There is some explanation of this effect on that page. Let us know if > setting this does not correct the situation. > > ? Thanks, > > ? ? ?Matt > ? > > Let me know if you need further information. > > Best, > > Myriam Peyrounette > > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From gang.lu at geo.uib.no Mon Mar 4 04:59:48 2019 From: gang.lu at geo.uib.no (GangLu) Date: Mon, 4 Mar 2019 11:59:48 +0100 Subject: [petsc-users] streams test on hpc Message-ID: Hi all, When installing petsc, there is a stream test that is quite useful. Is it possible to run such test in batch mode, e.g. using pbs script? Thanks. cheers, Gang From jed at jedbrown.org Tue Mar 5 09:21:47 2019 From: jed at jedbrown.org (Jed Brown) Date: Tue, 05 Mar 2019 08:21:47 -0700 Subject: [petsc-users] streams test on hpc In-Reply-To: References: Message-ID: <87woldqrck.fsf@jedbrown.org> Of course, just as you would run any other MPI application. GangLu via petsc-users writes: > Hi all, > > When installing petsc, there is a stream test that is quite useful. > > Is it possible to run such test in batch mode, e.g. using pbs script? > > Thanks. > > cheers, > > Gang From knepley at gmail.com Tue Mar 5 09:26:05 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 5 Mar 2019 10:26:05 -0500 Subject: [petsc-users] streams test on hpc In-Reply-To: <87woldqrck.fsf@jedbrown.org> References: <87woldqrck.fsf@jedbrown.org> Message-ID: There is a make target in that benchmarks directory that just builds the executable. Then you can submit that. Matt On Tue, Mar 5, 2019 at 10:21 AM Jed Brown via petsc-users < petsc-users at mcs.anl.gov> wrote: > Of course, just as you would run any other MPI application. > > GangLu via petsc-users writes: > > > Hi all, > > > > When installing petsc, there is a stream test that is quite useful. > > > > Is it possible to run such test in batch mode, e.g. using pbs script? > > > > Thanks. > > > > cheers, > > > > Gang > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From myriam.peyrounette at idris.fr Tue Mar 5 10:53:47 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Tue, 5 Mar 2019 17:53:47 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: Message-ID: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> I used PCView to display the size of the linear system in each level of the MG. You'll find the outputs attached to this mail (zip file) for both the default threshold value and a value of 0.1, and for both 3.6 and 3.10 PETSc versions. For convenience, I summarized the information in a graph, also attached (png file). As you can see, there are slight differences between the two versions but none is critical, in my opinion. Do you see anything suspicious in the outputs? + I can't find the default threshold value. Do you know where I can find it? Thanks for the follow-up Myriam Le 03/05/19 ? 14:06, Matthew Knepley a ?crit?: > On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette > > wrote: > > Hi Matt, > > I plotted the memory scalings using different threshold values. > The two scalings are slightly translated (from -22 to -88 mB) but > this gain is neglectable. The 3.6-scaling keeps being robust while > the 3.10-scaling deteriorates. > > Do you have any other suggestion? > > Mark, what is the option she can give to output all the GAMG data? > > Also, run using -ksp_view. GAMG will report all the sizes of its > grids, so it should be easy to see > if the coarse grid sizes are increasing, and also what the effect of > the threshold value is. > > ? Thanks, > > ? ? ?Matt? > > Thanks > > Myriam > > Le 03/02/19 ? 02:27, Matthew Knepley a ?crit?: >> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via >> petsc-users > > wrote: >> >> Hi, >> >> I used to run my code with PETSc 3.6. Since I upgraded the >> PETSc version >> to 3.10, this code has a bad memory scaling. >> >> To report this issue, I took the PETSc script ex42.c and slightly >> modified it so that the KSP and PC configurations are the >> same as in my >> code. In particular, I use a "personnalised" multi-grid >> method. The >> modifications are indicated by the keyword "TopBridge" in the >> attached >> scripts. >> >> To plot the memory (weak) scaling, I ran four calculations >> for each >> script with increasing problem sizes and computations cores: >> >> 1. 100,000 elts on 4 cores >> 2. 1 million elts on 40 cores >> 3. 10 millions elts on 400 cores >> 4. 100 millions elts on 4,000 cores >> >> The resulting graph is also attached. The scaling using PETSc >> 3.10 >> clearly deteriorates for large cases, while the one using >> PETSc 3.6 is >> robust. >> >> After a few tests, I found that the scaling is mostly >> sensitive to the >> use of the AMG method for the coarse grid (line 1780 in >> main_ex42_petsc36.cc). In particular, the performance strongly >> deteriorates when commenting lines 1777 to 1790 (in >> main_ex42_petsc36.cc). >> >> Do you have any idea of what changed between version 3.6 and >> version >> 3.10 that may imply such degradation? >> >> >> I believe the default values for PCGAMG changed between versions. >> It sounds like the coarsening rate >> is not great enough, so that these grids are too large. This can >> be set using: >> >> ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >> >> There is some explanation of this effect on that page. Let us >> know if setting this does not correct the situation. >> >> ? Thanks, >> >> ? ? ?Matt >> ? >> >> Let me know if you need further information. >> >> Best, >> >> Myriam Peyrounette >> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PCView_GAMG_threshold_investigation.zip Type: application/zip Size: 6924 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: pcview_gamg_threshold_investigation.png Type: image/png Size: 23243 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From knepley at gmail.com Tue Mar 5 12:42:59 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 5 Mar 2019 13:42:59 -0500 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> Message-ID: On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette < myriam.peyrounette at idris.fr> wrote: > I used PCView to display the size of the linear system in each level of > the MG. You'll find the outputs attached to this mail (zip file) for both > the default threshold value and a value of 0.1, and for both 3.6 and 3.10 > PETSc versions. > > For convenience, I summarized the information in a graph, also attached > (png file). > > Great! Can you draw lines for the different runs you did? My interpretation was that memory was increasing as you did larger runs, and that you though that was coming from GAMG. That means the curves should be pushed up for larger runs. Do you see that? Thanks, Matt > As you can see, there are slight differences between the two versions but > none is critical, in my opinion. Do you see anything suspicious in the > outputs? > > + I can't find the default threshold value. Do you know where I can find > it? > > Thanks for the follow-up > > Myriam > > Le 03/05/19 ? 14:06, Matthew Knepley a ?crit : > > On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette < > myriam.peyrounette at idris.fr> wrote: > >> Hi Matt, >> >> I plotted the memory scalings using different threshold values. The two >> scalings are slightly translated (from -22 to -88 mB) but this gain is >> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling >> deteriorates. >> >> Do you have any other suggestion? >> > Mark, what is the option she can give to output all the GAMG data? > > Also, run using -ksp_view. GAMG will report all the sizes of its grids, so > it should be easy to see > if the coarse grid sizes are increasing, and also what the effect of the > threshold value is. > > Thanks, > > Matt > >> Thanks >> Myriam >> >> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit : >> >> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Hi, >>> >>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc version >>> to 3.10, this code has a bad memory scaling. >>> >>> To report this issue, I took the PETSc script ex42.c and slightly >>> modified it so that the KSP and PC configurations are the same as in my >>> code. In particular, I use a "personnalised" multi-grid method. The >>> modifications are indicated by the keyword "TopBridge" in the attached >>> scripts. >>> >>> To plot the memory (weak) scaling, I ran four calculations for each >>> script with increasing problem sizes and computations cores: >>> >>> 1. 100,000 elts on 4 cores >>> 2. 1 million elts on 40 cores >>> 3. 10 millions elts on 400 cores >>> 4. 100 millions elts on 4,000 cores >>> >>> The resulting graph is also attached. The scaling using PETSc 3.10 >>> clearly deteriorates for large cases, while the one using PETSc 3.6 is >>> robust. >>> >>> After a few tests, I found that the scaling is mostly sensitive to the >>> use of the AMG method for the coarse grid (line 1780 in >>> main_ex42_petsc36.cc). In particular, the performance strongly >>> deteriorates when commenting lines 1777 to 1790 (in >>> main_ex42_petsc36.cc). >>> >>> Do you have any idea of what changed between version 3.6 and version >>> 3.10 that may imply such degradation? >>> >> >> I believe the default values for PCGAMG changed between versions. It >> sounds like the coarsening rate >> is not great enough, so that these grids are too large. This can be set >> using: >> >> >> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >> >> There is some explanation of this effect on that page. Let us know if >> setting this does not correct the situation. >> >> Thanks, >> >> Matt >> >> >>> Let me know if you need further information. >>> >>> Best, >>> >>> Myriam Peyrounette >>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbuerkle at web.de Tue Mar 5 18:19:40 2019 From: mbuerkle at web.de (Marius Buerkle) Date: Wed, 6 Mar 2019 01:19:40 +0100 Subject: [petsc-users] MatMatMult densemat Message-ID: An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Mar 5 18:58:29 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 6 Mar 2019 00:58:29 +0000 Subject: [petsc-users] MatMatMult densemat In-Reply-To: References: Message-ID: <8457F86F-6A72-4D60-A1AB-F8460DDE9C79@anl.gov> Marius, The reason this is happening is because the routine MatMatMultSymbolic_MPIDense_MPIDense() works by converting the matrix to elemental format, doing the product and then converting back. Elemental format has some block cyclic storage format and so the row ownership knowledge is lost along the way. I do not have any suggestions, though others might. Barry > On Mar 5, 2019, at 6:19 PM, Marius Buerkle via petsc-users wrote: > > Hi, > > > I have a question regarding MatMatMult for MPIDENSE matrices. I have two dense matrices A and B for which I set the number up the number of local rows each processor owns manually (same for A and B) when creating them with MatCreateDense (which is different from what PETSC_DECIDE what do). When I calculate A*B=C using MatMatMult with MAT_INITIAL_MATRIX the resulting matrix C has a different distribution of the rows among the processes. Is this normal? I would have expected that C inherits the local row structure from A and B. Later on, I want to multiply C let?s say with A which gives then accordingly an error that the local size is not conform. > > If on the other hand A is MATMPIAIJ then C has the same local row structure. > > > Best, > > Marius > From bsmith at mcs.anl.gov Tue Mar 5 19:21:24 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 6 Mar 2019 01:21:24 +0000 Subject: [petsc-users] Loading only upper + MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE); In-Reply-To: <232628624.8168576.1551704593241@mail.yahoo.com> References: <232628624.8168576.1551704593241.ref@mail.yahoo.com> <232628624.8168576.1551704593241@mail.yahoo.com> Message-ID: > On Mar 4, 2019, at 7:03 AM, Klaus Burkart via petsc-users wrote: > > Hello, > > I want to solve many symmetric linear systems one after another in parallel using boomerAMG + KSPCG and need to make the matrix transfer more efficient. Matrices are symmetric in structure and values. boomerAMG + KSPCG work fine. > > So far I have been loading the entire matrices but I read in a paper, that it's sufficient to load the upper part only and tell petsc that the matrix is symmetric using MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE); Unfortunately all computations fail if I load only the upper values and use MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE); Unfortunately it doesn't work that way. The MatSetOption() is just a marker that indicates that the user states the matrix is symmetric. It doesn't change the format or values. You could write a custom MatLoad() that reads in just the symmetric part and builds the entire MPIAIJ (since BoomerAMG requires the full MPIAIJ) but that is good amount of work because you have to do the careful communication needed to "fill in" the "transpose" part of the matrix. Basically read in the matrix and do a parallel transpose on the fly (also requires care in getting the matrix preallocation correct). Barry > > The idea is: > > if (matrix_.symmetric()) > { > MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE); > //load only upper part of the matrix MatSetValues(...) > > }else //asymmetric matrix > { > //load the entire matrix MatSetValues(...) > } > > Is it possible at all? > > Klaus From bsmith at mcs.anl.gov Tue Mar 5 20:02:22 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 6 Mar 2019 02:02:22 +0000 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> Message-ID: Myriam, Sorry we have not been able to resolve this problem with memory scaling yet. The best tool to determine the change in a code that results in large differences in a program's run is git bisect. Basically you tell git bisect the git commit of the code that is "good" and the git commit of the code that is "bad" and it gives you additional git commits for you to check your code on each time telling git if it is "good" or "bad", eventually git bisect tells you exactly the git commit that "broke" the code. No guess work, no endless speculation. The draw back is that you have to ./configure && make PETSc for each "test" commit and then compile and run your code for that commit. I can understand if you have to run your code on 10,000 processes to check if it is "good" or "bad" that can be very daunting. But all I can suggest is to find a problem size that is manageable and do the git bisect process (yeah it may take several hours but that beats days of head banging). Good luck, Barry > On Mar 5, 2019, at 12:42 PM, Matthew Knepley via petsc-users wrote: > > On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette wrote: > I used PCView to display the size of the linear system in each level of the MG. You'll find the outputs attached to this mail (zip file) for both the default threshold value and a value of 0.1, and for both 3.6 and 3.10 PETSc versions. > > For convenience, I summarized the information in a graph, also attached (png file). > > > Great! Can you draw lines for the different runs you did? My interpretation was that memory was increasing > as you did larger runs, and that you though that was coming from GAMG. That means the curves should > be pushed up for larger runs. Do you see that? > > Thanks, > > Matt > As you can see, there are slight differences between the two versions but none is critical, in my opinion. Do you see anything suspicious in the outputs? > > + I can't find the default threshold value. Do you know where I can find it? > > Thanks for the follow-up > > Myriam > > > Le 03/05/19 ? 14:06, Matthew Knepley a ?crit : >> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette wrote: >> Hi Matt, >> >> I plotted the memory scalings using different threshold values. The two scalings are slightly translated (from -22 to -88 mB) but this gain is neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling deteriorates. >> >> Do you have any other suggestion? >> >> Mark, what is the option she can give to output all the GAMG data? >> >> Also, run using -ksp_view. GAMG will report all the sizes of its grids, so it should be easy to see >> if the coarse grid sizes are increasing, and also what the effect of the threshold value is. >> >> Thanks, >> >> Matt >> Thanks >> >> Myriam >> >> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit : >>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users wrote: >>> Hi, >>> >>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc version >>> to 3.10, this code has a bad memory scaling. >>> >>> To report this issue, I took the PETSc script ex42.c and slightly >>> modified it so that the KSP and PC configurations are the same as in my >>> code. In particular, I use a "personnalised" multi-grid method. The >>> modifications are indicated by the keyword "TopBridge" in the attached >>> scripts. >>> >>> To plot the memory (weak) scaling, I ran four calculations for each >>> script with increasing problem sizes and computations cores: >>> >>> 1. 100,000 elts on 4 cores >>> 2. 1 million elts on 40 cores >>> 3. 10 millions elts on 400 cores >>> 4. 100 millions elts on 4,000 cores >>> >>> The resulting graph is also attached. The scaling using PETSc 3.10 >>> clearly deteriorates for large cases, while the one using PETSc 3.6 is >>> robust. >>> >>> After a few tests, I found that the scaling is mostly sensitive to the >>> use of the AMG method for the coarse grid (line 1780 in >>> main_ex42_petsc36.cc). In particular, the performance strongly >>> deteriorates when commenting lines 1777 to 1790 (in main_ex42_petsc36.cc). >>> >>> Do you have any idea of what changed between version 3.6 and version >>> 3.10 that may imply such degradation? >>> >>> I believe the default values for PCGAMG changed between versions. It sounds like the coarsening rate >>> is not great enough, so that these grids are too large. This can be set using: >>> >>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>> >>> There is some explanation of this effect on that page. Let us know if setting this does not correct the situation. >>> >>> Thanks, >>> >>> Matt >>> >>> Let me know if you need further information. >>> >>> Best, >>> >>> Myriam Peyrounette >>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From hongzhang at anl.gov Tue Mar 5 21:47:18 2019 From: hongzhang at anl.gov (Zhang, Hong) Date: Wed, 6 Mar 2019 03:47:18 +0000 Subject: [petsc-users] MatMatMult densemat In-Reply-To: References: Message-ID: <6BF45BFC-C159-4644-814E-50091B676F1C@anl.gov> What happens if you try to preallocate C matrix (in the same way as A and B) and use MatMatMult with MAT_REUSE_MATRIX? Hong (Mr.) On Mar 5, 2019, at 6:19 PM, Marius Buerkle via petsc-users > wrote: Hi, I have a question regarding MatMatMult for MPIDENSE matrices. I have two dense matrices A and B for which I set the number up the number of local rows each processor owns manually (same for A and B) when creating them with MatCreateDense (which is different from what PETSC_DECIDE what do). When I calculate A*B=C using MatMatMult with MAT_INITIAL_MATRIX the resulting matrix C has a different distribution of the rows among the processes. Is this normal? I would have expected that C inherits the local row structure from A and B. Later on, I want to multiply C let?s say with A which gives then accordingly an error that the local size is not conform. If on the other hand A is MATMPIAIJ then C has the same local row structure. Best, Marius -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Mar 5 22:35:36 2019 From: jed at jedbrown.org (Jed Brown) Date: Tue, 05 Mar 2019 21:35:36 -0700 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> Message-ID: <87sgw0pqlj.fsf@jedbrown.org> Myriam, in your first message, there was a significant (about 50%) increase in memory consumption already on 4 cores. Before attacking scaling, it may be useful to trace memory usage for that base case. Even better if you can reduce to one process. Anyway, I would start by running both cases with -log_view and looking at the memory summary. I would then use Massif (the memory profiler/tracer component in Valgrind) to obtain stack traces for the large allocations. Comparing those traces should help narrow down which part of the code has significantly different memory allocation behavior. It might also point to the unacceptable memory consumption under weak scaling, but it's something we should try to fix. If I had to guess, it may be in intermediate data structures for the different PtAP algorithms in GAMG. The option "-matptap_via scalable" may be helpful. "Smith, Barry F. via petsc-users" writes: > Myriam, > > Sorry we have not been able to resolve this problem with memory scaling yet. > > The best tool to determine the change in a code that results in large differences in a program's run is git bisect. Basically you tell git bisect > the git commit of the code that is "good" and the git commit of the code that is "bad" and it gives you additional git commits for you to check your code on each time telling git if it is "good" or "bad", eventually git bisect tells you exactly the git commit that "broke" the code. No guess work, no endless speculation. > > The draw back is that you have to ./configure && make PETSc for each "test" commit and then compile and run your code for that commit. I can understand if you have to run your code on 10,000 processes to check if it is "good" or "bad" that can be very daunting. But all I can suggest is to find a problem size that is manageable and do the git bisect process (yeah it may take several hours but that beats days of head banging). > > Good luck, > > Barry > > >> On Mar 5, 2019, at 12:42 PM, Matthew Knepley via petsc-users wrote: >> >> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette wrote: >> I used PCView to display the size of the linear system in each level of the MG. You'll find the outputs attached to this mail (zip file) for both the default threshold value and a value of 0.1, and for both 3.6 and 3.10 PETSc versions. >> >> For convenience, I summarized the information in a graph, also attached (png file). >> >> >> Great! Can you draw lines for the different runs you did? My interpretation was that memory was increasing >> as you did larger runs, and that you though that was coming from GAMG. That means the curves should >> be pushed up for larger runs. Do you see that? >> >> Thanks, >> >> Matt >> As you can see, there are slight differences between the two versions but none is critical, in my opinion. Do you see anything suspicious in the outputs? >> >> + I can't find the default threshold value. Do you know where I can find it? >> >> Thanks for the follow-up >> >> Myriam >> >> >> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit : >>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette wrote: >>> Hi Matt, >>> >>> I plotted the memory scalings using different threshold values. The two scalings are slightly translated (from -22 to -88 mB) but this gain is neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling deteriorates. >>> >>> Do you have any other suggestion? >>> >>> Mark, what is the option she can give to output all the GAMG data? >>> >>> Also, run using -ksp_view. GAMG will report all the sizes of its grids, so it should be easy to see >>> if the coarse grid sizes are increasing, and also what the effect of the threshold value is. >>> >>> Thanks, >>> >>> Matt >>> Thanks >>> >>> Myriam >>> >>> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit : >>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users wrote: >>>> Hi, >>>> >>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc version >>>> to 3.10, this code has a bad memory scaling. >>>> >>>> To report this issue, I took the PETSc script ex42.c and slightly >>>> modified it so that the KSP and PC configurations are the same as in my >>>> code. In particular, I use a "personnalised" multi-grid method. The >>>> modifications are indicated by the keyword "TopBridge" in the attached >>>> scripts. >>>> >>>> To plot the memory (weak) scaling, I ran four calculations for each >>>> script with increasing problem sizes and computations cores: >>>> >>>> 1. 100,000 elts on 4 cores >>>> 2. 1 million elts on 40 cores >>>> 3. 10 millions elts on 400 cores >>>> 4. 100 millions elts on 4,000 cores >>>> >>>> The resulting graph is also attached. The scaling using PETSc 3.10 >>>> clearly deteriorates for large cases, while the one using PETSc 3.6 is >>>> robust. >>>> >>>> After a few tests, I found that the scaling is mostly sensitive to the >>>> use of the AMG method for the coarse grid (line 1780 in >>>> main_ex42_petsc36.cc). In particular, the performance strongly >>>> deteriorates when commenting lines 1777 to 1790 (in main_ex42_petsc36.cc). >>>> >>>> Do you have any idea of what changed between version 3.6 and version >>>> 3.10 that may imply such degradation? >>>> >>>> I believe the default values for PCGAMG changed between versions. It sounds like the coarsening rate >>>> is not great enough, so that these grids are too large. This can be set using: >>>> >>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>> >>>> There is some explanation of this effect on that page. Let us know if setting this does not correct the situation. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Let me know if you need further information. >>>> >>>> Best, >>>> >>>> Myriam Peyrounette >>>> >>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ From mbuerkle at web.de Tue Mar 5 22:58:14 2019 From: mbuerkle at web.de (Marius Buerkle) Date: Wed, 6 Mar 2019 05:58:14 +0100 Subject: [petsc-users] MatMatMult densemat In-Reply-To: <6BF45BFC-C159-4644-814E-50091B676F1C@anl.gov> References: <6BF45BFC-C159-4644-814E-50091B676F1C@anl.gov> Message-ID: An HTML attachment was scrubbed... URL: From mbuerkle at web.de Tue Mar 5 23:00:19 2019 From: mbuerkle at web.de (Marius Buerkle) Date: Wed, 6 Mar 2019 06:00:19 +0100 Subject: [petsc-users] MatMatMult densemat In-Reply-To: <8457F86F-6A72-4D60-A1AB-F8460DDE9C79@anl.gov> References: <8457F86F-6A72-4D60-A1AB-F8460DDE9C79@anl.gov> Message-ID: An HTML attachment was scrubbed... URL: From eda.oktay at metu.edu.tr Thu Mar 7 05:45:13 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Thu, 7 Mar 2019 14:45:13 +0300 Subject: [petsc-users] EPSGetEigenpair Problem Message-ID: Hello, I have a code finding Laplacian of a matrix and then I need to find this Laplacian's second smallest eigenpair. I used EPSGetEigenpair code but I get the values like "0." or "4." (I could not get decimals) when I used PetscPrintf(PETSC_COMM_WORLD," The second smallest eigenvalue: %g\n",kr) line. What could be the problem? Best regards, Eda -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 7 06:07:44 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 7 Mar 2019 07:07:44 -0500 Subject: [petsc-users] EPSGetEigenpair Problem In-Reply-To: References: Message-ID: On Thu, Mar 7, 2019 at 6:45 AM Eda Oktay via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > I have a code finding Laplacian of a matrix and then I need to find this > Laplacian's second smallest eigenpair. I used EPSGetEigenpair code but I > get the values like "0." or "4." (I could not get decimals) when I used > PetscPrintf(PETSC_COMM_WORLD," The second smallest eigenvalue: %g\n",kr) > line. > > What could be the problem? > Hi Eda, I have an example that does just this, and I am getting the correct result. I have not yet checked it in, but i attach it here. Are you setting the right target? Thanks, Matt > Best regards, > > Eda > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex6.c Type: application/octet-stream Size: 14305 bytes Desc: not available URL: From myriam.peyrounette at idris.fr Fri Mar 8 03:10:30 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Fri, 8 Mar 2019 10:10:30 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: <87sgw0pqlj.fsf@jedbrown.org> References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <87sgw0pqlj.fsf@jedbrown.org> Message-ID: Hi all, thanks a lot for all your suggestions. I followed Jed's advice and focused on the smallest case (problem size = 1e5) to find the origin of the memory gap. And I ran my jobs on a single core. I first used a homemade script to plot memory and time consumption: see mem_consumption.png and time_consumption.png attached. The steps displayed correspond to checkpoints I defined in the main file (search for keyword "STEP" in the attached main file if you need to locate them). We can see that both the time and the memory needs increase while using petsc 3.10. With regard to memory consumption, the memory gap (of ~135,000,000B) is not critical yet for such a problem size, but will be for larger problems. According to these graphs, something clearly happens while calling KSPSolve. The code also spends more time building the matrix. To dig deeper, I printed the LogView outputs (logView_petsc3XX.log). In particular, we see that the memory distribution among the petsc object types is very different depending on the petsc version. I highlighted this in log_view_mem_distr.pdf by sorting the petsc object types according to their memory use, and computing the difference between petsc 3.10 and petsc 3.6? (column on the rigth). But I don't know how to understand that... Finally, I used Massif (massif_petscXX.out). The total memory gap of ~135,000,000B is verified. The outputs further indicate that most of the memory space is required by DMCreateMatrix (1,174,436,836B). This value is almost the same is both versions so I don't think this is the problem.?? You mentioned PtAP: MatPtAPSymbolic_SeqAIJ_SeqMAIJ needs 56,277,776B with petsc 3.6, and 112,562,000B in petsc 3.10 (twice more). So it is a good start but it does not correspond to the total memory gap. I will try the option "-matptap via scalable" to see if there is an improvement. I also find a gap at KSPSolve, which needs up to 173,253,112B with petsc 3.10 (line 461 in massif_petsc310.out), and no more than? 146,825,952B (line 1534 in massif_petsc36.out) with petsc 3.6. But again, the sum of these gaps does not fit the total gap.? I am not very familiar with massif, so maybe you'll see additional relevant information? Best, Myriam Le 03/06/19 ? 05:35, Jed Brown a ?crit?: > Myriam, in your first message, there was a significant (about 50%) > increase in memory consumption already on 4 cores. Before attacking > scaling, it may be useful to trace memory usage for that base case. > Even better if you can reduce to one process. Anyway, I would start by > running both cases with -log_view and looking at the memory summary. I > would then use Massif (the memory profiler/tracer component in Valgrind) > to obtain stack traces for the large allocations. Comparing those > traces should help narrow down which part of the code has significantly > different memory allocation behavior. It might also point to the > unacceptable memory consumption under weak scaling, but it's something > we should try to fix. > > If I had to guess, it may be in intermediate data structures for the > different PtAP algorithms in GAMG. The option "-matptap_via scalable" > may be helpful. > > "Smith, Barry F. via petsc-users" writes: > >> Myriam, >> >> Sorry we have not been able to resolve this problem with memory scaling yet. >> >> The best tool to determine the change in a code that results in large differences in a program's run is git bisect. Basically you tell git bisect >> the git commit of the code that is "good" and the git commit of the code that is "bad" and it gives you additional git commits for you to check your code on each time telling git if it is "good" or "bad", eventually git bisect tells you exactly the git commit that "broke" the code. No guess work, no endless speculation. >> >> The draw back is that you have to ./configure && make PETSc for each "test" commit and then compile and run your code for that commit. I can understand if you have to run your code on 10,000 processes to check if it is "good" or "bad" that can be very daunting. But all I can suggest is to find a problem size that is manageable and do the git bisect process (yeah it may take several hours but that beats days of head banging). >> >> Good luck, >> >> Barry >> >> >>> On Mar 5, 2019, at 12:42 PM, Matthew Knepley via petsc-users wrote: >>> >>> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette wrote: >>> I used PCView to display the size of the linear system in each level of the MG. You'll find the outputs attached to this mail (zip file) for both the default threshold value and a value of 0.1, and for both 3.6 and 3.10 PETSc versions. >>> >>> For convenience, I summarized the information in a graph, also attached (png file). >>> >>> >>> Great! Can you draw lines for the different runs you did? My interpretation was that memory was increasing >>> as you did larger runs, and that you though that was coming from GAMG. That means the curves should >>> be pushed up for larger runs. Do you see that? >>> >>> Thanks, >>> >>> Matt >>> As you can see, there are slight differences between the two versions but none is critical, in my opinion. Do you see anything suspicious in the outputs? >>> >>> + I can't find the default threshold value. Do you know where I can find it? >>> >>> Thanks for the follow-up >>> >>> Myriam >>> >>> >>> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit : >>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette wrote: >>>> Hi Matt, >>>> >>>> I plotted the memory scalings using different threshold values. The two scalings are slightly translated (from -22 to -88 mB) but this gain is neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling deteriorates. >>>> >>>> Do you have any other suggestion? >>>> >>>> Mark, what is the option she can give to output all the GAMG data? >>>> >>>> Also, run using -ksp_view. GAMG will report all the sizes of its grids, so it should be easy to see >>>> if the coarse grid sizes are increasing, and also what the effect of the threshold value is. >>>> >>>> Thanks, >>>> >>>> Matt >>>> Thanks >>>> >>>> Myriam >>>> >>>> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit : >>>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users wrote: >>>>> Hi, >>>>> >>>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc version >>>>> to 3.10, this code has a bad memory scaling. >>>>> >>>>> To report this issue, I took the PETSc script ex42.c and slightly >>>>> modified it so that the KSP and PC configurations are the same as in my >>>>> code. In particular, I use a "personnalised" multi-grid method. The >>>>> modifications are indicated by the keyword "TopBridge" in the attached >>>>> scripts. >>>>> >>>>> To plot the memory (weak) scaling, I ran four calculations for each >>>>> script with increasing problem sizes and computations cores: >>>>> >>>>> 1. 100,000 elts on 4 cores >>>>> 2. 1 million elts on 40 cores >>>>> 3. 10 millions elts on 400 cores >>>>> 4. 100 millions elts on 4,000 cores >>>>> >>>>> The resulting graph is also attached. The scaling using PETSc 3.10 >>>>> clearly deteriorates for large cases, while the one using PETSc 3.6 is >>>>> robust. >>>>> >>>>> After a few tests, I found that the scaling is mostly sensitive to the >>>>> use of the AMG method for the coarse grid (line 1780 in >>>>> main_ex42_petsc36.cc). In particular, the performance strongly >>>>> deteriorates when commenting lines 1777 to 1790 (in main_ex42_petsc36.cc). >>>>> >>>>> Do you have any idea of what changed between version 3.6 and version >>>>> 3.10 that may imply such degradation? >>>>> >>>>> I believe the default values for PCGAMG changed between versions. It sounds like the coarsening rate >>>>> is not great enough, so that these grids are too large. This can be set using: >>>>> >>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>>> >>>>> There is some explanation of this effect on that page. Let us know if setting this does not correct the situation. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> Let me know if you need further information. >>>>> >>>>> Best, >>>>> >>>>> Myriam Peyrounette >>>>> >>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mem_consumption.png Type: image/png Size: 24499 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: time_consumption.png Type: image/png Size: 19484 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: main_petsc36.cc Type: text/x-c++src Size: 92061 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: logView_petsc36.log Type: text/x-log Size: 18317 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: logView_petsc310.log Type: text/x-log Size: 18288 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: log_view_mem_distr.pdf Type: application/pdf Size: 20293 bytes Desc: not available URL: -------------- next part -------------- -------------------------------------------------------------------------------- Command: ./ex42 -mx 48 -my 48 -mz 48 -stokes_ksp_monitor_blocks 0 -model 0 -levels 3 Massif arguments: (none) ms_print arguments: massif.out.33309 -------------------------------------------------------------------------------- GB 1.547^ : | @:::::::::::::@::::::::::::#::::: | @@@:@::::::::::::@ @ # : | :@@@ :@ @ @ # : | :::@@@ :@ @ @ # : | @@@@:::::::::::: :@@@ :@ @ @ # : | @ : : :: :@@@ :@ @ @ # : | @ : : :: :@@@ :@ @ @ # : | @ : : :: :@@@ :@ @ @ # : | @ : : :: :@@@ :@ @ @ # : | @ : : :: :@@@ :@ @ @ # : | @ : : :: :@@@ :@ @ @ # : |@@@@ : : :: :@@@ :@ @ @ # : |@ @ : : :: :@@@ :@ @ @ # : |@ @ : : :: :@@@ :@ @ @ # : |@ @ : : :: :@@@ :@ @ @ # : |@ @ : : :: :@@@ :@ @ @ # : |@ @ : : :: :@@@ :@ @ @ # : |@ @ : : :: :@@@ :@ @ @ # : |@ @ : : :: :@@@ :@ @ @ # : 0 +----------------------------------------------------------------------->Gi 0 304.7 Number of snapshots: 82 Detailed snapshots: [3, 5, 11, 37, 40, 43, 46, 50, 53, 57, 60, 64, 67 (peak), 77] -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 0 0 0 0 0 0 1 35,834,612 71,737,416 71,723,119 14,297 0 2 327,760,951 71,715,056 71,700,775 14,281 0 3 328,278,660 75,486,720 75,471,765 14,955 0 99.98% (75,471,765B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->88.82% (67,049,927B) 0x4F3E598: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->75.01% (56,623,104B) 0x417587: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc36.cc:75) | | ->75.01% (56,623,104B) 0x404CBE: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1860) | | ->75.01% (56,623,104B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->07.48% (5,647,152B) 0x4FF7CFF: VecCreate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->07.48% (5,647,152B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->07.48% (5,647,152B) 0x4FF8EA3: VecCreate_Standard (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->07.48% (5,647,152B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.74% (2,823,576B) 0x53C0070: DMCreateGlobalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->03.74% (2,823,576B) 0x553D255: DMCreateGlobalVector (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->03.74% (2,823,576B) 0x53C5B4C: DMDASetUniformCoordinates (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->03.74% (2,823,576B) 0x404C6C: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1854) | | | ->03.74% (2,823,576B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | ->03.74% (2,823,576B) 0x53B5985: DMCreateLocalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.74% (2,823,576B) 0x5542B05: DMCreateLocalVector (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.74% (2,823,576B) 0x5543B56: DMGetCoordinatesLocal (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.74% (2,823,576B) 0x404D00: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1864) | | ->03.74% (2,823,576B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->04.99% (3,764,783B) 0x53EE8A9: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->04.99% (3,764,783B) 0x53E4FB5: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->04.99% (3,764,783B) 0x553AA24: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->04.99% (3,764,783B) 0x40614B: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2028) | | ->04.99% (3,764,783B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->01.25% (941,192B) 0x53A4C31: DMSetUp_DA_3D (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.25% (941,192B) 0x53E0AB6: DMSetUp_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.25% (941,192B) 0x553972E: DMSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.25% (941,192B) 0x53AE57B: DMDACreate3d (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.25% (941,192B) in 2 places, all below massif's threshold (01.00%) | | | ->00.10% (73,696B) in 1+ places, all below ms_print's threshold (01.00%) | ->11.12% (8,391,248B) 0xA051F04: MPIDI_CH3I_Seg_commit (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->11.12% (8,391,248B) 0xA05E0D2: MPID_nem_init_ckpt (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->11.12% (8,391,248B) 0xA05FCA5: MPID_nem_init (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->11.12% (8,391,248B) 0x9F1106E: MPIDI_CH3_Init (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->11.12% (8,391,248B) 0xA04EBAB: MPID_Init (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->11.12% (8,391,248B) 0xA02538D: MPIR_Init_thread (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->11.12% (8,391,248B) 0xA024D98: PMPI_Init_thread (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->11.12% (8,391,248B) 0x4F1D607: PetscInitialize (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->11.12% (8,391,248B) 0x404761: main (main_petsc36.cc:2242) | ->00.04% (30,590B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 4 1,499,136,193 75,486,696 75,471,765 14,931 0 5 1,516,554,677 666,470,400 666,454,974 15,426 0 100.00% (666,454,974B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->98.73% (658,033,144B) 0x4F3E598: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->88.11% (587,218,418B) 0x52B1D78: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->88.11% (587,218,418B) 0x52B3D19: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->88.11% (587,218,418B) 0x53EEE0C: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->88.11% (587,218,418B) 0x53E4FB5: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->88.11% (587,218,418B) 0x553AA24: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->88.11% (587,218,418B) 0x40614B: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2028) | | ->88.11% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->08.50% (56,623,104B) 0x417587: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc36.cc:75) | | ->08.50% (56,623,104B) 0x404CBE: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1860) | | ->08.50% (56,623,104B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->02.13% (14,191,622B) in 69 places, all below massif's threshold (01.00%) | ->01.26% (8,391,248B) 0xA051F04: MPIDI_CH3I_Seg_commit (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->01.26% (8,391,248B) 0xA05E0D2: MPID_nem_init_ckpt (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->01.26% (8,391,248B) 0xA05FCA5: MPID_nem_init (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->01.26% (8,391,248B) 0x9F1106E: MPIDI_CH3_Init (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->01.26% (8,391,248B) 0xA04EBAB: MPID_Init (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->01.26% (8,391,248B) 0xA02538D: MPIR_Init_thread (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->01.26% (8,391,248B) 0xA024D98: PMPI_Init_thread (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->01.26% (8,391,248B) 0x4F1D607: PetscInitialize (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->01.26% (8,391,248B) 0x404761: main (main_petsc36.cc:2242) | ->00.00% (30,582B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 6 16,261,614,710 662,705,608 662,690,191 15,417 0 7 16,273,177,611 664,588,000 664,572,575 15,425 0 8 16,453,080,876 666,470,408 666,454,963 15,445 0 9 16,592,550,759 670,240,048 670,224,014 16,034 0 10 17,763,405,488 670,240,024 670,224,014 16,010 0 11 17,780,820,810 1,261,223,728 1,261,207,223 16,505 0 100.00% (1,261,207,223B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.33% (1,252,785,393B) 0x4F3E598: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->93.12% (1,174,436,836B) 0x52B1D78: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->93.12% (1,174,436,836B) 0x52B3D19: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->93.12% (1,174,436,836B) 0x53EEE0C: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->93.12% (1,174,436,836B) 0x53E4FB5: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->93.12% (1,174,436,836B) 0x553AA24: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->46.56% (587,218,418B) 0x40614B: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2028) | | | ->46.56% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | ->46.56% (587,218,418B) 0x40616C: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2029) | | ->46.56% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->04.49% (56,623,104B) 0x417587: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc36.cc:75) | | ->04.49% (56,623,104B) 0x404CBE: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1860) | | ->04.49% (56,623,104B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->01.72% (21,725,453B) in 72 places, all below massif's threshold (01.00%) | ->00.67% (8,421,830B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 12 32,525,877,843 1,257,458,936 1,257,442,440 16,496 0 13 32,717,337,805 1,261,223,736 1,261,207,212 16,524 0 14 32,856,396,640 1,261,221,968 1,261,205,453 16,515 0 15 32,930,640,810 1,268,778,168 1,268,761,389 16,779 0 16 58,569,488,720 1,268,755,808 1,268,739,045 16,763 0 17 80,399,203,865 1,268,755,808 1,268,739,045 16,763 0 18 80,399,942,978 1,272,567,448 1,272,550,529 16,919 0 19 81,763,811,939 1,272,545,088 1,272,528,185 16,903 0 20 81,893,298,331 1,305,332,968 1,305,305,169 27,799 0 21 85,163,552,026 1,356,110,968 1,356,077,261 33,707 0 22 85,215,698,411 1,376,145,504 1,376,117,065 28,439 0 23 93,174,834,293 1,375,145,464 1,375,117,035 28,429 0 24 93,176,062,194 1,375,962,496 1,375,933,984 28,512 0 25 93,593,897,813 1,382,701,952 1,382,673,404 28,548 0 26 93,600,805,445 1,385,254,232 1,385,224,814 29,418 0 27 94,611,634,258 1,385,051,600 1,385,022,202 29,398 0 28 94,632,051,286 1,404,694,096 1,404,662,417 31,679 0 29 94,883,287,158 1,404,702,920 1,404,671,205 31,715 0 30 95,007,514,066 1,407,559,152 1,407,523,721 35,431 0 31 95,167,535,043 1,405,947,776 1,405,915,815 31,961 0 32 95,349,461,781 1,413,587,144 1,413,552,928 34,216 0 33 95,639,044,739 1,405,736,760 1,405,704,162 32,598 0 34 95,674,406,759 1,458,922,280 1,458,880,871 41,409 0 35 96,600,728,596 1,458,922,544 1,458,881,127 41,417 0 36 97,102,450,577 1,458,922,632 1,458,881,207 41,425 0 37 97,105,663,401 1,492,821,480 1,492,779,339 42,141 0 100.00% (1,492,779,339B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.43% (1,484,354,449B) 0x4F3E598: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->79.06% (1,180,246,040B) 0x52B1D78: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->79.06% (1,180,246,040B) 0x52B3D19: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->78.67% (1,174,436,836B) 0x53EEE0C: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->78.67% (1,174,436,836B) 0x53E4FB5: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->78.67% (1,174,436,836B) 0x553AA24: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->39.34% (587,218,418B) 0x40614B: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2028) | | | | ->39.34% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->39.34% (587,218,418B) 0x40616C: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2029) | | | ->39.34% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | ->00.39% (5,809,204B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->07.18% (107,177,936B) 0x4FF7CFF: VecCreate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->07.18% (107,177,936B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->04.78% (71,412,640B) 0x4FF1C5F: VecDuplicate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->04.78% (71,412,640B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->04.71% (70,337,984B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->04.71% (70,337,984B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->04.71% (70,337,984B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.27% (33,882,912B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.27% (33,882,912B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.27% (33,882,912B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.27% (33,882,912B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.27% (33,882,912B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.27% (33,882,912B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.27% (33,882,912B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.27% (33,882,912B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.27% (33,882,912B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.27% (33,882,912B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.27% (33,882,912B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.27% (33,882,912B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.27% (33,882,912B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | ->02.27% (33,882,912B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->01.26% (18,825,120B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.26% (18,825,120B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.26% (18,823,840B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.26% (18,823,840B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.26% (18,823,840B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.26% (18,823,840B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.26% (18,823,840B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.26% (18,823,840B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.26% (18,823,840B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.26% (18,823,840B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.26% (18,823,840B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.26% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | ->01.26% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (1,280B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->01.18% (17,629,952B) in 3 places, all below massif's threshold (01.00%) | | | | | | | ->00.07% (1,074,656B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->02.40% (35,765,296B) 0x4FF8EA3: VecCreate_Standard (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.40% (35,765,296B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.14% (32,000,528B) 0x53C0070: DMCreateGlobalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.14% (32,000,528B) 0x553D255: DMCreateGlobalVector (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->01.26% (18,823,840B) 0x53BFE5F: VecDuplicate_MPI_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.26% (18,823,840B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.26% (18,823,840B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.26% (18,823,840B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.26% (18,823,840B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.26% (18,823,840B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.26% (18,823,840B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.26% (18,823,840B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.26% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | ->01.26% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->00.88% (13,176,688B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->00.25% (3,764,768B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->03.79% (56,623,104B) 0x417587: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc36.cc:75) | | ->03.79% (56,623,104B) 0x404CBE: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1860) | | ->03.79% (56,623,104B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->03.77% (56,277,776B) 0x51F256F: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.77% (56,277,776B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.77% (56,277,776B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.77% (56,277,776B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.77% (56,277,776B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.77% (56,277,776B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.77% (56,277,776B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.77% (56,277,776B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->03.77% (56,277,776B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->02.27% (33,882,942B) 0x5307655: MatSOR_SeqAIJ_Inode (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x508F2F7: MatSOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x5603367: PCApply_SOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.27% (33,882,942B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->02.27% (33,882,942B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->01.88% (28,138,888B) 0x51F2440: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.88% (28,138,888B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.88% (28,138,888B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.88% (28,138,888B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.88% (28,138,888B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.88% (28,138,888B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.88% (28,138,888B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.88% (28,138,888B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->01.88% (28,138,888B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->01.47% (22,007,763B) in 162 places, all below massif's threshold (01.00%) | ->00.56% (8,424,890B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 38 101,238,965,269 1,496,588,096 1,496,545,902 42,194 0 39 103,718,453,406 1,496,637,272 1,496,595,059 42,213 0 40 103,903,351,997 1,508,176,464 1,508,131,617 44,847 0 100.00% (1,508,131,617B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.44% (1,499,673,831B) 0x4F3E598: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->78.26% (1,180,246,040B) 0x52B1D78: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->78.26% (1,180,246,040B) 0x52B3D19: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->77.87% (1,174,436,836B) 0x53EEE0C: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->77.87% (1,174,436,836B) 0x53E4FB5: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->77.87% (1,174,436,836B) 0x553AA24: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->38.94% (587,218,418B) 0x40614B: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2028) | | | | ->38.94% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->38.94% (587,218,418B) 0x40616C: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2029) | | | ->38.94% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | ->00.39% (5,809,204B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->07.82% (117,942,704B) 0x4FF7CFF: VecCreate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->07.82% (117,942,704B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.45% (82,177,408B) 0x4FF1C5F: VecDuplicate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.45% (82,177,408B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.38% (81,102,752B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->05.38% (81,102,752B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->05.38% (81,102,752B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.79% (42,147,680B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.79% (42,147,680B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.79% (42,147,680B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.79% (42,147,680B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.79% (42,147,680B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.79% (42,147,680B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.79% (42,147,680B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.50% (37,647,680B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.50% (37,647,680B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.50% (37,647,680B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.50% (37,647,680B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.50% (37,647,680B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.50% (37,647,680B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | ->02.50% (37,647,680B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | ->00.30% (4,500,000B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->01.41% (21,325,120B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.41% (21,325,120B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.41% (21,323,840B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.41% (21,323,840B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.41% (21,323,840B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.41% (21,323,840B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.25% (18,823,840B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.25% (18,823,840B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.25% (18,823,840B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.25% (18,823,840B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.25% (18,823,840B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.25% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | | ->01.25% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | | | ->00.17% (2,500,000B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (1,280B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->01.17% (17,629,952B) in 3 places, all below massif's threshold (01.00%) | | | | | | | ->00.07% (1,074,656B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->02.37% (35,765,296B) 0x4FF8EA3: VecCreate_Standard (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.37% (35,765,296B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.12% (32,000,528B) 0x53C0070: DMCreateGlobalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.12% (32,000,528B) 0x553D255: DMCreateGlobalVector (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->01.25% (18,823,840B) 0x53BFE5F: VecDuplicate_MPI_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | ->01.25% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->00.87% (13,176,688B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->00.25% (3,764,768B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->03.75% (56,623,104B) 0x417587: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc36.cc:75) | | ->03.75% (56,623,104B) 0x404CBE: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1860) | | ->03.75% (56,623,104B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->03.73% (56,277,776B) 0x51F256F: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.73% (56,277,776B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.73% (56,277,776B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.73% (56,277,776B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.73% (56,277,776B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.73% (56,277,776B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.73% (56,277,776B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.73% (56,277,776B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->03.73% (56,277,776B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->02.54% (38,382,972B) 0x5307655: MatSOR_SeqAIJ_Inode (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.54% (38,382,972B) 0x508F2F7: MatSOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.54% (38,382,972B) 0x5603367: PCApply_SOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.54% (38,382,972B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.54% (38,382,972B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.54% (38,382,972B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.54% (38,382,972B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.54% (38,382,972B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.54% (38,382,972B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.54% (38,382,972B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.25% (33,882,942B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.25% (33,882,942B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.25% (33,882,942B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.25% (33,882,942B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.25% (33,882,942B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.25% (33,882,942B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | ->02.25% (33,882,942B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | ->00.30% (4,500,030B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->01.87% (28,138,888B) 0x51F2440: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.87% (28,138,888B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.87% (28,138,888B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.87% (28,138,888B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.87% (28,138,888B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.87% (28,138,888B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.87% (28,138,888B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.87% (28,138,888B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->01.87% (28,138,888B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->01.46% (22,062,347B) in 164 places, all below massif's threshold (01.00%) | ->00.56% (8,457,786B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 41 104,434,019,902 1,508,678,312 1,508,633,412 44,900 0 42 104,751,975,500 1,508,694,584 1,508,649,673 44,911 0 43 104,877,346,163 1,510,547,792 1,510,499,343 48,449 0 100.00% (1,510,499,343B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.44% (1,502,041,557B) 0x4F3E598: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->78.13% (1,180,246,040B) 0x52B1D78: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->78.13% (1,180,246,040B) 0x52B3D19: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->77.75% (1,174,436,836B) 0x53EEE0C: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->77.75% (1,174,436,836B) 0x53E4FB5: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->77.75% (1,174,436,836B) 0x553AA24: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->38.87% (587,218,418B) 0x40614B: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2028) | | | | ->38.87% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->38.87% (587,218,418B) 0x40616C: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2029) | | | ->38.87% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | ->00.38% (5,809,204B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->07.91% (119,527,984B) 0x4FF7CFF: VecCreate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->07.91% (119,527,984B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.55% (83,762,688B) 0x4FF1C5F: VecDuplicate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.55% (83,762,688B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.47% (82,688,032B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->05.47% (82,688,032B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->05.47% (82,688,032B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.87% (43,371,200B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.87% (43,371,200B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.87% (43,371,200B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.87% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.87% (43,371,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.87% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.87% (43,371,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.54% (38,350,720B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.54% (38,350,720B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.49% (37,647,680B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->02.49% (37,647,680B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->02.49% (37,647,680B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->02.49% (37,647,680B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | | ->02.49% (37,647,680B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | | | ->00.05% (703,040B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.33% (5,020,480B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->01.44% (21,686,880B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.44% (21,686,880B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.44% (21,685,600B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.44% (21,685,600B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.44% (21,685,600B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.44% (21,685,600B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.27% (19,175,360B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.27% (19,175,360B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.25% (18,823,840B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | | ->01.25% (18,823,840B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | | ->01.25% (18,823,840B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | | ->01.25% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | | | ->01.25% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | | | | | ->00.02% (351,520B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | | | ->00.17% (2,510,240B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (1,280B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->01.17% (17,629,952B) in 3 places, all below massif's threshold (01.00%) | | | | | | | ->00.07% (1,074,656B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->02.37% (35,765,296B) 0x4FF8EA3: VecCreate_Standard (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.37% (35,765,296B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.12% (32,000,528B) 0x53C0070: DMCreateGlobalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.12% (32,000,528B) 0x553D255: DMCreateGlobalVector (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->01.25% (18,823,840B) 0x53BFE5F: VecDuplicate_MPI_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.25% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | ->01.25% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->00.87% (13,176,688B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->00.25% (3,764,768B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->03.75% (56,623,104B) 0x417587: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc36.cc:75) | | ->03.75% (56,623,104B) 0x404CBE: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1860) | | ->03.75% (56,623,104B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->03.73% (56,277,776B) 0x51F256F: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.73% (56,277,776B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.73% (56,277,776B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.73% (56,277,776B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.73% (56,277,776B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.73% (56,277,776B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.73% (56,277,776B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.73% (56,277,776B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->03.73% (56,277,776B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->02.58% (39,034,200B) 0x5307655: MatSOR_SeqAIJ_Inode (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.58% (39,034,200B) 0x508F2F7: MatSOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.58% (39,034,200B) 0x5603367: PCApply_SOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.58% (39,034,200B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.58% (39,034,200B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.58% (39,034,200B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.58% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.58% (39,034,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.58% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.58% (39,034,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.28% (34,515,708B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.28% (34,515,708B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.24% (33,882,942B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.24% (33,882,942B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.24% (33,882,942B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.24% (33,882,942B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | ->02.24% (33,882,942B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->00.04% (632,766B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->00.30% (4,518,492B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->01.86% (28,138,888B) 0x51F2440: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.86% (28,138,888B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.86% (28,138,888B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.86% (28,138,888B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.86% (28,138,888B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.86% (28,138,888B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.86% (28,138,888B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.86% (28,138,888B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->01.86% (28,138,888B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->01.47% (22,193,565B) in 189 places, all below massif's threshold (01.00%) | ->00.56% (8,457,786B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 44 114,000,025,261 1,510,549,016 1,510,500,559 48,457 0 45 120,086,777,602 1,510,549,104 1,510,500,639 48,465 0 46 120,089,651,044 1,544,450,152 1,544,400,843 49,309 0 100.00% (1,544,400,843B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.45% (1,535,943,053B) 0x4F3E598: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->76.42% (1,180,246,040B) 0x52B1D78: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->76.42% (1,180,246,040B) 0x52B3D19: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->76.04% (1,174,436,836B) 0x53EEE0C: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->76.04% (1,174,436,836B) 0x53E4FB5: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->76.04% (1,174,436,836B) 0x553AA24: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->38.02% (587,218,418B) 0x40614B: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2028) | | | | ->38.02% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->38.02% (587,218,418B) 0x40616C: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2029) | | | ->38.02% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | ->00.38% (5,809,204B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->09.93% (153,410,896B) 0x4FF7CFF: VecCreate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->09.93% (153,410,896B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.42% (83,762,688B) 0x4FF1C5F: VecDuplicate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.42% (83,762,688B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.35% (82,688,032B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->05.35% (82,688,032B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->05.35% (82,688,032B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.81% (43,371,200B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.81% (43,371,200B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.81% (43,371,200B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.81% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.81% (43,371,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.81% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.81% (43,371,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.48% (38,350,720B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.48% (38,350,720B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.44% (37,647,680B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->02.44% (37,647,680B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->02.44% (37,647,680B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->02.44% (37,647,680B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | | ->02.44% (37,647,680B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | | | ->00.05% (703,040B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.33% (5,020,480B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->01.40% (21,686,880B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.40% (21,686,880B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.40% (21,685,600B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.40% (21,685,600B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.40% (21,685,600B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.40% (21,685,600B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.24% (19,175,360B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.24% (19,175,360B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.22% (18,823,840B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | | ->01.22% (18,823,840B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | | ->01.22% (18,823,840B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | | ->01.22% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | | | ->01.22% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | | | | | ->00.02% (351,520B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | | | ->00.16% (2,510,240B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (1,280B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->01.14% (17,629,952B) in 3 places, all below massif's threshold (01.00%) | | | | | | | ->00.07% (1,074,656B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->04.51% (69,648,208B) 0x4FF8EA3: VecCreate_Standard (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->04.51% (69,648,208B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->04.27% (65,883,440B) 0x53C0070: DMCreateGlobalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->04.27% (65,883,440B) 0x553D255: DMCreateGlobalVector (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->03.41% (52,706,752B) 0x53BFE5F: VecDuplicate_MPI_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->03.41% (52,706,752B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->03.41% (52,706,752B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->03.41% (52,706,752B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->03.41% (52,706,752B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.19% (33,882,912B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.19% (33,882,912B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.19% (33,882,912B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.19% (33,882,912B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.19% (33,882,912B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | ->02.19% (33,882,912B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | ->01.22% (18,823,840B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.22% (18,823,840B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.22% (18,823,840B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.22% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | ->01.22% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->00.85% (13,176,688B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->00.24% (3,764,768B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->03.67% (56,623,104B) 0x417587: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc36.cc:75) | | ->03.67% (56,623,104B) 0x404CBE: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1860) | | ->03.67% (56,623,104B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->03.64% (56,277,776B) 0x51F256F: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.64% (56,277,776B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.64% (56,277,776B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.64% (56,277,776B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.64% (56,277,776B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.64% (56,277,776B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.64% (56,277,776B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.64% (56,277,776B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->03.64% (56,277,776B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->02.53% (39,034,200B) 0x5307655: MatSOR_SeqAIJ_Inode (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.53% (39,034,200B) 0x508F2F7: MatSOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.53% (39,034,200B) 0x5603367: PCApply_SOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.53% (39,034,200B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.53% (39,034,200B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.53% (39,034,200B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.53% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.53% (39,034,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.53% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.53% (39,034,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.23% (34,515,708B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.23% (34,515,708B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.19% (33,882,942B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.19% (33,882,942B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.19% (33,882,942B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.19% (33,882,942B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | ->02.19% (33,882,942B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->00.04% (632,766B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->00.29% (4,518,492B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->01.82% (28,138,888B) 0x51F2440: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.82% (28,138,888B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.82% (28,138,888B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.82% (28,138,888B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.82% (28,138,888B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.82% (28,138,888B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.82% (28,138,888B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.82% (28,138,888B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->01.82% (28,138,888B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->01.44% (22,212,149B) in 189 places, all below massif's threshold (01.00%) | ->00.55% (8,457,790B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 47 120,090,364,653 1,548,217,088 1,548,167,663 49,425 0 48 180,882,674,722 1,548,217,448 1,548,168,039 49,409 0 49 180,883,391,417 1,555,750,808 1,555,701,207 49,601 0 50 180,883,754,195 1,563,284,344 1,563,234,583 49,761 0 100.00% (1,563,234,583B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.46% (1,554,776,793B) 0x4F3E598: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->75.50% (1,180,246,040B) 0x52B1D78: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->75.50% (1,180,246,040B) 0x52B3D19: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->75.13% (1,174,436,836B) 0x53EEE0C: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->75.13% (1,174,436,836B) 0x53E4FB5: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->75.13% (1,174,436,836B) 0x553AA24: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->37.56% (587,218,418B) 0x40614B: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2028) | | | | ->37.56% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->37.56% (587,218,418B) 0x40616C: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2029) | | | ->37.56% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | ->00.37% (5,809,204B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->11.02% (172,234,736B) 0x4FF7CFF: VecCreate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->11.02% (172,234,736B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.66% (88,472,048B) 0x4FF8EA3: VecCreate_Standard (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.66% (88,472,048B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.42% (84,707,280B) 0x53C0070: DMCreateGlobalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->05.42% (84,707,280B) 0x553D255: DMCreateGlobalVector (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->04.58% (71,530,592B) 0x53BFE5F: VecDuplicate_MPI_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->04.58% (71,530,592B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->04.58% (71,530,592B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->04.58% (71,530,592B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->04.58% (71,530,592B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->03.37% (52,706,752B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->03.37% (52,706,752B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->03.37% (52,706,752B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->03.37% (52,706,752B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->03.37% (52,706,752B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | ->03.37% (52,706,752B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | ->01.20% (18,823,840B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.20% (18,823,840B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.20% (18,823,840B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.20% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | ->01.20% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | ->00.84% (13,176,688B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.24% (3,764,768B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->05.36% (83,762,688B) 0x4FF1C5F: VecDuplicate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.36% (83,762,688B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.29% (82,688,032B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.29% (82,688,032B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.29% (82,688,032B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.77% (43,371,200B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.77% (43,371,200B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.77% (43,371,200B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.77% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.77% (43,371,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.77% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.77% (43,371,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.45% (38,350,720B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.45% (38,350,720B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.41% (37,647,680B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.41% (37,647,680B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.41% (37,647,680B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.41% (37,647,680B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | ->02.41% (37,647,680B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | ->00.04% (703,040B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.32% (5,020,480B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->01.39% (21,686,880B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.39% (21,686,880B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.39% (21,685,600B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.39% (21,685,600B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.39% (21,685,600B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.39% (21,685,600B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.23% (19,175,360B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.23% (19,175,360B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.20% (18,823,840B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.20% (18,823,840B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.20% (18,823,840B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.20% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | | ->01.20% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | | | ->00.02% (351,520B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.16% (2,510,240B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.00% (1,280B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->01.13% (17,629,952B) in 3 places, all below massif's threshold (01.00%) | | | | | ->00.07% (1,074,656B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->03.62% (56,623,104B) 0x417587: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc36.cc:75) | | ->03.62% (56,623,104B) 0x404CBE: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1860) | | ->03.62% (56,623,104B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->03.60% (56,277,776B) 0x51F256F: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.60% (56,277,776B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.60% (56,277,776B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.60% (56,277,776B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.60% (56,277,776B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.60% (56,277,776B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.60% (56,277,776B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.60% (56,277,776B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->03.60% (56,277,776B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->02.50% (39,034,200B) 0x5307655: MatSOR_SeqAIJ_Inode (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.50% (39,034,200B) 0x508F2F7: MatSOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.50% (39,034,200B) 0x5603367: PCApply_SOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.50% (39,034,200B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.50% (39,034,200B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.50% (39,034,200B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.50% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.50% (39,034,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.50% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.50% (39,034,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.21% (34,515,708B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.21% (34,515,708B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.17% (33,882,942B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.17% (33,882,942B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.17% (33,882,942B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.17% (33,882,942B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | ->02.17% (33,882,942B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->00.04% (632,766B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->00.29% (4,518,492B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->01.80% (28,138,888B) 0x51F2440: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.80% (28,138,888B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.80% (28,138,888B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.80% (28,138,888B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.80% (28,138,888B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.80% (28,138,888B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.80% (28,138,888B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.80% (28,138,888B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->01.80% (28,138,888B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->01.42% (22,222,049B) in 189 places, all below massif's threshold (01.00%) | ->00.54% (8,457,790B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 51 180,884,467,808 1,567,051,280 1,567,001,403 49,877 0 52 180,885,185,402 1,574,584,928 1,574,534,867 50,061 0 53 180,885,548,180 1,582,118,464 1,582,068,243 50,221 0 100.00% (1,582,068,243B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.46% (1,573,610,453B) 0x4F3E598: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->74.60% (1,180,246,040B) 0x52B1D78: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->74.60% (1,180,246,040B) 0x52B3D19: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->74.23% (1,174,436,836B) 0x53EEE0C: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->74.23% (1,174,436,836B) 0x53E4FB5: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->74.23% (1,174,436,836B) 0x553AA24: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->37.12% (587,218,418B) 0x40614B: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2028) | | | | ->37.12% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->37.12% (587,218,418B) 0x40616C: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2029) | | | ->37.12% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | ->00.37% (5,809,204B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->12.08% (191,058,576B) 0x4FF7CFF: VecCreate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->12.08% (191,058,576B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->06.78% (107,295,888B) 0x4FF8EA3: VecCreate_Standard (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->06.78% (107,295,888B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->06.54% (103,531,120B) 0x53C0070: DMCreateGlobalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->06.54% (103,531,120B) 0x553D255: DMCreateGlobalVector (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->05.71% (90,354,432B) 0x53BFE5F: VecDuplicate_MPI_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->05.71% (90,354,432B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->05.71% (90,354,432B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->05.71% (90,354,432B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->05.71% (90,354,432B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->04.52% (71,530,592B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->04.52% (71,530,592B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->04.52% (71,530,592B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->04.52% (71,530,592B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->04.52% (71,530,592B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | ->04.52% (71,530,592B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | ->01.19% (18,823,840B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.19% (18,823,840B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.19% (18,823,840B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.19% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | ->01.19% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | ->00.83% (13,176,688B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.24% (3,764,768B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->05.29% (83,762,688B) 0x4FF1C5F: VecDuplicate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.29% (83,762,688B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.23% (82,688,032B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.23% (82,688,032B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.23% (82,688,032B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.74% (43,371,200B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.74% (43,371,200B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.74% (43,371,200B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.74% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.74% (43,371,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.74% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.74% (43,371,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.42% (38,350,720B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.42% (38,350,720B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.38% (37,647,680B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.38% (37,647,680B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.38% (37,647,680B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.38% (37,647,680B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | ->02.38% (37,647,680B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | ->00.04% (703,040B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.32% (5,020,480B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->01.37% (21,686,880B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.37% (21,686,880B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.37% (21,685,600B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.37% (21,685,600B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.37% (21,685,600B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.37% (21,685,600B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.21% (19,175,360B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.21% (19,175,360B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.19% (18,823,840B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.19% (18,823,840B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.19% (18,823,840B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.19% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | | ->01.19% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | | | ->00.02% (351,520B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.16% (2,510,240B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.00% (1,280B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->01.11% (17,629,952B) in 3 places, all below massif's threshold (01.00%) | | | | | ->00.07% (1,074,656B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->03.58% (56,623,104B) 0x417587: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc36.cc:75) | | ->03.58% (56,623,104B) 0x404CBE: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1860) | | ->03.58% (56,623,104B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->03.56% (56,277,776B) 0x51F256F: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.56% (56,277,776B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.56% (56,277,776B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.56% (56,277,776B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.56% (56,277,776B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.56% (56,277,776B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.56% (56,277,776B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.56% (56,277,776B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->03.56% (56,277,776B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->02.47% (39,034,200B) 0x5307655: MatSOR_SeqAIJ_Inode (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.47% (39,034,200B) 0x508F2F7: MatSOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.47% (39,034,200B) 0x5603367: PCApply_SOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.47% (39,034,200B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.47% (39,034,200B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.47% (39,034,200B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.47% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.47% (39,034,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.47% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.47% (39,034,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.18% (34,515,708B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.18% (34,515,708B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.14% (33,882,942B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.14% (33,882,942B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.14% (33,882,942B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.14% (33,882,942B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | ->02.14% (33,882,942B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->00.04% (632,766B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->00.29% (4,518,492B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->01.78% (28,138,888B) 0x51F2440: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.78% (28,138,888B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.78% (28,138,888B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.78% (28,138,888B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.78% (28,138,888B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.78% (28,138,888B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.78% (28,138,888B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.78% (28,138,888B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->01.78% (28,138,888B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->01.41% (22,231,869B) in 189 places, all below massif's threshold (01.00%) | ->00.53% (8,457,790B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 54 180,886,261,793 1,585,885,400 1,585,835,063 50,337 0 55 241,478,135,876 1,585,885,776 1,585,835,439 50,337 0 56 241,478,852,571 1,593,419,136 1,593,368,607 50,529 0 57 241,479,215,349 1,600,952,672 1,600,901,983 50,689 0 100.00% (1,600,901,983B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.47% (1,592,444,193B) 0x4F3E598: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->73.72% (1,180,246,040B) 0x52B1D78: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->73.72% (1,180,246,040B) 0x52B3D19: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->73.36% (1,174,436,836B) 0x53EEE0C: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->73.36% (1,174,436,836B) 0x53E4FB5: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->73.36% (1,174,436,836B) 0x553AA24: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->36.68% (587,218,418B) 0x40614B: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2028) | | | | ->36.68% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->36.68% (587,218,418B) 0x40616C: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2029) | | | ->36.68% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | ->00.36% (5,809,204B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->13.11% (209,882,416B) 0x4FF7CFF: VecCreate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->13.11% (209,882,416B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->07.88% (126,119,728B) 0x4FF8EA3: VecCreate_Standard (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->07.88% (126,119,728B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->07.64% (122,354,960B) 0x53C0070: DMCreateGlobalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->07.64% (122,354,960B) 0x553D255: DMCreateGlobalVector (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->06.82% (109,178,272B) 0x53BFE5F: VecDuplicate_MPI_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->06.82% (109,178,272B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->06.82% (109,178,272B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->06.82% (109,178,272B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->06.82% (109,178,272B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->05.64% (90,354,432B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->05.64% (90,354,432B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->05.64% (90,354,432B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->05.64% (90,354,432B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->05.64% (90,354,432B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | ->05.64% (90,354,432B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | ->01.18% (18,823,840B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.18% (18,823,840B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.18% (18,823,840B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.18% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | ->01.18% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | ->00.82% (13,176,688B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.24% (3,764,768B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->05.23% (83,762,688B) 0x4FF1C5F: VecDuplicate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.23% (83,762,688B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.16% (82,688,032B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.16% (82,688,032B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.16% (82,688,032B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.71% (43,371,200B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.71% (43,371,200B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.71% (43,371,200B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.71% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.71% (43,371,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.71% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.71% (43,371,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.40% (38,350,720B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.40% (38,350,720B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.35% (37,647,680B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.35% (37,647,680B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.35% (37,647,680B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.35% (37,647,680B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | ->02.35% (37,647,680B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | ->00.04% (703,040B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.31% (5,020,480B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->01.35% (21,686,880B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.35% (21,686,880B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.35% (21,685,600B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.35% (21,685,600B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.35% (21,685,600B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.35% (21,685,600B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.20% (19,175,360B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.20% (19,175,360B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.18% (18,823,840B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.18% (18,823,840B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.18% (18,823,840B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.18% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | | ->01.18% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | | | ->00.02% (351,520B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.16% (2,510,240B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.00% (1,280B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->01.10% (17,629,952B) in 3 places, all below massif's threshold (01.00%) | | | | | ->00.07% (1,074,656B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->03.54% (56,623,104B) 0x417587: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc36.cc:75) | | ->03.54% (56,623,104B) 0x404CBE: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1860) | | ->03.54% (56,623,104B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->03.52% (56,277,776B) 0x51F256F: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.52% (56,277,776B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.52% (56,277,776B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.52% (56,277,776B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.52% (56,277,776B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.52% (56,277,776B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.52% (56,277,776B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.52% (56,277,776B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->03.52% (56,277,776B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->02.44% (39,034,200B) 0x5307655: MatSOR_SeqAIJ_Inode (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.44% (39,034,200B) 0x508F2F7: MatSOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.44% (39,034,200B) 0x5603367: PCApply_SOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.44% (39,034,200B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.44% (39,034,200B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.44% (39,034,200B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.44% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.44% (39,034,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.44% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.44% (39,034,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.16% (34,515,708B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.16% (34,515,708B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.12% (33,882,942B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.12% (33,882,942B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.12% (33,882,942B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.12% (33,882,942B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | ->02.12% (33,882,942B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->00.04% (632,766B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->00.28% (4,518,492B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->01.76% (28,138,888B) 0x51F2440: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.76% (28,138,888B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.76% (28,138,888B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.76% (28,138,888B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.76% (28,138,888B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.76% (28,138,888B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.76% (28,138,888B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.76% (28,138,888B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->01.76% (28,138,888B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->01.39% (22,241,769B) in 189 places, all below massif's threshold (01.00%) | ->00.53% (8,457,790B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 58 241,479,928,962 1,604,719,608 1,604,668,803 50,805 0 59 241,480,646,554 1,612,253,256 1,612,202,267 50,989 0 60 241,481,009,330 1,619,786,792 1,619,735,643 51,149 0 100.00% (1,619,735,643B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.47% (1,611,277,853B) 0x4F3E598: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->72.86% (1,180,246,040B) 0x52B1D78: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->72.86% (1,180,246,040B) 0x52B3D19: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->72.51% (1,174,436,836B) 0x53EEE0C: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->72.51% (1,174,436,836B) 0x53E4FB5: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->72.51% (1,174,436,836B) 0x553AA24: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->36.25% (587,218,418B) 0x40614B: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2028) | | | | ->36.25% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->36.25% (587,218,418B) 0x40616C: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2029) | | | ->36.25% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | ->00.36% (5,809,204B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->14.12% (228,706,256B) 0x4FF7CFF: VecCreate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->14.12% (228,706,256B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->08.95% (144,943,568B) 0x4FF8EA3: VecCreate_Standard (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->08.95% (144,943,568B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->08.72% (141,178,800B) 0x53C0070: DMCreateGlobalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->08.72% (141,178,800B) 0x553D255: DMCreateGlobalVector (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->07.90% (128,002,112B) 0x53BFE5F: VecDuplicate_MPI_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->07.90% (128,002,112B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->07.90% (128,002,112B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->07.90% (128,002,112B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->07.90% (128,002,112B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->06.74% (109,178,272B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->06.74% (109,178,272B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->06.74% (109,178,272B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->06.74% (109,178,272B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->06.74% (109,178,272B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | ->06.74% (109,178,272B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | ->01.16% (18,823,840B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.16% (18,823,840B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.16% (18,823,840B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.16% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | ->01.16% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | ->00.81% (13,176,688B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.23% (3,764,768B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->05.17% (83,762,688B) 0x4FF1C5F: VecDuplicate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.17% (83,762,688B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.10% (82,688,032B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.10% (82,688,032B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.10% (82,688,032B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.68% (43,371,200B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.68% (43,371,200B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.68% (43,371,200B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.68% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.68% (43,371,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.68% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.68% (43,371,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.37% (38,350,720B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.37% (38,350,720B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.32% (37,647,680B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.32% (37,647,680B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.32% (37,647,680B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.32% (37,647,680B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | ->02.32% (37,647,680B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | ->00.04% (703,040B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.31% (5,020,480B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->01.34% (21,686,880B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.34% (21,686,880B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.34% (21,685,600B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.34% (21,685,600B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.34% (21,685,600B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.34% (21,685,600B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.18% (19,175,360B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.18% (19,175,360B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.16% (18,823,840B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.16% (18,823,840B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.16% (18,823,840B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.16% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | | ->01.16% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | | | ->00.02% (351,520B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.15% (2,510,240B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.00% (1,280B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->01.09% (17,629,952B) in 3 places, all below massif's threshold (01.00%) | | | | | ->00.07% (1,074,656B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->03.50% (56,623,104B) 0x417587: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc36.cc:75) | | ->03.50% (56,623,104B) 0x404CBE: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1860) | | ->03.50% (56,623,104B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->03.47% (56,277,776B) 0x51F256F: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.47% (56,277,776B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.47% (56,277,776B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.47% (56,277,776B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.47% (56,277,776B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.47% (56,277,776B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.47% (56,277,776B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.47% (56,277,776B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->03.47% (56,277,776B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->02.41% (39,034,200B) 0x5307655: MatSOR_SeqAIJ_Inode (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.41% (39,034,200B) 0x508F2F7: MatSOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.41% (39,034,200B) 0x5603367: PCApply_SOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.41% (39,034,200B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.41% (39,034,200B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.41% (39,034,200B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.41% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.41% (39,034,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.41% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.41% (39,034,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.13% (34,515,708B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.13% (34,515,708B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.09% (33,882,942B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.09% (33,882,942B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.09% (33,882,942B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.09% (33,882,942B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | ->02.09% (33,882,942B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->00.04% (632,766B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->00.28% (4,518,492B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->01.74% (28,138,888B) 0x51F2440: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.74% (28,138,888B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.74% (28,138,888B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.74% (28,138,888B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.74% (28,138,888B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.74% (28,138,888B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.74% (28,138,888B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.74% (28,138,888B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->01.74% (28,138,888B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->01.37% (22,251,589B) in 189 places, all below massif's threshold (01.00%) | ->00.52% (8,457,790B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 61 241,481,722,939 1,623,553,728 1,623,502,463 51,265 0 62 302,564,355,117 1,623,554,104 1,623,502,839 51,265 0 63 302,565,071,808 1,631,087,464 1,631,036,007 51,457 0 64 302,565,434,584 1,638,621,000 1,638,569,383 51,617 0 100.00% (1,638,569,383B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.48% (1,630,111,593B) 0x4F3E598: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->72.03% (1,180,246,040B) 0x52B1D78: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->72.03% (1,180,246,040B) 0x52B3D19: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->71.67% (1,174,436,836B) 0x53EEE0C: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->71.67% (1,174,436,836B) 0x53E4FB5: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->71.67% (1,174,436,836B) 0x553AA24: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->35.84% (587,218,418B) 0x40614B: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2028) | | | | ->35.84% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->35.84% (587,218,418B) 0x40616C: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2029) | | | ->35.84% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | ->00.35% (5,809,204B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->15.11% (247,530,096B) 0x4FF7CFF: VecCreate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->15.11% (247,530,096B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->09.99% (163,767,408B) 0x4FF8EA3: VecCreate_Standard (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->09.99% (163,767,408B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->09.76% (160,002,640B) 0x53C0070: DMCreateGlobalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->09.76% (160,002,640B) 0x553D255: DMCreateGlobalVector (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->08.96% (146,825,952B) 0x53BFE5F: VecDuplicate_MPI_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->08.96% (146,825,952B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->08.96% (146,825,952B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->08.96% (146,825,952B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->08.96% (146,825,952B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->07.81% (128,002,112B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->07.81% (128,002,112B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->07.81% (128,002,112B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->07.81% (128,002,112B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->07.81% (128,002,112B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | ->07.81% (128,002,112B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | ->01.15% (18,823,840B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.15% (18,823,840B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.15% (18,823,840B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.15% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | ->01.15% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | ->00.80% (13,176,688B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.23% (3,764,768B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->05.11% (83,762,688B) 0x4FF1C5F: VecDuplicate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.11% (83,762,688B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.05% (82,688,032B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.05% (82,688,032B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->05.05% (82,688,032B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.65% (43,371,200B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.65% (43,371,200B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.65% (43,371,200B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.65% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.65% (43,371,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.65% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.65% (43,371,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.34% (38,350,720B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.34% (38,350,720B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.30% (37,647,680B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.30% (37,647,680B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.30% (37,647,680B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.30% (37,647,680B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | ->02.30% (37,647,680B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | ->00.04% (703,040B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.31% (5,020,480B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->01.32% (21,686,880B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.32% (21,686,880B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.32% (21,685,600B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.32% (21,685,600B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.32% (21,685,600B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.32% (21,685,600B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.17% (19,175,360B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.17% (19,175,360B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.15% (18,823,840B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.15% (18,823,840B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.15% (18,823,840B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.15% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | | ->01.15% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | | | ->00.02% (351,520B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.15% (2,510,240B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.00% (1,280B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->01.08% (17,629,952B) in 3 places, all below massif's threshold (01.00%) | | | | | ->00.07% (1,074,656B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->03.46% (56,623,104B) 0x417587: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc36.cc:75) | | ->03.46% (56,623,104B) 0x404CBE: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1860) | | ->03.46% (56,623,104B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->03.43% (56,277,776B) 0x51F256F: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.43% (56,277,776B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.43% (56,277,776B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.43% (56,277,776B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.43% (56,277,776B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.43% (56,277,776B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.43% (56,277,776B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.43% (56,277,776B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->03.43% (56,277,776B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->02.38% (39,034,200B) 0x5307655: MatSOR_SeqAIJ_Inode (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.38% (39,034,200B) 0x508F2F7: MatSOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.38% (39,034,200B) 0x5603367: PCApply_SOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.38% (39,034,200B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.38% (39,034,200B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.38% (39,034,200B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.38% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.38% (39,034,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.38% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.38% (39,034,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.11% (34,515,708B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.11% (34,515,708B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.07% (33,882,942B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.07% (33,882,942B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.07% (33,882,942B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.07% (33,882,942B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | ->02.07% (33,882,942B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->00.04% (632,766B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->00.28% (4,518,492B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->01.72% (28,138,888B) 0x51F2440: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.72% (28,138,888B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.72% (28,138,888B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.72% (28,138,888B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.72% (28,138,888B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.72% (28,138,888B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.72% (28,138,888B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.72% (28,138,888B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->01.72% (28,138,888B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->01.36% (22,261,489B) in 189 places, all below massif's threshold (01.00%) | ->00.52% (8,457,790B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 65 302,566,148,193 1,642,387,936 1,642,336,203 51,733 0 66 302,566,865,783 1,649,921,584 1,649,869,667 51,917 0 67 302,567,228,559 1,657,455,120 1,657,403,043 52,077 0 100.00% (1,657,403,043B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.49% (1,648,945,253B) 0x4F3E598: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->71.21% (1,180,246,040B) 0x52B1D78: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->71.21% (1,180,246,040B) 0x52B3D19: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->70.86% (1,174,436,836B) 0x53EEE0C: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->70.86% (1,174,436,836B) 0x53E4FB5: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->70.86% (1,174,436,836B) 0x553AA24: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->35.43% (587,218,418B) 0x40614B: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2028) | | | | ->35.43% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->35.43% (587,218,418B) 0x40616C: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2029) | | | ->35.43% (587,218,418B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | ->00.35% (5,809,204B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->16.07% (266,353,936B) 0x4FF7CFF: VecCreate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->16.07% (266,353,936B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->11.02% (182,591,248B) 0x4FF8EA3: VecCreate_Standard (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->11.02% (182,591,248B) 0x5008681: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->10.79% (178,826,480B) 0x53C0070: DMCreateGlobalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->10.79% (178,826,480B) 0x553D255: DMCreateGlobalVector (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->09.99% (165,649,792B) 0x53BFE5F: VecDuplicate_MPI_DA (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->09.99% (165,649,792B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->09.99% (165,649,792B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->09.99% (165,649,792B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->09.99% (165,649,792B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->08.86% (146,825,952B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->08.86% (146,825,952B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->08.86% (146,825,952B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->08.86% (146,825,952B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->08.86% (146,825,952B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | ->08.86% (146,825,952B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | ->01.14% (18,823,840B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.14% (18,823,840B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.14% (18,823,840B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.14% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | ->01.14% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | ->00.79% (13,176,688B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.23% (3,764,768B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->05.05% (83,762,688B) 0x4FF1C5F: VecDuplicate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->05.05% (83,762,688B) 0x5007F69: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->04.99% (82,688,032B) 0x500832F: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->04.99% (82,688,032B) 0x50050D5: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->04.99% (82,688,032B) 0x570A400: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.62% (43,371,200B) 0x56AF984: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.62% (43,371,200B) 0x56AEFB0: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.62% (43,371,200B) 0x56AEC7A: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.62% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.62% (43,371,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.62% (43,371,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.62% (43,371,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->02.31% (38,350,720B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.31% (38,350,720B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->02.27% (37,647,680B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.27% (37,647,680B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.27% (37,647,680B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->02.27% (37,647,680B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | ->02.27% (37,647,680B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | ->00.04% (703,040B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.30% (5,020,480B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->01.31% (21,686,880B) 0x56AE79B: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.31% (21,686,880B) 0x5701286: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->01.31% (21,685,600B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | ->01.31% (21,685,600B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.31% (21,685,600B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.31% (21,685,600B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | ->01.16% (19,175,360B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.16% (19,175,360B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | ->01.14% (18,823,840B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.14% (18,823,840B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.14% (18,823,840B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | | | | | ->01.14% (18,823,840B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | | | | | ->01.14% (18,823,840B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | | | | | | | | | ->00.02% (351,520B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.15% (2,510,240B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.00% (1,280B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->01.06% (17,629,952B) in 3 places, all below massif's threshold (01.00%) | | | | | ->00.06% (1,074,656B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->03.42% (56,623,104B) 0x417587: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc36.cc:75) | | ->03.42% (56,623,104B) 0x404CBE: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:1860) | | ->03.42% (56,623,104B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->03.40% (56,277,776B) 0x51F256F: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.40% (56,277,776B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.40% (56,277,776B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.40% (56,277,776B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.40% (56,277,776B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.40% (56,277,776B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.40% (56,277,776B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->03.40% (56,277,776B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->03.40% (56,277,776B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->02.36% (39,034,200B) 0x5307655: MatSOR_SeqAIJ_Inode (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.36% (39,034,200B) 0x508F2F7: MatSOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.36% (39,034,200B) 0x5603367: PCApply_SOR (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.36% (39,034,200B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.36% (39,034,200B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.36% (39,034,200B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.36% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.36% (39,034,200B) 0x56F6F73: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.36% (39,034,200B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.36% (39,034,200B) 0x56818FA: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->02.08% (34,515,708B) 0x5682B76: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.08% (34,515,708B) 0x5698127: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | ->02.04% (33,882,942B) 0x570D0E0: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.04% (33,882,942B) 0x56AEC65: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.04% (33,882,942B) 0x56FDF2E: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | | | ->02.04% (33,882,942B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | | | ->02.04% (33,882,942B) 0x4048AA: main (main_petsc36.cc:2255) | | | | | | | ->00.04% (632,766B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->00.27% (4,518,492B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->01.70% (28,138,888B) 0x51F2440: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.70% (28,138,888B) 0x51F192E: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.70% (28,138,888B) 0x509B68C: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.70% (28,138,888B) 0x568798E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.70% (28,138,888B) 0x5693D68: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.70% (28,138,888B) 0x57014F7: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.70% (28,138,888B) 0x56FDBFB: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | | ->01.70% (28,138,888B) 0x4069AB: solve_stokes_3d_coupled(int, int, int) (main_petsc36.cc:2133) | | ->01.70% (28,138,888B) 0x4048AA: main (main_petsc36.cc:2255) | | | ->01.34% (22,271,309B) in 189 places, all below massif's threshold (01.00%) | ->00.51% (8,457,790B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 68 302,567,942,168 1,661,222,056 1,661,169,863 52,193 0 69 327,133,263,563 1,661,222,384 1,661,170,189 52,195 0 70 327,134,801,224 1,661,225,528 1,661,172,873 52,655 0 71 327,135,484,200 1,661,226,648 1,661,173,805 52,843 0 72 327,136,002,907 1,320,812,376 1,320,775,719 36,657 0 73 327,136,533,387 8,543,760 8,531,809 11,951 0 74 327,137,076,493 8,543,856 8,531,825 12,031 0 75 327,137,613,732 8,543,856 8,531,825 12,031 0 76 327,138,154,123 8,543,856 8,531,825 12,031 0 77 327,138,673,343 8,543,856 8,531,825 12,031 0 99.86% (8,531,825B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->98.21% (8,391,248B) 0xA051F04: MPIDI_CH3I_Seg_commit (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->98.21% (8,391,248B) 0xA05E0D2: MPID_nem_init_ckpt (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->98.21% (8,391,248B) 0xA05FCA5: MPID_nem_init (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->98.21% (8,391,248B) 0x9F1106E: MPIDI_CH3_Init (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->98.21% (8,391,248B) 0xA04EBAB: MPID_Init (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->98.21% (8,391,248B) 0xA02538D: MPIR_Init_thread (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->98.21% (8,391,248B) 0xA024D98: PMPI_Init_thread (in /gpfs4l/smplocal/intel/impi/4.1.1.036/intel64/lib/libmpi.so.4.1) | ->98.21% (8,391,248B) 0x4F1D607: PetscInitialize (in /gpfs4l/smplocal/pub/PETSc/3.6.2/ada-real/lib/libpetsc.so.3.6.2) | ->98.21% (8,391,248B) 0x404761: main (main_petsc36.cc:2242) | ->01.65% (140,577B) in 80 places, all below massif's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 78 327,139,191,175 8,543,832 8,531,817 12,015 0 79 327,139,714,049 8,543,856 8,531,825 12,031 0 80 327,140,232,002 8,543,832 8,531,817 12,015 0 81 327,140,755,852 49,600 47,540 2,060 0 -------------- next part -------------- -------------------------------------------------------------------------------- Command: ./ex42 -mx 48 -my 48 -mz 48 -stokes_ksp_monitor_blocks 0 -model 0 -levels 3 Massif arguments: (none) ms_print arguments: massif.out.47240 -------------------------------------------------------------------------------- GB 1.674^ ::: | :::::::@:::::::::##::: | :::::::@:::::::::: ::::@: : :::::# ::: | ::::::::: :@::::: :::: ::::@: : :::::# ::: | :::::: ::::: :@::::: :::: ::::@: : :::::# ::: | ::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: | :::::::::::::::::::::::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: | : : : :::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: | : : : :::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: | : : : :::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: | : : : :::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: | : : : :::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: | : : : :::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: |:::: : : :::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: |: : : : :::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: |: : : : :::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: |: : : : :::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: |: : : : :::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: |: : : : :::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: |: : : : :::: :::: ::::: :@::::: :::: ::::@: : :::::# ::: 0 +----------------------------------------------------------------------->Ti 0 1.100 Number of snapshots: 75 Detailed snapshots: [22, 36, 45 (peak), 54, 64, 74] -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 0 0 0 0 0 0 1 4,980,608,775 661,628,576 661,566,530 62,046 0 2 53,659,329,974 661,628,576 661,566,530 62,046 0 3 58,343,759,694 1,258,402,064 1,258,334,906 67,158 0 4 107,022,473,279 1,258,386,616 1,258,319,478 67,138 0 5 107,808,983,916 1,269,736,560 1,269,669,042 67,518 0 6 262,297,899,894 1,269,712,584 1,269,645,086 67,498 0 7 262,297,915,219 1,269,736,560 1,269,669,042 67,518 0 8 395,956,080,886 1,269,712,584 1,269,645,086 67,498 0 9 410,013,623,726 1,379,141,240 1,379,051,510 89,730 0 10 436,485,513,646 1,379,894,488 1,379,804,718 89,770 0 11 444,589,783,484 1,455,207,200 1,455,111,954 95,246 0 12 471,061,673,275 1,455,960,448 1,455,865,162 95,286 0 13 490,475,678,488 1,593,525,776 1,593,399,454 126,322 0 14 513,251,901,897 1,608,152,744 1,608,018,094 134,650 0 15 531,294,375,247 1,608,155,584 1,608,020,914 134,670 0 16 558,853,194,437 1,645,938,808 1,645,802,718 136,090 0 17 576,440,011,933 1,645,938,808 1,645,802,718 136,090 0 18 600,336,201,777 1,645,938,808 1,645,802,718 136,090 0 19 614,937,032,875 1,645,938,808 1,645,802,718 136,090 0 20 637,484,776,453 1,645,938,808 1,645,802,718 136,090 0 21 659,407,063,651 1,645,938,808 1,645,802,718 136,090 0 22 676,265,624,554 1,668,604,104 1,668,467,230 136,874 0 99.99% (1,668,467,230B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.73% (1,664,055,560B) 0x4FFCEA1: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->99.73% (1,664,055,560B) 0x50006E8: PetscTrMallocDefault (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->99.62% (1,662,324,180B) 0x4FFE767: PetscMallocA (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->70.82% (1,181,786,648B) 0x583AC5B: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->70.73% (1,180,265,248B) 0x5839FF5: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->70.38% (1,174,446,424B) 0x5B6327B: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->70.38% (1,174,446,424B) 0x5B5951F: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->70.38% (1,174,446,424B) 0x5EBC894: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->35.19% (587,223,212B) 0x417497: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2026) | | | | | | ->35.19% (587,223,212B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->35.19% (587,223,212B) 0x41750B: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2027) | | | | | ->35.19% (587,223,212B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | ->00.35% (5,818,824B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.09% (1,521,400B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->10.84% (180,804,084B) 0x52A9E6E: VecCreate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->10.84% (180,804,084B) 0x52EC6A1: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->05.81% (96,905,208B) 0x52C2880: VecCreate_Standard (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->05.81% (96,905,208B) 0x52EC6A1: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->05.58% (93,138,836B) 0x5B0C608: DMCreateGlobalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->05.58% (93,138,836B) 0x5EB8BE6: DMCreateGlobalVector (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->04.74% (79,093,812B) 0x5B0BD6D: VecDuplicate_MPI_DA (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->04.74% (79,093,812B) 0x52DBF2E: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->04.74% (79,093,812B) 0x52E058A: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->04.74% (79,093,812B) 0x52DCD2D: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->04.74% (79,093,812B) 0x635A9B1: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->03.61% (60,261,952B) 0x62C4F9F: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->03.61% (60,261,952B) 0x62C19CD: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->03.61% (60,261,952B) 0x62C2540: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->03.61% (60,261,952B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->03.61% (60,261,952B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | | | ->03.61% (60,261,952B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | | | ->01.13% (18,831,860B) 0x62C0C8D: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.13% (18,831,860B) 0x633021B: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.13% (18,831,860B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.13% (18,831,860B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | | ->01.13% (18,831,860B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->00.84% (14,045,024B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.23% (3,766,372B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->05.03% (83,898,876B) 0x529E375: VecDuplicate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->05.03% (83,898,876B) 0x52DBF2E: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->04.96% (82,814,820B) 0x52E058A: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->04.96% (82,814,820B) 0x52DCD2D: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->04.96% (82,814,820B) 0x635A9B1: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->02.60% (43,434,400B) 0x62C4F9F: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.60% (43,434,400B) 0x62C19CD: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.60% (43,434,400B) 0x62C2540: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.60% (43,434,400B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.60% (43,434,400B) 0x62A6082: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->02.60% (43,434,400B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->02.60% (43,434,400B) 0x61990DE: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->02.30% (38,382,800B) 0x619E2A5: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->02.30% (38,382,800B) 0x61E665F: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->02.26% (37,663,720B) 0x635E4EB: KSP_PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->02.26% (37,663,720B) 0x635F3D2: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->02.26% (37,663,720B) 0x62C24DD: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->02.26% (37,663,720B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->02.26% (37,663,720B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | | | | ->02.26% (37,663,720B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | | | | | ->00.04% (719,080B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.30% (5,051,600B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->01.30% (21,717,200B) 0x62C0C8D: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.30% (21,717,200B) 0x633021B: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.30% (21,717,200B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.30% (21,717,200B) 0x62A6082: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.30% (21,717,200B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.30% (21,717,200B) 0x61990DE: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.15% (19,191,400B) 0x619E2A5: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->01.15% (19,191,400B) 0x61E665F: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->01.13% (18,831,860B) 0x635E4EB: KSP_PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->01.13% (18,831,860B) 0x635F3D2: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->01.13% (18,831,860B) 0x62C24DD: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->01.13% (18,831,860B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->01.13% (18,831,860B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | | | | ->01.13% (18,831,860B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | | | | | ->00.02% (359,540B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.15% (2,525,800B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->01.06% (17,663,220B) in 3 places, all below massif's threshold (01.00%) | | | | | | | ->00.06% (1,084,056B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->06.75% (112,562,000B) 0x56C4881: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->06.75% (112,562,000B) 0x56C5A3B: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->06.75% (112,562,000B) 0x541F3A9: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->06.75% (112,562,000B) 0x5433AEC: MatGalerkin (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.37% (56,281,000B) 0x61A23A0: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.37% (56,281,000B) 0x61ED10C: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.37% (56,281,000B) 0x6330A89: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.37% (56,281,000B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.37% (56,281,000B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | ->03.37% (56,281,000B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | ->03.37% (56,281,000B) 0x61A244E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.37% (56,281,000B) 0x61ED10C: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.37% (56,281,000B) 0x6330A89: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.37% (56,281,000B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.37% (56,281,000B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | ->03.37% (56,281,000B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | ->03.39% (56,624,708B) 0x403FE0: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc310.cc:75) | | | ->03.39% (56,624,708B) 0x415FE6: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:1858) | | | ->03.39% (56,624,708B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | ->03.37% (56,284,240B) 0x56C46D6: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.37% (56,284,240B) 0x56C5A3B: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.37% (56,284,240B) 0x541F3A9: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.37% (56,284,240B) 0x5433AEC: MatGalerkin (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.69% (28,142,120B) 0x61A23A0: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.69% (28,142,120B) 0x61ED10C: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.69% (28,142,120B) 0x6330A89: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.69% (28,142,120B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.69% (28,142,120B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | ->01.69% (28,142,120B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | ->01.69% (28,142,120B) 0x61A244E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.69% (28,142,120B) 0x61ED10C: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.69% (28,142,120B) 0x6330A89: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.69% (28,142,120B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.69% (28,142,120B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | ->01.69% (28,142,120B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | ->02.34% (39,052,464B) 0x58AA01F: MatSOR_SeqAIJ_Inode (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.34% (39,052,464B) 0x53E6A04: MatSOR (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.34% (39,052,464B) 0x60260A2: PCApply_SOR (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.34% (39,052,464B) 0x61E665F: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.34% (39,052,464B) 0x635E4EB: KSP_PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.34% (39,052,464B) 0x635F3D2: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.34% (39,052,464B) 0x62C24DD: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.34% (39,052,464B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.34% (39,052,464B) 0x62A6082: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.34% (39,052,464B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.34% (39,052,464B) 0x61990DE: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.07% (34,525,272B) 0x619E2A5: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->02.07% (34,525,272B) 0x61E665F: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->02.03% (33,887,724B) 0x635E4EB: KSP_PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.03% (33,887,724B) 0x635F3D2: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.03% (33,887,724B) 0x62C24DD: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.03% (33,887,724B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.03% (33,887,724B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | ->02.03% (33,887,724B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | ->00.04% (637,548B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.27% (4,527,192B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->02.11% (35,210,036B) in 189 places, all below massif's threshold (01.00%) | | | ->00.10% (1,731,380B) in 1+ places, all below ms_print's threshold (01.00%) | ->00.26% (4,411,670B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 23 688,869,472,856 1,683,722,032 1,683,584,522 137,510 0 24 706,846,779,285 1,683,722,032 1,683,584,522 137,510 0 25 734,772,531,658 1,683,722,032 1,683,584,522 137,510 0 26 753,089,886,037 1,683,722,032 1,683,584,522 137,510 0 27 770,117,739,409 1,683,722,032 1,683,584,522 137,510 0 28 794,894,660,656 1,683,722,032 1,683,584,522 137,510 0 29 808,203,400,071 1,683,722,032 1,683,584,522 137,510 0 30 826,475,260,753 1,721,505,256 1,721,366,326 138,930 0 31 851,341,932,673 1,721,505,256 1,721,366,326 138,930 0 32 878,754,667,235 1,721,505,256 1,721,366,326 138,930 0 33 892,704,562,674 1,721,505,256 1,721,366,326 138,930 0 34 907,360,934,944 1,721,505,256 1,721,366,326 138,930 0 35 934,847,264,351 1,721,505,256 1,721,366,326 138,930 0 36 955,723,849,888 1,744,170,552 1,744,030,838 139,714 0 99.99% (1,744,030,838B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.74% (1,739,619,168B) 0x4FFCEA1: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->99.74% (1,739,619,168B) 0x50006E8: PetscTrMallocDefault (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->99.64% (1,737,887,788B) 0x4FFE767: PetscMallocA (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->67.76% (1,181,786,648B) 0x583AC5B: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->67.67% (1,180,265,248B) 0x5839FF5: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->67.34% (1,174,446,424B) 0x5B6327B: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->67.34% (1,174,446,424B) 0x5B5951F: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->67.34% (1,174,446,424B) 0x5EBC894: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->33.67% (587,223,212B) 0x417497: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2026) | | | | | | ->33.67% (587,223,212B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->33.67% (587,223,212B) 0x41750B: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2027) | | | | | ->33.67% (587,223,212B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | ->00.33% (5,818,824B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.09% (1,521,400B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->14.69% (256,131,524B) 0x52A9E6E: VecCreate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->14.69% (256,131,524B) 0x52EC6A1: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->09.87% (172,232,648B) 0x52C2880: VecCreate_Standard (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->09.87% (172,232,648B) 0x52EC6A1: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->09.66% (168,466,276B) 0x5B0C608: DMCreateGlobalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->09.66% (168,466,276B) 0x5EB8BE6: DMCreateGlobalVector (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->08.85% (154,421,252B) 0x5B0BD6D: VecDuplicate_MPI_DA (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->08.85% (154,421,252B) 0x52DBF2E: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->08.85% (154,421,252B) 0x52E058A: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->08.85% (154,421,252B) 0x52DCD2D: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->08.85% (154,421,252B) 0x635A9B1: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->07.77% (135,589,392B) 0x62C4F9F: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->07.77% (135,589,392B) 0x62C19CD: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->07.77% (135,589,392B) 0x62C2540: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->07.77% (135,589,392B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->07.77% (135,589,392B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | | | ->07.77% (135,589,392B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | | | ->01.08% (18,831,860B) 0x62C0C8D: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.08% (18,831,860B) 0x633021B: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.08% (18,831,860B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.08% (18,831,860B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | | ->01.08% (18,831,860B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->00.81% (14,045,024B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.22% (3,766,372B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->04.81% (83,898,876B) 0x529E375: VecDuplicate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->04.81% (83,898,876B) 0x52DBF2E: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->04.75% (82,814,820B) 0x52E058A: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->04.75% (82,814,820B) 0x52DCD2D: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->04.75% (82,814,820B) 0x635A9B1: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->02.49% (43,434,400B) 0x62C4F9F: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.49% (43,434,400B) 0x62C19CD: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.49% (43,434,400B) 0x62C2540: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.49% (43,434,400B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.49% (43,434,400B) 0x62A6082: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->02.49% (43,434,400B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->02.49% (43,434,400B) 0x61990DE: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->02.20% (38,382,800B) 0x619E2A5: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->02.20% (38,382,800B) 0x61E665F: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->02.16% (37,663,720B) 0x635E4EB: KSP_PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->02.16% (37,663,720B) 0x635F3D2: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->02.16% (37,663,720B) 0x62C24DD: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->02.16% (37,663,720B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->02.16% (37,663,720B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | | | | ->02.16% (37,663,720B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | | | | | ->00.04% (719,080B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.29% (5,051,600B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->01.25% (21,717,200B) 0x62C0C8D: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.25% (21,717,200B) 0x633021B: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.25% (21,717,200B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.25% (21,717,200B) 0x62A6082: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.25% (21,717,200B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.25% (21,717,200B) 0x61990DE: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.10% (19,191,400B) 0x619E2A5: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->01.10% (19,191,400B) 0x61E665F: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->01.08% (18,831,860B) 0x635E4EB: KSP_PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->01.08% (18,831,860B) 0x635F3D2: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->01.08% (18,831,860B) 0x62C24DD: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->01.08% (18,831,860B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->01.08% (18,831,860B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | | | | ->01.08% (18,831,860B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | | | | | ->00.02% (359,540B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.14% (2,525,800B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->01.01% (17,663,220B) in 3 places, all below massif's threshold (01.00%) | | | | | | | ->00.06% (1,084,056B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->06.45% (112,562,000B) 0x56C4881: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->06.45% (112,562,000B) 0x56C5A3B: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->06.45% (112,562,000B) 0x541F3A9: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->06.45% (112,562,000B) 0x5433AEC: MatGalerkin (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.23% (56,281,000B) 0x61A23A0: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.23% (56,281,000B) 0x61ED10C: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.23% (56,281,000B) 0x6330A89: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.23% (56,281,000B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.23% (56,281,000B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | ->03.23% (56,281,000B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | ->03.23% (56,281,000B) 0x61A244E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.23% (56,281,000B) 0x61ED10C: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.23% (56,281,000B) 0x6330A89: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.23% (56,281,000B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.23% (56,281,000B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | ->03.23% (56,281,000B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | ->03.25% (56,624,708B) 0x403FE0: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc310.cc:75) | | | ->03.25% (56,624,708B) 0x415FE6: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:1858) | | | ->03.25% (56,624,708B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | ->03.23% (56,284,240B) 0x56C46D6: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.23% (56,284,240B) 0x56C5A3B: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.23% (56,284,240B) 0x541F3A9: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.23% (56,284,240B) 0x5433AEC: MatGalerkin (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.61% (28,142,120B) 0x61A23A0: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.61% (28,142,120B) 0x61ED10C: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.61% (28,142,120B) 0x6330A89: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.61% (28,142,120B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.61% (28,142,120B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | ->01.61% (28,142,120B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | ->01.61% (28,142,120B) 0x61A244E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.61% (28,142,120B) 0x61ED10C: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.61% (28,142,120B) 0x6330A89: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.61% (28,142,120B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.61% (28,142,120B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | ->01.61% (28,142,120B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | ->02.24% (39,052,464B) 0x58AA01F: MatSOR_SeqAIJ_Inode (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.24% (39,052,464B) 0x53E6A04: MatSOR (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.24% (39,052,464B) 0x60260A2: PCApply_SOR (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.24% (39,052,464B) 0x61E665F: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.24% (39,052,464B) 0x635E4EB: KSP_PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.24% (39,052,464B) 0x635F3D2: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.24% (39,052,464B) 0x62C24DD: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.24% (39,052,464B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.24% (39,052,464B) 0x62A6082: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.24% (39,052,464B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.24% (39,052,464B) 0x61990DE: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.98% (34,525,272B) 0x619E2A5: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.98% (34,525,272B) 0x61E665F: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.94% (33,887,724B) 0x635E4EB: KSP_PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.94% (33,887,724B) 0x635F3D2: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.94% (33,887,724B) 0x62C24DD: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.94% (33,887,724B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.94% (33,887,724B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | ->01.94% (33,887,724B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | ->00.04% (637,548B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.26% (4,527,192B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->02.03% (35,446,204B) in 189 places, all below massif's threshold (01.00%) | | | ->00.10% (1,731,380B) in 1+ places, all below ms_print's threshold (01.00%) | ->00.25% (4,411,670B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 37 969,591,761,903 1,759,288,480 1,759,148,130 140,350 0 38 991,106,632,404 1,759,288,480 1,759,148,130 140,350 0 39 1,005,161,800,597 1,759,288,480 1,759,148,130 140,350 0 40 1,025,190,917,071 1,759,288,480 1,759,148,130 140,350 0 41 1,047,466,381,537 1,759,288,480 1,759,148,130 140,350 0 42 1,061,578,439,350 1,759,288,480 1,759,148,130 140,350 0 43 1,086,296,641,845 1,759,288,480 1,759,148,130 140,350 0 44 1,102,529,172,628 1,759,288,480 1,759,148,130 140,350 0 45 1,123,472,773,295 1,781,953,776 1,781,812,642 141,134 0 99.99% (1,781,812,642B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->99.74% (1,777,400,972B) 0x4FFCEA1: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->99.74% (1,777,400,972B) 0x50006E8: PetscTrMallocDefault (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->99.65% (1,775,669,592B) 0x4FFE767: PetscMallocA (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->66.32% (1,181,786,648B) 0x583AC5B: MatSeqAIJSetPreallocation_SeqAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->66.23% (1,180,265,248B) 0x5839FF5: MatSeqAIJSetPreallocation (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->65.91% (1,174,446,424B) 0x5B6327B: DMCreateMatrix_DA_3d_MPIAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->65.91% (1,174,446,424B) 0x5B5951F: DMCreateMatrix_DA (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->65.91% (1,174,446,424B) 0x5EBC894: DMCreateMatrix (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->32.95% (587,223,212B) 0x417497: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2026) | | | | | | ->32.95% (587,223,212B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->32.95% (587,223,212B) 0x41750B: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2027) | | | | | ->32.95% (587,223,212B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | ->00.33% (5,818,824B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.09% (1,521,400B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->16.49% (293,795,244B) 0x52A9E6E: VecCreate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->16.49% (293,795,244B) 0x52EC6A1: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->11.78% (209,896,368B) 0x52C2880: VecCreate_Standard (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->11.78% (209,896,368B) 0x52EC6A1: VecSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->11.57% (206,129,996B) 0x5B0C608: DMCreateGlobalVector_DA (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->11.57% (206,129,996B) 0x5EB8BE6: DMCreateGlobalVector (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->10.78% (192,084,972B) 0x5B0BD6D: VecDuplicate_MPI_DA (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->10.78% (192,084,972B) 0x52DBF2E: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->10.78% (192,084,972B) 0x52E058A: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->10.78% (192,084,972B) 0x52DCD2D: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->10.78% (192,084,972B) 0x635A9B1: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->09.72% (173,253,112B) 0x62C4F9F: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->09.72% (173,253,112B) 0x62C19CD: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->09.72% (173,253,112B) 0x62C2540: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->09.72% (173,253,112B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->09.72% (173,253,112B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | | | ->09.72% (173,253,112B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | | | ->01.06% (18,831,860B) 0x62C0C8D: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.06% (18,831,860B) 0x633021B: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.06% (18,831,860B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.06% (18,831,860B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | | ->01.06% (18,831,860B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->00.79% (14,045,024B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.21% (3,766,372B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->04.71% (83,898,876B) 0x529E375: VecDuplicate_Seq (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->04.71% (83,898,876B) 0x52DBF2E: VecDuplicate (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->04.65% (82,814,820B) 0x52E058A: VecDuplicateVecs_Default (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->04.65% (82,814,820B) 0x52DCD2D: VecDuplicateVecs (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->04.65% (82,814,820B) 0x635A9B1: KSPCreateVecs (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->02.44% (43,434,400B) 0x62C4F9F: KSPGMRESGetNewVectors (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.44% (43,434,400B) 0x62C19CD: KSPGMRESCycle (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.44% (43,434,400B) 0x62C2540: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.44% (43,434,400B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->02.44% (43,434,400B) 0x62A6082: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->02.44% (43,434,400B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->02.44% (43,434,400B) 0x61990DE: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->02.15% (38,382,800B) 0x619E2A5: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->02.15% (38,382,800B) 0x61E665F: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->02.11% (37,663,720B) 0x635E4EB: KSP_PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->02.11% (37,663,720B) 0x635F3D2: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->02.11% (37,663,720B) 0x62C24DD: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->02.11% (37,663,720B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->02.11% (37,663,720B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | | | | ->02.11% (37,663,720B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | | | | | ->00.04% (719,080B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.28% (5,051,600B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->01.22% (21,717,200B) 0x62C0C8D: KSPSetUp_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.22% (21,717,200B) 0x633021B: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.22% (21,717,200B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.22% (21,717,200B) 0x62A6082: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.22% (21,717,200B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.22% (21,717,200B) 0x61990DE: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.08% (19,191,400B) 0x619E2A5: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->01.08% (19,191,400B) 0x61E665F: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | ->01.06% (18,831,860B) 0x635E4EB: KSP_PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->01.06% (18,831,860B) 0x635F3D2: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->01.06% (18,831,860B) 0x62C24DD: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->01.06% (18,831,860B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | | | ->01.06% (18,831,860B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | | | | ->01.06% (18,831,860B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | | | | | ->00.02% (359,540B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | | | ->00.14% (2,525,800B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.99% (17,663,220B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.06% (1,084,056B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->06.32% (112,562,000B) 0x56C4881: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->06.32% (112,562,000B) 0x56C5A3B: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->06.32% (112,562,000B) 0x541F3A9: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->06.32% (112,562,000B) 0x5433AEC: MatGalerkin (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.16% (56,281,000B) 0x61A23A0: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.16% (56,281,000B) 0x61ED10C: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.16% (56,281,000B) 0x6330A89: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.16% (56,281,000B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.16% (56,281,000B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | ->03.16% (56,281,000B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | ->03.16% (56,281,000B) 0x61A244E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.16% (56,281,000B) 0x61ED10C: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.16% (56,281,000B) 0x6330A89: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.16% (56,281,000B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.16% (56,281,000B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | ->03.16% (56,281,000B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | ->03.18% (56,624,708B) 0x403FE0: CellPropertiesCreate(_p_DM*, _p_CellProperties**) (main_petsc310.cc:75) | | | ->03.18% (56,624,708B) 0x415FE6: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:1858) | | | ->03.18% (56,624,708B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | ->03.16% (56,284,240B) 0x56C46D6: MatPtAPSymbolic_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.16% (56,284,240B) 0x56C5A3B: MatPtAP_SeqAIJ_SeqMAIJ (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.16% (56,284,240B) 0x541F3A9: MatPtAP (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.16% (56,284,240B) 0x5433AEC: MatGalerkin (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.58% (28,142,120B) 0x61A23A0: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.58% (28,142,120B) 0x61ED10C: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.58% (28,142,120B) 0x6330A89: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.58% (28,142,120B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.58% (28,142,120B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | ->01.58% (28,142,120B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | ->01.58% (28,142,120B) 0x61A244E: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.58% (28,142,120B) 0x61ED10C: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.58% (28,142,120B) 0x6330A89: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.58% (28,142,120B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.58% (28,142,120B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | ->01.58% (28,142,120B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | ->02.19% (39,052,464B) 0x58AA01F: MatSOR_SeqAIJ_Inode (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.19% (39,052,464B) 0x53E6A04: MatSOR (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.19% (39,052,464B) 0x60260A2: PCApply_SOR (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.19% (39,052,464B) 0x61E665F: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.19% (39,052,464B) 0x635E4EB: KSP_PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.19% (39,052,464B) 0x635F3D2: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.19% (39,052,464B) 0x62C24DD: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.19% (39,052,464B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.19% (39,052,464B) 0x62A6082: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.19% (39,052,464B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.19% (39,052,464B) 0x61990DE: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.94% (34,525,272B) 0x619E2A5: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.94% (34,525,272B) 0x61E665F: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.90% (33,887,724B) 0x635E4EB: KSP_PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.90% (33,887,724B) 0x635F3D2: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.90% (33,887,724B) 0x62C24DD: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.90% (33,887,724B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.90% (33,887,724B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | | | | ->01.90% (33,887,724B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | ->00.04% (637,548B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.25% (4,527,192B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->02.00% (35,564,288B) in 189 places, all below massif's threshold (01.00%) | | | ->00.10% (1,731,380B) in 1+ places, all below ms_print's threshold (01.00%) | ->00.25% (4,411,670B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 46 1,144,977,466,286 1,797,071,704 1,796,929,934 141,770 0 47 1,169,482,873,650 1,797,071,704 1,796,929,934 141,770 0 48 1,187,701,811,363 1,797,071,704 1,796,929,934 141,770 0 49 1,209,131,205,775 5,474,208 5,419,554 54,654 0 50 1,209,131,221,293 5,388,512 5,334,898 53,614 0 51 1,209,131,236,638 5,281,384 5,229,070 52,314 0 52 1,209,131,252,176 5,169,320 5,118,366 50,954 0 53 1,209,131,267,534 5,099,704 5,049,606 50,098 0 54 1,209,131,282,926 4,994,488 4,945,670 48,818 0 99.02% (4,945,670B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->55.11% (2,752,512B) 0xB6B4A75: MPIDI_RMA_init (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->55.11% (2,752,512B) 0xB69B950: MPID_Init (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->55.11% (2,752,512B) 0xB5031EE: MPIR_Init_thread (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->55.11% (2,752,512B) 0xB5036EE: PMPI_Init_thread (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->55.11% (2,752,512B) 0x4FAB272: PetscInitialize (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->55.11% (2,752,512B) 0x41983C: main (main_petsc310.cc:2241) | ->26.24% (1,310,720B) 0xB6B4CFD: MPIDI_RMA_init (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->26.24% (1,310,720B) 0xB69B950: MPID_Init (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->26.24% (1,310,720B) 0xB5031EE: MPIR_Init_thread (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->26.24% (1,310,720B) 0xB5036EE: PMPI_Init_thread (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->26.24% (1,310,720B) 0x4FAB272: PetscInitialize (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->26.24% (1,310,720B) 0x41983C: main (main_petsc310.cc:2241) | ->10.74% (536,488B) 0x4FFCEA1: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->10.74% (536,488B) 0x50006E8: PetscTrMallocDefault (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->10.51% (524,852B) 0x4FFE767: PetscMallocA (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->08.51% (425,240B) 0x4FDB939: PetscStrallocpy (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->07.24% (361,772B) 0x4F662CD: PetscEventRegLogRegister (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->07.24% (361,772B) 0x4F3968E: PetscLogEventRegister (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->07.24% (361,772B) in 247 places, all below massif's threshold (01.00%) | | | | | | | ->01.27% (63,468B) in 32 places, all below massif's threshold (01.00%) | | | | | ->01.99% (99,612B) in 203 places, all below massif's threshold (01.00%) | | | ->00.23% (11,636B) in 1+ places, all below ms_print's threshold (01.00%) | ->04.79% (239,180B) in 130 places, all below massif's threshold (01.00%) | ->01.12% (55,933B) 0xC2A1ED0: strdup (in /lib64/libc-2.12.so) | ->01.12% (55,933B) in 100 places, all below massif's threshold (01.00%) | ->01.02% (50,837B) 0xB6E2FC3: hwloc_tma_malloc (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) ->01.02% (50,837B) in 8 places, all below massif's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 55 1,209,131,298,332 4,889,160 4,841,622 47,538 0 56 1,209,131,313,702 4,784,120 4,737,862 46,258 0 57 1,209,131,329,062 4,679,160 4,634,182 44,978 0 58 1,209,131,344,556 4,585,648 4,541,730 43,918 0 59 1,209,131,372,522 4,497,744 4,454,235 43,509 0 60 1,209,131,391,226 4,483,776 4,440,702 43,074 0 61 1,209,131,406,563 4,481,160 4,438,490 42,670 0 62 1,209,131,422,011 4,459,456 4,420,562 38,894 0 63 1,209,131,437,338 4,438,256 4,403,151 35,105 0 64 1,209,131,452,709 4,416,680 4,385,319 31,361 0 99.29% (4,385,319B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->62.32% (2,752,512B) 0xB6B4A75: MPIDI_RMA_init (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->62.32% (2,752,512B) 0xB69B950: MPID_Init (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->62.32% (2,752,512B) 0xB5031EE: MPIR_Init_thread (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->62.32% (2,752,512B) 0xB5036EE: PMPI_Init_thread (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->62.32% (2,752,512B) 0x4FAB272: PetscInitialize (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->62.32% (2,752,512B) 0x41983C: main (main_petsc310.cc:2241) | ->29.68% (1,310,720B) 0xB6B4CFD: MPIDI_RMA_init (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->29.68% (1,310,720B) 0xB69B950: MPID_Init (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->29.68% (1,310,720B) 0xB5031EE: MPIR_Init_thread (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->29.68% (1,310,720B) 0xB5036EE: PMPI_Init_thread (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->29.68% (1,310,720B) 0x4FAB272: PetscInitialize (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->29.68% (1,310,720B) 0x41983C: main (main_petsc310.cc:2241) | ->06.03% (266,288B) in 132 places, all below massif's threshold (01.00%) | ->01.26% (55,799B) 0xC2A1ED0: strdup (in /lib64/libc-2.12.so) ->01.02% (45,156B) 0xB6D6187: MPIR_T_CVAR_REGISTER_impl (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.02% (45,156B) in 235 places, all below massif's threshold (01.00%) | ->00.24% (10,643B) in 1+ places, all below ms_print's threshold (01.00%) -------------------------------------------------------------------------------- n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) -------------------------------------------------------------------------------- 65 1,209,131,468,036 4,395,496 4,367,908 27,588 0 66 1,209,131,483,425 4,374,144 4,350,321 23,823 0 67 1,209,131,498,780 4,352,664 4,332,601 20,063 0 68 1,209,131,515,035 266,744 249,681 17,063 0 69 1,209,131,530,396 211,640 200,353 11,287 0 70 1,209,131,545,804 143,072 136,226 6,846 0 71 1,209,131,561,148 133,480 128,378 5,102 0 72 1,209,131,576,478 122,528 119,356 3,172 0 73 1,209,131,659,641 107,416 105,622 1,794 0 74 1,209,131,709,690 88,184 86,649 1,535 0 98.26% (86,649B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. ->49.90% (44,008B) 0x4FFCEA1: PetscMallocAlign (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->49.90% (44,008B) 0x50006E8: PetscTrMallocDefault (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->49.90% (44,008B) 0x4FFE767: PetscMallocA (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->22.19% (19,568B) 0x4FDB939: PetscStrallocpy (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->11.13% (9,816B) 0x500DF1D: PetscFunctionListAdd_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->11.13% (9,816B) 0x4F8255C: PetscObjectComposeFunction_Petsc (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->11.13% (9,816B) 0x4F83ACF: PetscObjectComposeFunction_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->03.71% (3,272B) 0x50D123D: PetscViewerCreate_ASCII (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->03.71% (3,272B) 0x50E29F2: PetscViewerSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->03.71% (3,272B) 0x50D4E82: PetscViewerASCIIOpen (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.86% (1,636B) 0x418A0C: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2169) | | | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | | | ->01.86% (1,636B) 0x418AE3: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2178) | | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->03.71% (3,272B) 0x50D12A7: PetscViewerCreate_ASCII (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->03.71% (3,272B) 0x50E29F2: PetscViewerSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->03.71% (3,272B) 0x50D4E82: PetscViewerASCIIOpen (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.86% (1,636B) 0x418A0C: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2169) | | | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | | | ->01.86% (1,636B) 0x418AE3: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2178) | | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->03.71% (3,272B) 0x50D1311: PetscViewerCreate_ASCII (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->03.71% (3,272B) 0x50E29F2: PetscViewerSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->03.71% (3,272B) 0x50D4E82: PetscViewerASCIIOpen (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.86% (1,636B) 0x418A0C: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2169) | | | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | | | ->01.86% (1,636B) 0x418AE3: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2178) | | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->03.71% (3,272B) 0x500DBCE: PetscFunctionListAdd_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.71% (3,272B) 0x4F8255C: PetscObjectComposeFunction_Petsc (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->03.71% (3,272B) 0x4F83ACF: PetscObjectComposeFunction_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->03.71% (3,272B) 0x50D11D3: PetscViewerCreate_ASCII (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->03.71% (3,272B) 0x50E29F2: PetscViewerSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->03.71% (3,272B) 0x50D4E82: PetscViewerASCIIOpen (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | | ->01.86% (1,636B) 0x418A0C: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2169) | | | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | | | ->01.86% (1,636B) 0x418AE3: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2178) | | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->03.67% (3,240B) 0x4F74319: PetscObjectChangeTypeName (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.67% (3,240B) 0x50E2995: PetscViewerSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->03.67% (3,240B) 0x50D4E82: PetscViewerASCIIOpen (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.84% (1,620B) 0x418A0C: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2169) | | | | | | ->01.84% (1,620B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->01.84% (1,620B) 0x418AE3: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2178) | | | | | ->01.84% (1,620B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->03.67% (3,240B) 0x50CFAFF: PetscViewerFileSetName_ASCII (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.67% (3,240B) 0x50CED56: PetscViewerFileSetName (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.67% (3,240B) 0x50D4EEF: PetscViewerASCIIOpen (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->01.84% (1,620B) 0x418A0C: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2169) | | | | | ->01.84% (1,620B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | ->01.84% (1,620B) 0x418AE3: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2178) | | | | ->01.84% (1,620B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->11.13% (9,816B) 0x500DEB6: PetscFunctionListAdd_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->11.13% (9,816B) 0x4F8255C: PetscObjectComposeFunction_Petsc (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->11.13% (9,816B) 0x4F83ACF: PetscObjectComposeFunction_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.71% (3,272B) 0x50D123D: PetscViewerCreate_ASCII (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->03.71% (3,272B) 0x50E29F2: PetscViewerSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->03.71% (3,272B) 0x50D4E82: PetscViewerASCIIOpen (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.86% (1,636B) 0x418A0C: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2169) | | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->01.86% (1,636B) 0x418AE3: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2178) | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | ->03.71% (3,272B) 0x50D12A7: PetscViewerCreate_ASCII (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->03.71% (3,272B) 0x50E29F2: PetscViewerSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->03.71% (3,272B) 0x50D4E82: PetscViewerASCIIOpen (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.86% (1,636B) 0x418A0C: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2169) | | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->01.86% (1,636B) 0x418AE3: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2178) | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | ->03.71% (3,272B) 0x50D1311: PetscViewerCreate_ASCII (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->03.71% (3,272B) 0x50E29F2: PetscViewerSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->03.71% (3,272B) 0x50D4E82: PetscViewerASCIIOpen (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.86% (1,636B) 0x418A0C: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2169) | | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->01.86% (1,636B) 0x418AE3: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2178) | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->05.34% (4,712B) 0x50E20A8: PetscViewerCreate (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->05.34% (4,712B) 0x50D4E1C: PetscViewerASCIIOpen (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->02.67% (2,356B) 0x418A0C: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2169) | | | | ->02.67% (2,356B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | ->02.67% (2,356B) 0x418AE3: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2178) | | | ->02.67% (2,356B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | ->03.86% (3,400B) 0x50D1059: PetscViewerCreate_ASCII (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.86% (3,400B) 0x50E29F2: PetscViewerSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.86% (3,400B) 0x50D4E82: PetscViewerASCIIOpen (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.93% (1,700B) 0x418A0C: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2169) | | | | ->01.93% (1,700B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | ->01.93% (1,700B) 0x418AE3: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2178) | | | ->01.93% (1,700B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | ->03.71% (3,272B) 0x500DB67: PetscFunctionListAdd_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->03.71% (3,272B) 0x4F8255C: PetscObjectComposeFunction_Petsc (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.71% (3,272B) 0x4F83ACF: PetscObjectComposeFunction_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | ->03.71% (3,272B) 0x50D11D3: PetscViewerCreate_ASCII (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->03.71% (3,272B) 0x50E29F2: PetscViewerSetType (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->03.71% (3,272B) 0x50D4E82: PetscViewerASCIIOpen (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | | | ->01.86% (1,636B) 0x418A0C: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2169) | | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | | | ->01.86% (1,636B) 0x418AE3: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2178) | | | | | ->01.86% (1,636B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | | | ->03.67% (3,240B) 0x50D4F74: PetscViewerASCIIOpen (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | | ->01.84% (1,620B) 0x418A0C: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2169) | | | | ->01.84% (1,620B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | | | ->01.84% (1,620B) 0x418AE3: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2178) | | | ->01.84% (1,620B) 0x419ADD: main (main_petsc310.cc:2254) | | | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | ->38.03% (33,536B) 0x8C869BB: mkl_serv_allocate (in /gpfs4l/smplocal/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_core.so) | ->37.30% (32,896B) 0x90EA3BD: mkl_lapack_dgehrd (in /gpfs4l/smplocal/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_core.so) | | ->37.30% (32,896B) 0x90E3F92: mkl_lapack_dgeev (in /gpfs4l/smplocal/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_core.so) | | ->37.30% (32,896B) 0x7F41809: DGEEV (in /gpfs4l/smplocal/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_intel_lp64.so) | | ->37.30% (32,896B) 0x62CE421: KSPComputeEigenvalues_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->37.30% (32,896B) 0x632E270: KSPComputeEigenvalues (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->37.30% (32,896B) 0x62A54C2: KSPChebyshevComputeExtremeEigenvalues_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->37.30% (32,896B) 0x62A64CB: KSPSolve_Chebyshev (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->37.30% (32,896B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->37.30% (32,896B) 0x61990DE: PCMGMCycle_Private (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->37.30% (32,896B) 0x619E2A5: PCApply_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->37.30% (32,896B) 0x61E665F: PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->37.30% (32,896B) 0x635E4EB: KSP_PCApply (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->37.30% (32,896B) 0x635F3D2: KSPInitialResidual (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->37.30% (32,896B) 0x62C24DD: KSPSolve_GMRES (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->37.30% (32,896B) 0x633504F: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->37.30% (32,896B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | ->37.30% (32,896B) 0x419ADD: main (main_petsc310.cc:2254) | | | ->00.73% (640B) in 1+ places, all below ms_print's threshold (01.00%) | ->02.73% (2,406B) 0x400B13D: _dl_new_object (in /lib64/ld-2.12.so) | ->02.73% (2,406B) 0x400745C: _dl_map_object_from_fd (in /lib64/ld-2.12.so) | ->02.73% (2,406B) 0x4008660: _dl_map_object (in /lib64/ld-2.12.so) | ->02.73% (2,406B) 0x4012EF3: dl_open_worker (in /lib64/ld-2.12.so) | ->02.73% (2,406B) 0x400E5E4: _dl_catch_error (in /lib64/ld-2.12.so) | ->02.73% (2,406B) 0x4012998: _dl_open (in /lib64/ld-2.12.so) | ->01.39% (1,228B) 0xAFB4F64: dlopen_doit (in /lib64/libdl-2.12.so) | | ->01.39% (1,228B) 0x400E5E4: _dl_catch_error (in /lib64/ld-2.12.so) | | ->01.39% (1,228B) 0xAFB529A: _dlerror_run (in /lib64/libdl-2.12.so) | | ->01.39% (1,228B) 0xAFB4EDF: dlopen@@GLIBC_2.2.5 (in /lib64/libdl-2.12.so) | | ->01.39% (1,228B) 0x8C83F19: mkl_serv_load_dll (in /gpfs4l/smplocal/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_core.so) | | ->01.39% (1,228B) 0x8CD46DC: mkl_blas_dnrm2 (in /gpfs4l/smplocal/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_core.so) | | ->01.39% (1,228B) 0x918D552: mkl_lapack_dlarfg (in /gpfs4l/smplocal/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_core.so) | | ->01.39% (1,228B) 0x90FF89F: mkl_lapack_dgeqr2 (in /gpfs4l/smplocal/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_core.so) | | ->01.39% (1,228B) 0x958109D: mkl_lapack_xdgeqrf (in /gpfs4l/smplocal/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_core.so) | | ->01.39% (1,228B) 0x8714BB6: mkl_lapack_dgeqrf (in /gpfs4l/smplocal/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_sequential.so) | | ->01.39% (1,228B) 0x7F37955: DGEQRF (in /gpfs4l/smplocal/intel/composer_xe_2013_sp1.1.106/mkl/lib/intel64/libmkl_intel_lp64.so) | | ->01.39% (1,228B) 0x60F2D40: formProl0 (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->01.39% (1,228B) 0x60F74C2: PCGAMGProlongator_AGG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->01.39% (1,228B) 0x60DFD80: PCSetUp_GAMG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->01.39% (1,228B) 0x61ED10C: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->01.39% (1,228B) 0x6330A89: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->01.39% (1,228B) 0x61A45F2: PCSetUp_MG (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->01.39% (1,228B) 0x61ED10C: PCSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->01.39% (1,228B) 0x6330A89: KSPSetUp (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->01.39% (1,228B) 0x633452D: KSPSolve (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->01.39% (1,228B) 0x4184F1: solve_stokes_3d_coupled(int, int, int) (main_petsc310.cc:2130) | | ->01.39% (1,228B) 0x419ADD: main (main_petsc310.cc:2254) | | | ->01.34% (1,178B) 0xC347EFE: do_dlopen (in /lib64/libc-2.12.so) | ->01.34% (1,178B) 0x400E5E4: _dl_catch_error (in /lib64/ld-2.12.so) | ->01.34% (1,178B) 0xC348055: __libc_dlopen_mode (in /lib64/libc-2.12.so) | ->01.34% (1,178B) 0xC31D571: __nss_lookup_function (in /lib64/libc-2.12.so) | ->01.34% (1,178B) 0xC31D5DA: __nss_lookup (in /lib64/libc-2.12.so) | ->01.34% (1,178B) 0xC324DEE: gethostbyname_r@@GLIBC_2.2.5 (in /lib64/libc-2.12.so) | ->01.34% (1,178B) 0xC324491: gethostbyname (in /lib64/libc-2.12.so) | ->01.34% (1,178B) 0xB6BC049: MPIDU_CH3U_GetSockInterfaceAddr (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.34% (1,178B) 0xB6BA105: MPIDI_CH3U_Get_business_card_sock (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.34% (1,178B) 0xB6B92D0: MPIDI_CH3U_Init_sock (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.34% (1,178B) 0xB6C3B8C: MPIDI_CH3_Init (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.34% (1,178B) 0xB69B394: MPID_Init (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.34% (1,178B) 0xB5031EE: MPIR_Init_thread (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.34% (1,178B) 0xB5036EE: PMPI_Init_thread (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.34% (1,178B) 0x4FAB272: PetscInitialize (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->01.34% (1,178B) 0x41983C: main (main_petsc310.cc:2241) | ->02.18% (1,923B) 0xC2BF7CA: __tzfile_read (in /lib64/libc-2.12.so) | ->02.18% (1,923B) 0xC2BE962: tzset_internal (in /lib64/libc-2.12.so) | ->02.18% (1,923B) 0xC2BEAC7: __tz_convert (in /lib64/libc-2.12.so) | ->02.18% (1,923B) 0x4F6EBAB: PetscGetDate (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->02.18% (1,923B) 0x511FB42: PetscErrorPrintfInitialize (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->02.18% (1,923B) 0x4FAB5F8: PetscInitialize (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->02.18% (1,923B) 0x41983C: main (main_petsc310.cc:2241) | ->01.79% (1,582B) in 129 places, all below massif's threshold (01.00%) | ->01.29% (1,136B) 0xC287D39: __fopen_internal (in /lib64/libc-2.12.so) | ->01.29% (1,136B) 0x50CFF95: PetscViewerFileSetName_ASCII (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->01.29% (1,136B) 0x50CED56: PetscViewerFileSetName (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->01.29% (1,136B) 0x50D4EEF: PetscViewerASCIIOpen (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | | ->01.29% (1,136B) in 2 places, all below massif's threshold (01.00%) | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) | ->01.17% (1,034B) 0xC31D739: nss_parse_service_list (in /lib64/libc-2.12.so) | ->01.17% (1,034B) 0xC31E1FF: __nss_database_lookup (in /lib64/libc-2.12.so) | ->01.17% (1,034B) 0xC31F356: __nss_hosts_lookup2 (in /lib64/libc-2.12.so) | ->01.17% (1,034B) 0xC324DEE: gethostbyname_r@@GLIBC_2.2.5 (in /lib64/libc-2.12.so) | ->01.17% (1,034B) 0xC324491: gethostbyname (in /lib64/libc-2.12.so) | ->01.17% (1,034B) 0xB6BC049: MPIDU_CH3U_GetSockInterfaceAddr (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.17% (1,034B) 0xB6BA105: MPIDI_CH3U_Get_business_card_sock (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.17% (1,034B) 0xB6B92D0: MPIDI_CH3U_Init_sock (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.17% (1,034B) 0xB6C3B8C: MPIDI_CH3_Init (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.17% (1,034B) 0xB69B394: MPID_Init (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.17% (1,034B) 0xB5031EE: MPIR_Init_thread (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.17% (1,034B) 0xB5036EE: PMPI_Init_thread (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) | ->01.17% (1,034B) 0x4FAB272: PetscInitialize (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) | ->01.17% (1,034B) 0x41983C: main (main_petsc310.cc:2241) | ->01.16% (1,024B) 0xC324563: gethostbyname (in /lib64/libc-2.12.so) ->01.16% (1,024B) 0xB6BC049: MPIDU_CH3U_GetSockInterfaceAddr (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) ->01.16% (1,024B) 0xB6BA105: MPIDI_CH3U_Get_business_card_sock (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) ->01.16% (1,024B) 0xB6B92D0: MPIDI_CH3U_Init_sock (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) ->01.16% (1,024B) 0xB6C3B8C: MPIDI_CH3_Init (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) ->01.16% (1,024B) 0xB69B394: MPID_Init (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) ->01.16% (1,024B) 0xB5031EE: MPIR_Init_thread (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) ->01.16% (1,024B) 0xB5036EE: PMPI_Init_thread (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libmpi.so.0.0.0) ->01.16% (1,024B) 0x4FAB272: PetscInitialize (in /gpfs4l/smplocal/pub/PETSc/3.10.2/real-debug-mpich-mumps-hypre/lib/libpetsc.so.3.10.2) ->01.16% (1,024B) 0x41983C: main (main_petsc310.cc:2241) -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From eda.oktay at metu.edu.tr Fri Mar 8 07:37:48 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Fri, 8 Mar 2019 16:37:48 +0300 Subject: [petsc-users] EPSGetEigenpair Problem In-Reply-To: References: Message-ID: Dear Matt, Thank you for your answer but I get an error even before compiling ex6. The error is: error: too many arguments to function ?DMSetField? ierr = DMSetField(dm, 0, NULL, (PetscObject) fe);CHKERRQ(ierr); Eda Matthew Knepley , 7 Mar 2019 Per, 15:07 tarihinde ?unu yazd?: > On Thu, Mar 7, 2019 at 6:45 AM Eda Oktay via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello, >> >> I have a code finding Laplacian of a matrix and then I need to find this >> Laplacian's second smallest eigenpair. I used EPSGetEigenpair code but I >> get the values like "0." or "4." (I could not get decimals) when I used >> PetscPrintf(PETSC_COMM_WORLD," The second smallest eigenvalue: %g\n",kr) >> line. >> >> What could be the problem? >> > > Hi Eda, > > I have an example that does just this, and I am getting the correct > result. I have not yet checked it in, > but i attach it here. Are you setting the right target? > > Thanks, > > Matt > > >> Best regards, >> >> Eda >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 8 07:40:46 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 8 Mar 2019 08:40:46 -0500 Subject: [petsc-users] EPSGetEigenpair Problem In-Reply-To: References: Message-ID: On Fri, Mar 8, 2019 at 8:38 AM Eda Oktay wrote: > Dear Matt, > > Thank you for your answer but I get an error even before compiling ex6. > The error is: > > error: too many arguments to function ?DMSetField? > ierr = DMSetField(dm, 0, NULL, (PetscObject) fe);CHKERRQ(ierr); > You need PETSc master and slepc-dev. Thanks, Matt > Eda > > > Matthew Knepley , 7 Mar 2019 Per, 15:07 tarihinde ?unu > yazd?: > >> On Thu, Mar 7, 2019 at 6:45 AM Eda Oktay via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Hello, >>> >>> I have a code finding Laplacian of a matrix and then I need to find this >>> Laplacian's second smallest eigenpair. I used EPSGetEigenpair code but I >>> get the values like "0." or "4." (I could not get decimals) when I used >>> PetscPrintf(PETSC_COMM_WORLD," The second smallest eigenvalue: %g\n",kr) >>> line. >>> >>> What could be the problem? >>> >> >> Hi Eda, >> >> I have an example that does just this, and I am getting the correct >> result. I have not yet checked it in, >> but i attach it here. Are you setting the right target? >> >> Thanks, >> >> Matt >> >> >>> Best regards, >>> >>> Eda >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From amfoggia at gmail.com Fri Mar 8 09:19:46 2019 From: amfoggia at gmail.com (Ale Foggia) Date: Fri, 8 Mar 2019 16:19:46 +0100 Subject: [petsc-users] MatCreate performance Message-ID: Hello all, I have a problem with the scaling of the MatCreate() function. I wrote a code to diagonalize sparse matrices and I'm running it in parallel. I've observed a very bad speedup of the code and it's given by the MatCreate part of it: for a fixed matrix size, when I increase the number of processes the time taken by the function also increases. I wanted to know if you expect this behavior or if maybe there's something wrong with my code. When I go to (what I consider) very big matrix sizes, and depending on the number of mpi processes, in some cases, MatCreate takes more time than the time the solver takes to solve the system for one eigenvalue or the time it takes to set up the values. Ale -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Mar 8 10:00:53 2019 From: jed at jedbrown.org (Jed Brown) Date: Fri, 08 Mar 2019 09:00:53 -0700 Subject: [petsc-users] MatCreate performance In-Reply-To: References: Message-ID: <87o96l9wzu.fsf@jedbrown.org> This is very unusual. MatCreate() does no work, merely dup'ing a communicator (or referencing an inner communicator if this is not the first PetscObject on the provided communicator). What size matrices are you working with? Can you send some performance data and (if feasible) a reproducer? Ale Foggia via petsc-users writes: > Hello all, > > I have a problem with the scaling of the MatCreate() function. I wrote a > code to diagonalize sparse matrices and I'm running it in parallel. I've > observed a very bad speedup of the code and it's given by the MatCreate > part of it: for a fixed matrix size, when I increase the number of > processes the time taken by the function also increases. I wanted to know > if you expect this behavior or if maybe there's something wrong with my > code. When I go to (what I consider) very big matrix sizes, and depending > on the number of mpi processes, in some cases, MatCreate takes more time > than the time the solver takes to solve the system for one eigenvalue or > the time it takes to set up the values. > > Ale From finnkochinski at keemail.me Fri Mar 8 10:01:55 2019 From: finnkochinski at keemail.me (finnkochinski at keemail.me) Date: Fri, 8 Mar 2019 17:01:55 +0100 (CET) Subject: [petsc-users] [petsc4py] DMPlexCreateFromDAG and other missing functions Message-ID: Dear petsc4py experts, I'd like to ask why several PETSc functions are not wrapped in petsc4py. I'd need to use DMPlexCreateFromDAG from python. Could you explain with this function as an example why there is no python wrapper available? Do I have to expect severe difficulties when I try this myself - impossible data structures, memory management or something else? Then, if it was just lack of time that prevented these functions from being available in petsc4py but if it could be done easily: Is the wrapping process of petsc4py documented somewhere? Or do I have to browse the sources to get an understanding? Do you use swig, clif, boost.python or something else? Is it possible to write another (small) python extension for the missing functions independent from petsc4py that allows me to pass PETSc structures back and forth between the two? Or is it necessary to have /one/ complete wrapper, because interoperability is not possible otherwise? regards Chris -- Securely sent with Tutanota. Get your own encrypted, ad-free mailbox: https://tutanota.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Mar 8 10:55:49 2019 From: jed at jedbrown.org (Jed Brown) Date: Fri, 08 Mar 2019 09:55:49 -0700 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <87sgw0pqlj.fsf@jedbrown.org> Message-ID: <87ftrx9uga.fsf@jedbrown.org> It may not address the memory issue, but can you build 3.10 with the same options you used for 3.6? It is currently a debugging build: ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option. # # To get timing results run ./configure # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## From knepley at gmail.com Fri Mar 8 12:38:01 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 8 Mar 2019 13:38:01 -0500 Subject: [petsc-users] [petsc4py] DMPlexCreateFromDAG and other missing functions In-Reply-To: References: Message-ID: On Fri, Mar 8, 2019 at 11:02 AM Chris Finn via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear petsc4py experts, > I'd like to ask why several PETSc functions are not wrapped in petsc4py. > I'd need to use DMPlexCreateFromDAG from python. Could you explain with > this function as an example why there is no python wrapper available? Do I > have to expect severe difficulties when I try this myself - impossible data > structures, memory management or something else? > Lisandro is the expert, but I will try answering. The main problem is just time. There is no documentation for contributing, but what I do is copy a function that is pretty much like the one I want. So I think DMPlexCreateFromCellList() is wrapped, and it looks almost the same. Thanks, Matt > Then, if it was just lack of time that prevented these functions from > being available in petsc4py but if it could be done easily: > Is the wrapping process of petsc4py documented somewhere? Or do I have to > browse the sources to get an understanding? Do you use swig, clif, > boost.python or something else? > > Is it possible to write another (small) python extension for the missing > functions independent from petsc4py that allows me to pass PETSc structures > back and forth between the two? Or is it necessary to have /one/ complete > wrapper, because interoperability is not possible otherwise? > > regards > Chris > > -- > Securely sent with Tutanota. Get your own encrypted, ad-free mailbox: > https://tutanota.com > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Mar 8 14:27:15 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 8 Mar 2019 20:27:15 +0000 Subject: [petsc-users] MatCreate performance In-Reply-To: References: Message-ID: <604EAAC5-CE26-4868-BB95-41976BE3DBF8@anl.gov> https://www.mcs.anl.gov/petsc/documentation/faq.html#efficient-assembly > On Mar 8, 2019, at 9:19 AM, Ale Foggia via petsc-users wrote: > > Hello all, > > I have a problem with the scaling of the MatCreate() function. I wrote a code to diagonalize sparse matrices and I'm running it in parallel. I've observed a very bad speedup of the code and it's given by the MatCreate part of it: for a fixed matrix size, when I increase the number of processes the time taken by the function also increases. I wanted to know if you expect this behavior or if maybe there's something wrong with my code. When I go to (what I consider) very big matrix sizes, and depending on the number of mpi processes, in some cases, MatCreate takes more time than the time the solver takes to solve the system for one eigenvalue or the time it takes to set up the values. > > Ale From mfadams at lbl.gov Fri Mar 8 16:16:15 2019 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 8 Mar 2019 17:16:15 -0500 Subject: [petsc-users] MatCreate performance In-Reply-To: References: Message-ID: MatCreate is collective so you want to check that it is not seeing load imbalance from earlier code. And duplicating communicators can be expensive on some systems. On Fri, Mar 8, 2019 at 10:21 AM Ale Foggia via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello all, > > I have a problem with the scaling of the MatCreate() function. I wrote a > code to diagonalize sparse matrices and I'm running it in parallel. I've > observed a very bad speedup of the code and it's given by the MatCreate > part of it: for a fixed matrix size, when I increase the number of > processes the time taken by the function also increases. I wanted to know > if you expect this behavior or if maybe there's something wrong with my > code. When I go to (what I consider) very big matrix sizes, and depending > on the number of mpi processes, in some cases, MatCreate takes more time > than the time the solver takes to solve the system for one eigenvalue or > the time it takes to set up the values. > > Ale > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Mar 8 16:23:12 2019 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 8 Mar 2019 17:23:12 -0500 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: <87ftrx9uga.fsf@jedbrown.org> References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <87sgw0pqlj.fsf@jedbrown.org> <87ftrx9uga.fsf@jedbrown.org> Message-ID: Just seeing this now. It is hard to imagine how bad GAMG could be on a coarse grid, but you can run with -info and grep on GAMG and send that. You will see listing of levels, number of equations and number of non-zeros (nnz). You can send that and I can get some sense of GAMG is going nuts. Mark On Fri, Mar 8, 2019 at 11:56 AM Jed Brown via petsc-users < petsc-users at mcs.anl.gov> wrote: > It may not address the memory issue, but can you build 3.10 with the > same options you used for 3.6? It is currently a debugging build: > > ########################################################## > # # > # WARNING!!! # > # # > # This code was compiled with a debugging option. # > # To get timing results run ./configure # > # using --with-debugging=no, the performance will # > # be generally two or three times faster. # > # # > ########################################################## > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yyang85 at stanford.edu Fri Mar 8 16:52:55 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Fri, 8 Mar 2019 22:52:55 +0000 Subject: [petsc-users] Append vector to existing file Message-ID: Hello team, This is a very simple question, but I just want to be sure I understand how the viewer works. I have a file with some vectors already written into it. Now I'm calling PetscViewerASCIIOpen, setting the format and then the file mode into append: PetscViewer viewer; ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD, filename.c_str(), &viewer); ierr = PetscViewerPushFormat(viewer, PETSC_VIEWER_ASCII_MATLAB); CHKERRQ(ierr); ierr = PetscViewerFileSetMode(viewer, FILE_MODE_APPEND); CHKERRQ(ierr); After which I call VecView to append some new vectors to the existing file. But this operation is clearing out my existing file and writing it with new information. Does the append mode happen to not work with this type of viewer, or am I missing something here? Thanks a lot, Yuyun -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 8 18:50:59 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 8 Mar 2019 19:50:59 -0500 Subject: [petsc-users] Append vector to existing file In-Reply-To: References: Message-ID: On Fri, Mar 8, 2019 at 5:53 PM Yuyun Yang via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello team, > > > > This is a very simple question, but I just want to be sure I understand > how the viewer works. I have a file with some vectors already written into > it. Now I?m calling PetscViewerASCIIOpen, setting the format and then the > file mode into append: > > PetscViewer viewer; > > ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD, filename.c_str(), &viewer); > ASCIIOpen() is a simplified interface. For append, I think you need PetscViewerCreate() PetscViewerSetType() PetscViewerFileSetMode() PetscViewerFileSetName() PetscViewerPushFormat() Matt > ierr = PetscViewerPushFormat(viewer, PETSC_VIEWER_ASCII_MATLAB); > CHKERRQ(ierr); > > ierr = PetscViewerFileSetMode(viewer, FILE_MODE_APPEND); CHKERRQ(ierr); > > > > After which I call VecView to append some new vectors to the existing > file. But this operation is clearing out my existing file and writing it > with new information. Does the append mode happen to not work with this > type of viewer, or am I missing something here? > > > > Thanks a lot, > > Yuyun > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yyang85 at stanford.edu Fri Mar 8 19:10:41 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Sat, 9 Mar 2019 01:10:41 +0000 Subject: [petsc-users] Append vector to existing file In-Reply-To: References: Message-ID: I see, thank you! Yuyun From: Matthew Knepley Sent: Friday, March 8, 2019 4:51 PM To: Yuyun Yang Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Append vector to existing file On Fri, Mar 8, 2019 at 5:53 PM Yuyun Yang via petsc-users > wrote: Hello team, This is a very simple question, but I just want to be sure I understand how the viewer works. I have a file with some vectors already written into it. Now I?m calling PetscViewerASCIIOpen, setting the format and then the file mode into append: PetscViewer viewer; ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD, filename.c_str(), &viewer); ASCIIOpen() is a simplified interface. For append, I think you need PetscViewerCreate() PetscViewerSetType() PetscViewerFileSetMode() PetscViewerFileSetName() PetscViewerPushFormat() Matt ierr = PetscViewerPushFormat(viewer, PETSC_VIEWER_ASCII_MATLAB); CHKERRQ(ierr); ierr = PetscViewerFileSetMode(viewer, FILE_MODE_APPEND); CHKERRQ(ierr); After which I call VecView to append some new vectors to the existing file. But this operation is clearing out my existing file and writing it with new information. Does the append mode happen to not work with this type of viewer, or am I missing something here? Thanks a lot, Yuyun -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eda.oktay at metu.edu.tr Sat Mar 9 09:01:20 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Sat, 9 Mar 2019 18:01:20 +0300 Subject: [petsc-users] EPSGetEigenpair Problem In-Reply-To: References: Message-ID: Dear Matt, Thank you for your example. I didn't compile your program since I fixed my problem, now I get a value. However, now I get a different error: [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: Argument 2 out of range [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.10.3, Dec, 18, 2018 [0]PETSC ERROR: ./ENYENI_FINAL on a arch-linux2-c-debug named localhost.localdomain by edaoktay Sat Mar 9 17:55:23 2019 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-cxx-dialect=C++11 --download-openblas --download-metis --download-parmetis --download-superlu_dist --download-slepc --download-mpich [0]PETSC ERROR: #1 EPSGetEigenpair() line 398 in /home/edaoktay/petsc-3.10.3/arch-linux2-c-debug/externalpackages/git.slepc/src/eps/interface/epssolve.c I didn't understand why the arguments are out of range. I used this function as: ierr = EPSGetEigenpair(eps,0,&kr,NULL,vr,NULL); By the way, even though I get this error, I think EPSGetEigenpair computes something since I get 4.07265e-314 (which is obviously not true) as the second smallest eigenvalue. Thanks, Eda Matthew Knepley , 8 Mar 2019 Cum, 16:41 tarihinde ?unu yazd?: > On Fri, Mar 8, 2019 at 8:38 AM Eda Oktay wrote: > >> Dear Matt, >> >> Thank you for your answer but I get an error even before compiling ex6. >> The error is: >> >> error: too many arguments to function ?DMSetField? >> ierr = DMSetField(dm, 0, NULL, (PetscObject) fe);CHKERRQ(ierr); >> > > You need PETSc master and slepc-dev. > > Thanks, > > Matt > > >> Eda >> >> >> Matthew Knepley , 7 Mar 2019 Per, 15:07 tarihinde >> ?unu yazd?: >> >>> On Thu, Mar 7, 2019 at 6:45 AM Eda Oktay via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Hello, >>>> >>>> I have a code finding Laplacian of a matrix and then I need to find >>>> this Laplacian's second smallest eigenpair. I used EPSGetEigenpair code but >>>> I get the values like "0." or "4." (I could not get decimals) when I used >>>> PetscPrintf(PETSC_COMM_WORLD," The second smallest eigenvalue: %g\n",kr) >>>> line. >>>> >>>> What could be the problem? >>>> >>> >>> Hi Eda, >>> >>> I have an example that does just this, and I am getting the correct >>> result. I have not yet checked it in, >>> but i attach it here. Are you setting the right target? >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Best regards, >>>> >>>> Eda >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Mar 9 12:11:11 2019 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 9 Mar 2019 13:11:11 -0500 Subject: [petsc-users] EPSGetEigenpair Problem In-Reply-To: References: Message-ID: On Sat, Mar 9, 2019 at 10:01 AM Eda Oktay wrote: > Dear Matt, > > Thank you for your example. I didn't compile your program since I fixed my > problem, now I get a value. However, now I get a different error: > > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Argument 2 out of range > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.10.3, Dec, 18, 2018 > [0]PETSC ERROR: ./ENYENI_FINAL on a arch-linux2-c-debug named > localhost.localdomain by edaoktay Sat Mar 9 17:55:23 2019 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-cxx-dialect=C++11 --download-openblas > --download-metis --download-parmetis --download-superlu_dist > --download-slepc --download-mpich > [0]PETSC ERROR: #1 EPSGetEigenpair() line 398 in > /home/edaoktay/petsc-3.10.3/arch-linux2-c-debug/externalpackages/git.slepc/src/eps/interface/epssolve.c > > I didn't understand why the arguments are out of range. I used this > function as: > > ierr = EPSGetEigenpair(eps,0,&kr,NULL,vr,NULL); > > By the way, even though I get this error, I think EPSGetEigenpair computes > something since I get 4.07265e-314 (which is obviously not true) as the > second smallest eigenvalue. > The error message makes it sound like you have no converged eigenvalues. Then its no surprise that its not setting a value. You need to use CHKERRQ() for all return values, so that it stops after an error since the values will not be set. Matt > Thanks, > > Eda > > > Matthew Knepley , 8 Mar 2019 Cum, 16:41 tarihinde ?unu > yazd?: > >> On Fri, Mar 8, 2019 at 8:38 AM Eda Oktay wrote: >> >>> Dear Matt, >>> >>> Thank you for your answer but I get an error even before compiling ex6. >>> The error is: >>> >>> error: too many arguments to function ?DMSetField? >>> ierr = DMSetField(dm, 0, NULL, (PetscObject) fe);CHKERRQ(ierr); >>> >> >> You need PETSc master and slepc-dev. >> >> Thanks, >> >> Matt >> >> >>> Eda >>> >>> >>> Matthew Knepley , 7 Mar 2019 Per, 15:07 tarihinde >>> ?unu yazd?: >>> >>>> On Thu, Mar 7, 2019 at 6:45 AM Eda Oktay via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>>> Hello, >>>>> >>>>> I have a code finding Laplacian of a matrix and then I need to find >>>>> this Laplacian's second smallest eigenpair. I used EPSGetEigenpair code but >>>>> I get the values like "0." or "4." (I could not get decimals) when I used >>>>> PetscPrintf(PETSC_COMM_WORLD," The second smallest eigenvalue: %g\n",kr) >>>>> line. >>>>> >>>>> What could be the problem? >>>>> >>>> >>>> Hi Eda, >>>> >>>> I have an example that does just this, and I am getting the correct >>>> result. I have not yet checked it in, >>>> but i attach it here. Are you setting the right target? >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Best regards, >>>>> >>>>> Eda >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From myriam.peyrounette at idris.fr Mon Mar 11 05:32:26 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Mon, 11 Mar 2019 11:32:26 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <87sgw0pqlj.fsf@jedbrown.org> <87ftrx9uga.fsf@jedbrown.org> Message-ID: Hi, good point, I changed the 3.10 version so that it is configured with --with-debugging=0. You'll find attached the output of the new LogView. The execution time is reduced (although still not as good as 3.6) but I can't see any improvement with regard to memory. You'll also find attached the grep GAMG on -info outputs for both versions. There are slight differences in grid dimensions or nnz values, but is it significant? Thanks, Myriam Le 03/08/19 ? 23:23, Mark Adams a ?crit?: > Just seeing this now. It is hard to imagine how bad GAMG could be on a > coarse grid, but you can run with -info and grep on GAMG and send > that. You will see listing of levels, number of equations and number > of non-zeros (nnz). You can send that and I can get some sense of GAMG > is going nuts. > > Mark > > On Fri, Mar 8, 2019 at 11:56 AM Jed Brown via petsc-users > > wrote: > > It may not address the memory issue, but can you build 3.10 with the > same options you used for 3.6?? It is currently a debugging build: > > ? ? ? ? ? ? ? > ########################################################## > ? ? ? ? ? ? ? #? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? > ? ? # > ? ? ? ? ? ? ? #? ? ? ? ? ? ? ? ? ? ? ?WARNING!!!? ? ? ? ? ? ? ? ? > ? ? ?# > ? ? ? ? ? ? ? #? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? > ? ? # > ? ? ? ? ? ? ? #? ?This code was compiled with a debugging option.? > ? ? # > ? ? ? ? ? ? ? #? ?To get timing results run ./configure? ? ? ? ? ? > ? ? # > ? ? ? ? ? ? ? #? ?using --with-debugging=no, the performance will? > ? ? # > ? ? ? ? ? ? ? #? ?be generally two or three times faster.? ? ? ? ? > ? ? # > ? ? ? ? ? ? ? #? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? > ? ? # > ? ? ? ? ? ? ? > ########################################################## > -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: logView_48x48x48_petsc310_noDebug.log Type: text/x-log Size: 16537 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: grepGAMG_petsc310.log Type: text/x-log Size: 836 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: grepGAMG_petsc36.log Type: text/x-log Size: 831 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From m.colera at upm.es Mon Mar 11 06:56:36 2019 From: m.colera at upm.es (Manuel Colera Rico) Date: Mon, 11 Mar 2019 12:56:36 +0100 Subject: [petsc-users] PCFieldSplit with MatNest Message-ID: Hello, I need to solve a 2*2 block linear system. The matrices A_00, A_01, A_10, A_11 are constructed separately via MatCreateSeqAIJWithArrays and MatCreateSeqSBAIJWithArrays. Then, I construct the full system matrix with MatCreateNest, and use MatNestGetISs and PCFieldSplitSetIS to set up the PC, trying to follow the procedure described here: https://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex70.c.html. However, when I run the code with Leak Sanitizer, I get the following error: ================================================================= ==54927==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x627000051ab8 in thread T0 ??? #0 0x7fbd95c08f30 in __interceptor_free ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:66 ??? #1 0x7fbd92b99dcd in PetscFreeAlign (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x146dcd) ??? #2 0x7fbd92ce0178 in VecRestoreArray_Nest (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28d178) ??? #3 0x7fbd92cd627d in VecRestoreArrayRead (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28327d) ??? #4 0x7fbd92d1189e in VecScatterBegin_SSToSS (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2be89e) ??? #5 0x7fbd92d1a414 in VecScatterBegin (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2c7414) ??? #6 0x7fbd934a999c in PCApply_FieldSplit (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa5699c) ??? #7 0x7fbd93369071 in PCApply (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x916071) ??? #8 0x7fbd934efe77 in KSPInitialResidual (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa9ce77) ??? #9 0x7fbd9350272c in KSPSolve_GMRES (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xaaf72c) ??? #10 0x7fbd934e3c01 in KSPSolve (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa90c01) Disabling Leak Sanitizer also outputs an "invalid pointer" error. Did I forget something when writing the code? Thank you, Manuel --- From amfoggia at gmail.com Mon Mar 11 07:22:39 2019 From: amfoggia at gmail.com (Ale Foggia) Date: Mon, 11 Mar 2019 13:22:39 +0100 Subject: [petsc-users] MatCreate performance In-Reply-To: <87o96l9wzu.fsf@jedbrown.org> References: <87o96l9wzu.fsf@jedbrown.org> Message-ID: Hello all, Thanks for your answers. 1) I'm working with a matrix with a linear size of 2**34, but it's a sparse matrix, and the number of elements different from zero is 43,207,072,74. I know that the distribution of these elements is not balanced between the processes, the matrix is more populated in the middle part. 2) I initialize Slepc. Then I create the basis elements of the system (this part does not involve Petsc/Slepc, and every process is just computing -and owns- an equal amount of basis elements). Then I call: ierr = MatCreate(PETSC_COMM_WORLD, &A); CHKERRQ(ierr); ierr = MatSetType(A, MATMPIAIJ); CHKERRQ(ierr); ierr = MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, size, size); CHKERRQ(ierr); ierr = MatMPIAIJSetPreallocation(A, 0, d_nnz, 0, o_nnz); CHKERRQ(ierr); ierr = MatZeroEntries(A); CHKERRQ(ierr); After this, I compute the elements of the matrix and set the values with MatSetValues. The I call EPSSolve (with KrylovSchur and setting the type as EPS_HEP). 3) There are a few more things that are strange to me. I measure the execution time of these parts both with a PetscLogStage and with a std::chrono (in nanoseconds) clock. I understand that the time given by the Log is an average over the processes right? In the case of the std::chrono, I'm only printing the times from process 0 (no average over processes). What I see is the following: 1024 procs 2048 procs 4096 procs 8192 procs Log std Log std Log std Log std MatCreate 68.42 122.7 67.08 121.2 62.29 116 73.36 127.4 preallocation 140.36 140.3 76.45 76.45 40.31 40.3 21.13 21.12 MatSetValues 237.79 237.7 116.6 116.6 60.59 60.59 35.32 35.32 ESPSolve 162.8 160 95.8 94.2 62.17 60.63 41.16 40.24 - So, all the times (including the total execution time that I'm not showing here) are the same between PetscLogStage and the std::chrono clock, except for the part of MatCreate. Maybe that part is very unbalanced? - The time of the MatCreate given by the PetscLogStage is not changing. Ale El vie., 8 mar. 2019 a las 17:00, Jed Brown () escribi?: > This is very unusual. MatCreate() does no work, merely dup'ing a > communicator (or referencing an inner communicator if this is not the > first PetscObject on the provided communicator). What size matrices are > you working with? Can you send some performance data and (if feasible) > a reproducer? > > Ale Foggia via petsc-users writes: > > > Hello all, > > > > I have a problem with the scaling of the MatCreate() function. I wrote a > > code to diagonalize sparse matrices and I'm running it in parallel. I've > > observed a very bad speedup of the code and it's given by the MatCreate > > part of it: for a fixed matrix size, when I increase the number of > > processes the time taken by the function also increases. I wanted to know > > if you expect this behavior or if maybe there's something wrong with my > > code. When I go to (what I consider) very big matrix sizes, and depending > > on the number of mpi processes, in some cases, MatCreate takes more time > > than the time the solver takes to solve the system for one eigenvalue or > > the time it takes to set up the values. > > > > Ale > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Mar 11 07:24:31 2019 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 11 Mar 2019 08:24:31 -0400 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <87sgw0pqlj.fsf@jedbrown.org> <87ftrx9uga.fsf@jedbrown.org> Message-ID: GAMG look fine here but the convergence rate looks terrible, like 4k+ iterations. You have 4 degrees of freedom per vertex. What equations and discretization are you using? Your eigen estimates are a little high, but not crazy. I assume this system is not symmetric. AMG is oriented toward the laplacian (and elasticity). It looks like you are solving Stokes. AMG will not work on the whole system out of the box. You can use it for a 3 dof velocity solve in a FieldSolit solver On Mon, Mar 11, 2019 at 6:32 AM Myriam Peyrounette < myriam.peyrounette at idris.fr> wrote: > Hi, > > good point, I changed the 3.10 version so that it is configured with > --with-debugging=0. You'll find attached the output of the new LogView. The > execution time is reduced (although still not as good as 3.6) but I can't > see any improvement with regard to memory. > > You'll also find attached the grep GAMG on -info outputs for both > versions. There are slight differences in grid dimensions or nnz values, > but is it significant? > > Thanks, > > Myriam > > > > Le 03/08/19 ? 23:23, Mark Adams a ?crit : > > Just seeing this now. It is hard to imagine how bad GAMG could be on a > coarse grid, but you can run with -info and grep on GAMG and send that. You > will see listing of levels, number of equations and number of non-zeros > (nnz). You can send that and I can get some sense of GAMG is going nuts. > > Mark > > On Fri, Mar 8, 2019 at 11:56 AM Jed Brown via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> It may not address the memory issue, but can you build 3.10 with the >> same options you used for 3.6? It is currently a debugging build: >> >> ########################################################## >> # # >> # WARNING!!! # >> # # >> # This code was compiled with a debugging option. # >> # To get timing results run ./configure # >> # using --with-debugging=no, the performance will # >> # be generally two or three times faster. # >> # # >> ########################################################## >> >> > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eda.oktay at metu.edu.tr Mon Mar 11 07:26:46 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Mon, 11 Mar 2019 15:26:46 +0300 Subject: [petsc-users] Problem in MatSetValues Message-ID: Hello, I have a following part of a code which tries to change the nonzero values of matrix L with -1. However in MatSetValues line, something happens and some of the values in matrix turns into 1.99665e-314 instead of -1. Type of arr is defined as PetscScalar and arr is produced correctly. What can be the problem, is there a mistake about types? Thanks, Eda for(rw = mm; rw From mfadams at lbl.gov Mon Mar 11 07:52:16 2019 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 11 Mar 2019 08:52:16 -0400 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> Message-ID: In looking at this larger scale run ... * Your eigen estimates are much lower than your tiny test problem. But this is Stokes apparently and it should not work anyway. Maybe you have a small time step that adds a lot of mass that brings the eigen estimates down. And your min eigenvalue (not used) is positive. I would expect negative for Stokes ... * You seem to be setting a threshold value of 0.1 -- that is very high * v3.6 says "using nonzero initial guess" but this is not in v3.10. Maybe we just stopped printing that. * There were some changes to coasening parameters in going from v3.6 but it does not look like your problem was effected. (The coarsening algo is non-deterministic by default and you can see small difference on different runs) * We may have also added a "noisy" RHS for eigen estimates by default from v3.6. * And for non-symetric problems you can try -pc_gamg_agg_nsmooths 0, but again GAMG is not built for Stokes anyway. On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette < myriam.peyrounette at idris.fr> wrote: > I used PCView to display the size of the linear system in each level of > the MG. You'll find the outputs attached to this mail (zip file) for both > the default threshold value and a value of 0.1, and for both 3.6 and 3.10 > PETSc versions. > > For convenience, I summarized the information in a graph, also attached > (png file). > > As you can see, there are slight differences between the two versions but > none is critical, in my opinion. Do you see anything suspicious in the > outputs? > > + I can't find the default threshold value. Do you know where I can find > it? > > Thanks for the follow-up > > Myriam > > Le 03/05/19 ? 14:06, Matthew Knepley a ?crit : > > On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette < > myriam.peyrounette at idris.fr> wrote: > >> Hi Matt, >> >> I plotted the memory scalings using different threshold values. The two >> scalings are slightly translated (from -22 to -88 mB) but this gain is >> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling >> deteriorates. >> >> Do you have any other suggestion? >> > Mark, what is the option she can give to output all the GAMG data? > > Also, run using -ksp_view. GAMG will report all the sizes of its grids, so > it should be easy to see > if the coarse grid sizes are increasing, and also what the effect of the > threshold value is. > > Thanks, > > Matt > >> Thanks >> Myriam >> >> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit : >> >> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Hi, >>> >>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc version >>> to 3.10, this code has a bad memory scaling. >>> >>> To report this issue, I took the PETSc script ex42.c and slightly >>> modified it so that the KSP and PC configurations are the same as in my >>> code. In particular, I use a "personnalised" multi-grid method. The >>> modifications are indicated by the keyword "TopBridge" in the attached >>> scripts. >>> >>> To plot the memory (weak) scaling, I ran four calculations for each >>> script with increasing problem sizes and computations cores: >>> >>> 1. 100,000 elts on 4 cores >>> 2. 1 million elts on 40 cores >>> 3. 10 millions elts on 400 cores >>> 4. 100 millions elts on 4,000 cores >>> >>> The resulting graph is also attached. The scaling using PETSc 3.10 >>> clearly deteriorates for large cases, while the one using PETSc 3.6 is >>> robust. >>> >>> After a few tests, I found that the scaling is mostly sensitive to the >>> use of the AMG method for the coarse grid (line 1780 in >>> main_ex42_petsc36.cc). In particular, the performance strongly >>> deteriorates when commenting lines 1777 to 1790 (in >>> main_ex42_petsc36.cc). >>> >>> Do you have any idea of what changed between version 3.6 and version >>> 3.10 that may imply such degradation? >>> >> >> I believe the default values for PCGAMG changed between versions. It >> sounds like the coarsening rate >> is not great enough, so that these grids are too large. This can be set >> using: >> >> >> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >> >> There is some explanation of this effect on that page. Let us know if >> setting this does not correct the situation. >> >> Thanks, >> >> Matt >> >> >>> Let me know if you need further information. >>> >>> Best, >>> >>> Myriam Peyrounette >>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 11 07:54:57 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 11 Mar 2019 08:54:57 -0400 Subject: [petsc-users] MatCreate performance In-Reply-To: References: <87o96l9wzu.fsf@jedbrown.org> Message-ID: On Mon, Mar 11, 2019 at 8:23 AM Ale Foggia via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello all, > > Thanks for your answers. > > 1) I'm working with a matrix with a linear size of 2**34, but it's a > sparse matrix, and the number of elements different from zero is > 43,207,072,74. I know that the distribution of these elements is not > balanced between the processes, the matrix is more populated in the middle > part. > > 2) I initialize Slepc. Then I create the basis elements of the system > (this part does not involve Petsc/Slepc, and every process is just > computing -and owns- an equal amount of basis elements). Then I call: > ierr = MatCreate(PETSC_COMM_WORLD, &A); CHKERRQ(ierr); > ierr = MatSetType(A, MATMPIAIJ); CHKERRQ(ierr); > ierr = MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, size, size); > CHKERRQ(ierr); > ierr = MatMPIAIJSetPreallocation(A, 0, d_nnz, 0, o_nnz); CHKERRQ(ierr); > ierr = MatZeroEntries(A); CHKERRQ(ierr); > After this, I compute the elements of the matrix and set the values with > MatSetValues. The I call EPSSolve (with KrylovSchur and setting the type as > EPS_HEP). > > 3) There are a few more things that are strange to me. I measure the > execution time of these parts both with a PetscLogStage and with a > std::chrono (in nanoseconds) clock. I understand that the time given by the > Log is an average over the processes right? In the case of the std::chrono, > I'm only printing the times from process 0 (no average over processes). > What I see is the following: > 1024 procs 2048 procs 4096 > procs 8192 procs > Log std Log std > Log std Log std > MatCreate 68.42 122.7 67.08 121.2 62.29 116 > 73.36 127.4 > preallocation 140.36 140.3 76.45 76.45 40.31 > 40.3 21.13 21.12 > MatSetValues 237.79 237.7 116.6 116.6 60.59 60.59 > 35.32 35.32 > ESPSolve 162.8 160 95.8 94.2 62.17 > 60.63 41.16 40.24 > > - So, all the times (including the total execution time that I'm not > showing here) are the same between PetscLogStage and the std::chrono clock, > except for the part of MatCreate. Maybe that part is very unbalanced? > MatCreate() does nothing at all, but it does have a synchronization (to check the comm). So you must be very imbalanced _coming into_ MatCreate. It also appears that 0 is more imbalanced than the rest, so maybe you are doing serial work on 0 that no one else does before you call MatCreate. Matt > - The time of the MatCreate given by the PetscLogStage is not changing. > > Ale > > El vie., 8 mar. 2019 a las 17:00, Jed Brown () escribi?: > >> This is very unusual. MatCreate() does no work, merely dup'ing a >> communicator (or referencing an inner communicator if this is not the >> first PetscObject on the provided communicator). What size matrices are >> you working with? Can you send some performance data and (if feasible) >> a reproducer? >> >> Ale Foggia via petsc-users writes: >> >> > Hello all, >> > >> > I have a problem with the scaling of the MatCreate() function. I wrote a >> > code to diagonalize sparse matrices and I'm running it in parallel. I've >> > observed a very bad speedup of the code and it's given by the MatCreate >> > part of it: for a fixed matrix size, when I increase the number of >> > processes the time taken by the function also increases. I wanted to >> know >> > if you expect this behavior or if maybe there's something wrong with my >> > code. When I go to (what I consider) very big matrix sizes, and >> depending >> > on the number of mpi processes, in some cases, MatCreate takes more time >> > than the time the solver takes to solve the system for one eigenvalue or >> > the time it takes to set up the values. >> > >> > Ale >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 11 07:56:11 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 11 Mar 2019 08:56:11 -0400 Subject: [petsc-users] Problem in MatSetValues In-Reply-To: References: Message-ID: On Mon, Mar 11, 2019 at 8:27 AM Eda Oktay via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > I have a following part of a code which tries to change the nonzero values > of matrix L with -1. However in MatSetValues line, something happens and > some of the values in matrix turns into 1.99665e-314 instead of -1. Type of > arr is defined as PetscScalar and arr is produced correctly. What can be > the problem, is there a mistake about types? > > Thanks, > > Eda > > > for(rw = mm; rw > ierr = MatGetRow(L,rw,&ncols,&cols,&vals);CHKERRQ(ierr); > > s = sizeof(vals); > This is wrong. sizeof(vals) gives bytes, not entries. Why don't you just use ncols here? Matt > ierr = PetscMalloc1(s,&arr);CHKERRQ(ierr); > > for(j=0;j > arr[j]=-1.0; > } > ierr = > MatSetValues(NSymmA,1,&rw,ncols,cols,arr,INSERT_VALUES);CHKERRQ(ierr); > ierr = MatRestoreRow(L,rw,&ncols,&cols,&vals);CHKERRQ(ierr); > } > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eda.oktay at metu.edu.tr Mon Mar 11 08:05:23 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Mon, 11 Mar 2019 16:05:23 +0300 Subject: [petsc-users] Problem in MatSetValues In-Reply-To: References: Message-ID: Dear Matt, Thank you for answering. First of all, sizeof(vals) returns to number of entries, I checked. Secondly, I found a problem: ncols gives me 6.95328e-310. However, I checked the matrix L, it was computed properly. Why can ncols give such a value? Thanks, Eda Matthew Knepley , 11 Mar 2019 Pzt, 15:56 tarihinde ?unu yazd?: > On Mon, Mar 11, 2019 at 8:27 AM Eda Oktay via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello, >> >> I have a following part of a code which tries to change the nonzero >> values of matrix L with -1. However in MatSetValues line, something happens >> and some of the values in matrix turns into 1.99665e-314 instead of -1. >> Type of arr is defined as PetscScalar and arr is produced correctly. What >> can be the problem, is there a mistake about types? >> >> Thanks, >> >> Eda >> >> >> for(rw = mm; rw> >> ierr = MatGetRow(L,rw,&ncols,&cols,&vals);CHKERRQ(ierr); >> >> s = sizeof(vals); >> > > This is wrong. sizeof(vals) gives bytes, not entries. Why don't you just > use ncols here? > > Matt > > >> ierr = PetscMalloc1(s,&arr);CHKERRQ(ierr); >> >> for(j=0;j> >> arr[j]=-1.0; >> } >> ierr = >> MatSetValues(NSymmA,1,&rw,ncols,cols,arr,INSERT_VALUES);CHKERRQ(ierr); >> ierr = MatRestoreRow(L,rw,&ncols,&cols,&vals);CHKERRQ(ierr); >> } >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eda.oktay at metu.edu.tr Mon Mar 11 08:07:04 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Mon, 11 Mar 2019 16:07:04 +0300 Subject: [petsc-users] Problem in MatSetValues In-Reply-To: References: Message-ID: Dear Matt, I printed in wrong state, ncols gives right solution. But I still can't understand the first problem. Eda Eda Oktay , 11 Mar 2019 Pzt, 16:05 tarihinde ?unu yazd?: > Dear Matt, > > Thank you for answering. First of all, sizeof(vals) returns to number of > entries, I checked. Secondly, I found a problem: > > ncols gives me 6.95328e-310. However, I checked the matrix L, it was > computed properly. > > Why can ncols give such a value? > > Thanks, > > Eda > > Matthew Knepley , 11 Mar 2019 Pzt, 15:56 tarihinde > ?unu yazd?: > >> On Mon, Mar 11, 2019 at 8:27 AM Eda Oktay via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Hello, >>> >>> I have a following part of a code which tries to change the nonzero >>> values of matrix L with -1. However in MatSetValues line, something happens >>> and some of the values in matrix turns into 1.99665e-314 instead of -1. >>> Type of arr is defined as PetscScalar and arr is produced correctly. What >>> can be the problem, is there a mistake about types? >>> >>> Thanks, >>> >>> Eda >>> >>> >>> for(rw = mm; rw>> >>> ierr = MatGetRow(L,rw,&ncols,&cols,&vals);CHKERRQ(ierr); >>> >>> s = sizeof(vals); >>> >> >> This is wrong. sizeof(vals) gives bytes, not entries. Why don't you just >> use ncols here? >> >> Matt >> >> >>> ierr = PetscMalloc1(s,&arr);CHKERRQ(ierr); >>> >>> for(j=0;j>> >>> arr[j]=-1.0; >>> } >>> ierr = >>> MatSetValues(NSymmA,1,&rw,ncols,cols,arr,INSERT_VALUES);CHKERRQ(ierr); >>> ierr = MatRestoreRow(L,rw,&ncols,&cols,&vals);CHKERRQ(ierr); >>> } >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eda.oktay at metu.edu.tr Mon Mar 11 08:22:27 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Mon, 11 Mar 2019 16:22:27 +0300 Subject: [petsc-users] Problem in MatSetValues In-Reply-To: References: Message-ID: Dear Matt, I understood that you are right. I changed sizeof(values) with ncols, it gives matrix correctly. However, now I get an error in EPSGetEigenpair: 0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: Argument 2 out of range [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.10.3, Dec, 18, 2018 [0]PETSC ERROR: ./ENYENI_FINAL on a arch-linux2-c-debug named 70a.wls.metu.edu.tr by edaoktay Mon Mar 11 16:17:25 2019 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-cxx-dialect=C++11 --download-openblas --download-metis --download-parmetis --download-superlu_dist --download-slepc --download-mpich [0]PETSC ERROR: #1 EPSGetEigenpair() line 398 in /home/edaoktay/petsc-3.10.3/arch-linux2-c-debug/externalpackages/git.slepc/src/eps/interface/epssolve.c I understood from this error that the matrix does not converge. However, I tested in MATLAB and it converges. The matrix is true now, so this cannot be from the matrix. Thanks, Eda Eda Oktay , 11 Mar 2019 Pzt, 16:07 tarihinde ?unu yazd?: > Dear Matt, > > I printed in wrong state, ncols gives right solution. > > But I still can't understand the first problem. > > Eda > > Eda Oktay , 11 Mar 2019 Pzt, 16:05 tarihinde ?unu > yazd?: > >> Dear Matt, >> >> Thank you for answering. First of all, sizeof(vals) returns to number of >> entries, I checked. Secondly, I found a problem: >> >> ncols gives me 6.95328e-310. However, I checked the matrix L, it was >> computed properly. >> >> Why can ncols give such a value? >> >> Thanks, >> >> Eda >> >> Matthew Knepley , 11 Mar 2019 Pzt, 15:56 tarihinde >> ?unu yazd?: >> >>> On Mon, Mar 11, 2019 at 8:27 AM Eda Oktay via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Hello, >>>> >>>> I have a following part of a code which tries to change the nonzero >>>> values of matrix L with -1. However in MatSetValues line, something happens >>>> and some of the values in matrix turns into 1.99665e-314 instead of -1. >>>> Type of arr is defined as PetscScalar and arr is produced correctly. What >>>> can be the problem, is there a mistake about types? >>>> >>>> Thanks, >>>> >>>> Eda >>>> >>>> >>>> for(rw = mm; rw>>> >>>> ierr = MatGetRow(L,rw,&ncols,&cols,&vals);CHKERRQ(ierr); >>>> >>>> s = sizeof(vals); >>>> >>> >>> This is wrong. sizeof(vals) gives bytes, not entries. Why don't you just >>> use ncols here? >>> >>> Matt >>> >>> >>>> ierr = PetscMalloc1(s,&arr);CHKERRQ(ierr); >>>> >>>> for(j=0;j>>> >>>> arr[j]=-1.0; >>>> } >>>> ierr = >>>> MatSetValues(NSymmA,1,&rw,ncols,cols,arr,INSERT_VALUES);CHKERRQ(ierr); >>>> ierr = MatRestoreRow(L,rw,&ncols,&cols,&vals);CHKERRQ(ierr); >>>> } >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 11 08:28:12 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 11 Mar 2019 09:28:12 -0400 Subject: [petsc-users] Problem in MatSetValues In-Reply-To: References: Message-ID: On Mon, Mar 11, 2019 at 9:22 AM Eda Oktay wrote: > Dear Matt, > > I understood that you are right. I changed sizeof(values) with ncols, it > gives matrix correctly. > > However, now I get an error in EPSGetEigenpair: > You have to check how many _eigenvalues_ converged. It sounds like it is less than 2. Matt > 0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Argument 2 out of range > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.10.3, Dec, 18, 2018 > [0]PETSC ERROR: ./ENYENI_FINAL on a arch-linux2-c-debug named > 70a.wls.metu.edu.tr by edaoktay Mon Mar 11 16:17:25 2019 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-cxx-dialect=C++11 --download-openblas > --download-metis --download-parmetis --download-superlu_dist > --download-slepc --download-mpich > [0]PETSC ERROR: #1 EPSGetEigenpair() line 398 in > /home/edaoktay/petsc-3.10.3/arch-linux2-c-debug/externalpackages/git.slepc/src/eps/interface/epssolve.c > > I understood from this error that the matrix does not converge. However, I > tested in MATLAB and it converges. The matrix is true now, so this cannot > be from the matrix. > > Thanks, > > Eda > > Eda Oktay , 11 Mar 2019 Pzt, 16:07 tarihinde ?unu > yazd?: > >> Dear Matt, >> >> I printed in wrong state, ncols gives right solution. >> >> But I still can't understand the first problem. >> >> Eda >> >> Eda Oktay , 11 Mar 2019 Pzt, 16:05 tarihinde ?unu >> yazd?: >> >>> Dear Matt, >>> >>> Thank you for answering. First of all, sizeof(vals) returns to number of >>> entries, I checked. Secondly, I found a problem: >>> >>> ncols gives me 6.95328e-310. However, I checked the matrix L, it was >>> computed properly. >>> >>> Why can ncols give such a value? >>> >>> Thanks, >>> >>> Eda >>> >>> Matthew Knepley , 11 Mar 2019 Pzt, 15:56 tarihinde >>> ?unu yazd?: >>> >>>> On Mon, Mar 11, 2019 at 8:27 AM Eda Oktay via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>>> Hello, >>>>> >>>>> I have a following part of a code which tries to change the nonzero >>>>> values of matrix L with -1. However in MatSetValues line, something happens >>>>> and some of the values in matrix turns into 1.99665e-314 instead of -1. >>>>> Type of arr is defined as PetscScalar and arr is produced correctly. What >>>>> can be the problem, is there a mistake about types? >>>>> >>>>> Thanks, >>>>> >>>>> Eda >>>>> >>>>> >>>>> for(rw = mm; rw>>>> >>>>> ierr = MatGetRow(L,rw,&ncols,&cols,&vals);CHKERRQ(ierr); >>>>> >>>>> s = sizeof(vals); >>>>> >>>> >>>> This is wrong. sizeof(vals) gives bytes, not entries. Why don't you >>>> just use ncols here? >>>> >>>> Matt >>>> >>>> >>>>> ierr = PetscMalloc1(s,&arr);CHKERRQ(ierr); >>>>> >>>>> for(j=0;j>>>> >>>>> arr[j]=-1.0; >>>>> } >>>>> ierr = >>>>> MatSetValues(NSymmA,1,&rw,ncols,cols,arr,INSERT_VALUES);CHKERRQ(ierr); >>>>> ierr = >>>>> MatRestoreRow(L,rw,&ncols,&cols,&vals);CHKERRQ(ierr); >>>>> } >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From myriam.peyrounette at idris.fr Mon Mar 11 08:32:07 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Mon, 11 Mar 2019 14:32:07 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> Message-ID: <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> The code I am using here is the example 42 of PETSc (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). Indeed it solves the Stokes equation. I thought it was a good idea to use an example you might know (and didn't find any that uses GAMG functions). I just changed the PCMG setup so that the memory problem appears. And it appears when adding PCGAMG. I don't care about the performance or even the result rightness here, but only about the difference in memory use between 3.6 and 3.10. Do you think finding a more adapted script would help? I used the threshold of 0.1 only once, at the beginning, to test its influence. I used the default threshold (of 0, I guess) for all the other runs. Myriam Le 03/11/19 ? 13:52, Mark Adams a ?crit?: > In looking at this larger scale run ... > > * Your eigen estimates are much lower than your tiny test problem.? > But this is Stokes apparently and it should not work anyway. Maybe you > have a small time step that adds a lot of mass that brings the eigen > estimates down. And your min eigenvalue (not used) is positive. I > would expect negative for Stokes ... > > * You seem to be setting a threshold value of 0.1 -- that is very high > > * v3.6 says "using nonzero initial guess" but this is not in v3.10. > Maybe we just stopped printing that. > > * There were some changes to coasening parameters in going from v3.6 > but it does not look like your problem was effected. (The coarsening > algo is non-deterministic by default and you can see small difference > on different runs) > > * We may have also added a "noisy" RHS for eigen estimates by default > from v3.6. > > * And for non-symetric problems you can try -pc_gamg_agg_nsmooths 0, > but again GAMG is not built for Stokes anyway. > > > On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette > > wrote: > > I used PCView to display the size of the linear system in each > level of the MG. You'll find the outputs attached to this mail > (zip file) for both the default threshold value and a value of > 0.1, and for both 3.6 and 3.10 PETSc versions. > > For convenience, I summarized the information in a graph, also > attached (png file). > > As you can see, there are slight differences between the two > versions but none is critical, in my opinion. Do you see anything > suspicious in the outputs? > > + I can't find the default threshold value. Do you know where I > can find it? > > Thanks for the follow-up > > Myriam > > > Le 03/05/19 ? 14:06, Matthew Knepley a ?crit?: >> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette >> > > wrote: >> >> Hi Matt, >> >> I plotted the memory scalings using different threshold >> values. The two scalings are slightly translated (from -22 to >> -88 mB) but this gain is neglectable. The 3.6-scaling keeps >> being robust while the 3.10-scaling deteriorates. >> >> Do you have any other suggestion? >> >> Mark, what is the option she can give to output all the GAMG data? >> >> Also, run using -ksp_view. GAMG will report all the sizes of its >> grids, so it should be easy to see >> if the coarse grid sizes are increasing, and also what the effect >> of the threshold value is. >> >> ? Thanks, >> >> ? ? ?Matt? >> >> Thanks >> >> Myriam >> >> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit?: >>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via >>> petsc-users >> > wrote: >>> >>> Hi, >>> >>> I used to run my code with PETSc 3.6. Since I upgraded >>> the PETSc version >>> to 3.10, this code has a bad memory scaling. >>> >>> To report this issue, I took the PETSc script ex42.c and >>> slightly >>> modified it so that the KSP and PC configurations are >>> the same as in my >>> code. In particular, I use a "personnalised" multi-grid >>> method. The >>> modifications are indicated by the keyword "TopBridge" >>> in the attached >>> scripts. >>> >>> To plot the memory (weak) scaling, I ran four >>> calculations for each >>> script with increasing problem sizes and computations cores: >>> >>> 1. 100,000 elts on 4 cores >>> 2. 1 million elts on 40 cores >>> 3. 10 millions elts on 400 cores >>> 4. 100 millions elts on 4,000 cores >>> >>> The resulting graph is also attached. The scaling using >>> PETSc 3.10 >>> clearly deteriorates for large cases, while the one >>> using PETSc 3.6 is >>> robust. >>> >>> After a few tests, I found that the scaling is mostly >>> sensitive to the >>> use of the AMG method for the coarse grid (line 1780 in >>> main_ex42_petsc36.cc). In particular, the performance >>> strongly >>> deteriorates when commenting lines 1777 to 1790 (in >>> main_ex42_petsc36.cc). >>> >>> Do you have any idea of what changed between version 3.6 >>> and version >>> 3.10 that may imply such degradation? >>> >>> >>> I believe the default values for PCGAMG changed between >>> versions. It sounds like the coarsening rate >>> is not great enough, so that these grids are too large. This >>> can be set using: >>> >>> ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>> >>> There is some explanation of this effect on that page. Let >>> us know if setting this does not correct the situation. >>> >>> ? Thanks, >>> >>> ? ? ?Matt >>> ? >>> >>> Let me know if you need further information. >>> >>> Best, >>> >>> Myriam Peyrounette >>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin >>> their experiments is infinitely more interesting than any >>> results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From mfadams at lbl.gov Mon Mar 11 08:36:54 2019 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 11 Mar 2019 09:36:54 -0400 Subject: [petsc-users] MatCreate performance In-Reply-To: References: <87o96l9wzu.fsf@jedbrown.org> Message-ID: The PETSc logs print the max time and the ratio max/min. On Mon, Mar 11, 2019 at 8:24 AM Ale Foggia via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello all, > > Thanks for your answers. > > 1) I'm working with a matrix with a linear size of 2**34, but it's a > sparse matrix, and the number of elements different from zero is > 43,207,072,74. I know that the distribution of these elements is not > balanced between the processes, the matrix is more populated in the middle > part. > > 2) I initialize Slepc. Then I create the basis elements of the system > (this part does not involve Petsc/Slepc, and every process is just > computing -and owns- an equal amount of basis elements). Then I call: > ierr = MatCreate(PETSC_COMM_WORLD, &A); CHKERRQ(ierr); > ierr = MatSetType(A, MATMPIAIJ); CHKERRQ(ierr); > ierr = MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, size, size); > CHKERRQ(ierr); > ierr = MatMPIAIJSetPreallocation(A, 0, d_nnz, 0, o_nnz); CHKERRQ(ierr); > ierr = MatZeroEntries(A); CHKERRQ(ierr); > After this, I compute the elements of the matrix and set the values with > MatSetValues. The I call EPSSolve (with KrylovSchur and setting the type as > EPS_HEP). > > 3) There are a few more things that are strange to me. I measure the > execution time of these parts both with a PetscLogStage and with a > std::chrono (in nanoseconds) clock. I understand that the time given by the > Log is an average over the processes right? In the case of the std::chrono, > I'm only printing the times from process 0 (no average over processes). > What I see is the following: > 1024 procs 2048 procs 4096 > procs 8192 procs > Log std Log std > Log std Log std > MatCreate 68.42 122.7 67.08 121.2 62.29 116 > 73.36 127.4 > preallocation 140.36 140.3 76.45 76.45 40.31 > 40.3 21.13 21.12 > MatSetValues 237.79 237.7 116.6 116.6 60.59 60.59 > 35.32 35.32 > ESPSolve 162.8 160 95.8 94.2 62.17 > 60.63 41.16 40.24 > > - So, all the times (including the total execution time that I'm not > showing here) are the same between PetscLogStage and the std::chrono clock, > except for the part of MatCreate. Maybe that part is very unbalanced? > - The time of the MatCreate given by the PetscLogStage is not changing. > > Ale > > El vie., 8 mar. 2019 a las 17:00, Jed Brown () escribi?: > >> This is very unusual. MatCreate() does no work, merely dup'ing a >> communicator (or referencing an inner communicator if this is not the >> first PetscObject on the provided communicator). What size matrices are >> you working with? Can you send some performance data and (if feasible) >> a reproducer? >> >> Ale Foggia via petsc-users writes: >> >> > Hello all, >> > >> > I have a problem with the scaling of the MatCreate() function. I wrote a >> > code to diagonalize sparse matrices and I'm running it in parallel. I've >> > observed a very bad speedup of the code and it's given by the MatCreate >> > part of it: for a fixed matrix size, when I increase the number of >> > processes the time taken by the function also increases. I wanted to >> know >> > if you expect this behavior or if maybe there's something wrong with my >> > code. When I go to (what I consider) very big matrix sizes, and >> depending >> > on the number of mpi processes, in some cases, MatCreate takes more time >> > than the time the solver takes to solve the system for one eigenvalue or >> > the time it takes to set up the values. >> > >> > Ale >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Mar 11 08:40:40 2019 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 11 Mar 2019 09:40:40 -0400 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> Message-ID: Is there a difference in memory usage on your tiny problem? I assume no. I don't see anything that could come from GAMG other than the RAP stuff that you have discussed already. On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette < myriam.peyrounette at idris.fr> wrote: > The code I am using here is the example 42 of PETSc ( > https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). > Indeed it solves the Stokes equation. I thought it was a good idea to use > an example you might know (and didn't find any that uses GAMG functions). I > just changed the PCMG setup so that the memory problem appears. And it > appears when adding PCGAMG. > > I don't care about the performance or even the result rightness here, but > only about the difference in memory use between 3.6 and 3.10. Do you think > finding a more adapted script would help? > > I used the threshold of 0.1 only once, at the beginning, to test its > influence. I used the default threshold (of 0, I guess) for all the other > runs. > > Myriam > > Le 03/11/19 ? 13:52, Mark Adams a ?crit : > > In looking at this larger scale run ... > > * Your eigen estimates are much lower than your tiny test problem. But > this is Stokes apparently and it should not work anyway. Maybe you have a > small time step that adds a lot of mass that brings the eigen estimates > down. And your min eigenvalue (not used) is positive. I would expect > negative for Stokes ... > > * You seem to be setting a threshold value of 0.1 -- that is very high > > * v3.6 says "using nonzero initial guess" but this is not in v3.10. Maybe > we just stopped printing that. > > * There were some changes to coasening parameters in going from v3.6 but > it does not look like your problem was effected. (The coarsening algo is > non-deterministic by default and you can see small difference on different > runs) > > * We may have also added a "noisy" RHS for eigen estimates by default from > v3.6. > > * And for non-symetric problems you can try -pc_gamg_agg_nsmooths 0, but > again GAMG is not built for Stokes anyway. > > > On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette < > myriam.peyrounette at idris.fr> wrote: > >> I used PCView to display the size of the linear system in each level of >> the MG. You'll find the outputs attached to this mail (zip file) for both >> the default threshold value and a value of 0.1, and for both 3.6 and 3.10 >> PETSc versions. >> >> For convenience, I summarized the information in a graph, also attached >> (png file). >> >> As you can see, there are slight differences between the two versions but >> none is critical, in my opinion. Do you see anything suspicious in the >> outputs? >> >> + I can't find the default threshold value. Do you know where I can find >> it? >> >> Thanks for the follow-up >> >> Myriam >> >> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit : >> >> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette < >> myriam.peyrounette at idris.fr> wrote: >> >>> Hi Matt, >>> >>> I plotted the memory scalings using different threshold values. The two >>> scalings are slightly translated (from -22 to -88 mB) but this gain is >>> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling >>> deteriorates. >>> >>> Do you have any other suggestion? >>> >> Mark, what is the option she can give to output all the GAMG data? >> >> Also, run using -ksp_view. GAMG will report all the sizes of its grids, >> so it should be easy to see >> if the coarse grid sizes are increasing, and also what the effect of the >> threshold value is. >> >> Thanks, >> >> Matt >> >>> Thanks >>> Myriam >>> >>> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit : >>> >>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Hi, >>>> >>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc version >>>> to 3.10, this code has a bad memory scaling. >>>> >>>> To report this issue, I took the PETSc script ex42.c and slightly >>>> modified it so that the KSP and PC configurations are the same as in my >>>> code. In particular, I use a "personnalised" multi-grid method. The >>>> modifications are indicated by the keyword "TopBridge" in the attached >>>> scripts. >>>> >>>> To plot the memory (weak) scaling, I ran four calculations for each >>>> script with increasing problem sizes and computations cores: >>>> >>>> 1. 100,000 elts on 4 cores >>>> 2. 1 million elts on 40 cores >>>> 3. 10 millions elts on 400 cores >>>> 4. 100 millions elts on 4,000 cores >>>> >>>> The resulting graph is also attached. The scaling using PETSc 3.10 >>>> clearly deteriorates for large cases, while the one using PETSc 3.6 is >>>> robust. >>>> >>>> After a few tests, I found that the scaling is mostly sensitive to the >>>> use of the AMG method for the coarse grid (line 1780 in >>>> main_ex42_petsc36.cc). In particular, the performance strongly >>>> deteriorates when commenting lines 1777 to 1790 (in >>>> main_ex42_petsc36.cc). >>>> >>>> Do you have any idea of what changed between version 3.6 and version >>>> 3.10 that may imply such degradation? >>>> >>> >>> I believe the default values for PCGAMG changed between versions. It >>> sounds like the coarsening rate >>> is not great enough, so that these grids are too large. This can be set >>> using: >>> >>> >>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>> >>> There is some explanation of this effect on that page. Let us know if >>> setting this does not correct the situation. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Let me know if you need further information. >>>> >>>> Best, >>>> >>>> Myriam Peyrounette >>>> >>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From myriam.peyrounette at idris.fr Mon Mar 11 08:53:22 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Mon, 11 Mar 2019 14:53:22 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> Message-ID: There is a small difference in memory usage already (of 135mB). It is not a big deal but it will be for larger problems (as shown by the memory scaling). If we find the origin of this small gap for a small case, we probably find the reason why the memory scaling is so bad with 3.10. I am currently looking for the exact commit where the problem arises, using git bisect. I'll let you know about the result. Le 03/11/19 ? 14:40, Mark Adams a ?crit?: > Is there a difference in memory usage on your tiny problem? I assume no. > > I don't see anything that could come from GAMG other than the RAP > stuff that you have discussed already. > > On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette > > wrote: > > The code I am using here is the example 42 of PETSc > (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). > Indeed it solves the Stokes equation. I thought it was a good idea > to use an example you might know (and didn't find any that uses > GAMG functions). I just changed the PCMG setup so that the memory > problem appears. And it appears when adding PCGAMG. > > I don't care about the performance or even the result rightness > here, but only about the difference in memory use between 3.6 and > 3.10. Do you think finding a more adapted script would help? > > I used the threshold of 0.1 only once, at the beginning, to test > its influence. I used the default threshold (of 0, I guess) for > all the other runs. > > Myriam > > > Le 03/11/19 ? 13:52, Mark Adams a ?crit?: >> In looking at this larger scale run ... >> >> * Your eigen estimates are much lower than your tiny test >> problem.? But this is Stokes apparently and it should not work >> anyway. Maybe you have a small time step that adds a lot of mass >> that brings the eigen estimates down. And your min eigenvalue >> (not used) is positive. I would expect negative for Stokes ... >> >> * You seem to be setting a threshold value of 0.1 -- that is very >> high >> >> * v3.6 says "using nonzero initial guess" but this is not in >> v3.10. Maybe we just stopped printing that. >> >> * There were some changes to coasening parameters in going from >> v3.6 but it does not look like your problem was effected. (The >> coarsening algo is non-deterministic by default and you can see >> small difference on different runs) >> >> * We may have also added a "noisy" RHS for eigen estimates by >> default from v3.6. >> >> * And for non-symetric problems you can try -pc_gamg_agg_nsmooths >> 0, but again GAMG is not built for Stokes anyway. >> >> >> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette >> > > wrote: >> >> I used PCView to display the size of the linear system in >> each level of the MG. You'll find the outputs attached to >> this mail (zip file) for both the default threshold value and >> a value of 0.1, and for both 3.6 and 3.10 PETSc versions. >> >> For convenience, I summarized the information in a graph, >> also attached (png file). >> >> As you can see, there are slight differences between the two >> versions but none is critical, in my opinion. Do you see >> anything suspicious in the outputs? >> >> + I can't find the default threshold value. Do you know where >> I can find it? >> >> Thanks for the follow-up >> >> Myriam >> >> >> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit?: >>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette >>> >> > wrote: >>> >>> Hi Matt, >>> >>> I plotted the memory scalings using different threshold >>> values. The two scalings are slightly translated (from >>> -22 to -88 mB) but this gain is neglectable. The >>> 3.6-scaling keeps being robust while the 3.10-scaling >>> deteriorates. >>> >>> Do you have any other suggestion? >>> >>> Mark, what is the option she can give to output all the GAMG >>> data? >>> >>> Also, run using -ksp_view. GAMG will report all the sizes of >>> its grids, so it should be easy to see >>> if the coarse grid sizes are increasing, and also what the >>> effect of the threshold value is. >>> >>> ? Thanks, >>> >>> ? ? ?Matt? >>> >>> Thanks >>> >>> Myriam >>> >>> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit?: >>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via >>>> petsc-users >>> > wrote: >>>> >>>> Hi, >>>> >>>> I used to run my code with PETSc 3.6. Since I >>>> upgraded the PETSc version >>>> to 3.10, this code has a bad memory scaling. >>>> >>>> To report this issue, I took the PETSc script >>>> ex42.c and slightly >>>> modified it so that the KSP and PC configurations >>>> are the same as in my >>>> code. In particular, I use a "personnalised" >>>> multi-grid method. The >>>> modifications are indicated by the keyword >>>> "TopBridge" in the attached >>>> scripts. >>>> >>>> To plot the memory (weak) scaling, I ran four >>>> calculations for each >>>> script with increasing problem sizes and >>>> computations cores: >>>> >>>> 1. 100,000 elts on 4 cores >>>> 2. 1 million elts on 40 cores >>>> 3. 10 millions elts on 400 cores >>>> 4. 100 millions elts on 4,000 cores >>>> >>>> The resulting graph is also attached. The scaling >>>> using PETSc 3.10 >>>> clearly deteriorates for large cases, while the one >>>> using PETSc 3.6 is >>>> robust. >>>> >>>> After a few tests, I found that the scaling is >>>> mostly sensitive to the >>>> use of the AMG method for the coarse grid (line 1780 in >>>> main_ex42_petsc36.cc). In particular, the >>>> performance strongly >>>> deteriorates when commenting lines 1777 to 1790 (in >>>> main_ex42_petsc36.cc). >>>> >>>> Do you have any idea of what changed between >>>> version 3.6 and version >>>> 3.10 that may imply such degradation? >>>> >>>> >>>> I believe the default values for PCGAMG changed between >>>> versions. It sounds like the coarsening rate >>>> is not great enough, so that these grids are too large. >>>> This can be set using: >>>> >>>> ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>> >>>> There is some explanation of this effect on that page. >>>> Let us know if setting this does not correct the situation. >>>> >>>> ? Thanks, >>>> >>>> ? ? ?Matt >>>> ? >>>> >>>> Let me know if you need further information. >>>> >>>> Best, >>>> >>>> Myriam Peyrounette >>>> >>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they >>>> begin their experiments is infinitely more interesting >>>> than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin >>> their experiments is infinitely more interesting than any >>> results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From pietro.benedusi at usi.ch Mon Mar 11 09:42:21 2019 From: pietro.benedusi at usi.ch (Pietro Benedusi) Date: Mon, 11 Mar 2019 14:42:21 +0000 Subject: [petsc-users] Preconditioner in multigrid solver Message-ID: <962D5726-4295-4CBD-B88D-55D189D6ABFC@usi.ch> Dear Petsc team, I have a question about the setting up of a multigrid solver. I would like yo use a PCG smoother, preconditioned with a mass matrix, just on the fine level. But when add the line for preconditioning the CG with the mass matrix my MG diverges. I have implemented the same solver in MATLAB and it converges fine. Also the operators in PETSc are the same and the PCG applied directly on the problem (without MG) works the same in both PETSC and MATLAB. This is what I do in PETSC for 2 levels: KSP space_solver; ierr = KSPCreate(PETSC_COMM_WORLD,&space_solver);CHKERRQ(ierr); ierr = KSPSetTolerances(space_solver,1e-8,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr); ierr = KSPSetOperators(space_solver, K, K);CHKERRQ(ierr); ierr = KSPSetNormType(space_solver, KSP_NORM_UNPRECONDITIONED );CHKERRQ(ierr); ierr = KSPSetType(space_solver,KSPRICHARDSON);CHKERRQ(ierr); ierr = KSPSetFromOptions(space_solver);CHKERRQ(ierr); ierr = KSPSetUp(space_solver);CHKERRQ(ierr); PC pcmg; ierr = KSPGetPC(space_solver, &pcmg); ierr = PCSetType(pcmg, PCMG); ierr = PCMGSetLevels(pcmg,levels, NULL);CHKERRQ(ierr); ierr = PCMGSetGalerkin(pcmg,PC_MG_GALERKIN_BOTH);CHKERRQ(ierr); // smoothers for (int i = 1; i < levels; ++i) { KSP smoother; ierr = PCMGGetSmoother(pcmg, i, &smoother);CHKERRQ(ierr); ierr = KSPSetType(smoother, KSPCG);CHKERRQ(ierr); ierr = KSPSetOperators(smoother, K, M);CHKERRQ(ierr); // ierr = KSPSetUp(smoother);CHKERRQ(ierr); ierr = KSPSetTolerances(smoother,1e-12,PETSC_DEFAULT,PETSC_DEFAULT,s_p);CHKERRQ(ierr); ierr = KSPSetNormType(smoother, KSP_NORM_NONE);CHKERRQ(ierr); PC sm; ierr = KSPGetPC(smoother, &sm);CHKERRQ(ierr); ierr = PCSetType(sm, PCLU);CHKERRQ(ierr); ierr = PCMGSetInterpolation(pcmg, i, interpolation_operators[i-1]);CHKERRQ(ierr); } I think there is a problem with the PETSc syntax, because I checked everything else and it is fine. Do you any ideas? Thank you very much! Best, Pietro ~~~~~~~~~~~~ Pietro Benedusi Numerical Simulation in Science, Medicine and Engineering research group ICS, Institute of Computational Science USI, Universit? della Svizzera Italiana Via Giuseppe Buffi, 13 CH - 6900 Lugano benedp at usi.ch -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 11 11:28:11 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 11 Mar 2019 12:28:11 -0400 Subject: [petsc-users] Preconditioner in multigrid solver In-Reply-To: <962D5726-4295-4CBD-B88D-55D189D6ABFC@usi.ch> References: <962D5726-4295-4CBD-B88D-55D189D6ABFC@usi.ch> Message-ID: On Mon, Mar 11, 2019 at 10:42 AM Pietro Benedusi via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear Petsc team, > > I have a question about the setting up of a multigrid solver. > > I would like yo use a PCG smoother, preconditioned with a mass matrix, > just on the fine level. > But when add the line for preconditioning the CG with the mass matrix my > MG diverges. > > I have implemented the same solver in MATLAB and it converges fine. Also > the operators in PETSc are the same and the PCG applied directly on the > problem (without MG) works the same in both PETSC and MATLAB. > > This is what I do in PETSC for 2 levels: > Send the output of -ksp_view so we can see what the exact solver is. It will not tell whether the matrices you set are the correct ones, but at least we can see the solver. Also, did you look at the residual decrease? Is the smoother working on every level but the finest? Is the coarse grid correction effective? Thanks, Matt > > KSP space_solver; > > ierr = KSPCreate(PETSC_COMM_WORLD,&space_solver);CHKERRQ(ierr); > ierr = > KSPSetTolerances(space_solver,1e-8,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr); > > ierr = KSPSetOperators(space_solver, K, K);CHKERRQ(ierr); > ierr = KSPSetNormType(space_solver, KSP_NORM_UNPRECONDITIONED > );CHKERRQ(ierr); > > ierr = KSPSetType(space_solver,KSPRICHARDSON);CHKERRQ(ierr); > ierr = KSPSetFromOptions(space_solver);CHKERRQ(ierr); > ierr = KSPSetUp(space_solver);CHKERRQ(ierr); > > PC pcmg; > ierr = KSPGetPC(space_solver, &pcmg); > ierr = PCSetType(pcmg, PCMG); > ierr = PCMGSetLevels(pcmg,levels, NULL);CHKERRQ(ierr); > ierr = PCMGSetGalerkin(pcmg,PC_MG_GALERKIN_BOTH);CHKERRQ(ierr); > > // smoothers > for (int i = 1; i < levels; ++i) > { > KSP smoother; > ierr = PCMGGetSmoother(pcmg, i, &smoother);CHKERRQ(ierr); > > ierr = KSPSetType(smoother, KSPCG);CHKERRQ(ierr); > ierr = KSPSetOperators(smoother, K, M);CHKERRQ(ierr); > > > // ierr = KSPSetUp(smoother);CHKERRQ(ierr); > > ierr = > KSPSetTolerances(smoother,1e-12,PETSC_DEFAULT,PETSC_DEFAULT,s_p);CHKERRQ(ierr); > > ierr = KSPSetNormType(smoother, KSP_NORM_NONE);CHKERRQ(ierr); > > PC sm; > ierr = KSPGetPC(smoother, &sm);CHKERRQ(ierr); > ierr = PCSetType(sm, PCLU);CHKERRQ(ierr); > > ierr = PCMGSetInterpolation(pcmg, i, > interpolation_operators[i-1]);CHKERRQ(ierr); > } > > > > I think there is a problem with the PETSc syntax, because I checked > everything else and it is fine. > > Do you any ideas? > > Thank you very much! > > Best, > Pietro > > ~~~~~~~~~~~~ > Pietro Benedusi > > Numerical Simulation in Science, > Medicine and Engineering research group > ICS, Institute of Computational Science > USI, Universit? della Svizzera Italiana > Via Giuseppe Buffi, 13 > CH - 6900 Lugano > benedp at usi.ch > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Mar 11 12:38:14 2019 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 11 Mar 2019 13:38:14 -0400 Subject: [petsc-users] Preconditioner in multigrid solver In-Reply-To: <962D5726-4295-4CBD-B88D-55D189D6ABFC@usi.ch> References: <962D5726-4295-4CBD-B88D-55D189D6ABFC@usi.ch> Message-ID: You are giving all levels the same matrices (K & M). This code should not work. You are using LU as the smother. This will solve the problem immediately. If MG is setup correctly then you will just have zero residuals and corrections for the rest of the solve. And you set the relative tolerance to 1.e-12, which will solve the problem with whatever smoother you use. On Mon, Mar 11, 2019 at 10:42 AM Pietro Benedusi via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear Petsc team, > > I have a question about the setting up of a multigrid solver. > > I would like yo use a PCG smoother, preconditioned with a mass matrix, > just on the fine level. > But when add the line for preconditioning the CG with the mass matrix my > MG diverges. > > I have implemented the same solver in MATLAB and it converges fine. Also > the operators in PETSc are the same and the PCG applied directly on the > problem (without MG) works the same in both PETSC and MATLAB. > > This is what I do in PETSC for 2 levels: > > > KSP space_solver; > > ierr = KSPCreate(PETSC_COMM_WORLD,&space_solver);CHKERRQ(ierr); > ierr = > KSPSetTolerances(space_solver,1e-8,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr); > > ierr = KSPSetOperators(space_solver, K, K);CHKERRQ(ierr); > ierr = KSPSetNormType(space_solver, KSP_NORM_UNPRECONDITIONED > );CHKERRQ(ierr); > > ierr = KSPSetType(space_solver,KSPRICHARDSON);CHKERRQ(ierr); > ierr = KSPSetFromOptions(space_solver);CHKERRQ(ierr); > ierr = KSPSetUp(space_solver);CHKERRQ(ierr); > > PC pcmg; > ierr = KSPGetPC(space_solver, &pcmg); > ierr = PCSetType(pcmg, PCMG); > ierr = PCMGSetLevels(pcmg,levels, NULL);CHKERRQ(ierr); > ierr = PCMGSetGalerkin(pcmg,PC_MG_GALERKIN_BOTH);CHKERRQ(ierr); > > // smoothers > for (int i = 1; i < levels; ++i) > { > KSP smoother; > ierr = PCMGGetSmoother(pcmg, i, &smoother);CHKERRQ(ierr); > > ierr = KSPSetType(smoother, KSPCG);CHKERRQ(ierr); > ierr = KSPSetOperators(smoother, K, M);CHKERRQ(ierr); > > > // ierr = KSPSetUp(smoother);CHKERRQ(ierr); > > ierr = > KSPSetTolerances(smoother,1e-12,PETSC_DEFAULT,PETSC_DEFAULT,s_p);CHKERRQ(ierr); > > ierr = KSPSetNormType(smoother, KSP_NORM_NONE);CHKERRQ(ierr); > > PC sm; > ierr = KSPGetPC(smoother, &sm);CHKERRQ(ierr); > ierr = PCSetType(sm, PCLU);CHKERRQ(ierr); > > ierr = PCMGSetInterpolation(pcmg, i, > interpolation_operators[i-1]);CHKERRQ(ierr); > } > > > > I think there is a problem with the PETSc syntax, because I checked > everything else and it is fine. > > Do you any ideas? > > Thank you very much! > > Best, > Pietro > > ~~~~~~~~~~~~ > Pietro Benedusi > > Numerical Simulation in Science, > Medicine and Engineering research group > ICS, Institute of Computational Science > USI, Universit? della Svizzera Italiana > Via Giuseppe Buffi, 13 > CH - 6900 Lugano > benedp at usi.ch > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Mar 11 13:05:39 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 11 Mar 2019 18:05:39 +0000 Subject: [petsc-users] Preconditioner in multigrid solver In-Reply-To: <962D5726-4295-4CBD-B88D-55D189D6ABFC@usi.ch> References: <962D5726-4295-4CBD-B88D-55D189D6ABFC@usi.ch> Message-ID: > On Mar 11, 2019, at 9:42 AM, Pietro Benedusi via petsc-users wrote: > > Dear Petsc team, > > I have a question about the setting up of a multigrid solver. > > I would like yo use a PCG smoother, preconditioned with a mass matrix, just on the fine level. > But when add the line for preconditioning the CG with the mass matrix my MG diverges. > > I have implemented the same solver in MATLAB and it converges fine. Also the operators in PETSc are the same and the PCG applied directly on the problem (without MG) works the same in both PETSC and MATLAB. > > This is what I do in PETSC for 2 levels: > > > KSP space_solver; > > ierr = KSPCreate(PETSC_COMM_WORLD,&space_solver);CHKERRQ(ierr); > ierr = KSPSetTolerances(space_solver,1e-8,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr); > ierr = KSPSetOperators(space_solver, K, K);CHKERRQ(ierr); > ierr = KSPSetNormType(space_solver, KSP_NORM_UNPRECONDITIONED );CHKERRQ(ierr); > > ierr = KSPSetType(space_solver,KSPRICHARDSON);CHKERRQ(ierr); > ierr = KSPSetFromOptions(space_solver);CHKERRQ(ierr); > ierr = KSPSetUp(space_solver);CHKERRQ(ierr); > > PC pcmg; > ierr = KSPGetPC(space_solver, &pcmg); > ierr = PCSetType(pcmg, PCMG); > ierr = PCMGSetLevels(pcmg,levels, NULL);CHKERRQ(ierr); > ierr = PCMGSetGalerkin(pcmg,PC_MG_GALERKIN_BOTH);CHKERRQ(ierr); > > // smoothers > for (int i = 1; i < levels; ++i) > { > KSP smoother; > ierr = PCMGGetSmoother(pcmg, i, &smoother);CHKERRQ(ierr); > > ierr = KSPSetType(smoother, KSPCG);CHKERRQ(ierr); > ierr = KSPSetOperators(smoother, K, M);CHKERRQ(ierr); I'm not sure what you mean by "preconditioned with a mass matrix" but I don't think this line makes sense. The last argument to this call is the matrix from which the preconditioner is constructed. I don't think it ever makes sense to precondition a Laplacian with a mass matrix; they are very different beasts. Barry > > > // ierr = KSPSetUp(smoother);CHKERRQ(ierr); > > ierr = KSPSetTolerances(smoother,1e-12,PETSC_DEFAULT,PETSC_DEFAULT,s_p);CHKERRQ(ierr); > ierr = KSPSetNormType(smoother, KSP_NORM_NONE);CHKERRQ(ierr); > > PC sm; > ierr = KSPGetPC(smoother, &sm);CHKERRQ(ierr); > ierr = PCSetType(sm, PCLU);CHKERRQ(ierr); > > ierr = PCMGSetInterpolation(pcmg, i, interpolation_operators[i-1]);CHKERRQ(ierr); > } > > > > I think there is a problem with the PETSc syntax, because I checked everything else and it is fine. > > Do you any ideas? > > Thank you very much! > > Best, > Pietro > > ~~~~~~~~~~~~ > Pietro Benedusi > > Numerical Simulation in Science, > Medicine and Engineering research group > ICS, Institute of Computational Science > USI, Universit? della Svizzera Italiana > Via Giuseppe Buffi, 13 > CH - 6900 Lugano > benedp at usi.ch > > From yyang85 at stanford.edu Mon Mar 11 15:17:46 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Mon, 11 Mar 2019 20:17:46 +0000 Subject: [petsc-users] Conceptual question about DMDA Message-ID: Hello team, May I know for what types of computations is DMDA better to use compared to regular Vec/Mat? It is more complicated in terms of usage, thus so far I've only used Vec/Mat. Would DMDA improve the performance of solving large linear systems (say for variable grid spacing as a result of coordinate transforms, with finite difference method)? What considerations should go into implementing it? Thank you very much! Yuyun -------------- next part -------------- An HTML attachment was scrubbed... URL: From maahi.buet at gmail.com Mon Mar 11 15:17:32 2019 From: maahi.buet at gmail.com (Maahi Talukder) Date: Mon, 11 Mar 2019 16:17:32 -0400 Subject: [petsc-users] DMDA and ksp(ex46.c & ex22f.F90) Message-ID: Hello all, I am trying to solve Poisson Equation on structured grid using 9-point stencil in 2D. Now to setup my matrix, I came across C structure MatStencil in ex22f.F90 ........................................................................................................... call DMDAGetCorners (da,xs,ys,zs,xm,ym,zm,ierr) 107: do 10,k=zs,zs+zm-1108: do 20,j=ys,ys+ym-1109: do 30,i=xs,xs+xm-1110: row(MatStencil_i) = i111: row(MatStencil_j) = j112: row(MatStencil_k) = k113: if (i.eq.0 .or. j.eq.0 .or. k.eq.0 .or. i.eq.mx-1 .or. j.eq.my-1 .or. k.eq.mz-1) then114: v(1) = 2.0*(HxHydHz + HxHzdHy + HyHzdHx)115: call MatSetValuesStencil (jac,i1,row,i1,row,v,INSERT_VALUES ,ierr)116: else117: v(1) = -HxHydHz118: col(MatStencil_i,1) = i119: col(MatStencil_j,1) = j120: col(MatStencil_k,1) = k-1121: v(2) = -HxHzdHy122: col(MatStencil_i,2) = i123: col(MatStencil_j,2) = j-1124: col(MatStencil_k,2) = k125: v(3) = -HyHzdHx126: col(MatStencil_i,3) = i-1127: col(MatStencil_j,3) = j128: col(MatStencil_k,3) = k129: v(4) = 2.0*(HxHydHz + HxHzdHy + HyHzdHx)130: col(MatStencil_i,4) = i131: col(MatStencil_j,4) = j132: col(MatStencil_k,4) = k133: v(5) = -HyHzdHx134: col(MatStencil_i,5) = i+1135: col(MatStencil_j,5) = j136: col(MatStencil_k,5) = k137: v(6) = -HxHzdHy138: col(MatStencil_i,6) = i139: col(MatStencil_j,6) = j+1140: col(MatStencil_k,6) = k141: v(7) = -HxHydHz142: col(MatStencil_i,7) = i143: col(MatStencil_j,7) = j144: col(MatStencil_k,7) = k+1145: call MatSetValuesStencil (jac,i1,row,i7,col,v,INSERT_VALUES ,ierr)146: endif ..................................................................................... What I am confused about is what it means to have the value of row in i and j directions(row(MatStencil_i,1) & row(MatStencil_j,1)). Same confusion goes for the column values as well. I mean generally in a 2D Matrix row values are in j/y direction and column values are in i/x direction. Could you please explain that? Regards, Maahi Talukder Department of Mechanical Engineering Clarkson University -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Mon Mar 11 15:40:44 2019 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Mon, 11 Mar 2019 21:40:44 +0100 Subject: [petsc-users] DMDA and ksp(ex46.c & ex22f.F90) In-Reply-To: References: Message-ID: There are two different types of rows and columns: 1. Rows and columns in a grid 2. Rows and columns in a matrix "i" and "j" refer to rows and columns in the grid, but "row" and "col" refer to rows and columns in the matrix. Am Mo., 11. M?rz 2019 um 21:18 Uhr schrieb Maahi Talukder via petsc-users < petsc-users at mcs.anl.gov>: > Hello all, > > I am trying to solve Poisson Equation on structured grid using 9-point > stencil in 2D. Now to setup my matrix, I came across C structure MatStencil > in ex22f.F90 > > > ........................................................................................................... > > call DMDAGetCorners (da,xs,ys,zs,xm,ym,zm,ierr) > 107: do 10,k=zs,zs+zm-1108: do 20,j=ys,ys+ym-1109: do 30,i=xs,xs+xm-1110: row(MatStencil_i) = i111: row(MatStencil_j) = j112: row(MatStencil_k) = k113: if (i.eq.0 .or. j.eq.0 .or. k.eq.0 .or. i.eq.mx-1 .or. j.eq.my-1 .or. k.eq.mz-1) then114: v(1) = 2.0*(HxHydHz + HxHzdHy + HyHzdHx)115: call MatSetValuesStencil (jac,i1,row,i1,row,v,INSERT_VALUES ,ierr)116: else117: v(1) = -HxHydHz118: col(MatStencil_i,1) = i119: col(MatStencil_j,1) = j120: col(MatStencil_k,1) = k-1121: v(2) = -HxHzdHy122: col(MatStencil_i,2) = i123: col(MatStencil_j,2) = j-1124: col(MatStencil_k,2) = k125: v(3) = -HyHzdHx126: col(MatStencil_i,3) = i-1127: col(MatStencil_j,3) = j128: col(MatStencil_k,3) = k129: v(4) = 2.0*(HxHydHz + HxHzdHy + HyHzdHx)130: col(MatStencil_i,4) = i131: col(MatStencil_j,4) = j132: col(MatStencil_k,4) = k133: v(5) = -HyHzdHx134: col(MatStencil_i,5) = i+1135: col(MatStencil_j,5) = j136: col(MatStencil_k,5) = k137: v(6) = -HxHzdHy138: col(MatStencil_i,6) = i139: col(MatStencil_j,6) = j+1140: col(MatStencil_k,6) = k141: v(7) = -HxHydHz142: col(MatStencil_i,7) = i143: col(MatStencil_j,7) = j144: col(MatStencil_k,7) = k+1145: call MatSetValuesStencil (jac,i1,row,i7,col,v,INSERT_VALUES ,ierr)146: endif > > ..................................................................................... > > What I am confused about is what it means to have the value of row in i and j directions(row(MatStencil_i,1) & row(MatStencil_j,1)). > > Same confusion goes for the column values as well. I mean generally in a 2D Matrix row values are in j/y direction and column values are in i/x direction. > > Could you please explain that? > > > Regards, > > Maahi Talukder > > Department of Mechanical Engineering > > Clarkson University > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Mar 11 16:11:27 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 11 Mar 2019 21:11:27 +0000 Subject: [petsc-users] Conceptual question about DMDA In-Reply-To: References: Message-ID: Yuyun, DMDA is an add-on on top of Vec/Mat (it doesn't replace anything in them) DMDA manages the parallel layout of your structured grid across the processes so you don't have to manage that yourself. So, for structured grids using DMDA is actually easier than you having to manage partitioning the domain and setting up communication yourself. DMDA does it for you and choices partitions to minimize communications needed. You should look at the examples in src/ksp/ksp/examples/tutorials for linear problems and src/snes/examples/tutorials/ for nonlinear problems that use DMDACreate to see what the code looks like. Pick an example that is like your problem and copy it as a starting point. Barry > On Mar 11, 2019, at 3:17 PM, Yuyun Yang via petsc-users wrote: > > Hello team, > > May I know for what types of computations is DMDA better to use compared to regular Vec/Mat? It is more complicated in terms of usage, thus so far I?ve only used Vec/Mat. Would DMDA improve the performance of solving large linear systems (say for variable grid spacing as a result of coordinate transforms, with finite difference method)? What considerations should go into implementing it? > > Thank you very much! > Yuyun From edoardo.alinovi at gmail.com Mon Mar 11 16:13:46 2019 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Mon, 11 Mar 2019 22:13:46 +0100 Subject: [petsc-users] Error during install check - intel compiler Message-ID: Dear all, It's a while that I do not write here, hope you are doing all very well! I am compiling petsc with intel, everything seems fine druing configuration and make steps but when I perform the install check I get several errors (attached the log file). Are they dangerous (they are not very encouraging segmentation faults) ? What can I do to fix a bit the thing in the case? Configuration options are: ./configure PETSC_ARCH=arch-intel-opt --with-debugging=no --with-cxx-dialect=C++11 --download-superlu_dist --download-mumps --download-hypre --download-metis --download-parmetis --download-scalapack --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-debugging=0 COPTFLAGS='-ipo -03 -xHost' CXXOPTFLAGS='-ipo -03 -xHost' FOPTFLAGS='-ipo -03 -xHost' --with-blas-lapack-dir=$HOME/intel/mkl --with-mkl_pardiso-dir=$HOME/intel/mkl --with-mkl_cpardiso-dir=$HOME/intel/mkl The compiler that I m using comes with intel parallel studio 2019. As always every hints is hightly appreciated. Thank you very much! ----- Edoardo Alinovi, Ph.D. DICCA, Scuola Politecnica, Universita' degli Studi di Genova, 1, via Montallegro, 16145 Genova, Italy Email: edoardo.alinovi at dicca.unige.it Tel: +39 010 353 2540 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: log.errors Type: application/octet-stream Size: 2881 bytes Desc: not available URL: From bsmith at mcs.anl.gov Mon Mar 11 16:33:44 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 11 Mar 2019 21:33:44 +0000 Subject: [petsc-users] Error during install check - intel compiler In-Reply-To: References: Message-ID: <9237A282-39D1-4A9D-91F4-4E0359DD8D90@anl.gov> Not good. Run the test manually, then in the debugger to see exactly where it is crashing. cd src/snes/examples/tutorials make ex19 ./ex19 gdb (or the intel debugger) ./ex19 run Send all the output > On Mar 11, 2019, at 4:13 PM, Edoardo alinovi via petsc-users wrote: > > Dear all, > > It's a while that I do not write here, hope you are doing all very well! > > I am compiling petsc with intel, everything seems fine druing configuration and make steps but when I perform the install check I get several errors (attached the log file). Are they dangerous (they are not very encouraging segmentation faults) ? What can I do to fix a bit the thing in the case? > > Configuration options are: > > ./configure PETSC_ARCH=arch-intel-opt --with-debugging=no --with-cxx-dialect=C++11 --download-superlu_dist --download-mumps --download-hypre --download-metis --download-parmetis --download-scalapack --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-debugging=0 COPTFLAGS='-ipo -03 -xHost' CXXOPTFLAGS='-ipo -03 -xHost' FOPTFLAGS='-ipo -03 -xHost' --with-blas-lapack-dir=$HOME/intel/mkl --with-mkl_pardiso-dir=$HOME/intel/mkl --with-mkl_cpardiso-dir=$HOME/intel/mkl > > The compiler that I m using comes with intel parallel studio 2019. > > As always every hints is hightly appreciated. Thank you very much! > > ----- > > Edoardo Alinovi, Ph.D. > > DICCA, Scuola Politecnica, > Universita' degli Studi di Genova, > 1, via Montallegro, > 16145 Genova, Italy > > Email: edoardo.alinovi at dicca.unige.it > Tel: +39 010 353 2540 > > > > > From bsmith at mcs.anl.gov Mon Mar 11 17:03:06 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 11 Mar 2019 22:03:06 +0000 Subject: [petsc-users] Error during install check - intel compiler In-Reply-To: References: <9237A282-39D1-4A9D-91F4-4E0359DD8D90@anl.gov> Message-ID: Seems a problem with the MPI install. You could try compiling and running a same MPI (only) code to see if MPI_Init() then MPI_Finalize() succeed or not. Barry > On Mar 11, 2019, at 4:47 PM, Edoardo alinovi wrote: > > Thanks Barry for the help as usual. > > Attached the errors. Maybe intel is not installed correctly? (I have used the gui provided by intel so I do not know if this is the case...) > ------ > > Edoardo Alinovi, Ph.D. > > DICCA, Scuola Politecnica, > Universita' degli Studi di Genova, > 1, via Montallegro, > 16145 Genova, Italy > > Email: edoardo.alinovi at dicca.unige.it > Tel: +39 010 353 2540 > > > > > > Il giorno lun 11 mar 2019 alle ore 22:33 Smith, Barry F. ha scritto: > > Not good. Run the test manually, then in the debugger to see exactly where it is crashing. > > cd src/snes/examples/tutorials > make ex19 > ./ex19 > > gdb (or the intel debugger) ./ex19 > run > > Send all the output > > > > On Mar 11, 2019, at 4:13 PM, Edoardo alinovi via petsc-users wrote: > > > > Dear all, > > > > It's a while that I do not write here, hope you are doing all very well! > > > > I am compiling petsc with intel, everything seems fine druing configuration and make steps but when I perform the install check I get several errors (attached the log file). Are they dangerous (they are not very encouraging segmentation faults) ? What can I do to fix a bit the thing in the case? > > > > Configuration options are: > > > > ./configure PETSC_ARCH=arch-intel-opt --with-debugging=no --with-cxx-dialect=C++11 --download-superlu_dist --download-mumps --download-hypre --download-metis --download-parmetis --download-scalapack --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-debugging=0 COPTFLAGS='-ipo -03 -xHost' CXXOPTFLAGS='-ipo -03 -xHost' FOPTFLAGS='-ipo -03 -xHost' --with-blas-lapack-dir=$HOME/intel/mkl --with-mkl_pardiso-dir=$HOME/intel/mkl --with-mkl_cpardiso-dir=$HOME/intel/mkl > > > > The compiler that I m using comes with intel parallel studio 2019. > > > > As always every hints is hightly appreciated. Thank you very much! > > > > ----- > > > > Edoardo Alinovi, Ph.D. > > > > DICCA, Scuola Politecnica, > > Universita' degli Studi di Genova, > > 1, via Montallegro, > > 16145 Genova, Italy > > > > Email: edoardo.alinovi at dicca.unige.it > > Tel: +39 010 353 2540 > > > > > > > > > > > > From maahi.buet at gmail.com Mon Mar 11 19:07:59 2019 From: maahi.buet at gmail.com (Maahi Talukder) Date: Mon, 11 Mar 2019 20:07:59 -0400 Subject: [petsc-users] DMDA and ksp(ex46.c & ex22f.F90) In-Reply-To: References: Message-ID: Thank you for your reply. I still have some confusion. So if (i,j) is a point on the structured grid( Where "i" is the column and "j" is the row), and the information associated with the (i,j) point on the grid is stored in some (m,n) location of the matrix A (Where Ax =b), I still don't understand why both of row(MatStencil_i,1) and row(MatStencil_j,1) are necessary? I mean is it something like mapping "i" from grid to its location in the matrix? Would you please explain that? Regards, Maahi On Mon, Mar 11, 2019 at 4:41 PM Patrick Sanan wrote: > There are two different types of rows and columns: > 1. Rows and columns in a grid > 2. Rows and columns in a matrix > > "i" and "j" refer to rows and columns in the grid, but "row" and "col" > refer to rows and columns in the matrix. > > > > Am Mo., 11. M?rz 2019 um 21:18 Uhr schrieb Maahi Talukder via petsc-users < > petsc-users at mcs.anl.gov>: > >> Hello all, >> >> I am trying to solve Poisson Equation on structured grid using 9-point >> stencil in 2D. Now to setup my matrix, I came across C structure MatStencil >> in ex22f.F90 >> >> >> ........................................................................................................... >> >> call DMDAGetCorners (da,xs,ys,zs,xm,ym,zm,ierr) >> 107: do 10,k=zs,zs+zm-1108: do 20,j=ys,ys+ym-1109: do 30,i=xs,xs+xm-1110: row(MatStencil_i) = i111: row(MatStencil_j) = j112: row(MatStencil_k) = k113: if (i.eq.0 .or. j.eq.0 .or. k.eq.0 .or. i.eq.mx-1 .or. j.eq.my-1 .or. k.eq.mz-1) then114: v(1) = 2.0*(HxHydHz + HxHzdHy + HyHzdHx)115: call MatSetValuesStencil (jac,i1,row,i1,row,v,INSERT_VALUES ,ierr)116: else117: v(1) = -HxHydHz118: col(MatStencil_i,1) = i119: col(MatStencil_j,1) = j120: col(MatStencil_k,1) = k-1121: v(2) = -HxHzdHy122: col(MatStencil_i,2) = i123: col(MatStencil_j,2) = j-1124: col(MatStencil_k,2) = k125: v(3) = -HyHzdHx126: col(MatStencil_i,3) = i-1127: col(MatStencil_j,3) = j128: col(MatStencil_k,3) = k129: v(4) = 2.0*(HxHydHz + HxHzdHy + HyHzdHx)130: col(MatStencil_i,4) = i131: col(MatStencil_j,4) = j132: col(MatStencil_k,4) = k133: v(5) = -HyHzdHx134: col(MatStencil_i,5) = i+1135: col(MatStencil_j,5) = j136: col(MatStencil_k,5) = k137: v(6) = -HxHzdHy138: col(MatStencil_i,6) = i139: col(MatStencil_j,6) = j+1140: col(MatStencil_k,6) = k141: v(7) = -HxHydHz142: col(MatStencil_i,7) = i143: col(MatStencil_j,7) = j144: col(MatStencil_k,7) = k+1145: call MatSetValuesStencil (jac,i1,row,i7,col,v,INSERT_VALUES ,ierr)146: endif >> >> ..................................................................................... >> >> What I am confused about is what it means to have the value of row in i and j directions(row(MatStencil_i,1) & row(MatStencil_j,1)). >> >> Same confusion goes for the column values as well. I mean generally in a 2D Matrix row values are in j/y direction and column values are in i/x direction. >> >> Could you please explain that? >> >> >> Regards, >> >> Maahi Talukder >> >> Department of Mechanical Engineering >> >> Clarkson University >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Mar 11 20:46:44 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 12 Mar 2019 01:46:44 +0000 Subject: [petsc-users] DMDA and ksp(ex46.c & ex22f.F90) In-Reply-To: References: Message-ID: > On Mar 11, 2019, at 7:07 PM, Maahi Talukder via petsc-users wrote: > > > Thank you for your reply. > > I still have some confusion. So if (i,j) is a point on the structured grid( Where "i" is the column and "j" is the row), and the information associated with the (i,j) point on the grid is stored in some (m,n) location of the matrix A (Where Ax =b), Almost. Think about the vector (not the matrix) first. The point (i,j) on the mesh has location m in the vector (m is a function of both i and j) Now think about another point on the mesh (say next to the first point), call it (i',j') this point also has a location in the vector, call it n. Now consider the matrix entry containing the value that connects m and n. On the mesh it is associated with the point (i,j) AND the point (i',j') hence the row in the matrix is a function of the first point (i,j) while the column is a function of the second point (i',j'). Now the diagonal entry in matrix a_{mm} has (i,j) for both "row" and "column" stencils, but the off-diagonal entries a_{mn} has (i,j) for the row but (i',j') for the column. > I still don't > understand why both of row(MatStencil_i,1) and row(MatStencil_j,1) are necessary? I mean is it something like mapping "i" from grid to its location in the matrix? > Would you please explain that? > > Regards, > Maahi > > On Mon, Mar 11, 2019 at 4:41 PM Patrick Sanan wrote: > There are two different types of rows and columns: > 1. Rows and columns in a grid > 2. Rows and columns in a matrix > > "i" and "j" refer to rows and columns in the grid, but "row" and "col" refer to rows and columns in the matrix. > > > > Am Mo., 11. M?rz 2019 um 21:18 Uhr schrieb Maahi Talukder via petsc-users : > Hello all, > > I am trying to solve Poisson Equation on structured grid using 9-point stencil in 2D. Now to setup my matrix, I came across C structure MatStencil in ex22f.F90 > > ........................................................................................................... > call DMDAGetCorners > (da,xs,ys,zs,xm,ym,zm,ierr) > > > 107: do > 10,k=zs,zs+zm-1 > > 108: do > 20,j=ys,ys+ym-1 > > 109: do > 30,i=xs,xs+xm-1 > > 110: > row(MatStencil_i) = i > > 111: > row(MatStencil_j) = j > > 112: > row(MatStencil_k) = k > > 113: if > (i.eq.0 .or. j.eq.0 .or. k.eq.0 .or. i.eq.mx-1 .or. j.eq.my-1 .or. k.eq.mz-1) then > > 114: > v(1) = 2.0*(HxHydHz + HxHzdHy + HyHzdHx) > > 115: call MatSetValuesStencil(jac,i1,row,i1,row,v,INSERT_VALUES > ,ierr) > > 116: else > 117: > v(1) = -HxHydHz > > 118: > col(MatStencil_i,1) = i > > 119: > col(MatStencil_j,1) = j > > 120: > col(MatStencil_k,1) = k-1 > > 121: > v(2) = -HxHzdHy > > 122: > col(MatStencil_i,2) = i > > 123: > col(MatStencil_j,2) = j-1 > > 124: > col(MatStencil_k,2) = k > > 125: > v(3) = -HyHzdHx > > 126: > col(MatStencil_i,3) = i-1 > > 127: > col(MatStencil_j,3) = j > > 128: > col(MatStencil_k,3) = k > > 129: > v(4) = 2.0*(HxHydHz + HxHzdHy + HyHzdHx) > > 130: > col(MatStencil_i,4) = i > > 131: > col(MatStencil_j,4) = j > > 132: > col(MatStencil_k,4) = k > > 133: > v(5) = -HyHzdHx > > 134: > col(MatStencil_i,5) = i+1 > > 135: > col(MatStencil_j,5) = j > > 136: > col(MatStencil_k,5) = k > > 137: > v(6) = -HxHzdHy > > 138: > col(MatStencil_i,6) = i > > 139: > col(MatStencil_j,6) = j+1 > > 140: > col(MatStencil_k,6) = k > > 141: > v(7) = -HxHydHz > > 142: > col(MatStencil_i,7) = i > > 143: > col(MatStencil_j,7) = j > > 144: > col(MatStencil_k,7) = k+1 > > 145: call MatSetValuesStencil(jac,i1,row,i7,col,v,INSERT_VALUES > ,ierr) > > 146: endif > ..................................................................................... > What I am confused about is what it means to have the value of row in i and j directions(row(MatStencil_i,1) & row(MatStencil_j,1)). > Same confusion goes for the column values as well. I mean generally in a 2D Matrix row values are in j/y direction and column values are in i/x direction. > Could you please explain that? > > Regards, > Maahi Talukder > Department of Mechanical Engineering > Clarkson University From knepley at gmail.com Mon Mar 11 20:56:08 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 11 Mar 2019 21:56:08 -0400 Subject: [petsc-users] DMDA and ksp(ex46.c & ex22f.F90) In-Reply-To: References: Message-ID: On Mon, Mar 11, 2019 at 8:09 PM Maahi Talukder via petsc-users < petsc-users at mcs.anl.gov> wrote: > > Thank you for your reply. > > I still have some confusion. So if (i,j) is a point on the structured > grid( Where "i" is the column and "j" is the row), and the information > associated with the (i,j) point on the grid is stored in some (m,n) > location of the matrix A (Where Ax =b), I still don't > understand why both of row(MatStencil_i,1) and row(MatStencil_j,1) are > necessary? I mean is it something like mapping "i" from grid to its > location in the matrix? Would you please explain that? > I don't think you are understanding. (i, j) is a grid location, so it corresponds to a dof number n. A location (m, n) in the Jacobian relates two variables, one at some location (i_m, j_m) in the grid and another at location (i_n, j_n) in the grid. The matrix grid and spatial grid are completely different things. For example, you could have a 3D spatial grid, but you still have a 2D matrix. Matt > Regards, > Maahi > > On Mon, Mar 11, 2019 at 4:41 PM Patrick Sanan > wrote: > >> There are two different types of rows and columns: >> 1. Rows and columns in a grid >> 2. Rows and columns in a matrix >> >> "i" and "j" refer to rows and columns in the grid, but "row" and "col" >> refer to rows and columns in the matrix. >> >> >> >> Am Mo., 11. M?rz 2019 um 21:18 Uhr schrieb Maahi Talukder via petsc-users >> : >> >>> Hello all, >>> >>> I am trying to solve Poisson Equation on structured grid using 9-point >>> stencil in 2D. Now to setup my matrix, I came across C structure MatStencil >>> in ex22f.F90 >>> >>> >>> ........................................................................................................... >>> >>> call DMDAGetCorners (da,xs,ys,zs,xm,ym,zm,ierr) >>> 107: do 10,k=zs,zs+zm-1108: do 20,j=ys,ys+ym-1109: do 30,i=xs,xs+xm-1110: row(MatStencil_i) = i111: row(MatStencil_j) = j112: row(MatStencil_k) = k113: if (i.eq.0 .or. j.eq.0 .or. k.eq.0 .or. i.eq.mx-1 .or. j.eq.my-1 .or. k.eq.mz-1) then114: v(1) = 2.0*(HxHydHz + HxHzdHy + HyHzdHx)115: call MatSetValuesStencil (jac,i1,row,i1,row,v,INSERT_VALUES ,ierr)116: else117: v(1) = -HxHydHz118: col(MatStencil_i,1) = i119: col(MatStencil_j,1) = j120: col(MatStencil_k,1) = k-1121: v(2) = -HxHzdHy122: col(MatStencil_i,2) = i123: col(MatStencil_j,2) = j-1124: col(MatStencil_k,2) = k125: v(3) = -HyHzdHx126: col(MatStencil_i,3) = i-1127: col(MatStencil_j,3) = j128: col(MatStencil_k,3) = k129: v(4) = 2.0*(HxHydHz + HxHzdHy + HyHzdHx)130: col(MatStencil_i,4) = i131: col(MatStencil_j,4) = j132: col(MatStencil_k,4) = k133: v(5) = -HyHzdHx134: col(MatStencil_i,5) = i+1135: col(MatStencil_j,5) = j136: col(MatStencil_k,5) = k137: v(6) = -HxHzdHy138: col(MatStencil_i,6) = i139: col(MatStencil_j,6) = j+1140: col(MatStencil_k,6) = k141: v(7) = -HxHydHz142: col(MatStencil_i,7) = i143: col(MatStencil_j,7) = j144: col(MatStencil_k,7) = k+1145: call MatSetValuesStencil (jac,i1,row,i7,col,v,INSERT_VALUES ,ierr)146: endif >>> >>> ..................................................................................... >>> >>> What I am confused about is what it means to have the value of row in i and j directions(row(MatStencil_i,1) & row(MatStencil_j,1)). >>> >>> Same confusion goes for the column values as well. I mean generally in a 2D Matrix row values are in j/y direction and column values are in i/x direction. >>> >>> Could you please explain that? >>> >>> >>> Regards, >>> >>> Maahi Talukder >>> >>> Department of Mechanical Engineering >>> >>> Clarkson University >>> >>> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Mar 11 22:11:30 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 12 Mar 2019 03:11:30 +0000 Subject: [petsc-users] DMDA and ksp(ex46.c & ex22f.F90) In-Reply-To: References: Message-ID: > On Mar 11, 2019, at 10:01 PM, Maahi Talukder wrote: > > Hi > Thank you for your explanation. > So is it so that it always connects two points? Entries in matrices represent the connection between points in a vector (including the diagonal entries that are connections between a point and itself). I think you are "over-thinking" MatStencil. It is just a handy way of managing the mapping from a grid point to an entry in a vector. And similarly from two grid points and an entry in a matrix. > I am still hazy on the concept. Could you suggest me some more elaborate book or something to go through so that I can grasp the whole concept? How about your book ' Domain Decomposition'? Does it cover this topic? No, it doesn't discuss this issue. Barry > Please let me know > > Regards, > Maahi Talukder > > On Mon, Mar 11, 2019 at 9:46 PM Smith, Barry F. wrote: > > > > On Mar 11, 2019, at 7:07 PM, Maahi Talukder via petsc-users wrote: > > > > > > Thank you for your reply. > > > > I still have some confusion. So if (i,j) is a point on the structured grid( Where "i" is the column and "j" is the row), and the information associated with the (i,j) point on the grid is stored in some (m,n) location of the matrix A (Where Ax =b), > > Almost. Think about the vector (not the matrix) first. The point (i,j) on the mesh has location m in the vector (m is a function of both i and j) Now think about another point on the mesh (say next to the first point), call it (i',j') this point also has a location in the vector, call it n. Now consider the matrix entry containing the value that connects m and n. On the mesh it is associated with the point (i,j) AND the point (i',j') hence the row in the matrix is a function of the first point (i,j) while the column is a function of the second point (i',j'). > > Now the diagonal entry in matrix a_{mm} has (i,j) for both "row" and "column" stencils, but the off-diagonal entries a_{mn} has (i,j) for the row but (i',j') for the column. > > > I still don't > > understand why both of row(MatStencil_i,1) and row(MatStencil_j,1) are necessary? I mean is it something like mapping "i" from grid to its location in the matrix? > > > > Would you please explain that? > > > > > > Regards, > > Maahi > > > > On Mon, Mar 11, 2019 at 4:41 PM Patrick Sanan wrote: > > There are two different types of rows and columns: > > 1. Rows and columns in a grid > > 2. Rows and columns in a matrix > > > > "i" and "j" refer to rows and columns in the grid, but "row" and "col" refer to rows and columns in the matrix. > > > > > > > > Am Mo., 11. M?rz 2019 um 21:18 Uhr schrieb Maahi Talukder via petsc-users : > > Hello all, > > > > I am trying to solve Poisson Equation on structured grid using 9-point stencil in 2D. Now to setup my matrix, I came across C structure MatStencil in ex22f.F90 > > > > ........................................................................................................... > > call DMDAGetCorners > > (da,xs,ys,zs,xm,ym,zm,ierr) > > > > > > 107: do > > 10,k=zs,zs+zm-1 > > > > 108: do > > 20,j=ys,ys+ym-1 > > > > 109: do > > 30,i=xs,xs+xm-1 > > > > 110: > > row(MatStencil_i) = i > > > > 111: > > row(MatStencil_j) = j > > > > 112: > > row(MatStencil_k) = k > > > > 113: if > > (i.eq.0 .or. j.eq.0 .or. k.eq.0 .or. i.eq.mx-1 .or. j.eq.my-1 .or. k.eq.mz-1) then > > > > 114: > > v(1) = 2.0*(HxHydHz + HxHzdHy + HyHzdHx) > > > > 115: call MatSetValuesStencil(jac,i1,row,i1,row,v,INSERT_VALUES > > ,ierr) > > > > 116: else > > 117: > > v(1) = -HxHydHz > > > > 118: > > col(MatStencil_i,1) = i > > > > 119: > > col(MatStencil_j,1) = j > > > > 120: > > col(MatStencil_k,1) = k-1 > > > > 121: > > v(2) = -HxHzdHy > > > > 122: > > col(MatStencil_i,2) = i > > > > 123: > > col(MatStencil_j,2) = j-1 > > > > 124: > > col(MatStencil_k,2) = k > > > > 125: > > v(3) = -HyHzdHx > > > > 126: > > col(MatStencil_i,3) = i-1 > > > > 127: > > col(MatStencil_j,3) = j > > > > 128: > > col(MatStencil_k,3) = k > > > > 129: > > v(4) = 2.0*(HxHydHz + HxHzdHy + HyHzdHx) > > > > 130: > > col(MatStencil_i,4) = i > > > > 131: > > col(MatStencil_j,4) = j > > > > 132: > > col(MatStencil_k,4) = k > > > > 133: > > v(5) = -HyHzdHx > > > > 134: > > col(MatStencil_i,5) = i+1 > > > > 135: > > col(MatStencil_j,5) = j > > > > 136: > > col(MatStencil_k,5) = k > > > > 137: > > v(6) = -HxHzdHy > > > > 138: > > col(MatStencil_i,6) = i > > > > 139: > > col(MatStencil_j,6) = j+1 > > > > 140: > > col(MatStencil_k,6) = k > > > > 141: > > v(7) = -HxHydHz > > > > 142: > > col(MatStencil_i,7) = i > > > > 143: > > col(MatStencil_j,7) = j > > > > 144: > > col(MatStencil_k,7) = k+1 > > > > 145: call MatSetValuesStencil(jac,i1,row,i7,col,v,INSERT_VALUES > > ,ierr) > > > > 146: endif > > ..................................................................................... > > What I am confused about is what it means to have the value of row in i and j directions(row(MatStencil_i,1) & row(MatStencil_j,1)). > > Same confusion goes for the column values as well. I mean generally in a 2D Matrix row values are in j/y direction and column values are in i/x direction. > > Could you please explain that? > > > > Regards, > > Maahi Talukder > > Department of Mechanical Engineering > > Clarkson University > From bsmith at mcs.anl.gov Mon Mar 11 23:24:54 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 12 Mar 2019 04:24:54 +0000 Subject: [petsc-users] DMDA and ksp(ex46.c & ex22f.F90) In-Reply-To: References: Message-ID: <244811F3-EA04-4837-B16D-6997CE5C4319@mcs.anl.gov> > On Mar 11, 2019, at 11:11 PM, Maahi Talukder wrote: > > Hi > Thank you so much for explanation. > > So when you say connecting two points on the grid in the matrix entry, what do you mean by that? Do you mean that matrix entry is calculated using two points ( (i,j) and (i',j') ) on the grid? I mean the value of the finite difference "stencil" that connects the two points. Take a look at src/ksp/ksp/examples/tutorials/ex22f.F90. The inner most loop do 30,i=xs,xs+xm-1 row(MatStencil_i) = i row(MatStencil_j) = j row(MatStencil_k) = k .... sets one row of the matrix which has 7 nonzero columns corresponding to the 6 neighbors of the (i,j,k) point plus the diagonal entry which connects with itself. > And when you talked about thinking about the vector, how is it that the entry in the vector is a function of both i and j? For each (i,j) point on the mesh there is a vector entry. When indexing into vectors we use a single index value m; that has to be computed based on the two values of the logical coordinates on the mesh i and j. > Will it be the same if my dof = 1 ie solving for only one thing at each grid point? Yes it doesn't have anything to do with dof. The stencil_c value is used for multiple degrees of freedom per grid point. > > > Regards, > Maahi Talukder > > On Mon, Mar 11, 2019 at 11:11 PM Smith, Barry F. wrote: > > > > On Mar 11, 2019, at 10:01 PM, Maahi Talukder wrote: > > > > Hi > > Thank you for your explanation. > > So is it so that it always connects two points? > > Entries in matrices represent the connection between points in a vector (including the diagonal entries that are connections between a point and itself). > > I think you are "over-thinking" MatStencil. It is just a handy way of managing the mapping from a grid point to an entry in a vector. And similarly from two grid points and an entry in a matrix. > > > I am still hazy on the concept. Could you suggest me some more elaborate book or something to go through so that I can grasp the whole concept? How about your book ' Domain Decomposition'? Does it cover this topic? > > No, it doesn't discuss this issue. > > Barry > > > Please let me know > > > > Regards, > > Maahi Talukder > > > > On Mon, Mar 11, 2019 at 9:46 PM Smith, Barry F. wrote: > > > > > > > On Mar 11, 2019, at 7:07 PM, Maahi Talukder via petsc-users wrote: > > > > > > > > > Thank you for your reply. > > > > > > I still have some confusion. So if (i,j) is a point on the structured grid( Where "i" is the column and "j" is the row), and the information associated with the (i,j) point on the grid is stored in some (m,n) location of the matrix A (Where Ax =b), > > > > Almost. Think about the vector (not the matrix) first. The point (i,j) on the mesh has location m in the vector (m is a function of both i and j) Now think about another point on the mesh (say next to the first point), call it (i',j') this point also has a location in the vector, call it n. Now consider the matrix entry containing the value that connects m and n. On the mesh it is associated with the point (i,j) AND the point (i',j') hence the row in the matrix is a function of the first point (i,j) while the column is a function of the second point (i',j'). > > > > Now the diagonal entry in matrix a_{mm} has (i,j) for both "row" and "column" stencils, but the off-diagonal entries a_{mn} has (i,j) for the row but (i',j') for the column. > > > > > I still don't > > > understand why both of row(MatStencil_i,1) and row(MatStencil_j,1) are necessary? I mean is it something like mapping "i" from grid to its location in the matrix? > > > > > > > Would you please explain that? > > > > > > > > > > Regards, > > > Maahi > > > > > > On Mon, Mar 11, 2019 at 4:41 PM Patrick Sanan wrote: > > > There are two different types of rows and columns: > > > 1. Rows and columns in a grid > > > 2. Rows and columns in a matrix > > > > > > "i" and "j" refer to rows and columns in the grid, but "row" and "col" refer to rows and columns in the matrix. > > > > > > > > > > > > Am Mo., 11. M?rz 2019 um 21:18 Uhr schrieb Maahi Talukder via petsc-users : > > > Hello all, > > > > > > I am trying to solve Poisson Equation on structured grid using 9-point stencil in 2D. Now to setup my matrix, I came across C structure MatStencil in ex22f.F90 > > > > > > ........................................................................................................... > > > call DMDAGetCorners > > > (da,xs,ys,zs,xm,ym,zm,ierr) > > > > > > > > > 107: do > > > 10,k=zs,zs+zm-1 > > > > > > 108: do > > > 20,j=ys,ys+ym-1 > > > > > > 109: do > > > 30,i=xs,xs+xm-1 > > > > > > 110: > > > row(MatStencil_i) = i > > > > > > 111: > > > row(MatStencil_j) = j > > > > > > 112: > > > row(MatStencil_k) = k > > > > > > 113: if > > > (i.eq.0 .or. j.eq.0 .or. k.eq.0 .or. i.eq.mx-1 .or. j.eq.my-1 .or. k.eq.mz-1) then > > > > > > 114: > > > v(1) = 2.0*(HxHydHz + HxHzdHy + HyHzdHx) > > > > > > 115: call MatSetValuesStencil(jac,i1,row,i1,row,v,INSERT_VALUES > > > ,ierr) > > > > > > 116: else > > > 117: > > > v(1) = -HxHydHz > > > > > > 118: > > > col(MatStencil_i,1) = i > > > > > > 119: > > > col(MatStencil_j,1) = j > > > > > > 120: > > > col(MatStencil_k,1) = k-1 > > > > > > 121: > > > v(2) = -HxHzdHy > > > > > > 122: > > > col(MatStencil_i,2) = i > > > > > > 123: > > > col(MatStencil_j,2) = j-1 > > > > > > 124: > > > col(MatStencil_k,2) = k > > > > > > 125: > > > v(3) = -HyHzdHx > > > > > > 126: > > > col(MatStencil_i,3) = i-1 > > > > > > 127: > > > col(MatStencil_j,3) = j > > > > > > 128: > > > col(MatStencil_k,3) = k > > > > > > 129: > > > v(4) = 2.0*(HxHydHz + HxHzdHy + HyHzdHx) > > > > > > 130: > > > col(MatStencil_i,4) = i > > > > > > 131: > > > col(MatStencil_j,4) = j > > > > > > 132: > > > col(MatStencil_k,4) = k > > > > > > 133: > > > v(5) = -HyHzdHx > > > > > > 134: > > > col(MatStencil_i,5) = i+1 > > > > > > 135: > > > col(MatStencil_j,5) = j > > > > > > 136: > > > col(MatStencil_k,5) = k > > > > > > 137: > > > v(6) = -HxHzdHy > > > > > > 138: > > > col(MatStencil_i,6) = i > > > > > > 139: > > > col(MatStencil_j,6) = j+1 > > > > > > 140: > > > col(MatStencil_k,6) = k > > > > > > 141: > > > v(7) = -HxHydHz > > > > > > 142: > > > col(MatStencil_i,7) = i > > > > > > 143: > > > col(MatStencil_j,7) = j > > > > > > 144: > > > col(MatStencil_k,7) = k+1 > > > > > > 145: call MatSetValuesStencil(jac,i1,row,i7,col,v,INSERT_VALUES > > > ,ierr) > > > > > > 146: endif > > > ..................................................................................... > > > What I am confused about is what it means to have the value of row in i and j directions(row(MatStencil_i,1) & row(MatStencil_j,1)). > > > Same confusion goes for the column values as well. I mean generally in a 2D Matrix row values are in j/y direction and column values are in i/x direction. > > > Could you please explain that? > > > > > > Regards, > > > Maahi Talukder > > > Department of Mechanical Engineering > > > Clarkson University > > > From eda.oktay at metu.edu.tr Tue Mar 12 03:03:54 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Tue, 12 Mar 2019 11:03:54 +0300 Subject: [petsc-users] Matrix Partitioning using PARMETIS Message-ID: Hello, I have a Laplacian matrix PL of matrix A and I try to partition A using PARMETIS. Since PL is sequential and not adjacency matrix, I converted PL to AL, then write the following code: ierr = MatConvert(PL,MATMPIADJ,MAT_INITIAL_MATRIX,&AL);CHKERRQ(ierr); ierr = MatMeshToCellGraph(AL,2,&dual);CHKERRQ(ierr); ierr = MatPartitioningCreate(MPI_COMM_WORLD,&part);CHKERRQ(ierr); ierr = MatPartitioningSetAdjacency(part,dual);CHKERRQ(ierr); ierr = MatPartitioningSetFromOptions(part);CHKERRQ(ierr); ierr = MatPartitioningApply(part,&partitioning);CHKERRQ(ierr); ierr = ISView(partitioning,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); ierr = ISDestroy(&partitioning);CHKERRQ(ierr); ierr = MatPartitioningDestroy(&part);CHKERRQ(ierr); However, when I look at partitioning with ISView, the index set consists of zeros only. Is that because I have only one processor and my codes are written for only one processor, or is there another problem? I ran my code with -mat_partitioning_type parmetis. Thanks, Eda -------------- next part -------------- An HTML attachment was scrubbed... URL: From amfoggia at gmail.com Tue Mar 12 03:36:13 2019 From: amfoggia at gmail.com (Ale Foggia) Date: Tue, 12 Mar 2019 09:36:13 +0100 Subject: [petsc-users] MatCreate performance In-Reply-To: References: <87o96l9wzu.fsf@jedbrown.org> Message-ID: Hello, I've checked very thoroughly the previous part of the code and I found where the problem is. As you said, it's due to an imbalance in the previous part, before MatCreate. Thank you so much for your answers, they helped me understand. El lun., 11 mar. 2019 a las 14:37, Mark Adams () escribi?: > The PETSc logs print the max time and the ratio max/min. > > On Mon, Mar 11, 2019 at 8:24 AM Ale Foggia via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello all, >> >> Thanks for your answers. >> >> 1) I'm working with a matrix with a linear size of 2**34, but it's a >> sparse matrix, and the number of elements different from zero is >> 43,207,072,74. I know that the distribution of these elements is not >> balanced between the processes, the matrix is more populated in the middle >> part. >> >> 2) I initialize Slepc. Then I create the basis elements of the system >> (this part does not involve Petsc/Slepc, and every process is just >> computing -and owns- an equal amount of basis elements). Then I call: >> ierr = MatCreate(PETSC_COMM_WORLD, &A); CHKERRQ(ierr); >> ierr = MatSetType(A, MATMPIAIJ); CHKERRQ(ierr); >> ierr = MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, size, size); >> CHKERRQ(ierr); >> ierr = MatMPIAIJSetPreallocation(A, 0, d_nnz, 0, o_nnz); CHKERRQ(ierr); >> ierr = MatZeroEntries(A); CHKERRQ(ierr); >> After this, I compute the elements of the matrix and set the values with >> MatSetValues. The I call EPSSolve (with KrylovSchur and setting the type as >> EPS_HEP). >> >> 3) There are a few more things that are strange to me. I measure the >> execution time of these parts both with a PetscLogStage and with a >> std::chrono (in nanoseconds) clock. I understand that the time given by the >> Log is an average over the processes right? In the case of the std::chrono, >> I'm only printing the times from process 0 (no average over processes). >> What I see is the following: >> 1024 procs 2048 procs 4096 >> procs 8192 procs >> Log std Log std >> Log std Log std >> MatCreate 68.42 122.7 67.08 121.2 62.29 116 >> 73.36 127.4 >> preallocation 140.36 140.3 76.45 76.45 40.31 >> 40.3 21.13 21.12 >> MatSetValues 237.79 237.7 116.6 116.6 60.59 60.59 >> 35.32 35.32 >> ESPSolve 162.8 160 95.8 94.2 62.17 >> 60.63 41.16 40.24 >> >> - So, all the times (including the total execution time that I'm not >> showing here) are the same between PetscLogStage and the std::chrono clock, >> except for the part of MatCreate. Maybe that part is very unbalanced? >> - The time of the MatCreate given by the PetscLogStage is not changing. >> >> Ale >> >> El vie., 8 mar. 2019 a las 17:00, Jed Brown () >> escribi?: >> >>> This is very unusual. MatCreate() does no work, merely dup'ing a >>> communicator (or referencing an inner communicator if this is not the >>> first PetscObject on the provided communicator). What size matrices are >>> you working with? Can you send some performance data and (if feasible) >>> a reproducer? >>> >>> Ale Foggia via petsc-users writes: >>> >>> > Hello all, >>> > >>> > I have a problem with the scaling of the MatCreate() function. I wrote >>> a >>> > code to diagonalize sparse matrices and I'm running it in parallel. >>> I've >>> > observed a very bad speedup of the code and it's given by the MatCreate >>> > part of it: for a fixed matrix size, when I increase the number of >>> > processes the time taken by the function also increases. I wanted to >>> know >>> > if you expect this behavior or if maybe there's something wrong with my >>> > code. When I go to (what I consider) very big matrix sizes, and >>> depending >>> > on the number of mpi processes, in some cases, MatCreate takes more >>> time >>> > than the time the solver takes to solve the system for one eigenvalue >>> or >>> > the time it takes to set up the values. >>> > >>> > Ale >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbuerkle at web.de Tue Mar 12 03:50:18 2019 From: mbuerkle at web.de (Marius Buerkle) Date: Tue, 12 Mar 2019 09:50:18 +0100 Subject: [petsc-users] MatCompositeMerge + MatCreateRedundantMatrix In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 12 05:47:36 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Mar 2019 06:47:36 -0400 Subject: [petsc-users] MatCompositeMerge + MatCreateRedundantMatrix In-Reply-To: References: Message-ID: On Tue, Mar 12, 2019 at 4:50 AM Marius Buerkle wrote: > I tried to follow your suggestions but it seems there is > no MPICreateSubMatricesMPI for Fortran. Is this correct? > We just have to write the binding. Its almost identical to MatCreateSubMatrices() in src/mat/interface/ftn-custom/zmatrixf.c Matt > > On Wed, Feb 20, 2019 at 6:57 PM Marius Buerkle wrote: > >> ok, I think I understand now. I will give it a try and if there is some >> trouble comeback to you. thanks. >> > > Cool. > > Matt > > >> >> marius >> >> On Tue, Feb 19, 2019 at 8:42 PM Marius Buerkle wrote: >> >>> ok, so it seems there is no straight forward way to transfer data >>> between PETSc matrices on different subcomms. Probably doing it by "hand" >>> extracting the matricies on the subcomms create a MPI_INTERCOMM transfering >>> the data to PETSC_COMM_WORLD and assembling them in a new PETSc matrix >>> would be possible, right? >>> >> >> That sounds too complicated. Why not just reverse >> MPICreateSubMatricesMPI()? Meaning make it collective on the whole big >> communicator, so that you can swap out all the subcommunicator for the >> aggregation call, just like we do in that function. >> Then its really just a matter of reversing the communication call. >> >> Matt >> >>> >>> On Tue, Feb 19, 2019 at 7:12 PM Marius Buerkle wrote: >>> >>>> I see. This would work if the matrices are on different >>>> subcommumicators ? Is it possible to add this functionality ? >>>> >>> >>> Hmm, no. That is specialized to serial matrices. You need the inverse of >>> MatCreateSubMatricesMPI(). >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> marius >>>> >>>> >>>> You basically need the inverse of MatCreateSubmatrices(). I do not >>>> think we have that right now, but it could probably be done without too >>>> much trouble by looking at that code. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> On Tue, Feb 19, 2019 at 6:15 AM Marius Buerkle via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>>> Hi ! >>>>> >>>>> Is there some way to combine MatCompositeMerge >>>>> with MatCreateRedundantMatrix? I basically want to create copies of a >>>>> matrix from PETSC_COMM_WORLD to subcommunicators, do some work on each >>>>> subcommunicator and than gather the results back to PETSC_COMM_WORLD, >>>>> namely I want to sum the the invidual matrices from the subcommunicatos >>>>> component wise and get the resulting matrix on PETSC_COMM_WORLD. Is this >>>>> somehow possible without going through all the hassle of using MPI >>>>> directly? >>>>> >>>>> marius >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From finnkochinski at keemail.me Tue Mar 12 08:03:15 2019 From: finnkochinski at keemail.me (finnkochinski at keemail.me) Date: Tue, 12 Mar 2019 14:03:15 +0100 (CET) Subject: [petsc-users] DMPlexComputeCellGeometryFVM: "Cannot handle faces with 1 vertices" Message-ID: Hello, with the code below, I create a tetrahedron using DMPlexCreateFromDAG, then I try to run DMPlexComputeCellGeometryFVM on this cell. The latter call fails with output: ... [0]PETSC ERROR: --------------------- Error Message ----------------- [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: Cannot handle faces with 1 vertices... (full output is attached). What am I doing wrong? regards Chris Here is the code: static char help[] = "No help \n"; #include #undef __FUNCT__ #define __FUNCT__ "main" int main(int argc,char **args) { ? PetscErrorCode ierr; ? PetscInitialize(&argc,&args,(char*)0,help); ? DM??????????????? dm; ? int cStart,cEnd; ? int fStart,fEnd; ? int depth = 3; ? int dim = 3; ? PetscInt??? numPoints[4]??????? = {1,4,6,4}; ? PetscInt??? coneSize[15]???????? = {4,3,3,3,3,2,2,2,2,2,2,0,0,0,0}; ? PetscInt??? cones[28]??????????? = {1,2,3,4, 5,9,8, 9,6,10, 10,8,7, 5,6,7, 11,12, 12,13, 13,11, 11,14, 12,14, 13,14}; ? PetscInt??? coneOrientations[28] = {0 }; ? PetscScalar vertexCoords[12]???? = {0,0,0, 1,0,0, 0,1,0, 0,0,1}; ? DMCreate(PETSC_COMM_WORLD, &dm); ? DMSetType(dm, DMPLEX); ? DMSetDimension(dm,dim); ? DMPlexCreateFromDAG(dm, depth, numPoints, coneSize, cones, coneOrientations, vertexCoords); ? DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); ? DMPlexGetHeightStratum(dm, 1, &fStart, &fEnd); ? for (int k =cStart;k -------------- next part -------------- A non-text attachment was scrubbed... Name: log Type: application/octet-stream Size: 4241 bytes Desc: not available URL: From knepley at gmail.com Tue Mar 12 08:08:31 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Mar 2019 09:08:31 -0400 Subject: [petsc-users] DMPlexComputeCellGeometryFVM: "Cannot handle faces with 1 vertices" In-Reply-To: References: Message-ID: On Tue, Mar 12, 2019 at 9:03 AM Chris Finn via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > with the code below, I create a tetrahedron using DMPlexCreateFromDAG, > then I try to run DMPlexComputeCellGeometryFVM on this cell. The latter > call fails with output: > All the geometry stuff requires that you interpolate the mesh I think, Just use DMPlexInterpolate(). Thanks, Matt > ... > [0]PETSC ERROR: --------------------- Error Message ----------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Cannot handle faces with 1 vertices > ... > (full output is attached). > > What am I doing wrong? > regards > Chris > > Here is the code: > static char help[] = "No help \n"; > > #include > > #undef __FUNCT__ > #define __FUNCT__ "main" > int main(int argc,char **args) > { > PetscErrorCode ierr; > PetscInitialize(&argc,&args,(char*)0,help); > > DM dm; > int cStart,cEnd; > int fStart,fEnd; > > int depth = 3; > int dim = 3; > > PetscInt numPoints[4] = {1,4,6,4}; > PetscInt coneSize[15] = {4,3,3,3,3,2,2,2,2,2,2,0,0,0,0}; > PetscInt cones[28] = {1,2,3,4, 5,9,8, 9,6,10, 10,8,7, > 5,6,7, 11,12, 12,13, 13,11, 11,14, 12,14, 13,14}; > PetscInt coneOrientations[28] = {0 }; > PetscScalar vertexCoords[12] = {0,0,0, 1,0,0, 0,1,0, 0,0,1}; > > DMCreate(PETSC_COMM_WORLD, &dm); > DMSetType(dm, DMPLEX); > DMSetDimension(dm,dim); > DMPlexCreateFromDAG(dm, depth, numPoints, coneSize, cones, > coneOrientations, vertexCoords); > > DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); > DMPlexGetHeightStratum(dm, 1, &fStart, &fEnd); > > for (int k =cStart;k double vol; > double centroid[3]; > double normal[3]; > ierr = DMPlexComputeCellGeometryFVM(dm, k, &vol, > centroid,NULL);CHKERRQ(ierr); > printf("FVM: V=%f c=(%f %f %f) n=(%f %f > %f)\n",vol,centroid[0],centroid[1],centroid[2], > normal[0],normal[1],normal[2]); > } > ierr = PetscFinalize(); > return 0; > } > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 12 08:20:12 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Mar 2019 09:20:12 -0400 Subject: [petsc-users] Problems about SNES In-Reply-To: References: Message-ID: On Wed, Jan 16, 2019 at 10:59 PM Yingjie Wu via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc developers: > Hi, > During the process of testing the program, I found some questions about > SNES. These are some basic questions that I have overlooked. Please help me > to answer them. > 1. Because my program uses - snes_mf_operator, there is no Jacobian > matrix. Linear and non-linear step residuals are different in petsc. The > linear step residuals are r_linear = J*?x-f(x). Since I don't have a > Jacobian matrix, I don't know how to calculate the relative residuals of > linear steps provided in petsc. Do we use the finite difference > approximation matrix vector product when calculating the residuals? > PETSc is using a FD approximation to the Jacobian. > 2. Read the user's manual for a brief introduction to the inexact Newton > method, but I am very interested in the use of this method. I want to know > how to use this method in petsc. > Inexact Newton is any method that does not solve the Newton system exactly, so essentially any iterative solver. Do you mean quasi-Newton, which is -snes_type qn > 3. The default line search used by SNES in PETSc is bt, which often fails > in program debugging. I don't know much about linesearch, and I'm curious > to know why it failed. How can I supplement this knowledge? > Usually bt only fails if your search direction is not actually correct. Are you sure your Jacobian is right? Thanks, Matt > Thanks, > Yingjie > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From finnkochinski at keemail.me Tue Mar 12 08:25:50 2019 From: finnkochinski at keemail.me (finnkochinski at keemail.me) Date: Tue, 12 Mar 2019 14:25:50 +0100 (CET) Subject: [petsc-users] DMPlexComputeCellGeometryFVM: "Cannot handle faces with 1 vertices" In-Reply-To: References: <> Message-ID: Thank you, I added DMPlexInterpolate(). Now I get a different error in DMPlexInterpolate(): ... [0]PETSC ERROR: Dimension 0 not supported ... The source code and full output are attached. Anybody able to fix this? regards Chris -- Securely sent with Tutanota. Get your own encrypted, ad-free mailbox: https://tutanota.com Mar 12, 2019, 1:08 PM by knepley at gmail.com: > On Tue, Mar 12, 2019 at 9:03 AM Chris Finn via petsc-users <> petsc-users at mcs.anl.gov > > wrote: > >> Hello, >> with the code below, I create a tetrahedron using DMPlexCreateFromDAG, then I try to run DMPlexComputeCellGeometryFVM on this cell. The latter call fails with output: >> > > All the geometry stuff requires that you interpolate the mesh I think, Just use DMPlexInterpolate(). > > ? Thanks, > > ? ? Matt > ? > >> >> ... >> [0]PETSC ERROR: --------------------- Error Message ----------------- >> [0]PETSC ERROR: Argument out of range >> [0]PETSC ERROR: Cannot handle faces with 1 vertices >> ... >> (full output is attached). >> >> What am I doing wrong? >> regards >> Chris >> >> Here is the code: >> static char help[] = "No help \n"; >> >> #include >> >> #undef __FUNCT__ >> #define __FUNCT__ "main" >> int main(int argc,char **args) >> { >> ? PetscErrorCode ierr; >> ? PetscInitialize(&argc,&args,(char*)0,help); >> >> ? DM??????????????? dm; >> ? int cStart,cEnd; >> ? int fStart,fEnd; >> >> ? int depth = 3; >> ? int dim = 3; >> >> ? PetscInt??? numPoints[4]??????? = {1,4,6,4}; >> ? PetscInt??? coneSize[15]???????? = {4,3,3,3,3,2,2,2,2,2,2,0,0,0,0}; >> ? PetscInt??? cones[28]??????????? = {1,2,3,4, 5,9,8, 9,6,10, 10,8,7, 5,6,7, 11,12, 12,13, 13,11, 11,14, 12,14, 13,14}; >> ? PetscInt??? coneOrientations[28] = {0 }; >> ? PetscScalar vertexCoords[12]???? = {0,0,0, 1,0,0, 0,1,0, 0,0,1}; >> >> ? DMCreate(PETSC_COMM_WORLD, &dm); >> ? DMSetType(dm, DMPLEX); >> ? DMSetDimension(dm,dim); >> ? DMPlexCreateFromDAG(dm, depth, numPoints, coneSize, cones, coneOrientations, vertexCoords); >> >> ? DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); >> ? DMPlexGetHeightStratum(dm, 1, &fStart, &fEnd); >> >> ? for (int k =cStart;k> ??? double vol; >> ??? double centroid[3]; >> ??? double normal[3]; >> ??? ierr = DMPlexComputeCellGeometryFVM(dm, k, &vol, centroid,NULL);CHKERRQ(ierr); >> ??? printf("FVM: V=%f c=(%f %f %f) n=(%f %f %f)\n",vol,centroid[0],centroid[1],centroid[2], >> ????? normal[0],normal[1],normal[2]); >> ? } >> ? ierr = PetscFinalize(); >> ? return 0; >> } >> >> >> > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: testplex.c Type: text/x-csrc Size: 1441 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: log2 Type: application/octet-stream Size: 4483 bytes Desc: not available URL: From knepley at gmail.com Tue Mar 12 08:38:08 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Mar 2019 09:38:08 -0400 Subject: [petsc-users] DMPlexComputeCellGeometryFVM: "Cannot handle faces with 1 vertices" In-Reply-To: References: Message-ID: On Tue, Mar 12, 2019 at 9:25 AM wrote: > Thank you, > I added DMPlexInterpolate(). Now I get a different error in > DMPlexInterpolate(): > ... > Okay, you did not need interpolate. You already specified all the levels. However, your orientations are wrong. Get rid of that code. > [0]PETSC ERROR: Dimension 0 not supported > ... > > The source code and full output are attached. Anybody able to fix this? > The problem is a numbering convention in the library. Plex will accept any consistent DAG. However, if you want to use other things in the library, like geometry routines, then there is a convention on numbering (which makes many things simpler). We require that you number contiguously: Cells: [0, Nc) Vertices: [Nc, Nc+Nv) Edges: [Nc+Nv, Nc+Nv+Ne) Faces: [Nc+Nv+Ne, Nc+Nv+Ne+Nf) You numbered the faces in the vertices slot, so the geometry routines got confused. Thanks, Matt > regards > Chris > > -- > Securely sent with Tutanota. Get your own encrypted, ad-free mailbox: > https://tutanota.com > > > Mar 12, 2019, 1:08 PM by knepley at gmail.com: > > On Tue, Mar 12, 2019 at 9:03 AM Chris Finn via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello, > with the code below, I create a tetrahedron using DMPlexCreateFromDAG, > then I try to run DMPlexComputeCellGeometryFVM on this cell. The latter > call fails with output: > > > All the geometry stuff requires that you interpolate the mesh I think, > Just use DMPlexInterpolate(). > > Thanks, > > Matt > > > > ... > [0]PETSC ERROR: --------------------- Error Message ----------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Cannot handle faces with 1 vertices > ... > (full output is attached). > > What am I doing wrong? > regards > Chris > > Here is the code: > static char help[] = "No help \n"; > > #include > > #undef __FUNCT__ > #define __FUNCT__ "main" > int main(int argc,char **args) > { > PetscErrorCode ierr; > PetscInitialize(&argc,&args,(char*)0,help); > > DM dm; > int cStart,cEnd; > int fStart,fEnd; > > int depth = 3; > int dim = 3; > > PetscInt numPoints[4] = {1,4,6,4}; > PetscInt coneSize[15] = {4,3,3,3,3,2,2,2,2,2,2,0,0,0,0}; > PetscInt cones[28] = {1,2,3,4, 5,9,8, 9,6,10, 10,8,7, > 5,6,7, 11,12, 12,13, 13,11, 11,14, 12,14, 13,14}; > PetscInt coneOrientations[28] = {0 }; > PetscScalar vertexCoords[12] = {0,0,0, 1,0,0, 0,1,0, 0,0,1}; > > DMCreate(PETSC_COMM_WORLD, &dm); > DMSetType(dm, DMPLEX); > DMSetDimension(dm,dim); > DMPlexCreateFromDAG(dm, depth, numPoints, coneSize, cones, > coneOrientations, vertexCoords); > > DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); > DMPlexGetHeightStratum(dm, 1, &fStart, &fEnd); > > for (int k =cStart;k double vol; > double centroid[3]; > double normal[3]; > ierr = DMPlexComputeCellGeometryFVM(dm, k, &vol, > centroid,NULL);CHKERRQ(ierr); > printf("FVM: V=%f c=(%f %f %f) n=(%f %f > %f)\n",vol,centroid[0],centroid[1],centroid[2], > normal[0],normal[1],normal[2]); > } > ierr = PetscFinalize(); > return 0; > } > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From finnkochinski at keemail.me Tue Mar 12 09:09:57 2019 From: finnkochinski at keemail.me (finnkochinski at keemail.me) Date: Tue, 12 Mar 2019 15:09:57 +0100 (CET) Subject: [petsc-users] DMPlexComputeCellGeometryFVM: "Cannot handle faces with 1 vertices" In-Reply-To: References: <> <> Message-ID: Next try: Vertex and face numbering corrected (I hope), code and full output attached. Orientations still wrong, but the crash is elsewhere: ... reconstructed depth,dim: 1 3 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Mesh must be interpolated ... DMPlexComputeCellGeometryFVM() checks depth and dim in the beginning, finds they are different and quits because mesh seems not interpolated. Actually, DMPlexGetDepth and DMGetDimension return 1 and 3 with this vertex/face numbering. They returned the correct 3 and 3 with my previous wrong numbering. It can't be so difficult to create a simplex from scratch? regards Chris Mar 12, 2019, 1:38 PM by knepley at gmail.com: > On Tue, Mar 12, 2019 at 9:25 AM <> finnkochinski at keemail.me > > wrote: > >> Thank you, >> I added DMPlexInterpolate(). Now I get a different error in DMPlexInterpolate(): >> ... >> > > Okay, you did not need interpolate. You already specified all the levels. However, your orientations are wrong. > Get rid of that code. > ? > >> >> [0]PETSC ERROR: Dimension 0 not supported >> ... >> >> The source code and full output are attached. Anybody able to fix this? >> > > The problem is a numbering convention in the library. Plex will accept any consistent DAG. However, if you > want to use other things in the library, like geometry routines, then there is a convention on numbering (which > makes many things simpler). We require that you number contiguously: > > ? Cells:? ? ? [0, Nc) > ? Vertices: [Nc, Nc+Nv) > ? Edges:? ? [Nc+Nv, Nc+Nv+Ne) > ? Faces:? ? [Nc+Nv+Ne, Nc+Nv+Ne+Nf) > > You numbered the faces in the vertices slot,? so the geometry routines got confused. > > ? Thanks, > > ? ? Matt > ? > >> >> regards >> Chris >> >> -- >> Securely sent with Tutanota. Get your own encrypted, ad-free mailbox: >> https://tutanota.com >> >> >> Mar 12, 2019, 1:08 PM by >> knepley at gmail.com >> : >> >>> On Tue, Mar 12, 2019 at 9:03 AM Chris Finn via petsc-users <>>> petsc-users at mcs.anl.gov >>> > wrote: >>> >>>> Hello, >>>> with the code below, I create a tetrahedron using DMPlexCreateFromDAG, then I try to run DMPlexComputeCellGeometryFVM on this cell. The latter call fails with output: >>>> >>> >>> All the geometry stuff requires that you interpolate the mesh I think, Just use DMPlexInterpolate(). >>> >>> ? Thanks, >>> >>> ? ? Matt >>> ? >>> >>>> >>>> ... >>>> [0]PETSC ERROR: --------------------- Error Message ----------------- >>>> [0]PETSC ERROR: Argument out of range >>>> [0]PETSC ERROR: Cannot handle faces with 1 vertices >>>> ... >>>> (full output is attached). >>>> >>>> What am I doing wrong? >>>> regards >>>> Chris >>>> >>>> Here is the code: >>>> static char help[] = "No help \n"; >>>> >>>> #include >>>> >>>> #undef __FUNCT__ >>>> #define __FUNCT__ "main" >>>> int main(int argc,char **args) >>>> { >>>> ? PetscErrorCode ierr; >>>> ? PetscInitialize(&argc,&args,(char*)0,help); >>>> >>>> ? DM??????????????? dm; >>>> ? int cStart,cEnd; >>>> ? int fStart,fEnd; >>>> >>>> ? int depth = 3; >>>> ? int dim = 3; >>>> >>>> ? PetscInt??? numPoints[4]??????? = {1,4,6,4}; >>>> ? PetscInt??? coneSize[15]???????? = {4,3,3,3,3,2,2,2,2,2,2,0,0,0,0}; >>>> ? PetscInt??? cones[28]??????????? = {1,2,3,4, 5,9,8, 9,6,10, 10,8,7, 5,6,7, 11,12, 12,13, 13,11, 11,14, 12,14, 13,14}; >>>> ? PetscInt??? coneOrientations[28] = {0 }; >>>> ? PetscScalar vertexCoords[12]???? = {0,0,0, 1,0,0, 0,1,0, 0,0,1}; >>>> >>>> ? DMCreate(PETSC_COMM_WORLD, &dm); >>>> ? DMSetType(dm, DMPLEX); >>>> ? DMSetDimension(dm,dim); >>>> ? DMPlexCreateFromDAG(dm, depth, numPoints, coneSize, cones, coneOrientations, vertexCoords); >>>> >>>> ? DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); >>>> ? DMPlexGetHeightStratum(dm, 1, &fStart, &fEnd); >>>> >>>> ? for (int k =cStart;k>>> ??? double vol; >>>> ??? double centroid[3]; >>>> ??? double normal[3]; >>>> ??? ierr = DMPlexComputeCellGeometryFVM(dm, k, &vol, centroid,NULL);CHKERRQ(ierr); >>>> ??? printf("FVM: V=%f c=(%f %f %f) n=(%f %f %f)\n",vol,centroid[0],centroid[1],centroid[2], >>>> ????? normal[0],normal[1],normal[2]); >>>> ? } >>>> ? ierr = PetscFinalize(); >>>> ? return 0; >>>> } >>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >> >> > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: log3 Type: application/octet-stream Size: 4113 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: testplex.c Type: text/x-csrc Size: 1603 bytes Desc: not available URL: From finnkochinski at keemail.me Tue Mar 12 10:20:06 2019 From: finnkochinski at keemail.me (finnkochinski at keemail.me) Date: Tue, 12 Mar 2019 16:20:06 +0100 (CET) Subject: [petsc-users] DMPlexComputeCellGeometryFVM: "Cannot handle faces with 1 vertices" In-Reply-To: References: <> <> Message-ID: I think I found a hint in src/dm/impls/plex/examples/tests/ex5.c to get my code corrected. Apparently the contiguous numbering is NOT: cells, vertices, edges, faces (as you said), BUT: cells, vertices, faces, edges. Then in numPoints, you don't put depth 0..3 or depth 3..0, BUT: #vertices, #faces, #edges, #cells. This seems completely random, but works in the attached example. Maybe you should finally take the time to document this mess if even the developers no longer understand it. regards Chris Mar 12, 2019, 1:38 PM by knepley at gmail.com: > On Tue, Mar 12, 2019 at 9:25 AM <> finnkochinski at keemail.me > > wrote: > >> Thank you, >> I added DMPlexInterpolate(). Now I get a different error in DMPlexInterpolate(): >> ... >> > > Okay, you did not need interpolate. You already specified all the levels. However, your orientations are wrong. > Get rid of that code. > ? > >> >> [0]PETSC ERROR: Dimension 0 not supported >> ... >> >> The source code and full output are attached. Anybody able to fix this? >> > > The problem is a numbering convention in the library. Plex will accept any consistent DAG. However, if you > want to use other things in the library, like geometry routines, then there is a convention on numbering (which > makes many things simpler). We require that you number contiguously: > > ? Cells:? ? ? [0, Nc) > ? Vertices: [Nc, Nc+Nv) > ? Edges:? ? [Nc+Nv, Nc+Nv+Ne) > ? Faces:? ? [Nc+Nv+Ne, Nc+Nv+Ne+Nf) > > You numbered the faces in the vertices slot,? so the geometry routines got confused. > > ? Thanks, > > ? ? Matt > ? > >> >> regards >> Chris >> >> -- >> Securely sent with Tutanota. Get your own encrypted, ad-free mailbox: >> https://tutanota.com >> >> >> Mar 12, 2019, 1:08 PM by >> knepley at gmail.com >> : >> >>> On Tue, Mar 12, 2019 at 9:03 AM Chris Finn via petsc-users <>>> petsc-users at mcs.anl.gov >>> > wrote: >>> >>>> Hello, >>>> with the code below, I create a tetrahedron using DMPlexCreateFromDAG, then I try to run DMPlexComputeCellGeometryFVM on this cell. The latter call fails with output: >>>> >>> >>> All the geometry stuff requires that you interpolate the mesh I think, Just use DMPlexInterpolate(). >>> >>> ? Thanks, >>> >>> ? ? Matt >>> ? >>> >>>> >>>> ... >>>> [0]PETSC ERROR: --------------------- Error Message ----------------- >>>> [0]PETSC ERROR: Argument out of range >>>> [0]PETSC ERROR: Cannot handle faces with 1 vertices >>>> ... >>>> (full output is attached). >>>> >>>> What am I doing wrong? >>>> regards >>>> Chris >>>> >>>> Here is the code: >>>> static char help[] = "No help \n"; >>>> >>>> #include >>>> >>>> #undef __FUNCT__ >>>> #define __FUNCT__ "main" >>>> int main(int argc,char **args) >>>> { >>>> ? PetscErrorCode ierr; >>>> ? PetscInitialize(&argc,&args,(char*)0,help); >>>> >>>> ? DM??????????????? dm; >>>> ? int cStart,cEnd; >>>> ? int fStart,fEnd; >>>> >>>> ? int depth = 3; >>>> ? int dim = 3; >>>> >>>> ? PetscInt??? numPoints[4]??????? = {1,4,6,4}; >>>> ? PetscInt??? coneSize[15]???????? = {4,3,3,3,3,2,2,2,2,2,2,0,0,0,0}; >>>> ? PetscInt??? cones[28]??????????? = {1,2,3,4, 5,9,8, 9,6,10, 10,8,7, 5,6,7, 11,12, 12,13, 13,11, 11,14, 12,14, 13,14}; >>>> ? PetscInt??? coneOrientations[28] = {0 }; >>>> ? PetscScalar vertexCoords[12]???? = {0,0,0, 1,0,0, 0,1,0, 0,0,1}; >>>> >>>> ? DMCreate(PETSC_COMM_WORLD, &dm); >>>> ? DMSetType(dm, DMPLEX); >>>> ? DMSetDimension(dm,dim); >>>> ? DMPlexCreateFromDAG(dm, depth, numPoints, coneSize, cones, coneOrientations, vertexCoords); >>>> >>>> ? DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); >>>> ? DMPlexGetHeightStratum(dm, 1, &fStart, &fEnd); >>>> >>>> ? for (int k =cStart;k>>> ??? double vol; >>>> ??? double centroid[3]; >>>> ??? double normal[3]; >>>> ??? ierr = DMPlexComputeCellGeometryFVM(dm, k, &vol, centroid,NULL);CHKERRQ(ierr); >>>> ??? printf("FVM: V=%f c=(%f %f %f) n=(%f %f %f)\n",vol,centroid[0],centroid[1],centroid[2], >>>> ????? normal[0],normal[1],normal[2]); >>>> ? } >>>> ? ierr = PetscFinalize(); >>>> ? return 0; >>>> } >>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >> >> > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: testplex.c Type: text/x-csrc Size: 1838 bytes Desc: not available URL: From knepley at gmail.com Tue Mar 12 12:11:26 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Mar 2019 13:11:26 -0400 Subject: [petsc-users] DMPlexComputeCellGeometryFVM: "Cannot handle faces with 1 vertices" In-Reply-To: References: Message-ID: On Tue, Mar 12, 2019 at 10:09 AM wrote: > Next try: > Vertex and face numbering corrected (I hope), code and full output > attached. > Orientations still wrong, but the crash is elsewhere: > ... > reconstructed depth,dim: 1 3 > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Mesh must be interpolated > ... > > DMPlexComputeCellGeometryFVM() checks depth and dim in the beginning, > finds they are different and quits because mesh seems not interpolated. > Actually, DMPlexGetDepth and DMGetDimension return 1 and 3 with this > vertex/face numbering. They returned the correct 3 and 3 with my previous > wrong numbering. > > It can't be so difficult to create a simplex from scratch? > Well, it is difficult to get everything right. Here is me doing making a reference tetrahedron: https://bitbucket.org/petsc/petsc/src/e551bc0b4a184f2209a31a59ea3fbdc3edbf3863/src/dm/impls/plex/plexcreate.c#lines-3404 I use CreateFormDAG() only to get the topology right, and then call DMInterpolate() to get the edges/faces/orientations correct. I can debug this code if you want, or you can use this method. Either way. Thanks, Matt regards > Chris > > Mar 12, 2019, 1:38 PM by knepley at gmail.com: > > On Tue, Mar 12, 2019 at 9:25 AM wrote: > > Thank you, > I added DMPlexInterpolate(). Now I get a different error in > DMPlexInterpolate(): > ... > > > Okay, you did not need interpolate. You already specified all the levels. > However, your orientations are wrong. > Get rid of that code. > > > > [0]PETSC ERROR: Dimension 0 not supported > ... > > The source code and full output are attached. Anybody able to fix this? > > > The problem is a numbering convention in the library. Plex will accept any > consistent DAG. However, if you > want to use other things in the library, like geometry routines, then > there is a convention on numbering (which > makes many things simpler). We require that you number contiguously: > > Cells: [0, Nc) > Vertices: [Nc, Nc+Nv) > Edges: [Nc+Nv, Nc+Nv+Ne) > Faces: [Nc+Nv+Ne, Nc+Nv+Ne+Nf) > > You numbered the faces in the vertices slot, so the geometry routines got > confused. > > Thanks, > > Matt > > > > regards > Chris > > -- > Securely sent with Tutanota. Get your own encrypted, ad-free mailbox: > https://tutanota.com > > > Mar 12, 2019, 1:08 PM by knepley at gmail.com: > > On Tue, Mar 12, 2019 at 9:03 AM Chris Finn via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello, > with the code below, I create a tetrahedron using DMPlexCreateFromDAG, > then I try to run DMPlexComputeCellGeometryFVM on this cell. The latter > call fails with output: > > > All the geometry stuff requires that you interpolate the mesh I think, > Just use DMPlexInterpolate(). > > Thanks, > > Matt > > > > ... > [0]PETSC ERROR: --------------------- Error Message ----------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Cannot handle faces with 1 vertices > ... > (full output is attached). > > What am I doing wrong? > regards > Chris > > Here is the code: > static char help[] = "No help \n"; > > #include > > #undef __FUNCT__ > #define __FUNCT__ "main" > int main(int argc,char **args) > { > PetscErrorCode ierr; > PetscInitialize(&argc,&args,(char*)0,help); > > DM dm; > int cStart,cEnd; > int fStart,fEnd; > > int depth = 3; > int dim = 3; > > PetscInt numPoints[4] = {1,4,6,4}; > PetscInt coneSize[15] = {4,3,3,3,3,2,2,2,2,2,2,0,0,0,0}; > PetscInt cones[28] = {1,2,3,4, 5,9,8, 9,6,10, 10,8,7, > 5,6,7, 11,12, 12,13, 13,11, 11,14, 12,14, 13,14}; > PetscInt coneOrientations[28] = {0 }; > PetscScalar vertexCoords[12] = {0,0,0, 1,0,0, 0,1,0, 0,0,1}; > > DMCreate(PETSC_COMM_WORLD, &dm); > DMSetType(dm, DMPLEX); > DMSetDimension(dm,dim); > DMPlexCreateFromDAG(dm, depth, numPoints, coneSize, cones, > coneOrientations, vertexCoords); > > DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); > DMPlexGetHeightStratum(dm, 1, &fStart, &fEnd); > > for (int k =cStart;k double vol; > double centroid[3]; > double normal[3]; > ierr = DMPlexComputeCellGeometryFVM(dm, k, &vol, > centroid,NULL);CHKERRQ(ierr); > printf("FVM: V=%f c=(%f %f %f) n=(%f %f > %f)\n",vol,centroid[0],centroid[1],centroid[2], > normal[0],normal[1],normal[2]); > } > ierr = PetscFinalize(); > return 0; > } > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 12 12:24:07 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Mar 2019 13:24:07 -0400 Subject: [petsc-users] DMPlexComputeCellGeometryFVM: "Cannot handle faces with 1 vertices" In-Reply-To: References: Message-ID: On Tue, Mar 12, 2019 at 11:20 AM wrote: > I think I found a hint in src/dm/impls/plex/examples/tests/ex5.c to get my > code corrected. > > Apparently the contiguous numbering is > NOT: cells, vertices, edges, faces (as you said), > BUT: cells, vertices, faces, edges. > There is an easy way to check. We can output the whole DAG for a single cell using the ASCII viewer: cd $PETSC_DIR/src/dm/impls/plex/examples/tests make ex1 ./ex1 -dim 3 -cell_simplex 0 -domain_box_sizes 1,1,1 -interpolate -dm_view ::ascii_info_detail and yes, faces are numbered before edges. Sorry about the flip. > Then in numPoints, you don't put depth 0..3 or depth 3..0, > BUT: #vertices, #faces, #edges, #cells. > Actually the order in numPoints does not matter beyond having vertices first, so the documentation is correct: https://bitbucket.org/petsc/petsc/src/e551bc0b4a184f2209a31a59ea3fbdc3edbf3863/src/dm/impls/plex/plexcreate.c#lines-3066 > This seems completely random, but works in the attached example. > Maybe you should finally take the time to document this mess if even the > developers no longer understand it. > There is a chapter in the manual and many manpages. Thanks, Matt > regards > Chris > > Mar 12, 2019, 1:38 PM by knepley at gmail.com: > > On Tue, Mar 12, 2019 at 9:25 AM wrote: > > Thank you, > I added DMPlexInterpolate(). Now I get a different error in > DMPlexInterpolate(): > ... > > > Okay, you did not need interpolate. You already specified all the levels. > However, your orientations are wrong. > Get rid of that code. > > > > [0]PETSC ERROR: Dimension 0 not supported > ... > > The source code and full output are attached. Anybody able to fix this? > > > The problem is a numbering convention in the library. Plex will accept any > consistent DAG. However, if you > want to use other things in the library, like geometry routines, then > there is a convention on numbering (which > makes many things simpler). We require that you number contiguously: > > Cells: [0, Nc) > Vertices: [Nc, Nc+Nv) > Edges: [Nc+Nv, Nc+Nv+Ne) > Faces: [Nc+Nv+Ne, Nc+Nv+Ne+Nf) > > You numbered the faces in the vertices slot, so the geometry routines got > confused. > > Thanks, > > Matt > > > > regards > Chris > > -- > Securely sent with Tutanota. Get your own encrypted, ad-free mailbox: > https://tutanota.com > > > Mar 12, 2019, 1:08 PM by knepley at gmail.com: > > On Tue, Mar 12, 2019 at 9:03 AM Chris Finn via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello, > with the code below, I create a tetrahedron using DMPlexCreateFromDAG, > then I try to run DMPlexComputeCellGeometryFVM on this cell. The latter > call fails with output: > > > All the geometry stuff requires that you interpolate the mesh I think, > Just use DMPlexInterpolate(). > > Thanks, > > Matt > > > > ... > [0]PETSC ERROR: --------------------- Error Message ----------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Cannot handle faces with 1 vertices > ... > (full output is attached). > > What am I doing wrong? > regards > Chris > > Here is the code: > static char help[] = "No help \n"; > > #include > > #undef __FUNCT__ > #define __FUNCT__ "main" > int main(int argc,char **args) > { > PetscErrorCode ierr; > PetscInitialize(&argc,&args,(char*)0,help); > > DM dm; > int cStart,cEnd; > int fStart,fEnd; > > int depth = 3; > int dim = 3; > > PetscInt numPoints[4] = {1,4,6,4}; > PetscInt coneSize[15] = {4,3,3,3,3,2,2,2,2,2,2,0,0,0,0}; > PetscInt cones[28] = {1,2,3,4, 5,9,8, 9,6,10, 10,8,7, > 5,6,7, 11,12, 12,13, 13,11, 11,14, 12,14, 13,14}; > PetscInt coneOrientations[28] = {0 }; > PetscScalar vertexCoords[12] = {0,0,0, 1,0,0, 0,1,0, 0,0,1}; > > DMCreate(PETSC_COMM_WORLD, &dm); > DMSetType(dm, DMPLEX); > DMSetDimension(dm,dim); > DMPlexCreateFromDAG(dm, depth, numPoints, coneSize, cones, > coneOrientations, vertexCoords); > > DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); > DMPlexGetHeightStratum(dm, 1, &fStart, &fEnd); > > for (int k =cStart;k double vol; > double centroid[3]; > double normal[3]; > ierr = DMPlexComputeCellGeometryFVM(dm, k, &vol, > centroid,NULL);CHKERRQ(ierr); > printf("FVM: V=%f c=(%f %f %f) n=(%f %f > %f)\n",vol,centroid[0],centroid[1],centroid[2], > normal[0],normal[1],normal[2]); > } > ierr = PetscFinalize(); > return 0; > } > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Mar 12 12:41:04 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 12 Mar 2019 17:41:04 +0000 Subject: [petsc-users] Matrix Partitioning using PARMETIS In-Reply-To: References: Message-ID: <72BE401F-7C24-427C-A426-95B7E93F5D98@anl.gov> Yes, by default there is one subdomain per process so if you run on one process you will get all zero indices. Run on two processes and you should see a partitioning. See also MatPartitioningSetNParts() Barry > On Mar 12, 2019, at 3:03 AM, Eda Oktay via petsc-users wrote: > > Hello, > > I have a Laplacian matrix PL of matrix A and I try to partition A using PARMETIS. Since PL is sequential and not adjacency matrix, I converted PL to AL, then write the following code: > > ierr = MatConvert(PL,MATMPIADJ,MAT_INITIAL_MATRIX,&AL);CHKERRQ(ierr); > ierr = MatMeshToCellGraph(AL,2,&dual);CHKERRQ(ierr); > ierr = MatPartitioningCreate(MPI_COMM_WORLD,&part);CHKERRQ(ierr); > ierr = MatPartitioningSetAdjacency(part,dual);CHKERRQ(ierr); > ierr = MatPartitioningSetFromOptions(part);CHKERRQ(ierr); > ierr = MatPartitioningApply(part,&partitioning);CHKERRQ(ierr); > ierr = ISView(partitioning,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); > ierr = ISDestroy(&partitioning);CHKERRQ(ierr); > ierr = MatPartitioningDestroy(&part);CHKERRQ(ierr); > > However, when I look at partitioning with ISView, the index set consists of zeros only. Is that because I have only one processor and my codes are written for only one processor, or is there another problem? I ran my code with -mat_partitioning_type parmetis. > > Thanks, > > Eda From jczhang at mcs.anl.gov Tue Mar 12 13:08:10 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Tue, 12 Mar 2019 18:08:10 +0000 Subject: [petsc-users] PCFieldSplit with MatNest In-Reply-To: References: Message-ID: Hi, Manuel, I recently fixed a problem in VecRestoreArrayRead. Basically, I added VecRestoreArrayRead_Nest. Could you try the master branch of PETSc to see if it fixes your problem? Thanks. --Junchao Zhang On Mon, Mar 11, 2019 at 6:56 AM Manuel Colera Rico via petsc-users > wrote: Hello, I need to solve a 2*2 block linear system. The matrices A_00, A_01, A_10, A_11 are constructed separately via MatCreateSeqAIJWithArrays and MatCreateSeqSBAIJWithArrays. Then, I construct the full system matrix with MatCreateNest, and use MatNestGetISs and PCFieldSplitSetIS to set up the PC, trying to follow the procedure described here: https://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex70.c.html. However, when I run the code with Leak Sanitizer, I get the following error: ================================================================= ==54927==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x627000051ab8 in thread T0 #0 0x7fbd95c08f30 in __interceptor_free ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:66 #1 0x7fbd92b99dcd in PetscFreeAlign (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x146dcd) #2 0x7fbd92ce0178 in VecRestoreArray_Nest (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28d178) #3 0x7fbd92cd627d in VecRestoreArrayRead (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28327d) #4 0x7fbd92d1189e in VecScatterBegin_SSToSS (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2be89e) #5 0x7fbd92d1a414 in VecScatterBegin (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2c7414) #6 0x7fbd934a999c in PCApply_FieldSplit (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa5699c) #7 0x7fbd93369071 in PCApply (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x916071) #8 0x7fbd934efe77 in KSPInitialResidual (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa9ce77) #9 0x7fbd9350272c in KSPSolve_GMRES (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xaaf72c) #10 0x7fbd934e3c01 in KSPSolve (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa90c01) Disabling Leak Sanitizer also outputs an "invalid pointer" error. Did I forget something when writing the code? Thank you, Manuel --- -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Tue Mar 12 16:48:40 2019 From: mlohry at gmail.com (Mark Lohry) Date: Tue, 12 Mar 2019 17:48:40 -0400 Subject: [petsc-users] GAMG parallel convergence sensitivity Message-ID: Hi all, I've run into an unexpected issue with GAMG stagnating for a certain condition. I'm running a 3D high order DG discretization for compressible navier-stokes, using matrix-free gmres+amg, with the relevant petsc configuration: -pc_type gamg -ksp_type fgmres -pc_gamg_agg_nsmooths 0 -mg_levels_ksp_type gmres -mg_levels_pc_type bjacobi -mg_levels_ksp_max_it 20 -mg_levels_ksp_rtol 0.0001 -pc_mg_cycle_type v -pc_mg_type full So FGMRES on top, with AMG using ILU block jacobi + GMRES as a smoother. -ksp_view output pasted at the bottom here. This setup has been working fairly robustly. I'm testing two small mesh resolutions, with 1,536 cells and 6,144 cells each, where in the jacobian each cell is a 50x50 dense block, with 4 off-diagonal block neighbors each. With that, I'm testing 2 configurations of the same problem, one with mach 0.1 and the other with mach 0.01 (where the latter makes system much worse conditioned, a kind of stress test.) In serial everything converges well to relative tolerance 0.01: 1,536 cells, Mach 0.1: 2 iterations 6,144 cells, Mach 0.1: 2 iterations 1,536 cells, Mach 0.01: 5 iterations 6,144 cells, Mach 0.01: 5 iterations In parallel most things converge well, with -np 16 cores here: 1,536 cells, Mach 0.1: 3 iterations 6,144 cells, Mach 0.1: 4 iterations 1,536 cells, Mach 0.01: 11 iterations but for the 6,144 cell Mach 0.01 case, it's catastrophically worse: 0 SNES Function norm 6.934657276072e+05 0 KSP Residual norm 6.934657276072e+05 1 KSP Residual norm 6.934440650708e+05 2 KSP Residual norm 6.934157525695e+05 3 KSP Residual norm 6.934145135179e+05 ... 48 KSP Residual norm 6.830785654915e+05 49 KSP Residual norm 6.821332742917e+05 50 KSP Residual norm 6.807807049444e+05 and quickly stalls entirely and won't converge in 100s of iterations. The exact same case in serial shows nice convergence: 0 SNES Function norm 6.934657276072e+05 0 KSP Residual norm 6.934657276072e+05 1 KSP Residual norm 1.705989154365e+05 2 KSP Residual norm 3.183292610749e+04 3 KSP Residual norm 1.568738082749e+04 4 KSP Residual norm 9.875297457387e+03 5 KSP Residual norm 6.489083537720e+03 Linear solve converged due to CONVERGED_RTOL iterations 5 And the marginally coarser 1,536 cell case with the same physics is also healthy with parallel -np 16: 0 SNES Function norm 2.400990060398e+05 0 KSP Residual norm 2.400990060398e+05 1 KSP Residual norm 2.391625967890e+05 2 KSP Residual norm 1.388195699805e+05 3 KSP Residual norm 3.072388366914e+04 4 KSP Residual norm 2.151010198865e+04 5 KSP Residual norm 1.305330349765e+04 6 KSP Residual norm 8.126579575968e+03 7 KSP Residual norm 6.186198840355e+03 8 KSP Residual norm 4.673764041449e+03 9 KSP Residual norm 3.332141521573e+03 10 KSP Residual norm 2.811481187948e+03 11 KSP Residual norm 2.189632613389e+03 Linear solve converged due to CONVERGED_RTOL iterations 11 Any thoughts here? Is there anything obviously wrong with my setup? Any way to reduce the dependence of the convergence iterations on the parallelism? -- obviously I expect the iteration count to be higher in parallel, but I didn't expect such catastrophic failure. Thanks as always, Mark -ksp_view: 0 TS dt 30. time 0. 0 SNES Function norm 2.856641938332e+04 0 KSP Residual norm 2.856641938332e+04 1 KSP Residual norm 1.562096645358e+03 2 KSP Residual norm 3.008746074553e+02 3 KSP Residual norm 1.463990835793e+02 Linear solve converged due to CONVERGED_RTOL iterations 3 KSP Object: 16 MPI processes type: fgmres restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=100, initial guess is zero tolerances: relative=0.01, absolute=1e-06, divergence=10. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 16 MPI processes type: gamg type is FULL, levels=5 cycles=v Using externally compute Galerkin coarse grid matrices GAMG specific options Threshold for dropping small values in graph on each level = 0. 0. 0. Threshold scaling factor for each level not specified = 1. AGG specific options Symmetric graph false Number of levels to square graph 1 Number smoothing steps 0 Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 16 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 16 MPI processes type: bjacobi number of blocks = 16 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: nd factor fill ratio given 5., needed 1.10526 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=25, cols=25, bs=5 package used to perform factorization: petsc total: nonzeros=525, allocated nonzeros=525 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=25, cols=25, bs=5 total: nonzeros=475, allocated nonzeros=475 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 16 MPI processes type: mpiaij rows=25, cols=25, bs=5 total: nonzeros=475, allocated nonzeros=475 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 5 nodes, limit used is 5 Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 16 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, nonzero initial guess tolerances: relative=0.0001, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_1_) 16 MPI processes type: bjacobi number of blocks = 16 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_levels_1_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_1_sub_) 1 MPI processes type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=75, cols=75, bs=5 package used to perform factorization: petsc total: nonzeros=1925, allocated nonzeros=1925 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 15 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=75, cols=75, bs=5 total: nonzeros=1925, allocated nonzeros=1925 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 15 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 16 MPI processes type: mpiaij rows=75, cols=75, bs=5 total: nonzeros=1925, allocated nonzeros=1925 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 15 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 16 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, nonzero initial guess tolerances: relative=0.0001, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_2_) 16 MPI processes type: bjacobi number of blocks = 16 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_levels_2_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_2_sub_) 1 MPI processes type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=35, cols=35, bs=5 package used to perform factorization: petsc total: nonzeros=675, allocated nonzeros=675 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 7 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=35, cols=35, bs=5 total: nonzeros=675, allocated nonzeros=675 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 7 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 16 MPI processes type: mpiaij rows=305, cols=305, bs=5 total: nonzeros=8675, allocated nonzeros=8675 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 7 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (mg_levels_3_) 16 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, nonzero initial guess tolerances: relative=0.0001, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_3_) 16 MPI processes type: bjacobi number of blocks = 16 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_levels_3_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_3_sub_) 1 MPI processes type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=50, cols=50, bs=5 package used to perform factorization: petsc total: nonzeros=1050, allocated nonzeros=1050 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 10 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=50, cols=50, bs=5 total: nonzeros=1050, allocated nonzeros=1050 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 10 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 16 MPI processes type: mpiaij rows=1090, cols=1090, bs=5 total: nonzeros=32050, allocated nonzeros=32050 total number of mallocs used during MatSetValues calls =0 using nonscalable MatPtAP() implementation using I-node (on process 0) routines: found 10 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 4 ------------------------------- KSP Object: (mg_levels_4_) 16 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, nonzero initial guess tolerances: relative=0.0001, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_4_) 16 MPI processes type: bjacobi number of blocks = 16 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_levels_4_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_4_sub_) 1 MPI processes type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=4850, cols=4850, bs=5 package used to perform factorization: petsc total: nonzeros=1117500, allocated nonzeros=1117500 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 970 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=4850, cols=4850, bs=5 total: nonzeros=1117500, allocated nonzeros=1117500 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 970 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: 16 MPI processes type: mffd rows=76800, cols=76800 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 16 MPI processes type: mpiaij rows=76800, cols=76800, bs=5 total: nonzeros=18880000, allocated nonzeros=18880000 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 970 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix followed by preconditioner matrix: Mat Object: 16 MPI processes type: mffd rows=76800, cols=76800 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 16 MPI processes type: mpiaij rows=76800, cols=76800, bs=5 total: nonzeros=18880000, allocated nonzeros=18880000 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 970 nodes, limit used is 5 Line search: Using full step: fnorm 2.856641938332e+04 gnorm 3.868815397561e+03 1 SNES Function norm 3.868815397561e+03 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Tue Mar 12 16:55:17 2019 From: mlohry at gmail.com (Mark Lohry) Date: Tue, 12 Mar 2019 17:55:17 -0400 Subject: [petsc-users] GAMG parallel convergence sensitivity Message-ID: Hi all, I've run into an unexpected issue with GAMG stagnating for a certain condition. I'm running a 3D high order DG discretization for compressible navier-stokes, using matrix-free gmres+amg, with the relevant petsc configuration: -pc_type gamg -ksp_type fgmres -pc_gamg_agg_nsmooths 0 -mg_levels_ksp_type gmres -mg_levels_pc_type bjacobi -mg_levels_ksp_max_it 20 -mg_levels_ksp_rtol 0.0001 -pc_mg_cycle_type v -pc_mg_type full So FGMRES on top, with AMG using ILU block jacobi + GMRES as a smoother. -ksp_view output pasted at the bottom here. This setup has been working fairly robustly. I'm testing two small mesh resolutions, with 1,536 cells and 6,144 cells each, where in the jacobian each cell is a 50x50 dense block, with 4 off-diagonal block neighbors each. With that, I'm testing 2 configurations of the same problem, one with mach 0.1 and the other with mach 0.01 (where the latter makes system much worse conditioned, a kind of stress test.) In serial everything converges well to relative tolerance 0.01: 1,536 cells, Mach 0.1: 2 iterations 6,144 cells, Mach 0.1: 2 iterations 1,536 cells, Mach 0.01: 5 iterations 6,144 cells, Mach 0.01: 5 iterations In parallel most things converge well, with -np 16 cores here: 1,536 cells, Mach 0.1: 3 iterations 6,144 cells, Mach 0.1: 4 iterations 1,536 cells, Mach 0.01: 11 iterations but for the 6,144 cell Mach 0.01 case, it's catastrophically worse: 0 SNES Function norm 6.934657276072e+05 0 KSP Residual norm 6.934657276072e+05 1 KSP Residual norm 6.934440650708e+05 2 KSP Residual norm 6.934157525695e+05 3 KSP Residual norm 6.934145135179e+05 ... 48 KSP Residual norm 6.830785654915e+05 49 KSP Residual norm 6.821332742917e+05 50 KSP Residual norm 6.807807049444e+05 and quickly stalls entirely and won't converge in 100s of iterations. The exact same case in serial shows nice convergence: 0 SNES Function norm 6.934657276072e+05 0 KSP Residual norm 6.934657276072e+05 1 KSP Residual norm 1.705989154365e+05 2 KSP Residual norm 3.183292610749e+04 3 KSP Residual norm 1.568738082749e+04 4 KSP Residual norm 9.875297457387e+03 5 KSP Residual norm 6.489083537720e+03 Linear solve converged due to CONVERGED_RTOL iterations 5 And the marginally coarser 1,536 cell case with the same physics is also healthy with parallel -np 16: 0 SNES Function norm 2.400990060398e+05 0 KSP Residual norm 2.400990060398e+05 1 KSP Residual norm 2.391625967890e+05 2 KSP Residual norm 1.388195699805e+05 3 KSP Residual norm 3.072388366914e+04 4 KSP Residual norm 2.151010198865e+04 5 KSP Residual norm 1.305330349765e+04 6 KSP Residual norm 8.126579575968e+03 7 KSP Residual norm 6.186198840355e+03 8 KSP Residual norm 4.673764041449e+03 9 KSP Residual norm 3.332141521573e+03 10 KSP Residual norm 2.811481187948e+03 11 KSP Residual norm 2.189632613389e+03 Linear solve converged due to CONVERGED_RTOL iterations 11 Any thoughts here? Is there anything obviously wrong with my setup? Any way to reduce the dependence of the convergence iterations on the parallelism? -- obviously I expect the iteration count to be higher in parallel, but I didn't expect such catastrophic failure. Thanks as always, Mark -ksp_view: 0 TS dt 30. time 0. 0 SNES Function norm 2.856641938332e+04 0 KSP Residual norm 2.856641938332e+04 1 KSP Residual norm 1.562096645358e+03 2 KSP Residual norm 3.008746074553e+02 3 KSP Residual norm 1.463990835793e+02 Linear solve converged due to CONVERGED_RTOL iterations 3 KSP Object: 16 MPI processes type: fgmres restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=100, initial guess is zero tolerances: relative=0.01, absolute=1e-06, divergence=10. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 16 MPI processes type: gamg type is FULL, levels=5 cycles=v Using externally compute Galerkin coarse grid matrices GAMG specific options Threshold for dropping small values in graph on each level = 0. 0. 0. Threshold scaling factor for each level not specified = 1. AGG specific options Symmetric graph false Number of levels to square graph 1 Number smoothing steps 0 Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 16 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 16 MPI processes type: bjacobi number of blocks = 16 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: nd factor fill ratio given 5., needed 1.10526 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=25, cols=25, bs=5 package used to perform factorization: petsc total: nonzeros=525, allocated nonzeros=525 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=25, cols=25, bs=5 total: nonzeros=475, allocated nonzeros=475 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 16 MPI processes type: mpiaij rows=25, cols=25, bs=5 total: nonzeros=475, allocated nonzeros=475 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 5 nodes, limit used is 5 Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 16 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, nonzero initial guess tolerances: relative=0.0001, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_1_) 16 MPI processes type: bjacobi number of blocks = 16 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_levels_1_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_1_sub_) 1 MPI processes type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=75, cols=75, bs=5 package used to perform factorization: petsc total: nonzeros=1925, allocated nonzeros=1925 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 15 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=75, cols=75, bs=5 total: nonzeros=1925, allocated nonzeros=1925 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 15 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 16 MPI processes type: mpiaij rows=75, cols=75, bs=5 total: nonzeros=1925, allocated nonzeros=1925 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 15 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 16 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, nonzero initial guess tolerances: relative=0.0001, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_2_) 16 MPI processes type: bjacobi number of blocks = 16 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_levels_2_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_2_sub_) 1 MPI processes type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=35, cols=35, bs=5 package used to perform factorization: petsc total: nonzeros=675, allocated nonzeros=675 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 7 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=35, cols=35, bs=5 total: nonzeros=675, allocated nonzeros=675 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 7 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 16 MPI processes type: mpiaij rows=305, cols=305, bs=5 total: nonzeros=8675, allocated nonzeros=8675 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 7 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (mg_levels_3_) 16 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, nonzero initial guess tolerances: relative=0.0001, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_3_) 16 MPI processes type: bjacobi number of blocks = 16 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_levels_3_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_3_sub_) 1 MPI processes type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=50, cols=50, bs=5 package used to perform factorization: petsc total: nonzeros=1050, allocated nonzeros=1050 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 10 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=50, cols=50, bs=5 total: nonzeros=1050, allocated nonzeros=1050 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 10 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 16 MPI processes type: mpiaij rows=1090, cols=1090, bs=5 total: nonzeros=32050, allocated nonzeros=32050 total number of mallocs used during MatSetValues calls =0 using nonscalable MatPtAP() implementation using I-node (on process 0) routines: found 10 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 4 ------------------------------- KSP Object: (mg_levels_4_) 16 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=20, nonzero initial guess tolerances: relative=0.0001, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_4_) 16 MPI processes type: bjacobi number of blocks = 16 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_levels_4_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_4_sub_) 1 MPI processes type: ilu out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=4850, cols=4850, bs=5 package used to perform factorization: petsc total: nonzeros=1117500, allocated nonzeros=1117500 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 970 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=4850, cols=4850, bs=5 total: nonzeros=1117500, allocated nonzeros=1117500 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 970 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: 16 MPI processes type: mffd rows=76800, cols=76800 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 16 MPI processes type: mpiaij rows=76800, cols=76800, bs=5 total: nonzeros=18880000, allocated nonzeros=18880000 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 970 nodes, limit used is 5 Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix followed by preconditioner matrix: Mat Object: 16 MPI processes type: mffd rows=76800, cols=76800 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 16 MPI processes type: mpiaij rows=76800, cols=76800, bs=5 total: nonzeros=18880000, allocated nonzeros=18880000 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 970 nodes, limit used is 5 Line search: Using full step: fnorm 2.856641938332e+04 gnorm 3.868815397561e+03 1 SNES Function norm 3.868815397561e+03 -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera-w at sdsu.edu Tue Mar 12 17:48:45 2019 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Tue, 12 Mar 2019 15:48:45 -0700 Subject: [petsc-users] PetscScatterCreate type mismatch after update. Message-ID: Hello, I just updated petsc from the repo to the latest master branch version, and a compilation problem popped up, it seems like the variable types are not being acknowledged properly, what i have in a minimum working example fashion is: #include > #include > #include > #include > #include > USE petscvec > USE petscdmda > USE petscdm > USE petscis > USE petscksp > IS :: ScalarIS > IS :: DummyIS > VecScatter :: LargerToSmaller,to0,from0 > VecScatter :: SmallerToLarger > PetscInt, ALLOCATABLE :: pScalarDA(:), pDummyDA(:) > PetscScalar :: rtol > Vec :: Vec1 > Vec :: Vec2 > ! Create index sets > allocate( pScalarDA(0:(gridx-1)*(gridy-1)*(gridz-1)-1) , > pDummyDA(0:(gridx-1)*(gridy-1)*(gridz-1)-1) ) > iter=0 > do k=0,gridz-2 > kplane = k*gridx*gridy > do j=0,gridy-2 > do i=0,gridx-2 > pScalarDA(iter) = kplane + j*(gridx) + i > iter = iter+1 > enddo > enddo > enddo > pDummyDA = (/ (ind, ind=0,((gridx-1)*(gridy-1)*(gridz-1))-1) /) > call > ISCreateGeneral(PETSC_COMM_WORLD,(gridx-1)*(gridy-1)*(gridz-1), & > > pScalarDA,PETSC_COPY_VALUES,ScalarIS,ierr) > call > ISCreateGeneral(PETSC_COMM_WORLD,(gridx-1)*(gridy-1)*(gridz-1), & > > pDummyDA,PETSC_COPY_VALUES,DummyIS,ierr) > deallocate(pScalarDA,pDummyDA, STAT=ierr) > ! Create VecScatter contexts: LargerToSmaller & SmallerToLarger > call DMDACreateNaturalVector(daScalars,Vec1,ierr) > call DMDACreateNaturalVector(daDummy,Vec2,ierr) > call > VecScatterCreate(Vec1,ScalarIS,Vec2,DummyIS,LargerToSmaller,ierr) > call > VecScatterCreate(Vec2,DummyIS,Vec1,ScalarIS,SmallerToLarger,ierr) > call VecDestroy(Vec1,ierr) > call VecDestroy(Vec2,ierr) And the error i get is the part i cannot really understand: matrixobjs.f90:99.34: > call > VecScatterCreate(Vec1,ScalarIS,Vec2,DummyIS,LargerToSmaller,ie > 1 > Error: Type mismatch in argument 'a' at (1); passed TYPE(tvec) to > INTEGER(4) > matrixobjs.f90:100.34: > call > VecScatterCreate(Vec2,DummyIS,Vec1,ScalarIS,SmallerToLarger,ie > 1 > Error: Type mismatch in argument 'a' at (1); passed TYPE(tvec) to > INTEGER(4) > make[1]: *** [matrixobjs.o] Error 1 > make[1]: Leaving directory `/usr/scratch/valera/ParGCCOM-Master/Src' > make: *** [gcmSeamount] Error 2 What i find hard to understand is why/where my code is finding an integer type? as you can see from the MWE header the variables types look correct, Any help is appreaciated, Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Mar 12 17:57:57 2019 From: jed at jedbrown.org (Jed Brown) Date: Tue, 12 Mar 2019 16:57:57 -0600 Subject: [petsc-users] PetscScatterCreate type mismatch after update. In-Reply-To: References: Message-ID: <87ef7bya2y.fsf@jedbrown.org> Did you just update to 'master'? See VecScatter changes: https://www.mcs.anl.gov/petsc/documentation/changes/dev.html Manuel Valera via petsc-users writes: > Hello, > > I just updated petsc from the repo to the latest master branch version, and > a compilation problem popped up, it seems like the variable types are not > being acknowledged properly, what i have in a minimum working example > fashion is: > > #include >> #include >> #include >> #include >> #include >> USE petscvec >> USE petscdmda >> USE petscdm >> USE petscis >> USE petscksp >> IS :: ScalarIS >> IS :: DummyIS >> VecScatter :: LargerToSmaller,to0,from0 >> VecScatter :: SmallerToLarger >> PetscInt, ALLOCATABLE :: pScalarDA(:), pDummyDA(:) >> PetscScalar :: rtol >> Vec :: Vec1 >> Vec :: Vec2 >> ! Create index sets >> allocate( pScalarDA(0:(gridx-1)*(gridy-1)*(gridz-1)-1) , >> pDummyDA(0:(gridx-1)*(gridy-1)*(gridz-1)-1) ) >> iter=0 >> do k=0,gridz-2 >> kplane = k*gridx*gridy >> do j=0,gridy-2 >> do i=0,gridx-2 >> pScalarDA(iter) = kplane + j*(gridx) + i >> iter = iter+1 >> enddo >> enddo >> enddo >> pDummyDA = (/ (ind, ind=0,((gridx-1)*(gridy-1)*(gridz-1))-1) /) >> call >> ISCreateGeneral(PETSC_COMM_WORLD,(gridx-1)*(gridy-1)*(gridz-1), & >> >> pScalarDA,PETSC_COPY_VALUES,ScalarIS,ierr) >> call >> ISCreateGeneral(PETSC_COMM_WORLD,(gridx-1)*(gridy-1)*(gridz-1), & >> >> pDummyDA,PETSC_COPY_VALUES,DummyIS,ierr) >> deallocate(pScalarDA,pDummyDA, STAT=ierr) >> ! Create VecScatter contexts: LargerToSmaller & SmallerToLarger >> call DMDACreateNaturalVector(daScalars,Vec1,ierr) >> call DMDACreateNaturalVector(daDummy,Vec2,ierr) >> call >> VecScatterCreate(Vec1,ScalarIS,Vec2,DummyIS,LargerToSmaller,ierr) >> call >> VecScatterCreate(Vec2,DummyIS,Vec1,ScalarIS,SmallerToLarger,ierr) >> call VecDestroy(Vec1,ierr) >> call VecDestroy(Vec2,ierr) > > > And the error i get is the part i cannot really understand: > > matrixobjs.f90:99.34: >> call >> VecScatterCreate(Vec1,ScalarIS,Vec2,DummyIS,LargerToSmaller,ie >> 1 >> Error: Type mismatch in argument 'a' at (1); passed TYPE(tvec) to >> INTEGER(4) >> matrixobjs.f90:100.34: >> call >> VecScatterCreate(Vec2,DummyIS,Vec1,ScalarIS,SmallerToLarger,ie >> 1 >> Error: Type mismatch in argument 'a' at (1); passed TYPE(tvec) to >> INTEGER(4) >> make[1]: *** [matrixobjs.o] Error 1 >> make[1]: Leaving directory `/usr/scratch/valera/ParGCCOM-Master/Src' >> make: *** [gcmSeamount] Error 2 > > > What i find hard to understand is why/where my code is finding an integer > type? as you can see from the MWE header the variables types look correct, > > Any help is appreaciated, > > Thanks, From mvalera-w at sdsu.edu Tue Mar 12 20:02:31 2019 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Tue, 12 Mar 2019 18:02:31 -0700 Subject: [petsc-users] PetscScatterCreate type mismatch after update. In-Reply-To: <87ef7bya2y.fsf@jedbrown.org> References: <87ef7bya2y.fsf@jedbrown.org> Message-ID: Hi, So i just solved that problem but now it looks my code broke somewhere else, i have a script in place to scatter/gather the information to root in order to write it to a file (i know, we need to make this parallel I/O but that's future work). Such script looks like this: > SUBROUTINE WriteToFile_grid() > > PetscErrorCode :: ierrp > PetscMPIInt :: rank > PetscInt :: iter > Vec :: > CenterX,CenterY,CenterZ,Nat1,Nat2,seqvec > PetscScalar, pointer :: tmp3d(:,:,:),tmp4d(:,:,:,:),arr(:) > VecScatter :: LargerToSmaller, scatterctx > INTEGER :: i,j,k, ierr > > call MPI_Comm_rank(PETSC_COMM_WORLD, rank, ierrp) > > !#####################################################################################! > ! Grid Cell Centers: x-component > ! > > !#####################################################################################! > ! Extract x-component > call DMDAVecGetArrayF90(daGrid,GridCenters,tmp4d,ierrp) > call DMCreateGlobalVector(daSingle,CenterX,ierrp) > call DMDAVecGetArrayF90(daSingle,CenterX,tmp3d,ierrp) > tmp3d(:,:,:) = tmp4d(0,:,:,:) > call DMDAVecRestoreArrayF90(daSingle,CenterX,tmp3d,ierrp) > call DMDAVecRestoreArrayF90(daGrid,GridCenters,tmp4d,ierrp) > ! Scatter to daWriteCenters > call DMDACreateNaturalVector(daSingle,Nat1,ierrp) > call > DMDAGlobalToNaturalBegin(daSingle,CenterX,INSERT_VALUES,Nat1,ierrp) > call > DMDAGlobalToNaturalEnd(daSingle,CenterX,INSERT_VALUES,Nat1,ierrp) > call VecDestroy(CenterX,ierrp) > call DMDACreateNaturalVector(daWriteCenters,Nat2,ierrp) > call > VecScatterCreate(Nat1,SingleIS,Nat2,WriteIS,LargerToSmaller,ierrp) > call > VecScatterBegin(LargerToSmaller,Nat1,Nat2,INSERT_VALUES,SCATTER_FORWARD,ierrp) > call > VecScatterEnd(LargerToSmaller,Nat1,Nat2,INSERT_VALUES,SCATTER_FORWARD,ierrp) > call VecScatterDestroy(LargerToSmaller,ierrp) > call VecDestroy(Nat1,ierrp) > ! Send to root > call VecScatterCreateToZero(Nat2,scatterctx,seqvec,ierrp) > call > VecScatterBegin(scatterctx,Nat2,seqvec,INSERT_VALUES,SCATTER_FORWARD,ierrp) > call > VecScatterEnd(scatterctx,Nat2,seqvec,INSERT_VALUES,SCATTER_FORWARD,ierrp) > call VecScatterDestroy(scatterctx,ierrp) > call VecDestroy(Nat2,ierrp) > ! Let root write to netCDF file > if (rank == 0) then > allocate(buffer(1:IMax-1,1:JMax-1,1:KMax-1),STAT=ierr) > call VecGetArrayReadF90(seqvec,arr,ierrp) > iter = 1 > do k=1,KMax-1 > do j=1,JMax-1 > do i=1,IMax-1 > buffer(i,j,k) = arr(iter) > iter = iter + 1 > enddo > enddo > enddo > call VecRestoreArrayReadF90(seqvec,arr,ierrp) > call > nc_check(nf90_put_var(ncid,xID,buffer,start=(/1,1,1/),count=(/IMax-1,JMax-1,KMax-1/)), > & > 'WriteNetCDF', context='put_var GridCenterX > in '//trim(output_filename)) > deallocate(buffer,STAT=ierr) > endif > call VecDestroy(seqvec,ierrp) And then the process is repeated for each variable to output. Notice the vector seqvec is being destroyed at the end. Using petsc v3.10.0 this script worked without problems. After updating to v3.10.4 it no longer works. Gives the following error: [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: Null Object: Parameter # 3 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.10.4, unknown > [0]PETSC ERROR: ./gcmSeamount on a petsc-debug named ocean by valera Tue > Mar 12 17:59:43 2019 > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=8 > --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 > --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 > --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 > --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 > --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 > PETSC_ARCH=petsc-debug --COPTFLAGS=-O2 --CXXOPTFLAGS=-O2 --FOPTFLAGS=-O2 > --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort > --with-shared-libraries=1 --with-debugging=1 --download-hypre --download-ml > --with-batch --known-mpi-shared-libraries=1 --known-64-bit-blas-indices=0 > [0]PETSC ERROR: #1 VecScatterBegin() line 85 in > /usr/dataC/home/valera/petsc/src/vec/vscat/interface/vscatfce.c > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: Null Object: Parameter # 3 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.10.4, unknown > [0]PETSC ERROR: ./gcmSeamount on a petsc-debug named ocean by valera Tue > Mar 12 17:59:43 2019 > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=8 > --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 > --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 > --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 > --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 > --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 > PETSC_ARCH=petsc-debug --COPTFLAGS=-O2 --CXXOPTFLAGS=-O2 --FOPTFLAGS=-O2 > --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort > --with-shared-libraries=1 --with-debugging=1 --download-hypre --download-ml > --with-batch --known-mpi-shared-libraries=1 --known-64-bit-blas-indices=0 > [0]PETSC ERROR: #2 VecScatterEnd() line 150 in > /usr/dataC/home/valera/petsc/src/vec/vscat/interface/vscatfce.c > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: Null Object: Parameter # 1 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.10.4, unknown > [0]PETSC ERROR: ./gcmSeamount on a petsc-debug named ocean by valera Tue > Mar 12 17:59:43 2019 > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=8 > --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 > --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 > --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 > --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 > --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 > PETSC_ARCH=petsc-debug --COPTFLAGS=-O2 --CXXOPTFLAGS=-O2 --FOPTFLAGS=-O2 > --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort > --with-shared-libraries=1 --with-debugging=1 --download-hypre --download-ml > --with-batch --known-mpi-shared-libraries=1 --known-64-bit-blas-indices=0 > [0]PETSC ERROR: #3 VecGetArrayRead() line 1649 in > /usr/dataC/home/valera/petsc/src/vec/vec/interface/rvector.c > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] VecGetArrayRead line 1648 > /usr/dataC/home/valera/petsc/src/vec/vec/interface/rvector.c > [0]PETSC ERROR: [0] VecScatterEnd line 147 > /usr/dataC/home/valera/petsc/src/vec/vscat/interface/vscatfce.c > [0]PETSC ERROR: [0] VecScatterBegin line 82 > /usr/dataC/home/valera/petsc/src/vec/vscat/interface/vscatfce.c > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.10.4, unknown > [0]PETSC ERROR: ./gcmSeamount on a petsc-debug named ocean by valera Tue > Mar 12 17:59:43 2019 > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=8 > --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 > --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 > --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 > --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 > --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1 > PETSC_ARCH=petsc-debug --COPTFLAGS=-O2 --CXXOPTFLAGS=-O2 --FOPTFLAGS=-O2 > --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort > --with-shared-libraries=1 --with-debugging=1 --download-hypre --download-ml > --with-batch --known-mpi-shared-libraries=1 --known-64-bit-blas-indices=0 > [0]PETSC ERROR: #4 User provided function() line 0 in unknown file > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD Now, interesting part is, if i comment the VecDestroy(seqvec,...) call, this error doesn't show up, at least for this part of the code. Can you help me understand this error and how to fix it? I know the error is after the !send to root flag of the code, Thanks, Manuel On Tue, Mar 12, 2019 at 3:58 PM Jed Brown wrote: > Did you just update to 'master'? See VecScatter changes: > > https://www.mcs.anl.gov/petsc/documentation/changes/dev.html > > Manuel Valera via petsc-users writes: > > > Hello, > > > > I just updated petsc from the repo to the latest master branch version, > and > > a compilation problem popped up, it seems like the variable types are not > > being acknowledged properly, what i have in a minimum working example > > fashion is: > > > > #include > >> #include > >> #include > >> #include > >> #include > >> USE petscvec > >> USE petscdmda > >> USE petscdm > >> USE petscis > >> USE petscksp > >> IS :: ScalarIS > >> IS :: DummyIS > >> VecScatter :: LargerToSmaller,to0,from0 > >> VecScatter :: SmallerToLarger > >> PetscInt, ALLOCATABLE :: pScalarDA(:), pDummyDA(:) > >> PetscScalar :: rtol > >> Vec :: Vec1 > >> Vec :: Vec2 > >> ! Create index sets > >> allocate( pScalarDA(0:(gridx-1)*(gridy-1)*(gridz-1)-1) , > >> pDummyDA(0:(gridx-1)*(gridy-1)*(gridz-1)-1) ) > >> iter=0 > >> do k=0,gridz-2 > >> kplane = k*gridx*gridy > >> do j=0,gridy-2 > >> do i=0,gridx-2 > >> pScalarDA(iter) = kplane + j*(gridx) + i > >> iter = iter+1 > >> enddo > >> enddo > >> enddo > >> pDummyDA = (/ (ind, > ind=0,((gridx-1)*(gridy-1)*(gridz-1))-1) /) > >> call > >> ISCreateGeneral(PETSC_COMM_WORLD,(gridx-1)*(gridy-1)*(gridz-1), & > >> > >> pScalarDA,PETSC_COPY_VALUES,ScalarIS,ierr) > >> call > >> ISCreateGeneral(PETSC_COMM_WORLD,(gridx-1)*(gridy-1)*(gridz-1), & > >> > >> pDummyDA,PETSC_COPY_VALUES,DummyIS,ierr) > >> deallocate(pScalarDA,pDummyDA, STAT=ierr) > >> ! Create VecScatter contexts: LargerToSmaller & > SmallerToLarger > >> call DMDACreateNaturalVector(daScalars,Vec1,ierr) > >> call DMDACreateNaturalVector(daDummy,Vec2,ierr) > >> call > >> VecScatterCreate(Vec1,ScalarIS,Vec2,DummyIS,LargerToSmaller,ierr) > >> call > >> VecScatterCreate(Vec2,DummyIS,Vec1,ScalarIS,SmallerToLarger,ierr) > >> call VecDestroy(Vec1,ierr) > >> call VecDestroy(Vec2,ierr) > > > > > > And the error i get is the part i cannot really understand: > > > > matrixobjs.f90:99.34: > >> call > >> VecScatterCreate(Vec1,ScalarIS,Vec2,DummyIS,LargerToSmaller,ie > >> 1 > >> Error: Type mismatch in argument 'a' at (1); passed TYPE(tvec) to > >> INTEGER(4) > >> matrixobjs.f90:100.34: > >> call > >> VecScatterCreate(Vec2,DummyIS,Vec1,ScalarIS,SmallerToLarger,ie > >> 1 > >> Error: Type mismatch in argument 'a' at (1); passed TYPE(tvec) to > >> INTEGER(4) > >> make[1]: *** [matrixobjs.o] Error 1 > >> make[1]: Leaving directory `/usr/scratch/valera/ParGCCOM-Master/Src' > >> make: *** [gcmSeamount] Error 2 > > > > > > What i find hard to understand is why/where my code is finding an integer > > type? as you can see from the MWE header the variables types look > correct, > > > > Any help is appreaciated, > > > > Thanks, > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Tue Mar 12 20:09:12 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Wed, 13 Mar 2019 01:09:12 +0000 Subject: [petsc-users] PetscScatterCreate type mismatch after update. In-Reply-To: <87ef7bya2y.fsf@jedbrown.org> References: <87ef7bya2y.fsf@jedbrown.org> Message-ID: Manuel, I was working on a branch to revert the VecScatterCreate to VecScatterCreateWithData change. The change broke PETSc API and I think we do not need it. I had planed to do a pull request after my another PR is merged. But since it already affects you, you can try this branch now, which is jczhang/fix-vecscattercreate-api Thanks. --Junchao Zhang On Tue, Mar 12, 2019 at 5:58 PM Jed Brown via petsc-users > wrote: Did you just update to 'master'? See VecScatter changes: https://www.mcs.anl.gov/petsc/documentation/changes/dev.html Manuel Valera via petsc-users > writes: > Hello, > > I just updated petsc from the repo to the latest master branch version, and > a compilation problem popped up, it seems like the variable types are not > being acknowledged properly, what i have in a minimum working example > fashion is: > > #include >> #include >> #include >> #include >> #include >> USE petscvec >> USE petscdmda >> USE petscdm >> USE petscis >> USE petscksp >> IS :: ScalarIS >> IS :: DummyIS >> VecScatter :: LargerToSmaller,to0,from0 >> VecScatter :: SmallerToLarger >> PetscInt, ALLOCATABLE :: pScalarDA(:), pDummyDA(:) >> PetscScalar :: rtol >> Vec :: Vec1 >> Vec :: Vec2 >> ! Create index sets >> allocate( pScalarDA(0:(gridx-1)*(gridy-1)*(gridz-1)-1) , >> pDummyDA(0:(gridx-1)*(gridy-1)*(gridz-1)-1) ) >> iter=0 >> do k=0,gridz-2 >> kplane = k*gridx*gridy >> do j=0,gridy-2 >> do i=0,gridx-2 >> pScalarDA(iter) = kplane + j*(gridx) + i >> iter = iter+1 >> enddo >> enddo >> enddo >> pDummyDA = (/ (ind, ind=0,((gridx-1)*(gridy-1)*(gridz-1))-1) /) >> call >> ISCreateGeneral(PETSC_COMM_WORLD,(gridx-1)*(gridy-1)*(gridz-1), & >> >> pScalarDA,PETSC_COPY_VALUES,ScalarIS,ierr) >> call >> ISCreateGeneral(PETSC_COMM_WORLD,(gridx-1)*(gridy-1)*(gridz-1), & >> >> pDummyDA,PETSC_COPY_VALUES,DummyIS,ierr) >> deallocate(pScalarDA,pDummyDA, STAT=ierr) >> ! Create VecScatter contexts: LargerToSmaller & SmallerToLarger >> call DMDACreateNaturalVector(daScalars,Vec1,ierr) >> call DMDACreateNaturalVector(daDummy,Vec2,ierr) >> call >> VecScatterCreate(Vec1,ScalarIS,Vec2,DummyIS,LargerToSmaller,ierr) >> call >> VecScatterCreate(Vec2,DummyIS,Vec1,ScalarIS,SmallerToLarger,ierr) >> call VecDestroy(Vec1,ierr) >> call VecDestroy(Vec2,ierr) > > > And the error i get is the part i cannot really understand: > > matrixobjs.f90:99.34: >> call >> VecScatterCreate(Vec1,ScalarIS,Vec2,DummyIS,LargerToSmaller,ie >> 1 >> Error: Type mismatch in argument 'a' at (1); passed TYPE(tvec) to >> INTEGER(4) >> matrixobjs.f90:100.34: >> call >> VecScatterCreate(Vec2,DummyIS,Vec1,ScalarIS,SmallerToLarger,ie >> 1 >> Error: Type mismatch in argument 'a' at (1); passed TYPE(tvec) to >> INTEGER(4) >> make[1]: *** [matrixobjs.o] Error 1 >> make[1]: Leaving directory `/usr/scratch/valera/ParGCCOM-Master/Src' >> make: *** [gcmSeamount] Error 2 > > > What i find hard to understand is why/where my code is finding an integer > type? as you can see from the MWE header the variables types look correct, > > Any help is appreaciated, > > Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera-w at sdsu.edu Tue Mar 12 20:19:28 2019 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Tue, 12 Mar 2019 18:19:28 -0700 Subject: [petsc-users] PetscScatterCreate type mismatch after update. In-Reply-To: References: <87ef7bya2y.fsf@jedbrown.org> Message-ID: Hi Mr Zhang, thanks for your reply, I just checked your branch out, reconfigured and recompiled and i am still getting the same error from my last email (null argument, when expected a valid pointer), do you have any idea why this can be happening? Thanks so much, Manuel On Tue, Mar 12, 2019 at 6:09 PM Zhang, Junchao wrote: > Manuel, > I was working on a branch to revert the VecScatterCreate to > VecScatterCreateWithData change. The change broke PETSc API and I think we > do not need it. I had planed to do a pull request after my another PR is > merged. > But since it already affects you, you can try this branch now, which is > jczhang/fix-vecscattercreate-api > > Thanks. > --Junchao Zhang > > > On Tue, Mar 12, 2019 at 5:58 PM Jed Brown via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Did you just update to 'master'? See VecScatter changes: >> >> https://www.mcs.anl.gov/petsc/documentation/changes/dev.html >> >> Manuel Valera via petsc-users writes: >> >> > Hello, >> > >> > I just updated petsc from the repo to the latest master branch version, >> and >> > a compilation problem popped up, it seems like the variable types are >> not >> > being acknowledged properly, what i have in a minimum working example >> > fashion is: >> > >> > #include >> >> #include >> >> #include >> >> #include >> >> #include >> >> USE petscvec >> >> USE petscdmda >> >> USE petscdm >> >> USE petscis >> >> USE petscksp >> >> IS :: ScalarIS >> >> IS :: DummyIS >> >> VecScatter :: LargerToSmaller,to0,from0 >> >> VecScatter :: SmallerToLarger >> >> PetscInt, ALLOCATABLE :: pScalarDA(:), pDummyDA(:) >> >> PetscScalar :: rtol >> >> Vec :: Vec1 >> >> Vec :: Vec2 >> >> ! Create index sets >> >> allocate( pScalarDA(0:(gridx-1)*(gridy-1)*(gridz-1)-1) , >> >> pDummyDA(0:(gridx-1)*(gridy-1)*(gridz-1)-1) ) >> >> iter=0 >> >> do k=0,gridz-2 >> >> kplane = k*gridx*gridy >> >> do j=0,gridy-2 >> >> do i=0,gridx-2 >> >> pScalarDA(iter) = kplane + j*(gridx) + i >> >> iter = iter+1 >> >> enddo >> >> enddo >> >> enddo >> >> pDummyDA = (/ (ind, >> ind=0,((gridx-1)*(gridy-1)*(gridz-1))-1) /) >> >> call >> >> ISCreateGeneral(PETSC_COMM_WORLD,(gridx-1)*(gridy-1)*(gridz-1), & >> >> >> >> pScalarDA,PETSC_COPY_VALUES,ScalarIS,ierr) >> >> call >> >> ISCreateGeneral(PETSC_COMM_WORLD,(gridx-1)*(gridy-1)*(gridz-1), & >> >> >> >> pDummyDA,PETSC_COPY_VALUES,DummyIS,ierr) >> >> deallocate(pScalarDA,pDummyDA, STAT=ierr) >> >> ! Create VecScatter contexts: LargerToSmaller & >> SmallerToLarger >> >> call DMDACreateNaturalVector(daScalars,Vec1,ierr) >> >> call DMDACreateNaturalVector(daDummy,Vec2,ierr) >> >> call >> >> VecScatterCreate(Vec1,ScalarIS,Vec2,DummyIS,LargerToSmaller,ierr) >> >> call >> >> VecScatterCreate(Vec2,DummyIS,Vec1,ScalarIS,SmallerToLarger,ierr) >> >> call VecDestroy(Vec1,ierr) >> >> call VecDestroy(Vec2,ierr) >> > >> > >> > And the error i get is the part i cannot really understand: >> > >> > matrixobjs.f90:99.34: >> >> call >> >> VecScatterCreate(Vec1,ScalarIS,Vec2,DummyIS,LargerToSmaller,ie >> >> 1 >> >> Error: Type mismatch in argument 'a' at (1); passed TYPE(tvec) to >> >> INTEGER(4) >> >> matrixobjs.f90:100.34: >> >> call >> >> VecScatterCreate(Vec2,DummyIS,Vec1,ScalarIS,SmallerToLarger,ie >> >> 1 >> >> Error: Type mismatch in argument 'a' at (1); passed TYPE(tvec) to >> >> INTEGER(4) >> >> make[1]: *** [matrixobjs.o] Error 1 >> >> make[1]: Leaving directory `/usr/scratch/valera/ParGCCOM-Master/Src' >> >> make: *** [gcmSeamount] Error 2 >> > >> > >> > What i find hard to understand is why/where my code is finding an >> integer >> > type? as you can see from the MWE header the variables types look >> correct, >> > >> > Any help is appreaciated, >> > >> > Thanks, >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Tue Mar 12 20:42:52 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Wed, 13 Mar 2019 01:42:52 +0000 Subject: [petsc-users] PetscScatterCreate type mismatch after update. In-Reply-To: References: <87ef7bya2y.fsf@jedbrown.org> Message-ID: Maybe you should delete your PETSC_ARCH directory and recompile it? I tested my branch. It should not that easily fail :) --Junchao Zhang On Tue, Mar 12, 2019 at 8:20 PM Manuel Valera > wrote: Hi Mr Zhang, thanks for your reply, I just checked your branch out, reconfigured and recompiled and i am still getting the same error from my last email (null argument, when expected a valid pointer), do you have any idea why this can be happening? Thanks so much, Manuel On Tue, Mar 12, 2019 at 6:09 PM Zhang, Junchao > wrote: Manuel, I was working on a branch to revert the VecScatterCreate to VecScatterCreateWithData change. The change broke PETSc API and I think we do not need it. I had planed to do a pull request after my another PR is merged. But since it already affects you, you can try this branch now, which is jczhang/fix-vecscattercreate-api Thanks. --Junchao Zhang On Tue, Mar 12, 2019 at 5:58 PM Jed Brown via petsc-users > wrote: Did you just update to 'master'? See VecScatter changes: https://www.mcs.anl.gov/petsc/documentation/changes/dev.html Manuel Valera via petsc-users > writes: > Hello, > > I just updated petsc from the repo to the latest master branch version, and > a compilation problem popped up, it seems like the variable types are not > being acknowledged properly, what i have in a minimum working example > fashion is: > > #include >> #include >> #include >> #include >> #include >> USE petscvec >> USE petscdmda >> USE petscdm >> USE petscis >> USE petscksp >> IS :: ScalarIS >> IS :: DummyIS >> VecScatter :: LargerToSmaller,to0,from0 >> VecScatter :: SmallerToLarger >> PetscInt, ALLOCATABLE :: pScalarDA(:), pDummyDA(:) >> PetscScalar :: rtol >> Vec :: Vec1 >> Vec :: Vec2 >> ! Create index sets >> allocate( pScalarDA(0:(gridx-1)*(gridy-1)*(gridz-1)-1) , >> pDummyDA(0:(gridx-1)*(gridy-1)*(gridz-1)-1) ) >> iter=0 >> do k=0,gridz-2 >> kplane = k*gridx*gridy >> do j=0,gridy-2 >> do i=0,gridx-2 >> pScalarDA(iter) = kplane + j*(gridx) + i >> iter = iter+1 >> enddo >> enddo >> enddo >> pDummyDA = (/ (ind, ind=0,((gridx-1)*(gridy-1)*(gridz-1))-1) /) >> call >> ISCreateGeneral(PETSC_COMM_WORLD,(gridx-1)*(gridy-1)*(gridz-1), & >> >> pScalarDA,PETSC_COPY_VALUES,ScalarIS,ierr) >> call >> ISCreateGeneral(PETSC_COMM_WORLD,(gridx-1)*(gridy-1)*(gridz-1), & >> >> pDummyDA,PETSC_COPY_VALUES,DummyIS,ierr) >> deallocate(pScalarDA,pDummyDA, STAT=ierr) >> ! Create VecScatter contexts: LargerToSmaller & SmallerToLarger >> call DMDACreateNaturalVector(daScalars,Vec1,ierr) >> call DMDACreateNaturalVector(daDummy,Vec2,ierr) >> call >> VecScatterCreate(Vec1,ScalarIS,Vec2,DummyIS,LargerToSmaller,ierr) >> call >> VecScatterCreate(Vec2,DummyIS,Vec1,ScalarIS,SmallerToLarger,ierr) >> call VecDestroy(Vec1,ierr) >> call VecDestroy(Vec2,ierr) > > > And the error i get is the part i cannot really understand: > > matrixobjs.f90:99.34: >> call >> VecScatterCreate(Vec1,ScalarIS,Vec2,DummyIS,LargerToSmaller,ie >> 1 >> Error: Type mismatch in argument 'a' at (1); passed TYPE(tvec) to >> INTEGER(4) >> matrixobjs.f90:100.34: >> call >> VecScatterCreate(Vec2,DummyIS,Vec1,ScalarIS,SmallerToLarger,ie >> 1 >> Error: Type mismatch in argument 'a' at (1); passed TYPE(tvec) to >> INTEGER(4) >> make[1]: *** [matrixobjs.o] Error 1 >> make[1]: Leaving directory `/usr/scratch/valera/ParGCCOM-Master/Src' >> make: *** [gcmSeamount] Error 2 > > > What i find hard to understand is why/where my code is finding an integer > type? as you can see from the MWE header the variables types look correct, > > Any help is appreaciated, > > Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera-w at sdsu.edu Tue Mar 12 20:44:23 2019 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Tue, 12 Mar 2019 18:44:23 -0700 Subject: [petsc-users] PetscScatterCreate type mismatch after update. In-Reply-To: References: <87ef7bya2y.fsf@jedbrown.org> Message-ID: Ok i'll try that and let you know, for the time being i reverted to 3.9 to finish a paper, will update after that :) On Tue, Mar 12, 2019 at 6:42 PM Zhang, Junchao wrote: > Maybe you should delete your PETSC_ARCH directory and recompile it? I > tested my branch. It should not that easily fail :) > > --Junchao Zhang > > > On Tue, Mar 12, 2019 at 8:20 PM Manuel Valera wrote: > >> Hi Mr Zhang, thanks for your reply, >> >> I just checked your branch out, reconfigured and recompiled and i am >> still getting the same error from my last email (null argument, when >> expected a valid pointer), do you have any idea why this can be happening? >> >> Thanks so much, >> >> Manuel >> >> On Tue, Mar 12, 2019 at 6:09 PM Zhang, Junchao >> wrote: >> >>> Manuel, >>> I was working on a branch to revert the VecScatterCreate to >>> VecScatterCreateWithData change. The change broke PETSc API and I think we >>> do not need it. I had planed to do a pull request after my another PR is >>> merged. >>> But since it already affects you, you can try this branch now, which is >>> jczhang/fix-vecscattercreate-api >>> >>> Thanks. >>> --Junchao Zhang >>> >>> >>> On Tue, Mar 12, 2019 at 5:58 PM Jed Brown via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Did you just update to 'master'? See VecScatter changes: >>>> >>>> https://www.mcs.anl.gov/petsc/documentation/changes/dev.html >>>> >>>> Manuel Valera via petsc-users writes: >>>> >>>> > Hello, >>>> > >>>> > I just updated petsc from the repo to the latest master branch >>>> version, and >>>> > a compilation problem popped up, it seems like the variable types are >>>> not >>>> > being acknowledged properly, what i have in a minimum working example >>>> > fashion is: >>>> > >>>> > #include >>>> >> #include >>>> >> #include >>>> >> #include >>>> >> #include >>>> >> USE petscvec >>>> >> USE petscdmda >>>> >> USE petscdm >>>> >> USE petscis >>>> >> USE petscksp >>>> >> IS :: ScalarIS >>>> >> IS :: DummyIS >>>> >> VecScatter :: LargerToSmaller,to0,from0 >>>> >> VecScatter :: SmallerToLarger >>>> >> PetscInt, ALLOCATABLE :: pScalarDA(:), pDummyDA(:) >>>> >> PetscScalar :: rtol >>>> >> Vec :: Vec1 >>>> >> Vec :: Vec2 >>>> >> ! Create index sets >>>> >> allocate( pScalarDA(0:(gridx-1)*(gridy-1)*(gridz-1)-1) , >>>> >> pDummyDA(0:(gridx-1)*(gridy-1)*(gridz-1)-1) ) >>>> >> iter=0 >>>> >> do k=0,gridz-2 >>>> >> kplane = k*gridx*gridy >>>> >> do j=0,gridy-2 >>>> >> do i=0,gridx-2 >>>> >> pScalarDA(iter) = kplane + j*(gridx) + i >>>> >> iter = iter+1 >>>> >> enddo >>>> >> enddo >>>> >> enddo >>>> >> pDummyDA = (/ (ind, >>>> ind=0,((gridx-1)*(gridy-1)*(gridz-1))-1) /) >>>> >> call >>>> >> ISCreateGeneral(PETSC_COMM_WORLD,(gridx-1)*(gridy-1)*(gridz-1), & >>>> >> >>>> >> pScalarDA,PETSC_COPY_VALUES,ScalarIS,ierr) >>>> >> call >>>> >> ISCreateGeneral(PETSC_COMM_WORLD,(gridx-1)*(gridy-1)*(gridz-1), & >>>> >> >>>> >> pDummyDA,PETSC_COPY_VALUES,DummyIS,ierr) >>>> >> deallocate(pScalarDA,pDummyDA, STAT=ierr) >>>> >> ! Create VecScatter contexts: LargerToSmaller & >>>> SmallerToLarger >>>> >> call DMDACreateNaturalVector(daScalars,Vec1,ierr) >>>> >> call DMDACreateNaturalVector(daDummy,Vec2,ierr) >>>> >> call >>>> >> VecScatterCreate(Vec1,ScalarIS,Vec2,DummyIS,LargerToSmaller,ierr) >>>> >> call >>>> >> VecScatterCreate(Vec2,DummyIS,Vec1,ScalarIS,SmallerToLarger,ierr) >>>> >> call VecDestroy(Vec1,ierr) >>>> >> call VecDestroy(Vec2,ierr) >>>> > >>>> > >>>> > And the error i get is the part i cannot really understand: >>>> > >>>> > matrixobjs.f90:99.34: >>>> >> call >>>> >> VecScatterCreate(Vec1,ScalarIS,Vec2,DummyIS,LargerToSmaller,ie >>>> >> 1 >>>> >> Error: Type mismatch in argument 'a' at (1); passed TYPE(tvec) to >>>> >> INTEGER(4) >>>> >> matrixobjs.f90:100.34: >>>> >> call >>>> >> VecScatterCreate(Vec2,DummyIS,Vec1,ScalarIS,SmallerToLarger,ie >>>> >> 1 >>>> >> Error: Type mismatch in argument 'a' at (1); passed TYPE(tvec) to >>>> >> INTEGER(4) >>>> >> make[1]: *** [matrixobjs.o] Error 1 >>>> >> make[1]: Leaving directory `/usr/scratch/valera/ParGCCOM-Master/Src' >>>> >> make: *** [gcmSeamount] Error 2 >>>> > >>>> > >>>> > What i find hard to understand is why/where my code is finding an >>>> integer >>>> > type? as you can see from the MWE header the variables types look >>>> correct, >>>> > >>>> > Any help is appreaciated, >>>> > >>>> > Thanks, >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Mar 12 22:37:28 2019 From: jed at jedbrown.org (Jed Brown) Date: Tue, 12 Mar 2019 21:37:28 -0600 Subject: [petsc-users] PetscScatterCreate type mismatch after update. In-Reply-To: References: <87ef7bya2y.fsf@jedbrown.org> Message-ID: <875zsnxx53.fsf@jedbrown.org> Manuel Valera writes: > Ok i'll try that and let you know, for the time being i reverted to 3.9 to > finish a paper, will update after that :) 3.10 will also work. From jed at jedbrown.org Wed Mar 13 00:10:43 2019 From: jed at jedbrown.org (Jed Brown) Date: Tue, 12 Mar 2019 23:10:43 -0600 Subject: [petsc-users] Invite to Fluid Dynamics Software Infrastructure Workshop: April 12-13 at CU Boulder Message-ID: <871s3bfjfw.fsf@jedbrown.org> For PETSc users/developers interested in software and data infrastructure for fluid dynamics: We are excited to invite you to attend the second workshop to aid in the conceptualization of FDSI, a potential NSF-sponsored Institute dedicated to Fluid Dynamics Software Infrastructure. The workshop will be in Boulder, CO and will start at 1:00 PM on Friday April 12th and continue until 5:00 PM on April 13th. Registration and further info about FDSI is available in the links below. Registration (free; travel support available) https://www.colorado.edu/events/cfdsi/fdsi-full-community-workshop GitHub repo of May 2018 Kickoff Workshop: https://github.com/FDSI/Kickoff_Workshop Community Needs Assessment based on 2018 Kickoff Workshop: https://www.colorado.edu/events/cfdsi/2018-kick-work/Summary FDSI Discussion Forum: https://gitter.im/FDSI/community# We also ask for your help in growing the community by either forwarding this email to any of your colleagues with an interest in fluid dynamics software or by nominating them for personal invites to this and future FDSI workshops: https://www.colorado.edu/events/cfdsi/content/grow-fdsi We hope to see you in Boulder and/or in online virtual discussions of fluid dynamics software infrastructure. Thanks, FDSI Conceptualization Management Team From mbuerkle at web.de Wed Mar 13 02:37:25 2019 From: mbuerkle at web.de (Marius Buerkle) Date: Wed, 13 Mar 2019 08:37:25 +0100 Subject: [petsc-users] MatCompositeMerge + MatCreateRedundantMatrix In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 13 05:35:48 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 13 Mar 2019 06:35:48 -0400 Subject: [petsc-users] MatCompositeMerge + MatCreateRedundantMatrix In-Reply-To: References: Message-ID: On Wed, Mar 13, 2019 at 3:37 AM Marius Buerkle wrote: > Indeed, was very easy to add. Are you going to include the Fortran > interface for MPICreateSubMatricesMPI in future releases of PETSC ? > > Yes. You can either make a PR yourself at Bitbucket, or send me the diff and I will make a PR with your name. > Regarding my initial problem, thanks a lot. It works very well with > MPICreateSubMatricesMPI and the solution can be implemented in a few > lines. > > Glad its working. If you want that incorporated into the source, we can do another PR. Thanks, Matt > Thanks and Best, > > Marius > > > On Tue, Mar 12, 2019 at 4:50 AM Marius Buerkle wrote: > >> I tried to follow your suggestions but it seems there is >> no MPICreateSubMatricesMPI for Fortran. Is this correct? >> > > We just have to write the binding. Its almost identical to > MatCreateSubMatrices() in src/mat/interface/ftn-custom/zmatrixf.c > > Matt > >> >> On Wed, Feb 20, 2019 at 6:57 PM Marius Buerkle wrote: >> >>> ok, I think I understand now. I will give it a try and if there is some >>> trouble comeback to you. thanks. >>> >> >> Cool. >> >> Matt >> >> >>> >>> marius >>> >>> On Tue, Feb 19, 2019 at 8:42 PM Marius Buerkle wrote: >>> >>>> ok, so it seems there is no straight forward way to transfer data >>>> between PETSc matrices on different subcomms. Probably doing it by "hand" >>>> extracting the matricies on the subcomms create a MPI_INTERCOMM transfering >>>> the data to PETSC_COMM_WORLD and assembling them in a new PETSc matrix >>>> would be possible, right? >>>> >>> >>> That sounds too complicated. Why not just reverse >>> MPICreateSubMatricesMPI()? Meaning make it collective on the whole big >>> communicator, so that you can swap out all the subcommunicator for the >>> aggregation call, just like we do in that function. >>> Then its really just a matter of reversing the communication call. >>> >>> Matt >>> >>>> >>>> On Tue, Feb 19, 2019 at 7:12 PM Marius Buerkle wrote: >>>> >>>>> I see. This would work if the matrices are on different >>>>> subcommumicators ? Is it possible to add this functionality ? >>>>> >>>> >>>> Hmm, no. That is specialized to serial matrices. You need the inverse >>>> of MatCreateSubMatricesMPI(). >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> marius >>>>> >>>>> >>>>> You basically need the inverse of MatCreateSubmatrices(). I do not >>>>> think we have that right now, but it could probably be done without too >>>>> much trouble by looking at that code. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> On Tue, Feb 19, 2019 at 6:15 AM Marius Buerkle via petsc-users < >>>>> petsc-users at mcs.anl.gov> wrote: >>>>> >>>>>> Hi ! >>>>>> >>>>>> Is there some way to combine MatCompositeMerge >>>>>> with MatCreateRedundantMatrix? I basically want to create copies of a >>>>>> matrix from PETSC_COMM_WORLD to subcommunicators, do some work on each >>>>>> subcommunicator and than gather the results back to PETSC_COMM_WORLD, >>>>>> namely I want to sum the the invidual matrices from the subcommunicatos >>>>>> component wise and get the resulting matrix on PETSC_COMM_WORLD. Is this >>>>>> somehow possible without going through all the hassle of using MPI >>>>>> directly? >>>>>> >>>>>> marius >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.colera at upm.es Wed Mar 13 07:14:37 2019 From: m.colera at upm.es (Manuel Colera Rico) Date: Wed, 13 Mar 2019 13:14:37 +0100 Subject: [petsc-users] PCFieldSplit with MatNest In-Reply-To: References: Message-ID: <239f89b0-cf7d-4dde-2621-e185ead3ed17@upm.es> Hi, Junchao, I have installed the newest version of PETSc and it works fine. I just get the following memory leak warning: Direct leak of 28608 byte(s) in 12 object(s) allocated from: ??? #0 0x7f1ddd5caa38 in __interceptor_memalign ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:111 ??? #1 0x7f1ddbef1213 in PetscMallocAlign (/opt/PETSc_library/petsc-3.10.4/mcr_20190313/lib/libpetsc.so.3.10+0x150213) Thank you, Manuel --- On 3/12/19 7:08 PM, Zhang, Junchao wrote: > Hi, Manuel, > ? I recently fixed a problem in VecRestoreArrayRead. Basically, I > added VecRestoreArrayRead_Nest. Could you try the master branch of > PETSc to see if it fixes your problem? > ? Thanks. > > --Junchao Zhang > > > On Mon, Mar 11, 2019 at 6:56 AM Manuel Colera Rico via petsc-users > > wrote: > > Hello, > > I need to solve a 2*2 block linear system. The matrices A_00, A_01, > A_10, A_11 are constructed separately via > MatCreateSeqAIJWithArrays and > MatCreateSeqSBAIJWithArrays. Then, I construct the full system matrix > with MatCreateNest, and use MatNestGetISs and PCFieldSplitSetIS to > set > up the PC, trying to follow the procedure described here: > https://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex70.c.html. > > However, when I run the code with Leak Sanitizer, I get the > following error: > > ================================================================= > ==54927==ERROR: AddressSanitizer: attempting free on address which > was > not malloc()-ed: 0x627000051ab8 in thread T0 > ???? #0 0x7fbd95c08f30 in __interceptor_free > ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:66 > ???? #1 0x7fbd92b99dcd in PetscFreeAlign > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x146dcd) > ???? #2 0x7fbd92ce0178 in VecRestoreArray_Nest > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28d178) > ???? #3 0x7fbd92cd627d in VecRestoreArrayRead > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28327d) > ???? #4 0x7fbd92d1189e in VecScatterBegin_SSToSS > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2be89e) > ???? #5 0x7fbd92d1a414 in VecScatterBegin > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2c7414) > ???? #6 0x7fbd934a999c in PCApply_FieldSplit > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa5699c) > ???? #7 0x7fbd93369071 in PCApply > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x916071) > ???? #8 0x7fbd934efe77 in KSPInitialResidual > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa9ce77) > ???? #9 0x7fbd9350272c in KSPSolve_GMRES > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xaaf72c) > ???? #10 0x7fbd934e3c01 in KSPSolve > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa90c01) > > Disabling Leak Sanitizer also outputs an "invalid pointer" error. > > Did I forget something when writing the code? > > Thank you, > > Manuel > > --- > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Mar 13 08:28:38 2019 From: jed at jedbrown.org (Jed Brown) Date: Wed, 13 Mar 2019 07:28:38 -0600 Subject: [petsc-users] PCFieldSplit with MatNest In-Reply-To: <239f89b0-cf7d-4dde-2621-e185ead3ed17@upm.es> References: <239f89b0-cf7d-4dde-2621-e185ead3ed17@upm.es> Message-ID: <874l86ewe1.fsf@jedbrown.org> Is there any output if you run with -malloc_dump? Manuel Colera Rico via petsc-users writes: > Hi, Junchao, > > I have installed the newest version of PETSc and it works fine. I just > get the following memory leak warning: > > Direct leak of 28608 byte(s) in 12 object(s) allocated from: > ??? #0 0x7f1ddd5caa38 in __interceptor_memalign > ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:111 > ??? #1 0x7f1ddbef1213 in PetscMallocAlign > (/opt/PETSc_library/petsc-3.10.4/mcr_20190313/lib/libpetsc.so.3.10+0x150213) > > Thank you, > > Manuel > > --- > > On 3/12/19 7:08 PM, Zhang, Junchao wrote: >> Hi, Manuel, >> ? I recently fixed a problem in VecRestoreArrayRead. Basically, I >> added VecRestoreArrayRead_Nest. Could you try the master branch of >> PETSc to see if it fixes your problem? >> ? Thanks. >> >> --Junchao Zhang >> >> >> On Mon, Mar 11, 2019 at 6:56 AM Manuel Colera Rico via petsc-users >> > wrote: >> >> Hello, >> >> I need to solve a 2*2 block linear system. The matrices A_00, A_01, >> A_10, A_11 are constructed separately via >> MatCreateSeqAIJWithArrays and >> MatCreateSeqSBAIJWithArrays. Then, I construct the full system matrix >> with MatCreateNest, and use MatNestGetISs and PCFieldSplitSetIS to >> set >> up the PC, trying to follow the procedure described here: >> https://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex70.c.html. >> >> However, when I run the code with Leak Sanitizer, I get the >> following error: >> >> ================================================================= >> ==54927==ERROR: AddressSanitizer: attempting free on address which >> was >> not malloc()-ed: 0x627000051ab8 in thread T0 >> ???? #0 0x7fbd95c08f30 in __interceptor_free >> ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:66 >> ???? #1 0x7fbd92b99dcd in PetscFreeAlign >> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x146dcd) >> ???? #2 0x7fbd92ce0178 in VecRestoreArray_Nest >> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28d178) >> ???? #3 0x7fbd92cd627d in VecRestoreArrayRead >> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28327d) >> ???? #4 0x7fbd92d1189e in VecScatterBegin_SSToSS >> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2be89e) >> ???? #5 0x7fbd92d1a414 in VecScatterBegin >> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2c7414) >> ???? #6 0x7fbd934a999c in PCApply_FieldSplit >> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa5699c) >> ???? #7 0x7fbd93369071 in PCApply >> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x916071) >> ???? #8 0x7fbd934efe77 in KSPInitialResidual >> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa9ce77) >> ???? #9 0x7fbd9350272c in KSPSolve_GMRES >> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xaaf72c) >> ???? #10 0x7fbd934e3c01 in KSPSolve >> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa90c01) >> >> Disabling Leak Sanitizer also outputs an "invalid pointer" error. >> >> Did I forget something when writing the code? >> >> Thank you, >> >> Manuel >> >> --- >> From m.colera at upm.es Wed Mar 13 08:44:42 2019 From: m.colera at upm.es (Manuel Colera Rico) Date: Wed, 13 Mar 2019 14:44:42 +0100 Subject: [petsc-users] PCFieldSplit with MatNest In-Reply-To: <874l86ewe1.fsf@jedbrown.org> References: <239f89b0-cf7d-4dde-2621-e185ead3ed17@upm.es> <874l86ewe1.fsf@jedbrown.org> Message-ID: <7bb230b5-8ca5-aacd-9f12-26c149aaeab3@upm.es> Yes: [ 0]8416 bytes MatCreateSeqSBAIJWithArrays() line 2431 in /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c [ 0]8416 bytes MatCreateSeqSBAIJWithArrays() line 2431 in /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c [ 0]4544 bytes MatCreateSeqSBAIJWithArrays() line 2431 in /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c [ 0]4544 bytes MatCreateSeqSBAIJWithArrays() line 2431 in /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c I have checked that I have destroyed all the MatNest matrices and all the submatrices individually. Manuel --- On 3/13/19 2:28 PM, Jed Brown wrote: > Is there any output if you run with -malloc_dump? > > Manuel Colera Rico via petsc-users writes: > >> Hi, Junchao, >> >> I have installed the newest version of PETSc and it works fine. I just >> get the following memory leak warning: >> >> Direct leak of 28608 byte(s) in 12 object(s) allocated from: >> ??? #0 0x7f1ddd5caa38 in __interceptor_memalign >> ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:111 >> ??? #1 0x7f1ddbef1213 in PetscMallocAlign >> (/opt/PETSc_library/petsc-3.10.4/mcr_20190313/lib/libpetsc.so.3.10+0x150213) >> >> Thank you, >> >> Manuel >> >> --- >> >> On 3/12/19 7:08 PM, Zhang, Junchao wrote: >>> Hi, Manuel, >>> ? I recently fixed a problem in VecRestoreArrayRead. Basically, I >>> added VecRestoreArrayRead_Nest. Could you try the master branch of >>> PETSc to see if it fixes your problem? >>> ? Thanks. >>> >>> --Junchao Zhang >>> >>> >>> On Mon, Mar 11, 2019 at 6:56 AM Manuel Colera Rico via petsc-users >>> > wrote: >>> >>> Hello, >>> >>> I need to solve a 2*2 block linear system. The matrices A_00, A_01, >>> A_10, A_11 are constructed separately via >>> MatCreateSeqAIJWithArrays and >>> MatCreateSeqSBAIJWithArrays. Then, I construct the full system matrix >>> with MatCreateNest, and use MatNestGetISs and PCFieldSplitSetIS to >>> set >>> up the PC, trying to follow the procedure described here: >>> https://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex70.c.html. >>> >>> However, when I run the code with Leak Sanitizer, I get the >>> following error: >>> >>> ================================================================= >>> ==54927==ERROR: AddressSanitizer: attempting free on address which >>> was >>> not malloc()-ed: 0x627000051ab8 in thread T0 >>> ???? #0 0x7fbd95c08f30 in __interceptor_free >>> ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:66 >>> ???? #1 0x7fbd92b99dcd in PetscFreeAlign >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x146dcd) >>> ???? #2 0x7fbd92ce0178 in VecRestoreArray_Nest >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28d178) >>> ???? #3 0x7fbd92cd627d in VecRestoreArrayRead >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28327d) >>> ???? #4 0x7fbd92d1189e in VecScatterBegin_SSToSS >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2be89e) >>> ???? #5 0x7fbd92d1a414 in VecScatterBegin >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2c7414) >>> ???? #6 0x7fbd934a999c in PCApply_FieldSplit >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa5699c) >>> ???? #7 0x7fbd93369071 in PCApply >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x916071) >>> ???? #8 0x7fbd934efe77 in KSPInitialResidual >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa9ce77) >>> ???? #9 0x7fbd9350272c in KSPSolve_GMRES >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xaaf72c) >>> ???? #10 0x7fbd934e3c01 in KSPSolve >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa90c01) >>> >>> Disabling Leak Sanitizer also outputs an "invalid pointer" error. >>> >>> Did I forget something when writing the code? >>> >>> Thank you, >>> >>> Manuel >>> >>> --- >>> From m.colera at upm.es Wed Mar 13 08:49:01 2019 From: m.colera at upm.es (Manuel Colera Rico) Date: Wed, 13 Mar 2019 14:49:01 +0100 Subject: [petsc-users] PCFieldSplit with MatNest In-Reply-To: <7bb230b5-8ca5-aacd-9f12-26c149aaeab3@upm.es> References: <239f89b0-cf7d-4dde-2621-e185ead3ed17@upm.es> <874l86ewe1.fsf@jedbrown.org> <7bb230b5-8ca5-aacd-9f12-26c149aaeab3@upm.es> Message-ID: The warning that Leak Sanitizer gives me is not what I wrote two messages before (I apologize). It is: Direct leak of 25920 byte(s) in 4 object(s) allocated from: ??? #0 0x7fa97e35aa38 in __interceptor_memalign ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:111 ??? #1 0x7fa97cc81213 in PetscMallocAlign (/opt/PETSc_library/petsc-3.10.4/mcr_20190313/lib/libpetsc.so.3.10+0x150213) which seems to be in accordance (at least in number of leaked bytes) to -malloc_dump's output. Manuel --- On 3/13/19 2:44 PM, Manuel Colera Rico wrote: > Yes: > > [ 0]8416 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > [ 0]8416 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > [ 0]4544 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > [ 0]4544 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > > I have checked that I have destroyed all the MatNest matrices and all > the submatrices individually. > > Manuel > > --- > > On 3/13/19 2:28 PM, Jed Brown wrote: >> Is there any output if you run with -malloc_dump? >> >> Manuel Colera Rico via petsc-users writes: >> >>> Hi, Junchao, >>> >>> I have installed the newest version of PETSc and it works fine. I just >>> get the following memory leak warning: >>> >>> Direct leak of 28608 byte(s) in 12 object(s) allocated from: >>> ? ??? #0 0x7f1ddd5caa38 in __interceptor_memalign >>> ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:111 >>> ? ??? #1 0x7f1ddbef1213 in PetscMallocAlign >>> (/opt/PETSc_library/petsc-3.10.4/mcr_20190313/lib/libpetsc.so.3.10+0x150213) >>> >>> >>> Thank you, >>> >>> Manuel >>> >>> --- >>> >>> On 3/12/19 7:08 PM, Zhang, Junchao wrote: >>>> Hi, Manuel, >>>> ?? I recently fixed a problem in VecRestoreArrayRead. Basically, I >>>> added VecRestoreArrayRead_Nest. Could you try the master branch of >>>> PETSc to see if it fixes your problem? >>>> ?? Thanks. >>>> >>>> --Junchao Zhang >>>> >>>> >>>> On Mon, Mar 11, 2019 at 6:56 AM Manuel Colera Rico via petsc-users >>>> > wrote: >>>> >>>> ???? Hello, >>>> >>>> ???? I need to solve a 2*2 block linear system. The matrices A_00, >>>> A_01, >>>> ???? A_10, A_11 are constructed separately via >>>> ???? MatCreateSeqAIJWithArrays and >>>> ???? MatCreateSeqSBAIJWithArrays. Then, I construct the full system >>>> matrix >>>> ???? with MatCreateNest, and use MatNestGetISs and >>>> PCFieldSplitSetIS to >>>> ???? set >>>> ???? up the PC, trying to follow the procedure described here: >>>> https://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex70.c.html. >>>> >>>> ???? However, when I run the code with Leak Sanitizer, I get the >>>> ???? following error: >>>> >>>> ================================================================= >>>> ???? ==54927==ERROR: AddressSanitizer: attempting free on address >>>> which >>>> ???? was >>>> ???? not malloc()-ed: 0x627000051ab8 in thread T0 >>>> ???? ???? #0 0x7fbd95c08f30 in __interceptor_free >>>> ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:66 >>>> ???? ???? #1 0x7fbd92b99dcd in PetscFreeAlign >>>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x146dcd) >>>> ???? ???? #2 0x7fbd92ce0178 in VecRestoreArray_Nest >>>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28d178) >>>> ???? ???? #3 0x7fbd92cd627d in VecRestoreArrayRead >>>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28327d) >>>> ???? ???? #4 0x7fbd92d1189e in VecScatterBegin_SSToSS >>>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2be89e) >>>> ???? ???? #5 0x7fbd92d1a414 in VecScatterBegin >>>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2c7414) >>>> ???? ???? #6 0x7fbd934a999c in PCApply_FieldSplit >>>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa5699c) >>>> ???? ???? #7 0x7fbd93369071 in PCApply >>>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x916071) >>>> ???? ???? #8 0x7fbd934efe77 in KSPInitialResidual >>>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa9ce77) >>>> ???? ???? #9 0x7fbd9350272c in KSPSolve_GMRES >>>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xaaf72c) >>>> ???? ???? #10 0x7fbd934e3c01 in KSPSolve >>>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa90c01) >>>> >>>> ???? Disabling Leak Sanitizer also outputs an "invalid pointer" error. >>>> >>>> ???? Did I forget something when writing the code? >>>> >>>> ???? Thank you, >>>> >>>> ???? Manuel >>>> >>>> ???? --- >>>> From knepley at gmail.com Wed Mar 13 09:04:34 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 13 Mar 2019 10:04:34 -0400 Subject: [petsc-users] PCFieldSplit with MatNest In-Reply-To: <7bb230b5-8ca5-aacd-9f12-26c149aaeab3@upm.es> References: <239f89b0-cf7d-4dde-2621-e185ead3ed17@upm.es> <874l86ewe1.fsf@jedbrown.org> <7bb230b5-8ca5-aacd-9f12-26c149aaeab3@upm.es> Message-ID: On Wed, Mar 13, 2019 at 9:44 AM Manuel Colera Rico via petsc-users < petsc-users at mcs.anl.gov> wrote: > Yes: > > [ 0]8416 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > [ 0]8416 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > [ 0]4544 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > [ 0]4544 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > Junchao, do imax and ilen get missed in the Destroy with the user provides arrays? https://bitbucket.org/petsc/petsc/src/06a3e802b3873ffbfd04b71a0821522327dd9b04/src/mat/impls/sbaij/seq/sbaij.c#lines-2431 Matt > I have checked that I have destroyed all the MatNest matrices and all > the submatrices individually. > > Manuel > > --- > > On 3/13/19 2:28 PM, Jed Brown wrote: > > Is there any output if you run with -malloc_dump? > > > > Manuel Colera Rico via petsc-users writes: > > > >> Hi, Junchao, > >> > >> I have installed the newest version of PETSc and it works fine. I just > >> get the following memory leak warning: > >> > >> Direct leak of 28608 byte(s) in 12 object(s) allocated from: > >> #0 0x7f1ddd5caa38 in __interceptor_memalign > >> ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:111 > >> #1 0x7f1ddbef1213 in PetscMallocAlign > >> > (/opt/PETSc_library/petsc-3.10.4/mcr_20190313/lib/libpetsc.so.3.10+0x150213) > >> > >> Thank you, > >> > >> Manuel > >> > >> --- > >> > >> On 3/12/19 7:08 PM, Zhang, Junchao wrote: > >>> Hi, Manuel, > >>> I recently fixed a problem in VecRestoreArrayRead. Basically, I > >>> added VecRestoreArrayRead_Nest. Could you try the master branch of > >>> PETSc to see if it fixes your problem? > >>> Thanks. > >>> > >>> --Junchao Zhang > >>> > >>> > >>> On Mon, Mar 11, 2019 at 6:56 AM Manuel Colera Rico via petsc-users > >>> > wrote: > >>> > >>> Hello, > >>> > >>> I need to solve a 2*2 block linear system. The matrices A_00, > A_01, > >>> A_10, A_11 are constructed separately via > >>> MatCreateSeqAIJWithArrays and > >>> MatCreateSeqSBAIJWithArrays. Then, I construct the full system > matrix > >>> with MatCreateNest, and use MatNestGetISs and PCFieldSplitSetIS to > >>> set > >>> up the PC, trying to follow the procedure described here: > >>> > https://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex70.c.html > . > >>> > >>> However, when I run the code with Leak Sanitizer, I get the > >>> following error: > >>> > >>> ================================================================= > >>> ==54927==ERROR: AddressSanitizer: attempting free on address which > >>> was > >>> not malloc()-ed: 0x627000051ab8 in thread T0 > >>> #0 0x7fbd95c08f30 in __interceptor_free > >>> ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:66 > >>> #1 0x7fbd92b99dcd in PetscFreeAlign > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x146dcd) > >>> #2 0x7fbd92ce0178 in VecRestoreArray_Nest > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28d178) > >>> #3 0x7fbd92cd627d in VecRestoreArrayRead > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28327d) > >>> #4 0x7fbd92d1189e in VecScatterBegin_SSToSS > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2be89e) > >>> #5 0x7fbd92d1a414 in VecScatterBegin > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2c7414) > >>> #6 0x7fbd934a999c in PCApply_FieldSplit > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa5699c) > >>> #7 0x7fbd93369071 in PCApply > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x916071) > >>> #8 0x7fbd934efe77 in KSPInitialResidual > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa9ce77) > >>> #9 0x7fbd9350272c in KSPSolve_GMRES > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xaaf72c) > >>> #10 0x7fbd934e3c01 in KSPSolve > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa90c01) > >>> > >>> Disabling Leak Sanitizer also outputs an "invalid pointer" error. > >>> > >>> Did I forget something when writing the code? > >>> > >>> Thank you, > >>> > >>> Manuel > >>> > >>> --- > >>> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Wed Mar 13 09:13:51 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Wed, 13 Mar 2019 14:13:51 +0000 Subject: [petsc-users] PCFieldSplit with MatNest In-Reply-To: References: <239f89b0-cf7d-4dde-2621-e185ead3ed17@upm.es> <874l86ewe1.fsf@jedbrown.org> <7bb230b5-8ca5-aacd-9f12-26c149aaeab3@upm.es> Message-ID: Manuel, Could you try to add this line sbaij->free_imax_ilen = PETSC_TRUE; after line 2431 in /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c PS: Matt, this bug looks unrelated to my VecRestoreArrayRead_Nest fix. --Junchao Zhang On Wed, Mar 13, 2019 at 9:05 AM Matthew Knepley > wrote: On Wed, Mar 13, 2019 at 9:44 AM Manuel Colera Rico via petsc-users > wrote: Yes: [ 0]8416 bytes MatCreateSeqSBAIJWithArrays() line 2431 in /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c [ 0]8416 bytes MatCreateSeqSBAIJWithArrays() line 2431 in /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c [ 0]4544 bytes MatCreateSeqSBAIJWithArrays() line 2431 in /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c [ 0]4544 bytes MatCreateSeqSBAIJWithArrays() line 2431 in /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c Junchao, do imax and ilen get missed in the Destroy with the user provides arrays? https://bitbucket.org/petsc/petsc/src/06a3e802b3873ffbfd04b71a0821522327dd9b04/src/mat/impls/sbaij/seq/sbaij.c#lines-2431 Matt I have checked that I have destroyed all the MatNest matrices and all the submatrices individually. Manuel --- On 3/13/19 2:28 PM, Jed Brown wrote: > Is there any output if you run with -malloc_dump? > > Manuel Colera Rico via petsc-users > writes: > >> Hi, Junchao, >> >> I have installed the newest version of PETSc and it works fine. I just >> get the following memory leak warning: >> >> Direct leak of 28608 byte(s) in 12 object(s) allocated from: >> #0 0x7f1ddd5caa38 in __interceptor_memalign >> ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:111 >> #1 0x7f1ddbef1213 in PetscMallocAlign >> (/opt/PETSc_library/petsc-3.10.4/mcr_20190313/lib/libpetsc.so.3.10+0x150213) >> >> Thank you, >> >> Manuel >> >> --- >> >> On 3/12/19 7:08 PM, Zhang, Junchao wrote: >>> Hi, Manuel, >>> I recently fixed a problem in VecRestoreArrayRead. Basically, I >>> added VecRestoreArrayRead_Nest. Could you try the master branch of >>> PETSc to see if it fixes your problem? >>> Thanks. >>> >>> --Junchao Zhang >>> >>> >>> On Mon, Mar 11, 2019 at 6:56 AM Manuel Colera Rico via petsc-users >>> >> wrote: >>> >>> Hello, >>> >>> I need to solve a 2*2 block linear system. The matrices A_00, A_01, >>> A_10, A_11 are constructed separately via >>> MatCreateSeqAIJWithArrays and >>> MatCreateSeqSBAIJWithArrays. Then, I construct the full system matrix >>> with MatCreateNest, and use MatNestGetISs and PCFieldSplitSetIS to >>> set >>> up the PC, trying to follow the procedure described here: >>> https://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex70.c.html. >>> >>> However, when I run the code with Leak Sanitizer, I get the >>> following error: >>> >>> ================================================================= >>> ==54927==ERROR: AddressSanitizer: attempting free on address which >>> was >>> not malloc()-ed: 0x627000051ab8 in thread T0 >>> #0 0x7fbd95c08f30 in __interceptor_free >>> ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:66 >>> #1 0x7fbd92b99dcd in PetscFreeAlign >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x146dcd) >>> #2 0x7fbd92ce0178 in VecRestoreArray_Nest >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28d178) >>> #3 0x7fbd92cd627d in VecRestoreArrayRead >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28327d) >>> #4 0x7fbd92d1189e in VecScatterBegin_SSToSS >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2be89e) >>> #5 0x7fbd92d1a414 in VecScatterBegin >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2c7414) >>> #6 0x7fbd934a999c in PCApply_FieldSplit >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa5699c) >>> #7 0x7fbd93369071 in PCApply >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x916071) >>> #8 0x7fbd934efe77 in KSPInitialResidual >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa9ce77) >>> #9 0x7fbd9350272c in KSPSolve_GMRES >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xaaf72c) >>> #10 0x7fbd934e3c01 in KSPSolve >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa90c01) >>> >>> Disabling Leak Sanitizer also outputs an "invalid pointer" error. >>> >>> Did I forget something when writing the code? >>> >>> Thank you, >>> >>> Manuel >>> >>> --- >>> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wencel at gmail.com Wed Mar 13 09:15:27 2019 From: wencel at gmail.com (Lawrence Mitchell) Date: Wed, 13 Mar 2019 14:15:27 +0000 Subject: [petsc-users] PCFieldSplit with MatNest In-Reply-To: References: <239f89b0-cf7d-4dde-2621-e185ead3ed17@upm.es> <874l86ewe1.fsf@jedbrown.org> <7bb230b5-8ca5-aacd-9f12-26c149aaeab3@upm.es> Message-ID: > On 13 Mar 2019, at 14:04, Matthew Knepley via petsc-users wrote: > > On Wed, Mar 13, 2019 at 9:44 AM Manuel Colera Rico via petsc-users wrote: > Yes: > > [ 0]8416 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > [ 0]8416 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > [ 0]4544 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > [ 0]4544 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > > Junchao, do imax and ilen get missed in the Destroy with the user provides arrays? > > https://bitbucket.org/petsc/petsc/src/06a3e802b3873ffbfd04b71a0821522327dd9b04/src/mat/impls/sbaij/seq/sbaij.c#lines-2431 Looks like it. One probably needs something like: diff --git a/src/mat/impls/sbaij/seq/sbaij.c b/src/mat/impls/sbaij/seq/sbaij.c index 2b98394140..c39fc696d8 100644 --- a/src/mat/impls/sbaij/seq/sbaij.c +++ b/src/mat/impls/sbaij/seq/sbaij.c @@ -2442,6 +2442,7 @@ PetscErrorCode MatCreateSeqSBAIJWithArrays(MPI_Comm comm,PetscInt bs,PetscInt m sbaij->nonew = -1; /*this indicates that inserting a new value in the matrix that generates a new nonzero is an error*/ sbaij->free_a = PETSC_FALSE; sbaij->free_ij = PETSC_FALSE; + sbaij->free_imax_ilen = PETSC_TRUE; for (ii=0; iiilen[ii] = sbaij->imax[ii] = i[ii+1] - i[ii]; Lawrence From m.colera at upm.es Wed Mar 13 11:04:28 2019 From: m.colera at upm.es (Manuel Colera Rico) Date: Wed, 13 Mar 2019 17:04:28 +0100 Subject: [petsc-users] PCFieldSplit with MatNest In-Reply-To: References: <239f89b0-cf7d-4dde-2621-e185ead3ed17@upm.es> <874l86ewe1.fsf@jedbrown.org> <7bb230b5-8ca5-aacd-9f12-26c149aaeab3@upm.es> Message-ID: After adding that line the problem gets fixed. Regards, Manuel --- On 3/13/19 3:13 PM, Zhang, Junchao wrote: > Manuel, > ? Could you try to add this line > ? ? ?sbaij->free_imax_ilen = PETSC_TRUE; > ?after line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > > ?PS: Matt, this bug looks unrelated to my VecRestoreArrayRead_Nest fix. > > --Junchao Zhang > > > On Wed, Mar 13, 2019 at 9:05 AM Matthew Knepley > wrote: > > On Wed, Mar 13, 2019 at 9:44 AM Manuel Colera Rico via petsc-users > > wrote: > > Yes: > > [ 0]8416 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > [ 0]8416 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > [ 0]4544 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > [ 0]4544 bytes MatCreateSeqSBAIJWithArrays() line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > > > Junchao, do imax and ilen get missed in the Destroy with the user > provides arrays? > > https://bitbucket.org/petsc/petsc/src/06a3e802b3873ffbfd04b71a0821522327dd9b04/src/mat/impls/sbaij/seq/sbaij.c#lines-2431 > > ? ? Matt > > I have checked that I have destroyed all the MatNest matrices > and all > the submatrices individually. > > Manuel > > --- > > On 3/13/19 2:28 PM, Jed Brown wrote: > > Is there any output if you run with -malloc_dump? > > > > Manuel Colera Rico via petsc-users > writes: > > > >> Hi, Junchao, > >> > >> I have installed the newest version of PETSc and it works > fine. I just > >> get the following memory leak warning: > >> > >> Direct leak of 28608 byte(s) in 12 object(s) allocated from: > >>? ???? #0 0x7f1ddd5caa38 in __interceptor_memalign > >> > ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:111 > >>? ???? #1 0x7f1ddbef1213 in PetscMallocAlign > >> > (/opt/PETSc_library/petsc-3.10.4/mcr_20190313/lib/libpetsc.so.3.10+0x150213) > >> > >> Thank you, > >> > >> Manuel > >> > >> --- > >> > >> On 3/12/19 7:08 PM, Zhang, Junchao wrote: > >>> Hi, Manuel, > >>>? ? I recently fixed a problem in VecRestoreArrayRead. > Basically, I > >>> added VecRestoreArrayRead_Nest. Could you try the master > branch of > >>> PETSc to see if it fixes your problem? > >>>? ? Thanks. > >>> > >>> --Junchao Zhang > >>> > >>> > >>> On Mon, Mar 11, 2019 at 6:56 AM Manuel Colera Rico via > petsc-users > >>> > >> wrote: > >>> > >>>? ? ? Hello, > >>> > >>>? ? ? I need to solve a 2*2 block linear system. The > matrices A_00, A_01, > >>>? ? ? A_10, A_11 are constructed separately via > >>>? ? ? MatCreateSeqAIJWithArrays and > >>>? ? ? MatCreateSeqSBAIJWithArrays. Then, I construct the > full system matrix > >>>? ? ? with MatCreateNest, and use MatNestGetISs and > PCFieldSplitSetIS to > >>>? ? ? set > >>>? ? ? up the PC, trying to follow the procedure described here: > >>> > https://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex70.c.html. > >>> > >>>? ? ? However, when I run the code with Leak Sanitizer, I > get the > >>>? ? ? following error: > >>> > >>> > ================================================================= > >>>? ? ? ==54927==ERROR: AddressSanitizer: attempting free on > address which > >>>? ? ? was > >>>? ? ? not malloc()-ed: 0x627000051ab8 in thread T0 > >>>? ? ? ???? #0 0x7fbd95c08f30 in __interceptor_free > >>> > ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:66 > >>>? ? ? ???? #1 0x7fbd92b99dcd in PetscFreeAlign > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x146dcd) > >>>? ? ? ???? #2 0x7fbd92ce0178 in VecRestoreArray_Nest > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28d178) > >>>? ? ? ???? #3 0x7fbd92cd627d in VecRestoreArrayRead > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28327d) > >>>? ? ? ???? #4 0x7fbd92d1189e in VecScatterBegin_SSToSS > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2be89e) > >>>? ? ? ???? #5 0x7fbd92d1a414 in VecScatterBegin > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2c7414) > >>>? ? ? ???? #6 0x7fbd934a999c in PCApply_FieldSplit > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa5699c) > >>>? ? ? ???? #7 0x7fbd93369071 in PCApply > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x916071) > >>>? ? ? ???? #8 0x7fbd934efe77 in KSPInitialResidual > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa9ce77) > >>>? ? ? ???? #9 0x7fbd9350272c in KSPSolve_GMRES > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xaaf72c) > >>>? ? ? ???? #10 0x7fbd934e3c01 in KSPSolve > >>> > (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa90c01) > >>> > >>>? ? ? Disabling Leak Sanitizer also outputs an "invalid > pointer" error. > >>> > >>>? ? ? Did I forget something when writing the code? > >>> > >>>? ? ? Thank you, > >>> > >>>? ? ? Manuel > >>> > >>>? ? ? --- > >>> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to > which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 13 11:10:00 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 13 Mar 2019 12:10:00 -0400 Subject: [petsc-users] PCFieldSplit with MatNest In-Reply-To: References: <239f89b0-cf7d-4dde-2621-e185ead3ed17@upm.es> <874l86ewe1.fsf@jedbrown.org> <7bb230b5-8ca5-aacd-9f12-26c149aaeab3@upm.es> Message-ID: On Wed, Mar 13, 2019 at 12:04 PM Manuel Colera Rico wrote: > After adding that line the problem gets fixed. > > Junchao, I submitted a PR against maint for this as you. Thanks, Matt > Regards, > > Manuel > > --- > On 3/13/19 3:13 PM, Zhang, Junchao wrote: > > Manuel, > Could you try to add this line > sbaij->free_imax_ilen = PETSC_TRUE; > after line 2431 in > /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c > > PS: Matt, this bug looks unrelated to my VecRestoreArrayRead_Nest fix. > > --Junchao Zhang > > > On Wed, Mar 13, 2019 at 9:05 AM Matthew Knepley wrote: > >> On Wed, Mar 13, 2019 at 9:44 AM Manuel Colera Rico via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Yes: >>> >>> [ 0]8416 bytes MatCreateSeqSBAIJWithArrays() line 2431 in >>> /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c >>> [ 0]8416 bytes MatCreateSeqSBAIJWithArrays() line 2431 in >>> /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c >>> [ 0]4544 bytes MatCreateSeqSBAIJWithArrays() line 2431 in >>> /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c >>> [ 0]4544 bytes MatCreateSeqSBAIJWithArrays() line 2431 in >>> /opt/PETSc_library/petsc-3.10.4/src/mat/impls/sbaij/seq/sbaij.c >>> >> >> Junchao, do imax and ilen get missed in the Destroy with the user >> provides arrays? >> >> >> https://bitbucket.org/petsc/petsc/src/06a3e802b3873ffbfd04b71a0821522327dd9b04/src/mat/impls/sbaij/seq/sbaij.c#lines-2431 >> >> Matt >> >> >>> I have checked that I have destroyed all the MatNest matrices and all >>> the submatrices individually. >>> >>> Manuel >>> >>> --- >>> >>> On 3/13/19 2:28 PM, Jed Brown wrote: >>> > Is there any output if you run with -malloc_dump? >>> > >>> > Manuel Colera Rico via petsc-users writes: >>> > >>> >> Hi, Junchao, >>> >> >>> >> I have installed the newest version of PETSc and it works fine. I just >>> >> get the following memory leak warning: >>> >> >>> >> Direct leak of 28608 byte(s) in 12 object(s) allocated from: >>> >> #0 0x7f1ddd5caa38 in __interceptor_memalign >>> >> ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:111 >>> >> #1 0x7f1ddbef1213 in PetscMallocAlign >>> >> >>> (/opt/PETSc_library/petsc-3.10.4/mcr_20190313/lib/libpetsc.so.3.10+0x150213) >>> >> >>> >> Thank you, >>> >> >>> >> Manuel >>> >> >>> >> --- >>> >> >>> >> On 3/12/19 7:08 PM, Zhang, Junchao wrote: >>> >>> Hi, Manuel, >>> >>> I recently fixed a problem in VecRestoreArrayRead. Basically, I >>> >>> added VecRestoreArrayRead_Nest. Could you try the master branch of >>> >>> PETSc to see if it fixes your problem? >>> >>> Thanks. >>> >>> >>> >>> --Junchao Zhang >>> >>> >>> >>> >>> >>> On Mon, Mar 11, 2019 at 6:56 AM Manuel Colera Rico via petsc-users >>> >>> > wrote: >>> >>> >>> >>> Hello, >>> >>> >>> >>> I need to solve a 2*2 block linear system. The matrices A_00, >>> A_01, >>> >>> A_10, A_11 are constructed separately via >>> >>> MatCreateSeqAIJWithArrays and >>> >>> MatCreateSeqSBAIJWithArrays. Then, I construct the full system >>> matrix >>> >>> with MatCreateNest, and use MatNestGetISs and PCFieldSplitSetIS >>> to >>> >>> set >>> >>> up the PC, trying to follow the procedure described here: >>> >>> >>> https://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex70.c.html >>> . >>> >>> >>> >>> However, when I run the code with Leak Sanitizer, I get the >>> >>> following error: >>> >>> >>> >>> >>> ================================================================= >>> >>> ==54927==ERROR: AddressSanitizer: attempting free on address >>> which >>> >>> was >>> >>> not malloc()-ed: 0x627000051ab8 in thread T0 >>> >>> #0 0x7fbd95c08f30 in __interceptor_free >>> >>> ../../../../gcc-8.1.0/libsanitizer/asan/asan_malloc_linux.cc:66 >>> >>> #1 0x7fbd92b99dcd in PetscFreeAlign >>> >>> >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x146dcd) >>> >>> #2 0x7fbd92ce0178 in VecRestoreArray_Nest >>> >>> >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28d178) >>> >>> #3 0x7fbd92cd627d in VecRestoreArrayRead >>> >>> >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x28327d) >>> >>> #4 0x7fbd92d1189e in VecScatterBegin_SSToSS >>> >>> >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2be89e) >>> >>> #5 0x7fbd92d1a414 in VecScatterBegin >>> >>> >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x2c7414) >>> >>> #6 0x7fbd934a999c in PCApply_FieldSplit >>> >>> >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa5699c) >>> >>> #7 0x7fbd93369071 in PCApply >>> >>> >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0x916071) >>> >>> #8 0x7fbd934efe77 in KSPInitialResidual >>> >>> >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa9ce77) >>> >>> #9 0x7fbd9350272c in KSPSolve_GMRES >>> >>> >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xaaf72c) >>> >>> #10 0x7fbd934e3c01 in KSPSolve >>> >>> >>> (/opt/PETSc_library/petsc/manuel_OpenBLAS_petsc/lib/libpetsc.so.3.8+0xa90c01) >>> >>> >>> >>> Disabling Leak Sanitizer also outputs an "invalid pointer" >>> error. >>> >>> >>> >>> Did I forget something when writing the code? >>> >>> >>> >>> Thank you, >>> >>> >>> >>> Manuel >>> >>> >>> >>> --- >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bastian.loehrer at tu-dresden.de Wed Mar 13 11:11:59 2019 From: bastian.loehrer at tu-dresden.de (=?UTF-8?Q?Bastian_L=c3=b6hrer?=) Date: Wed, 13 Mar 2019 17:11:59 +0100 Subject: [petsc-users] Issue when passing DMDA array on to Paraview Catalyst Message-ID: <136c8d5d-bb94-c5c0-2380-8d4f3a84be00@tu-dresden.de> Dear PETSc users, I am having difficulties passing PETSc data on to Paraview Catalyst and it may be related to the way we handle the PETSs data in our Fortran code. We have DMDA objects, which we pass on to subroutines this way: > ? ... > ? call DMCreateLocalVector(da1dof, loc_p, ierr) > ? ... > ? call VecGetArray(loc_p, loc_p_v, loc_p_i, ierr) > ? call process( loc_p_v(loc_p_i+1) ) > ? ... > Inside the subroutine (process in this example) we treat the subroutine's argument as if it were an ordinary Fortran array: > ? subroutine process( p ) > > ??? use gridinfo ! provides gis, gie, ... etc. > > ??? implicit none > > #include "petsc_include.h" > > ??? PetscScalar, dimension(gis:gie,gjs:gje,gks:gke) :: p > ??? PetscInt i,j,k > > ??? do k = gks, gke > ????? do j = gjs, gje > ??????? do i = gis, gie > > ??????????? p(i,j,k) = ... > > ??????? enddo > ????? enddo > ??? enddo > > ? end subroutine process > I find this procedure a little quirky, but it has been working flawlessly for years. However, I am now encountering difficulties when passing this variable/array p on to a Paraview Catalyst adaptor subroutine. Doing so I end up with very strange values there. When replacing p with an ordinary local Fortran array everything is fine. Bastian From knepley at gmail.com Wed Mar 13 11:28:26 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 13 Mar 2019 12:28:26 -0400 Subject: [petsc-users] Issue when passing DMDA array on to Paraview Catalyst In-Reply-To: <136c8d5d-bb94-c5c0-2380-8d4f3a84be00@tu-dresden.de> References: <136c8d5d-bb94-c5c0-2380-8d4f3a84be00@tu-dresden.de> Message-ID: On Wed, Mar 13, 2019 at 12:16 PM Bastian L?hrer via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc users, > > I am having difficulties passing PETSc data on to Paraview Catalyst and > it may be related to the way we handle the PETSs data in our Fortran code. > > We have DMDA objects, which we pass on to subroutines this way: > > > ... > > call DMCreateLocalVector(da1dof, loc_p, ierr) > > ... > > call VecGetArray(loc_p, loc_p_v, loc_p_i, ierr) > > call process( loc_p_v(loc_p_i+1) ) > > ... > > > > Inside the subroutine (process in this example) we treat the > subroutine's argument as if it were an ordinary Fortran array: > > > subroutine process( p ) > > > > use gridinfo ! provides gis, gie, ... etc. > > > > implicit none > > > > #include "petsc_include.h" > > > > PetscScalar, dimension(gis:gie,gjs:gje,gks:gke) :: p > > PetscInt i,j,k > > > > do k = gks, gke > > do j = gjs, gje > > do i = gis, gie > > > > p(i,j,k) = ... > > > > enddo > > enddo > > enddo > > > > end subroutine process > > > I find this procedure a little quirky, but it has been working > flawlessly for years. > > However, I am now encountering difficulties when passing this > variable/array p on to a Paraview Catalyst adaptor subroutine. Doing so > I end up with very strange values there. When replacing p with an > ordinary local Fortran array everything is fine. > I can't think of a reason it would not work. I would look at the pointer you get inside the Catalyst function using the debugger. Note that you can also get an F90 array out if that is what Catalyst needs. Thanks, Matt > Bastian > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From maahi.buet at gmail.com Wed Mar 13 18:34:57 2019 From: maahi.buet at gmail.com (Maahi Talukder) Date: Wed, 13 Mar 2019 19:34:57 -0400 Subject: [petsc-users] Compiling Fortran Code Message-ID: Dear All, I am trying to compile a Fortran code. The make is as it follows- ............................................................................................................................................................................................................................ # Makefile for egrid2d OBJS = main.o egrid2d.o FFLAGS = -I/home/maahi/petsc/include -I/home/maahi/petsc/arch-linux2-c-debug/include -Ofast -fdefault-real-8 # # link # include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules egrid2d: $(OBJS) ${FLINKER} $(OBJS) -o egrid2d ${PETSC_LIB} # # compile # main.o: ${FLINKER} -c $(FFLAGS) main.f ${PETSC_LIB} # # Common and Parameter Dependencies # main.o: main.f par2d.f egrid2d.o: egrid2d.f par2d.f ..................................................................................................................................................................................................................................... But I get the following error- .............................................................................................................................................................................. /home/maahi/petsc/arch-linux2-c-debug/bin/mpif90 -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -I/home/maahi/petsc/include -I/home/maahi/petsc/arch-linux2-c-debug/include -Ofast -fdefault-real-8 -o egrid2d -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib -L/home/maahi/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib -L/home/maahi/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/7 -L/usr/lib/gcc/x86_64-redhat-linux/7 -lpetsc -lflapack -lfblas -lm -lpthread -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl /*usr/lib/gcc/x86_64-redhat-linux/7/../../../../lib64/crt1.o: In function `_start':* *(.text+0x20): undefined reference to `main'* collect2: error: ld returned 1 exit status make: *** [makefile:18: egrid2d] Error 1 ........................................................................................................................................ Any idea how to fix it ? Thanks, Maahi Talukder -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 13 19:31:17 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 13 Mar 2019 20:31:17 -0400 Subject: [petsc-users] Compiling Fortran Code In-Reply-To: References: Message-ID: On Wed, Mar 13, 2019 at 7:36 PM Maahi Talukder via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear All, > > I am trying to compile a Fortran code. The make is as it follows- > > > ............................................................................................................................................................................................................................ > # Makefile for egrid2d > > OBJS = main.o egrid2d.o > > FFLAGS = -I/home/maahi/petsc/include > -I/home/maahi/petsc/arch-linux2-c-debug/include -Ofast -fdefault-real-8 > > # > # link > # > include ${PETSC_DIR}/lib/petsc/conf/variables > include ${PETSC_DIR}/lib/petsc/conf/rules > > egrid2d: $(OBJS) > > ${FLINKER} $(OBJS) -o egrid2d ${PETSC_LIB} > Move this above your includes > > # > # compile > # > main.o: > ${FLINKER} -c $(FFLAGS) main.f ${PETSC_LIB} > You should not need this rule. Thanks, Matt > # > # Common and Parameter Dependencies > # > > main.o: main.f par2d.f > egrid2d.o: egrid2d.f par2d.f > > ..................................................................................................................................................................................................................................... > > But I get the following error- > > > .............................................................................................................................................................................. > /home/maahi/petsc/arch-linux2-c-debug/bin/mpif90 -Wall > -ffree-line-length-0 -Wno-unused-dummy-argument -g > -I/home/maahi/petsc/include -I/home/maahi/petsc/arch-linux2-c-debug/include > -Ofast -fdefault-real-8 -o egrid2d > -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib > -L/home/maahi/petsc/arch-linux2-c-debug/lib > -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib > -L/home/maahi/petsc/arch-linux2-c-debug/lib > -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/7 > -L/usr/lib/gcc/x86_64-redhat-linux/7 -lpetsc -lflapack -lfblas -lm > -lpthread -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm > -lgcc_s -lquadmath -lstdc++ -ldl > /*usr/lib/gcc/x86_64-redhat-linux/7/../../../../lib64/crt1.o: In function > `_start':* > *(.text+0x20): undefined reference to `main'* > collect2: error: ld returned 1 exit status > make: *** [makefile:18: egrid2d] Error 1 > > ........................................................................................................................................ > > Any idea how to fix it ? > > Thanks, > Maahi Talukder > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Mar 13 19:44:46 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Thu, 14 Mar 2019 00:44:46 +0000 Subject: [petsc-users] Compiling Fortran Code In-Reply-To: References: Message-ID: check petsc makefile format - for ex: src/tao/unconstrained/examples/tutorials/makefile Also rename your fortran sources that have petsc calls from .f to .F On Wed, 13 Mar 2019, Matthew Knepley via petsc-users wrote: > On Wed, Mar 13, 2019 at 7:36 PM Maahi Talukder via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > Dear All, > > > > I am trying to compile a Fortran code. The make is as it follows- > > > > > > ............................................................................................................................................................................................................................ > > # Makefile for egrid2d > > > > OBJS = main.o egrid2d.o > > > > FFLAGS = -I/home/maahi/petsc/include > > -I/home/maahi/petsc/arch-linux2-c-debug/include -Ofast -fdefault-real-8 > > > > # > > # link > > # > > include ${PETSC_DIR}/lib/petsc/conf/variables > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > egrid2d: $(OBJS) > > > > ${FLINKER} $(OBJS) -o egrid2d ${PETSC_LIB} > > > > Move this above your includes > The location is fine. Can you change OBJS to a different name - say OBJ [or something else] and see if that works. Satish > > > > > # > > # compile > > # > > main.o: > > ${FLINKER} -c $(FFLAGS) main.f ${PETSC_LIB} > > > > You should not need this rule. > > Thanks, > > Matt > > > > # > > # Common and Parameter Dependencies > > # > > > > main.o: main.f par2d.f > > egrid2d.o: egrid2d.f par2d.f > > > > ..................................................................................................................................................................................................................................... > > > > But I get the following error- > > > > > > .............................................................................................................................................................................. > > /home/maahi/petsc/arch-linux2-c-debug/bin/mpif90 -Wall > > -ffree-line-length-0 -Wno-unused-dummy-argument -g > > -I/home/maahi/petsc/include -I/home/maahi/petsc/arch-linux2-c-debug/include > > -Ofast -fdefault-real-8 -o egrid2d > > -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib > > -L/home/maahi/petsc/arch-linux2-c-debug/lib > > -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib > > -L/home/maahi/petsc/arch-linux2-c-debug/lib > > -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/7 > > -L/usr/lib/gcc/x86_64-redhat-linux/7 -lpetsc -lflapack -lfblas -lm > > -lpthread -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm > > -lgcc_s -lquadmath -lstdc++ -ldl > > /*usr/lib/gcc/x86_64-redhat-linux/7/../../../../lib64/crt1.o: In function > > `_start':* > > *(.text+0x20): undefined reference to `main'* > > collect2: error: ld returned 1 exit status > > make: *** [makefile:18: egrid2d] Error 1 > > > > ........................................................................................................................................ > > > > Any idea how to fix it ? > > > > Thanks, > > Maahi Talukder > > > > > > > > > > From mfadams at lbl.gov Wed Mar 13 19:59:51 2019 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 13 Mar 2019 20:59:51 -0400 Subject: [petsc-users] GAMG parallel convergence sensitivity In-Reply-To: References: Message-ID: > > > > Any thoughts here? Is there anything obviously wrong with my setup? > Fast and robust solvers for NS require specialized methods that are not provided in PETSc and the methods tend to require tighter integration with the meshing and discretization than the algebraic interface supports. I see you are using 20 smoothing steps. That is very high. Generally you want to use the v-cycle more (ie, lower number of smoothing steps and more iterations). And, full MG is a bit tricky. I would not use it, but if it helps, fine. > Any way to reduce the dependence of the convergence iterations on the > parallelism? > This comes from the bjacobi smoother. Use jacobi and you will not have a parallelism problem and you have bjacobi in the limit of parallelism. > -- obviously I expect the iteration count to be higher in parallel, but I > didn't expect such catastrophic failure. > > You are beyond what AMG is designed for. If you press this problem it will break any solver and will break generic AMG relatively early. This makes it hard to give much advice. You really just need to test things and use what works best. There are special purpose methods that you can implement in PETSc but that is a topic for a significant project. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Wed Mar 13 20:27:30 2019 From: mlohry at gmail.com (Mark Lohry) Date: Wed, 13 Mar 2019 21:27:30 -0400 Subject: [petsc-users] GAMG parallel convergence sensitivity In-Reply-To: References: Message-ID: Thanks Mark. This makes it hard to give much advice. You really just need to test things > and use what works best. > Yeah, arriving at the current setup was the result of a lot of rather aimless testing and trial and error. I see you are using 20 smoothing steps. That is very high. Generally you > want to use the v-cycle more (ie, lower number of smoothing steps and more > iterations). > This was partly from seeing a lot of cases that needed far too many outer gmres iterations / orthogonalization directions, and trying to coerce AMG into doing more work per cycle. You are beyond what AMG is designed for. If you press this problem it will > break any solver and will break generic AMG relatively early. For what it's worth, I'm regularly solving much larger problems (1M-100M unknowns, unsteady) with this discretization and AMG setup on 500+ cores with impressively great convergence, dramatically better than ILU/ASM. This just happens to be the first time I've experimented with this extremely low Mach number, which is known to have a whole host of issues and generally needs low-mach preconditioners, I was just a bit surprised by this specific failure mechanism. Thanks for the point on jacobi v bjacobi. On Wed, Mar 13, 2019 at 9:00 PM Mark Adams wrote: > >> >> Any thoughts here? Is there anything obviously wrong with my setup? >> > > Fast and robust solvers for NS require specialized methods that are not > provided in PETSc and the methods tend to require tighter integration with > the meshing and discretization than the algebraic interface supports. > > I see you are using 20 smoothing steps. That is very high. Generally you > want to use the v-cycle more (ie, lower number of smoothing steps and more > iterations). > > And, full MG is a bit tricky. I would not use it, but if it helps, fine. > > >> Any way to reduce the dependence of the convergence iterations on the >> parallelism? >> > > This comes from the bjacobi smoother. Use jacobi and you will not have a > parallelism problem and you have bjacobi in the limit of parallelism. > > >> -- obviously I expect the iteration count to be higher in parallel, but I >> didn't expect such catastrophic failure. >> >> > You are beyond what AMG is designed for. If you press this problem it will > break any solver and will break generic AMG relatively early. > > This makes it hard to give much advice. You really just need to test > things and use what works best. There are special purpose methods that you > can implement in PETSc but that is a topic for a significant project. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maahi.buet at gmail.com Wed Mar 13 21:05:02 2019 From: maahi.buet at gmail.com (Maahi Talukder) Date: Wed, 13 Mar 2019 22:05:02 -0400 Subject: [petsc-users] Compiling Fortran Code In-Reply-To: References: Message-ID: Hi, Thank you all for your suggestions. I made the changes as suggested. But now I get the following error- ................................................................................................................................. [maahi at CB272PP-THINK1 egrid2d]$ make egrid2d /home/maahi/petsc/arch-linux2-c-debug/bin/mpif90 -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -I/home/maahi/petsc/include -I/home/maahi/petsc/arch-linux2-c-debug/include -Ofast -fdefault-real-8 -c -I/home/maahi/petsc/include -I/home/maahi/petsc/arch-linux2-c-debug/include -Ofast -fdefault-real-8 main.F -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib -L/home/maahi/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib -L/home/maahi/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/7 -L/usr/lib/gcc/x86_64-redhat-linux/7 -lpetsc -lflapack -lfblas -lm -lpthread -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl main.F:6:1: use petscksp 1 *Error: Non-numeric character in statement label at (1)* main.F:6:1: use petscksp 1 Error: Unclassifiable statement at (1) make: *** [makefile:28: main.o] Error 1 ............................................................................. Any idea how to fix that? Thanks, Maahi Talukder On Wed, Mar 13, 2019 at 8:44 PM Balay, Satish wrote: > check petsc makefile format - for ex: > src/tao/unconstrained/examples/tutorials/makefile > > Also rename your fortran sources that have petsc calls from .f to .F > > > On Wed, 13 Mar 2019, Matthew Knepley via petsc-users wrote: > > > On Wed, Mar 13, 2019 at 7:36 PM Maahi Talukder via petsc-users < > > petsc-users at mcs.anl.gov> wrote: > > > > > Dear All, > > > > > > I am trying to compile a Fortran code. The make is as it follows- > > > > > > > > > > ............................................................................................................................................................................................................................ > > > # Makefile for egrid2d > > > > > > OBJS = main.o egrid2d.o > > > > > > FFLAGS = -I/home/maahi/petsc/include > > > -I/home/maahi/petsc/arch-linux2-c-debug/include -Ofast -fdefault-real-8 > > > > > > # > > > # link > > > # > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > > > egrid2d: $(OBJS) > > > > > > ${FLINKER} $(OBJS) -o egrid2d ${PETSC_LIB} > > > > > > > Move this above your includes > > > The location is fine. Can you change OBJS to a different name - say OBJ > [or something else] and see if that works. > > Satish > > > > > > > > > # > > > # compile > > > # > > > main.o: > > > ${FLINKER} -c $(FFLAGS) main.f ${PETSC_LIB} > > > > > > > You should not need this rule. > > > > Thanks, > > > > Matt > > > > > > > # > > > # Common and Parameter Dependencies > > > # > > > > > > main.o: main.f par2d.f > > > egrid2d.o: egrid2d.f par2d.f > > > > > > > ..................................................................................................................................................................................................................................... > > > > > > But I get the following error- > > > > > > > > > > .............................................................................................................................................................................. > > > /home/maahi/petsc/arch-linux2-c-debug/bin/mpif90 -Wall > > > -ffree-line-length-0 -Wno-unused-dummy-argument -g > > > -I/home/maahi/petsc/include > -I/home/maahi/petsc/arch-linux2-c-debug/include > > > -Ofast -fdefault-real-8 -o egrid2d > > > -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib > > > -L/home/maahi/petsc/arch-linux2-c-debug/lib > > > -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib > > > -L/home/maahi/petsc/arch-linux2-c-debug/lib > > > -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/7 > > > -L/usr/lib/gcc/x86_64-redhat-linux/7 -lpetsc -lflapack -lfblas -lm > > > -lpthread -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm > > > -lgcc_s -lquadmath -lstdc++ -ldl > > > /*usr/lib/gcc/x86_64-redhat-linux/7/../../../../lib64/crt1.o: In > function > > > `_start':* > > > *(.text+0x20): undefined reference to `main'* > > > collect2: error: ld returned 1 exit status > > > make: *** [makefile:18: egrid2d] Error 1 > > > > > > > ........................................................................................................................................ > > > > > > Any idea how to fix it ? > > > > > > Thanks, > > > Maahi Talukder > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Mar 13 21:10:13 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Thu, 14 Mar 2019 02:10:13 +0000 Subject: [petsc-users] Compiling Fortran Code In-Reply-To: References: Message-ID: <6025BCB4-D068-4F9C-986F-7ABA5E54AEC7@anl.gov> Put the use petscksp starting in column 7 of the file > On Mar 13, 2019, at 9:05 PM, Maahi Talukder via petsc-users wrote: > > Hi, > > Thank you all for your suggestions. I made the changes as suggested. But now I get the following error- > ................................................................................................................................. > [maahi at CB272PP-THINK1 egrid2d]$ make egrid2d > /home/maahi/petsc/arch-linux2-c-debug/bin/mpif90 -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -I/home/maahi/petsc/include -I/home/maahi/petsc/arch-linux2-c-debug/include -Ofast -fdefault-real-8 -c -I/home/maahi/petsc/include -I/home/maahi/petsc/arch-linux2-c-debug/include -Ofast -fdefault-real-8 main.F -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib -L/home/maahi/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib -L/home/maahi/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/7 -L/usr/lib/gcc/x86_64-redhat-linux/7 -lpetsc -lflapack -lfblas -lm -lpthread -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl > main.F:6:1: > > use petscksp > 1 > Error: Non-numeric character in statement label at (1) > main.F:6:1: > > use petscksp > 1 > Error: Unclassifiable statement at (1) > make: *** [makefile:28: main.o] Error 1 > > ............................................................................. > Any idea how to fix that? > > Thanks, > Maahi Talukder > > > On Wed, Mar 13, 2019 at 8:44 PM Balay, Satish wrote: > check petsc makefile format - for ex: src/tao/unconstrained/examples/tutorials/makefile > > Also rename your fortran sources that have petsc calls from .f to .F > > > On Wed, 13 Mar 2019, Matthew Knepley via petsc-users wrote: > > > On Wed, Mar 13, 2019 at 7:36 PM Maahi Talukder via petsc-users < > > petsc-users at mcs.anl.gov> wrote: > > > > > Dear All, > > > > > > I am trying to compile a Fortran code. The make is as it follows- > > > > > > > > > ............................................................................................................................................................................................................................ > > > # Makefile for egrid2d > > > > > > OBJS = main.o egrid2d.o > > > > > > FFLAGS = -I/home/maahi/petsc/include > > > -I/home/maahi/petsc/arch-linux2-c-debug/include -Ofast -fdefault-real-8 > > > > > > # > > > # link > > > # > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > > > egrid2d: $(OBJS) > > > > > > ${FLINKER} $(OBJS) -o egrid2d ${PETSC_LIB} > > > > > > > Move this above your includes > > > The location is fine. Can you change OBJS to a different name - say OBJ [or something else] and see if that works. > > Satish > > > > > > > > > # > > > # compile > > > # > > > main.o: > > > ${FLINKER} -c $(FFLAGS) main.f ${PETSC_LIB} > > > > > > > You should not need this rule. > > > > Thanks, > > > > Matt > > > > > > > # > > > # Common and Parameter Dependencies > > > # > > > > > > main.o: main.f par2d.f > > > egrid2d.o: egrid2d.f par2d.f > > > > > > ..................................................................................................................................................................................................................................... > > > > > > But I get the following error- > > > > > > > > > .............................................................................................................................................................................. > > > /home/maahi/petsc/arch-linux2-c-debug/bin/mpif90 -Wall > > > -ffree-line-length-0 -Wno-unused-dummy-argument -g > > > -I/home/maahi/petsc/include -I/home/maahi/petsc/arch-linux2-c-debug/include > > > -Ofast -fdefault-real-8 -o egrid2d > > > -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib > > > -L/home/maahi/petsc/arch-linux2-c-debug/lib > > > -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib > > > -L/home/maahi/petsc/arch-linux2-c-debug/lib > > > -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/7 > > > -L/usr/lib/gcc/x86_64-redhat-linux/7 -lpetsc -lflapack -lfblas -lm > > > -lpthread -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm > > > -lgcc_s -lquadmath -lstdc++ -ldl > > > /*usr/lib/gcc/x86_64-redhat-linux/7/../../../../lib64/crt1.o: In function > > > `_start':* > > > *(.text+0x20): undefined reference to `main'* > > > collect2: error: ld returned 1 exit status > > > make: *** [makefile:18: egrid2d] Error 1 > > > > > > ........................................................................................................................................ > > > > > > Any idea how to fix it ? > > > > > > Thanks, > > > Maahi Talukder > > > > > > > > > > > > > > > > > From maahi.buet at gmail.com Wed Mar 13 21:13:40 2019 From: maahi.buet at gmail.com (Maahi Talukder) Date: Wed, 13 Mar 2019 22:13:40 -0400 Subject: [petsc-users] Compiling Fortran Code In-Reply-To: <6025BCB4-D068-4F9C-986F-7ABA5E54AEC7@anl.gov> References: <6025BCB4-D068-4F9C-986F-7ABA5E54AEC7@anl.gov> Message-ID: Thank you so much. Now it works! Thanks again! On Wed, Mar 13, 2019 at 10:10 PM Smith, Barry F. wrote: > > Put the use petscksp starting in column 7 of the file > > > > > On Mar 13, 2019, at 9:05 PM, Maahi Talukder via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > Hi, > > > > Thank you all for your suggestions. I made the changes as suggested. > But now I get the following error- > > > ................................................................................................................................. > > [maahi at CB272PP-THINK1 egrid2d]$ make egrid2d > > /home/maahi/petsc/arch-linux2-c-debug/bin/mpif90 -Wall > -ffree-line-length-0 -Wno-unused-dummy-argument -g > -I/home/maahi/petsc/include -I/home/maahi/petsc/arch-linux2-c-debug/include > -Ofast -fdefault-real-8 -c -I/home/maahi/petsc/include > -I/home/maahi/petsc/arch-linux2-c-debug/include -Ofast -fdefault-real-8 > main.F -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib > -L/home/maahi/petsc/arch-linux2-c-debug/lib > -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib > -L/home/maahi/petsc/arch-linux2-c-debug/lib > -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/7 > -L/usr/lib/gcc/x86_64-redhat-linux/7 -lpetsc -lflapack -lfblas -lm > -lpthread -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm > -lgcc_s -lquadmath -lstdc++ -ldl > > main.F:6:1: > > > > use petscksp > > 1 > > Error: Non-numeric character in statement label at (1) > > main.F:6:1: > > > > use petscksp > > 1 > > Error: Unclassifiable statement at (1) > > make: *** [makefile:28: main.o] Error 1 > > > > > ............................................................................. > > Any idea how to fix that? > > > > Thanks, > > Maahi Talukder > > > > > > On Wed, Mar 13, 2019 at 8:44 PM Balay, Satish wrote: > > check petsc makefile format - for ex: > src/tao/unconstrained/examples/tutorials/makefile > > > > Also rename your fortran sources that have petsc calls from .f to .F > > > > > > On Wed, 13 Mar 2019, Matthew Knepley via petsc-users wrote: > > > > > On Wed, Mar 13, 2019 at 7:36 PM Maahi Talukder via petsc-users < > > > petsc-users at mcs.anl.gov> wrote: > > > > > > > Dear All, > > > > > > > > I am trying to compile a Fortran code. The make is as it follows- > > > > > > > > > > > > > ............................................................................................................................................................................................................................ > > > > # Makefile for egrid2d > > > > > > > > OBJS = main.o egrid2d.o > > > > > > > > FFLAGS = -I/home/maahi/petsc/include > > > > -I/home/maahi/petsc/arch-linux2-c-debug/include -Ofast > -fdefault-real-8 > > > > > > > > # > > > > # link > > > > # > > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > > > > > egrid2d: $(OBJS) > > > > > > > > ${FLINKER} $(OBJS) -o egrid2d ${PETSC_LIB} > > > > > > > > > > Move this above your includes > > > > > The location is fine. Can you change OBJS to a different name - say OBJ > [or something else] and see if that works. > > > > Satish > > > > > > > > > > > > > # > > > > # compile > > > > # > > > > main.o: > > > > ${FLINKER} -c $(FFLAGS) main.f ${PETSC_LIB} > > > > > > > > > > You should not need this rule. > > > > > > Thanks, > > > > > > Matt > > > > > > > > > > # > > > > # Common and Parameter Dependencies > > > > # > > > > > > > > main.o: main.f par2d.f > > > > egrid2d.o: egrid2d.f par2d.f > > > > > > > > > ..................................................................................................................................................................................................................................... > > > > > > > > But I get the following error- > > > > > > > > > > > > > .............................................................................................................................................................................. > > > > /home/maahi/petsc/arch-linux2-c-debug/bin/mpif90 -Wall > > > > -ffree-line-length-0 -Wno-unused-dummy-argument -g > > > > -I/home/maahi/petsc/include > -I/home/maahi/petsc/arch-linux2-c-debug/include > > > > -Ofast -fdefault-real-8 -o egrid2d > > > > -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib > > > > -L/home/maahi/petsc/arch-linux2-c-debug/lib > > > > -Wl,-rpath,/home/maahi/petsc/arch-linux2-c-debug/lib > > > > -L/home/maahi/petsc/arch-linux2-c-debug/lib > > > > -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/7 > > > > -L/usr/lib/gcc/x86_64-redhat-linux/7 -lpetsc -lflapack -lfblas -lm > > > > -lpthread -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm > > > > -lgcc_s -lquadmath -lstdc++ -ldl > > > > /*usr/lib/gcc/x86_64-redhat-linux/7/../../../../lib64/crt1.o: In > function > > > > `_start':* > > > > *(.text+0x20): undefined reference to `main'* > > > > collect2: error: ld returned 1 exit status > > > > make: *** [makefile:18: egrid2d] Error 1 > > > > > > > > > ........................................................................................................................................ > > > > > > > > Any idea how to fix it ? > > > > > > > > Thanks, > > > > Maahi Talukder > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Mar 13 22:29:43 2019 From: jed at jedbrown.org (Jed Brown) Date: Wed, 13 Mar 2019 21:29:43 -0600 Subject: [petsc-users] GAMG parallel convergence sensitivity In-Reply-To: References: Message-ID: <874l86b0bc.fsf@jedbrown.org> Mark Lohry via petsc-users writes: > For what it's worth, I'm regularly solving much larger problems (1M-100M > unknowns, unsteady) with this discretization and AMG setup on 500+ cores > with impressively great convergence, dramatically better than ILU/ASM. This > just happens to be the first time I've experimented with this extremely low > Mach number, which is known to have a whole host of issues and generally > needs low-mach preconditioners, I was just a bit surprised by this specific > failure mechanism. A common technique for low-Mach preconditioning is to convert to primitive variables (much better conditioned for the solve) and use a Schur fieldsplit into the pressure space. For modest time step, you can use SIMPLE-like method ("selfp" in PCFieldSplit lingo) to approximate that Schur complement. You can also rediscretize to form that approximation. This paper has a bunch of examples of choices for the state variables and derivation of the continuous pressure preconditioner each case. (They present it as a classical semi-implicit method, but that would be the Schur complement preconditioner if using FieldSplit with a fully implicit or IMEX method.) https://doi.org/10.1137/090775889 From bsmith at mcs.anl.gov Thu Mar 14 00:05:06 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Thu, 14 Mar 2019 05:05:06 +0000 Subject: [petsc-users] Issue when passing DMDA array on to Paraview Catalyst In-Reply-To: References: <136c8d5d-bb94-c5c0-2380-8d4f3a84be00@tu-dresden.de> Message-ID: <5BC156E1-D295-411B-AD32-788D57F1C7F5@anl.gov> > On Mar 13, 2019, at 11:28 AM, Matthew Knepley via petsc-users wrote: > > On Wed, Mar 13, 2019 at 12:16 PM Bastian L?hrer via petsc-users wrote: > Dear PETSc users, > > I am having difficulties passing PETSc data on to Paraview Catalyst and > it may be related to the way we handle the PETSs data in our Fortran code. > > We have DMDA objects, which we pass on to subroutines this way: > > > ... > > call DMCreateLocalVector(da1dof, loc_p, ierr) > > ... > > call VecGetArray(loc_p, loc_p_v, loc_p_i, ierr) > > call process( loc_p_v(loc_p_i+1) ) > > ... > > > > Inside the subroutine (process in this example) we treat the > subroutine's argument as if it were an ordinary Fortran array: > > > subroutine process( p ) > > > > use gridinfo ! provides gis, gie, ... etc. > > > > implicit none > > > > #include "petsc_include.h" > > > > PetscScalar, dimension(gis:gie,gjs:gje,gks:gke) :: p > > PetscInt i,j,k > > > > do k = gks, gke > > do j = gjs, gje > > do i = gis, gie > > > > p(i,j,k) = ... > > > > enddo > > enddo > > enddo > > > > end subroutine process > > > I find this procedure a little quirky, but it has been working > flawlessly for years. > > However, I am now encountering difficulties when passing this > variable/array p on to a Paraview Catalyst adaptor subroutine. Doing so > I end up with very strange values there. When replacing p with an > ordinary local Fortran array everything is fine. > > I can't think of a reason it would not work. I would look at the pointer you get inside > the Catalyst function using the debugger. > > Note that you can also get an F90 array out if that is what Catalyst needs. VecGetArrayF90() or DMDAVecGetArrayF90() > > Thanks, > > Matt > > Bastian > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From mail2amneet at gmail.com Thu Mar 14 01:38:45 2019 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Wed, 13 Mar 2019 23:38:45 -0700 Subject: [petsc-users] Cross-compilation cluster Message-ID: Hi Folks, I am on a cluster that has -L/lib dir with 32-bit libraries and -L/lib64 with 64-bit libraries. During compilation of some of libraries required for my code (such as SAMRAI and libMesh) both paths get picked -L/lib and -L/lib64. I am seeing some sporadic behavior in runtime when at some timesteps PETSc does not converge. The same code with the same number of processors run just fine on my workstation that has just 64-bit version of libraries. Even during the final linking stage of the executable, the linker gives warnings like ld: skipping incompatible //lib/libm.so when searching for -lm ld: skipping incompatible /lib/libm.so when searching for -lm ld: skipping incompatible /lib/libm.so when searching for -lm ld: skipping incompatible //lib/libpthread.so when searching for -lpthread ld: skipping incompatible /lib/libpthread.so when searching for -lpthread ld: skipping incompatible /lib/libpthread.so when searching for -lpthread ld: skipping incompatible //lib/libdl.so when searching for -ldl ld: skipping incompatible //lib/libc.so when searching for -lc ld: skipping incompatible /lib/libc.so when searching for -lc ld: skipping incompatible /lib/libc.so when searching for -lc but the executable runs. This is during config of SAMRAI when it picks both -L/lib and -L/lib64: checking whether we are using the GNU Fortran 77 compiler... no checking whether ifort accepts -g... yes checking how to get verbose linking output from ifort... -v checking for Fortran 77 libraries of ifort... -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/ -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ -L/lib/../lib64 -L/lib/../lib64/ -L/usr/lib/../lib64 -L/usr/lib/../lib64/ -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4/ -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/ -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/ -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64/ -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../ -L/lib64 -L/lib/ -L/usr/lib64 -L/usr/lib -lifport -lifcoremt -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc -lgcc_s -lirc_s -ldl libMesh is also picking that path libmesh_optional_LIBS............ : -lhdf5 -lhdf5_cpp -lz -L/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib -Wl,-rpath,/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib -Wl,-rpath,/opt/intel/mkl/lib/intel64 -L/opt/intel/mkl/lib/intel64 -Wl,-rpath,/opt/mellanox/hcoll/lib -L/opt/mellanox/hcoll/lib -Wl,-rpath,/opt/mellanox/mxm/lib -L/opt/mellanox/mxm/lib -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin -lpetsc -lHYPRE -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s -lstdc++ -ldl -L/lib -Wl,-rpath,/lib -Wl,-rpath,/usr/local/mpi/intel/openmpi-4.0.0/lib64 -L/usr/local/mpi/intel/openmpi-4.0.0/lib64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 Perhaps PETSc also picks up both versions (and there is a way to query it from PETSc?), but I can't confirm this. Is there a way to instruct make to select only -L/lib64? I want to rule out that 32-bit dynamic library is not a culprit for the random non-convergence of PETSc solvers and the eventual crash of the simulations. I have tried both gcc-7.3.0 and intel-18 compilers -- but the same thing is happening. -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 14 06:21:00 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 14 Mar 2019 07:21:00 -0400 Subject: [petsc-users] Cross-compilation cluster In-Reply-To: References: Message-ID: In order to see why each flag was included, we need to see configure.log. Thanks, Matt On Thu, Mar 14, 2019 at 2:40 AM Amneet Bhalla via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi Folks, > > I am on a cluster that has -L/lib dir with 32-bit libraries and -L/lib64 > with 64-bit libraries. During compilation of some of libraries required for > my code (such as SAMRAI and libMesh) both paths > get picked -L/lib and -L/lib64. > > I am seeing some sporadic behavior in runtime when at some timesteps PETSc > does not converge. The same code with the same number of processors run > just fine on my workstation that has just 64-bit version of libraries. > > Even during the final linking stage of the executable, the linker gives > warnings like > > ld: skipping incompatible //lib/libm.so when searching for -lm > > ld: skipping incompatible /lib/libm.so when searching for -lm > > ld: skipping incompatible /lib/libm.so when searching for -lm > > ld: skipping incompatible //lib/libpthread.so when searching for -lpthread > > ld: skipping incompatible /lib/libpthread.so when searching for -lpthread > > ld: skipping incompatible /lib/libpthread.so when searching for -lpthread > > ld: skipping incompatible //lib/libdl.so when searching for -ldl > > ld: skipping incompatible //lib/libc.so when searching for -lc > > ld: skipping incompatible /lib/libc.so when searching for -lc > > ld: skipping incompatible /lib/libc.so when searching for -lc > but the executable runs. > > > This is during config of SAMRAI when it picks both -L/lib and -L/lib64: > > checking whether we are using the GNU Fortran 77 compiler... no > > checking whether ifort accepts -g... yes > > checking how to get verbose linking output from ifort... -v > > checking for Fortran 77 libraries of ifort... > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin > -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/ > -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 > -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ -L/lib/../lib64 > -L/lib/../lib64/ -L/usr/lib/../lib64 -L/usr/lib/../lib64/ > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4/ > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/ > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/ > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64/ > -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../ -L/lib64 -L/lib/ > -L/usr/lib64 -L/usr/lib -lifport -lifcoremt -limf -lsvml -lm -lipgo -lirc > -lpthread -lgcc -lgcc_s -lirc_s -ldl > > libMesh is also picking that path > > libmesh_optional_LIBS............ : -lhdf5 -lhdf5_cpp -lz > -L/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib > -Wl,-rpath,/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib > -Wl,-rpath,/opt/intel/mkl/lib/intel64 -L/opt/intel/mkl/lib/intel64 > -Wl,-rpath,/opt/mellanox/hcoll/lib -L/opt/mellanox/hcoll/lib > -Wl,-rpath,/opt/mellanox/mxm/lib -L/opt/mellanox/mxm/lib > -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 > -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 > -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 > -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 > -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin > -lpetsc -lHYPRE -lmkl_intel_lp64 -lmkl_sequential -lmkl_core > -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport > -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s > -lstdc++ -ldl -L/lib -Wl,-rpath,/lib > -Wl,-rpath,/usr/local/mpi/intel/openmpi-4.0.0/lib64 > -L/usr/local/mpi/intel/openmpi-4.0.0/lib64 > -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 > -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 > > Perhaps PETSc also picks up both versions (and there is a way to query it > from PETSc?), but I can't confirm this. Is there a way to instruct make to > select only -L/lib64? I want to rule out that 32-bit dynamic library is not > a culprit for the random non-convergence of PETSc solvers and the eventual > crash of the simulations. I have tried both gcc-7.3.0 and intel-18 > compilers -- but the same thing is happening. > > > -- > --Amneet > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 14 07:36:07 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 14 Mar 2019 08:36:07 -0400 Subject: [petsc-users] Cross-compilation cluster In-Reply-To: References: Message-ID: On Thu, Mar 14, 2019 at 8:28 AM Amneet Bhalla wrote: > Matt -- SAMRAI, PETSc, and libMesh configure logs are attached with this > email. Also including some other log files in case they are useful. > Okay, PETSc is not sticking in a /usr/lib (or /usr/lib64). However, I can see that you mpif90 (and perhaps other of the wrappers) are reporting both /usr/lib64 AND /usr/lib flags, and I am guessing that is where the other configures are picking it up. Thanks, Matt > Thanks, > --Amneet > > On Thu, Mar 14, 2019 at 4:21 AM Matthew Knepley wrote: > >> In order to see why each flag was included, we need to see configure.log. >> >> Thanks, >> >> Matt >> >> On Thu, Mar 14, 2019 at 2:40 AM Amneet Bhalla via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Hi Folks, >>> >>> I am on a cluster that has -L/lib dir with 32-bit libraries and -L/lib64 >>> with 64-bit libraries. During compilation of some of libraries required for >>> my code (such as SAMRAI and libMesh) both paths >>> get picked -L/lib and -L/lib64. >>> >>> I am seeing some sporadic behavior in runtime when at some timesteps >>> PETSc does not converge. The same code with the same number of processors >>> run just fine on my workstation that has just 64-bit version of libraries. >>> >>> Even during the final linking stage of the executable, the linker gives >>> warnings like >>> >>> ld: skipping incompatible //lib/libm.so when searching for -lm >>> >>> ld: skipping incompatible /lib/libm.so when searching for -lm >>> >>> ld: skipping incompatible /lib/libm.so when searching for -lm >>> >>> ld: skipping incompatible //lib/libpthread.so when searching for >>> -lpthread >>> >>> ld: skipping incompatible /lib/libpthread.so when searching for -lpthread >>> >>> ld: skipping incompatible /lib/libpthread.so when searching for -lpthread >>> >>> ld: skipping incompatible //lib/libdl.so when searching for -ldl >>> >>> ld: skipping incompatible //lib/libc.so when searching for -lc >>> >>> ld: skipping incompatible /lib/libc.so when searching for -lc >>> >>> ld: skipping incompatible /lib/libc.so when searching for -lc >>> but the executable runs. >>> >>> >>> This is during config of SAMRAI when it picks both -L/lib and -L/lib64: >>> >>> checking whether we are using the GNU Fortran 77 compiler... no >>> >>> checking whether ifort accepts -g... yes >>> >>> checking how to get verbose linking output from ifort... -v >>> >>> checking for Fortran 77 libraries of ifort... >>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/ >>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 >>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ -L/lib/../lib64 >>> -L/lib/../lib64/ -L/usr/lib/../lib64 -L/usr/lib/../lib64/ >>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4/ >>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/ >>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/ >>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64/ >>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../ -L/lib64 -L/lib/ >>> -L/usr/lib64 -L/usr/lib -lifport -lifcoremt -limf -lsvml -lm -lipgo -lirc >>> -lpthread -lgcc -lgcc_s -lirc_s -ldl >>> >>> libMesh is also picking that path >>> >>> libmesh_optional_LIBS............ : -lhdf5 -lhdf5_cpp -lz >>> -L/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>> -Wl,-rpath,/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>> -Wl,-rpath,/opt/intel/mkl/lib/intel64 -L/opt/intel/mkl/lib/intel64 >>> -Wl,-rpath,/opt/mellanox/hcoll/lib -L/opt/mellanox/hcoll/lib >>> -Wl,-rpath,/opt/mellanox/mxm/lib -L/opt/mellanox/mxm/lib >>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>> -lpetsc -lHYPRE -lmkl_intel_lp64 -lmkl_sequential -lmkl_core >>> -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport >>> -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s >>> -lstdc++ -ldl -L/lib -Wl,-rpath,/lib >>> -Wl,-rpath,/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>> -L/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>> >>> Perhaps PETSc also picks up both versions (and there is a way to query >>> it from PETSc?), but I can't confirm this. Is there a way to instruct make >>> to select only -L/lib64? I want to rule out that 32-bit dynamic library is >>> not a culprit for the random non-convergence of PETSc solvers and the >>> eventual crash of the simulations. I have tried both gcc-7.3.0 and intel-18 >>> compilers -- but the same thing is happening. >>> >>> >>> -- >>> --Amneet >>> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > --Amneet > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail2amneet at gmail.com Thu Mar 14 07:46:33 2019 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Thu, 14 Mar 2019 05:46:33 -0700 Subject: [petsc-users] Cross-compilation cluster In-Reply-To: References: Message-ID: Ah, Ok. Do serial compilers look OK to you? Can lib-32 and lib-64 (say -lm) operate simulataneously during runtime, or this is my imagination? On Thu, Mar 14, 2019 at 5:36 AM Matthew Knepley wrote: > On Thu, Mar 14, 2019 at 8:28 AM Amneet Bhalla > wrote: > >> Matt -- SAMRAI, PETSc, and libMesh configure logs are attached with this >> email. Also including some other log files in case they are useful. >> > > Okay, PETSc is not sticking in a /usr/lib (or /usr/lib64). However, I can > see that you mpif90 (and perhaps other of the wrappers) are > reporting both /usr/lib64 AND /usr/lib flags, and I am guessing that is > where the other configures are picking it up. > > Thanks, > > Matt > > >> Thanks, >> --Amneet >> >> On Thu, Mar 14, 2019 at 4:21 AM Matthew Knepley >> wrote: >> >>> In order to see why each flag was included, we need to see configure.log. >>> >>> Thanks, >>> >>> Matt >>> >>> On Thu, Mar 14, 2019 at 2:40 AM Amneet Bhalla via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Hi Folks, >>>> >>>> I am on a cluster that has -L/lib dir with 32-bit libraries and >>>> -L/lib64 with 64-bit libraries. During compilation of some of >>>> libraries required for my code (such as SAMRAI and libMesh) both paths >>>> get picked -L/lib and -L/lib64. >>>> >>>> I am seeing some sporadic behavior in runtime when at some timesteps >>>> PETSc does not converge. The same code with the same number of processors >>>> run just fine on my workstation that has just 64-bit version of libraries. >>>> >>>> Even during the final linking stage of the executable, the linker gives >>>> warnings like >>>> >>>> ld: skipping incompatible //lib/libm.so when searching for -lm >>>> >>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>> >>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>> >>>> ld: skipping incompatible //lib/libpthread.so when searching for >>>> -lpthread >>>> >>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>> -lpthread >>>> >>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>> -lpthread >>>> >>>> ld: skipping incompatible //lib/libdl.so when searching for -ldl >>>> >>>> ld: skipping incompatible //lib/libc.so when searching for -lc >>>> >>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>> >>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>> but the executable runs. >>>> >>>> >>>> This is during config of SAMRAI when it picks both -L/lib and -L/lib64: >>>> >>>> checking whether we are using the GNU Fortran 77 compiler... no >>>> >>>> checking whether ifort accepts -g... yes >>>> >>>> checking how to get verbose linking output from ifort... -v >>>> >>>> checking for Fortran 77 libraries of ifort... >>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/ >>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 >>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ -L/lib/../lib64 >>>> -L/lib/../lib64/ -L/usr/lib/../lib64 -L/usr/lib/../lib64/ >>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4/ >>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/ >>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/ >>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64/ >>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../ -L/lib64 -L/lib/ >>>> -L/usr/lib64 -L/usr/lib -lifport -lifcoremt -limf -lsvml -lm -lipgo -lirc >>>> -lpthread -lgcc -lgcc_s -lirc_s -ldl >>>> >>>> libMesh is also picking that path >>>> >>>> libmesh_optional_LIBS............ : -lhdf5 -lhdf5_cpp -lz >>>> -L/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>> -Wl,-rpath,/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>> -Wl,-rpath,/opt/intel/mkl/lib/intel64 -L/opt/intel/mkl/lib/intel64 >>>> -Wl,-rpath,/opt/mellanox/hcoll/lib -L/opt/mellanox/hcoll/lib >>>> -Wl,-rpath,/opt/mellanox/mxm/lib -L/opt/mellanox/mxm/lib >>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>> -lpetsc -lHYPRE -lmkl_intel_lp64 -lmkl_sequential -lmkl_core >>>> -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport >>>> -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s >>>> -lstdc++ -ldl -L/lib -Wl,-rpath,/lib >>>> -Wl,-rpath,/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>> -L/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>> >>>> Perhaps PETSc also picks up both versions (and there is a way to query >>>> it from PETSc?), but I can't confirm this. Is there a way to instruct make >>>> to select only -L/lib64? I want to rule out that 32-bit dynamic library is >>>> not a culprit for the random non-convergence of PETSc solvers and the >>>> eventual crash of the simulations. I have tried both gcc-7.3.0 and intel-18 >>>> compilers -- but the same thing is happening. >>>> >>>> >>>> -- >>>> --Amneet >>>> >>>> >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> >> -- >> --Amneet >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail2amneet at gmail.com Thu Mar 14 07:47:57 2019 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Thu, 14 Mar 2019 05:47:57 -0700 Subject: [petsc-users] Cross-compilation cluster In-Reply-To: References: Message-ID: Silo and HDF5 are configured with serial version of the compiler. On Thu, Mar 14, 2019 at 5:46 AM Amneet Bhalla wrote: > Ah, Ok. Do serial compilers look OK to you? > > Can lib-32 and lib-64 (say -lm) operate simulataneously during runtime, or > this is my imagination? > > > > On Thu, Mar 14, 2019 at 5:36 AM Matthew Knepley wrote: > >> On Thu, Mar 14, 2019 at 8:28 AM Amneet Bhalla >> wrote: >> >>> Matt -- SAMRAI, PETSc, and libMesh configure logs are attached with this >>> email. Also including some other log files in case they are useful. >>> >> >> Okay, PETSc is not sticking in a /usr/lib (or /usr/lib64). However, I can >> see that you mpif90 (and perhaps other of the wrappers) are >> reporting both /usr/lib64 AND /usr/lib flags, and I am guessing that is >> where the other configures are picking it up. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> --Amneet >>> >>> On Thu, Mar 14, 2019 at 4:21 AM Matthew Knepley >>> wrote: >>> >>>> In order to see why each flag was included, we need to see >>>> configure.log. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> On Thu, Mar 14, 2019 at 2:40 AM Amneet Bhalla via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>>> Hi Folks, >>>>> >>>>> I am on a cluster that has -L/lib dir with 32-bit libraries and >>>>> -L/lib64 with 64-bit libraries. During compilation of some of >>>>> libraries required for my code (such as SAMRAI and libMesh) both paths >>>>> get picked -L/lib and -L/lib64. >>>>> >>>>> I am seeing some sporadic behavior in runtime when at some timesteps >>>>> PETSc does not converge. The same code with the same number of processors >>>>> run just fine on my workstation that has just 64-bit version of libraries. >>>>> >>>>> Even during the final linking stage of the executable, the linker >>>>> gives warnings like >>>>> >>>>> ld: skipping incompatible //lib/libm.so when searching for -lm >>>>> >>>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>>> >>>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>>> >>>>> ld: skipping incompatible //lib/libpthread.so when searching for >>>>> -lpthread >>>>> >>>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>>> -lpthread >>>>> >>>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>>> -lpthread >>>>> >>>>> ld: skipping incompatible //lib/libdl.so when searching for -ldl >>>>> >>>>> ld: skipping incompatible //lib/libc.so when searching for -lc >>>>> >>>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>>> >>>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>>> but the executable runs. >>>>> >>>>> >>>>> This is during config of SAMRAI when it picks both -L/lib and -L/lib64: >>>>> >>>>> checking whether we are using the GNU Fortran 77 compiler... no >>>>> >>>>> checking whether ifort accepts -g... yes >>>>> >>>>> checking how to get verbose linking output from ifort... -v >>>>> >>>>> checking for Fortran 77 libraries of ifort... >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/ >>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 >>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ -L/lib/../lib64 >>>>> -L/lib/../lib64/ -L/usr/lib/../lib64 -L/usr/lib/../lib64/ >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4/ >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/ >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/ >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64/ >>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../ -L/lib64 -L/lib/ >>>>> -L/usr/lib64 -L/usr/lib -lifport -lifcoremt -limf -lsvml -lm -lipgo -lirc >>>>> -lpthread -lgcc -lgcc_s -lirc_s -ldl >>>>> >>>>> libMesh is also picking that path >>>>> >>>>> libmesh_optional_LIBS............ : -lhdf5 -lhdf5_cpp -lz >>>>> -L/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>>> -Wl,-rpath,/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>>> -Wl,-rpath,/opt/intel/mkl/lib/intel64 -L/opt/intel/mkl/lib/intel64 >>>>> -Wl,-rpath,/opt/mellanox/hcoll/lib -L/opt/mellanox/hcoll/lib >>>>> -Wl,-rpath,/opt/mellanox/mxm/lib -L/opt/mellanox/mxm/lib >>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>> -lpetsc -lHYPRE -lmkl_intel_lp64 -lmkl_sequential -lmkl_core >>>>> -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport >>>>> -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s >>>>> -lstdc++ -ldl -L/lib -Wl,-rpath,/lib >>>>> -Wl,-rpath,/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>>> -L/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>>> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>>> >>>>> Perhaps PETSc also picks up both versions (and there is a way to query >>>>> it from PETSc?), but I can't confirm this. Is there a way to instruct make >>>>> to select only -L/lib64? I want to rule out that 32-bit dynamic library is >>>>> not a culprit for the random non-convergence of PETSc solvers and the >>>>> eventual crash of the simulations. I have tried both gcc-7.3.0 and intel-18 >>>>> compilers -- but the same thing is happening. >>>>> >>>>> >>>>> -- >>>>> --Amneet >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >>> -- >>> --Amneet >>> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- > --Amneet > > > > -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail2amneet at gmail.com Thu Mar 14 07:28:15 2019 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Thu, 14 Mar 2019 05:28:15 -0700 Subject: [petsc-users] Cross-compilation cluster In-Reply-To: References: Message-ID: Matt -- SAMRAI, PETSc, and libMesh configure logs are attached with this email. Also including some other log files in case they are useful. Thanks, --Amneet On Thu, Mar 14, 2019 at 4:21 AM Matthew Knepley wrote: > In order to see why each flag was included, we need to see configure.log. > > Thanks, > > Matt > > On Thu, Mar 14, 2019 at 2:40 AM Amneet Bhalla via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hi Folks, >> >> I am on a cluster that has -L/lib dir with 32-bit libraries and -L/lib64 >> with 64-bit libraries. During compilation of some of libraries required for >> my code (such as SAMRAI and libMesh) both paths >> get picked -L/lib and -L/lib64. >> >> I am seeing some sporadic behavior in runtime when at some timesteps >> PETSc does not converge. The same code with the same number of processors >> run just fine on my workstation that has just 64-bit version of libraries. >> >> Even during the final linking stage of the executable, the linker gives >> warnings like >> >> ld: skipping incompatible //lib/libm.so when searching for -lm >> >> ld: skipping incompatible /lib/libm.so when searching for -lm >> >> ld: skipping incompatible /lib/libm.so when searching for -lm >> >> ld: skipping incompatible //lib/libpthread.so when searching for -lpthread >> >> ld: skipping incompatible /lib/libpthread.so when searching for -lpthread >> >> ld: skipping incompatible /lib/libpthread.so when searching for -lpthread >> >> ld: skipping incompatible //lib/libdl.so when searching for -ldl >> >> ld: skipping incompatible //lib/libc.so when searching for -lc >> >> ld: skipping incompatible /lib/libc.so when searching for -lc >> >> ld: skipping incompatible /lib/libc.so when searching for -lc >> but the executable runs. >> >> >> This is during config of SAMRAI when it picks both -L/lib and -L/lib64: >> >> checking whether we are using the GNU Fortran 77 compiler... no >> >> checking whether ifort accepts -g... yes >> >> checking how to get verbose linking output from ifort... -v >> >> checking for Fortran 77 libraries of ifort... >> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/ >> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 >> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ -L/lib/../lib64 >> -L/lib/../lib64/ -L/usr/lib/../lib64 -L/usr/lib/../lib64/ >> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4/ >> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/ >> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/ >> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64/ >> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../ -L/lib64 -L/lib/ >> -L/usr/lib64 -L/usr/lib -lifport -lifcoremt -limf -lsvml -lm -lipgo -lirc >> -lpthread -lgcc -lgcc_s -lirc_s -ldl >> >> libMesh is also picking that path >> >> libmesh_optional_LIBS............ : -lhdf5 -lhdf5_cpp -lz >> -L/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >> -Wl,-rpath,/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >> -Wl,-rpath,/opt/intel/mkl/lib/intel64 -L/opt/intel/mkl/lib/intel64 >> -Wl,-rpath,/opt/mellanox/hcoll/lib -L/opt/mellanox/hcoll/lib >> -Wl,-rpath,/opt/mellanox/mxm/lib -L/opt/mellanox/mxm/lib >> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >> -lpetsc -lHYPRE -lmkl_intel_lp64 -lmkl_sequential -lmkl_core >> -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport >> -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s >> -lstdc++ -ldl -L/lib -Wl,-rpath,/lib >> -Wl,-rpath,/usr/local/mpi/intel/openmpi-4.0.0/lib64 >> -L/usr/local/mpi/intel/openmpi-4.0.0/lib64 >> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >> >> Perhaps PETSc also picks up both versions (and there is a way to query it >> from PETSc?), but I can't confirm this. Is there a way to instruct make to >> select only -L/lib64? I want to rule out that 32-bit dynamic library is not >> a culprit for the random non-convergence of PETSc solvers and the eventual >> crash of the simulations. I have tried both gcc-7.3.0 and intel-18 >> compilers -- but the same thing is happening. >> >> >> -- >> --Amneet >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: libMesh_config.log Type: application/octet-stream Size: 248696 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: SAMRAI_config.log Type: application/octet-stream Size: 202853 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PETSc_config.log Type: application/octet-stream Size: 10074128 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: HDF5_config.log Type: application/octet-stream Size: 178092 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Silo_config.log Type: application/octet-stream Size: 83751 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: IBAMR_config.log Type: application/octet-stream Size: 398301 bytes Desc: not available URL: From balay at mcs.anl.gov Thu Mar 14 08:17:06 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Thu, 14 Mar 2019 13:17:06 +0000 Subject: [petsc-users] Cross-compilation cluster In-Reply-To: References: Message-ID: As these warnings indicate - 64bit compiler ignores 32bit libraries [so they don't get used]. i.e mixing 32bit and 64bit libraries ins not the cause of your problems on this machine.. Satish On Wed, 13 Mar 2019, Amneet Bhalla via petsc-users wrote: > Hi Folks, > > I am on a cluster that has -L/lib dir with 32-bit libraries and -L/lib64 > with 64-bit libraries. During compilation of some of libraries required for > my code (such as SAMRAI and libMesh) both paths > get picked -L/lib and -L/lib64. > > I am seeing some sporadic behavior in runtime when at some timesteps PETSc > does not converge. The same code with the same number of processors run > just fine on my workstation that has just 64-bit version of libraries. > > Even during the final linking stage of the executable, the linker gives > warnings like > > ld: skipping incompatible //lib/libm.so when searching for -lm > > ld: skipping incompatible /lib/libm.so when searching for -lm > > ld: skipping incompatible /lib/libm.so when searching for -lm > > ld: skipping incompatible //lib/libpthread.so when searching for -lpthread > > ld: skipping incompatible /lib/libpthread.so when searching for -lpthread > > ld: skipping incompatible /lib/libpthread.so when searching for -lpthread > > ld: skipping incompatible //lib/libdl.so when searching for -ldl > > ld: skipping incompatible //lib/libc.so when searching for -lc > > ld: skipping incompatible /lib/libc.so when searching for -lc > > ld: skipping incompatible /lib/libc.so when searching for -lc > but the executable runs. > > > This is during config of SAMRAI when it picks both -L/lib and -L/lib64: > > checking whether we are using the GNU Fortran 77 compiler... no > > checking whether ifort accepts -g... yes > > checking how to get verbose linking output from ifort... -v > > checking for Fortran 77 libraries of ifort... > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin > -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/ > -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 > -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ -L/lib/../lib64 > -L/lib/../lib64/ -L/usr/lib/../lib64 -L/usr/lib/../lib64/ > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4/ > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/ > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/ > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64/ > -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../ -L/lib64 -L/lib/ > -L/usr/lib64 -L/usr/lib -lifport -lifcoremt -limf -lsvml -lm -lipgo -lirc > -lpthread -lgcc -lgcc_s -lirc_s -ldl > > libMesh is also picking that path > > libmesh_optional_LIBS............ : -lhdf5 -lhdf5_cpp -lz > -L/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib > -Wl,-rpath,/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib > -Wl,-rpath,/opt/intel/mkl/lib/intel64 -L/opt/intel/mkl/lib/intel64 > -Wl,-rpath,/opt/mellanox/hcoll/lib -L/opt/mellanox/hcoll/lib > -Wl,-rpath,/opt/mellanox/mxm/lib -L/opt/mellanox/mxm/lib > -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 > -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 > -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 > -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 > -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin > -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin > -lpetsc -lHYPRE -lmkl_intel_lp64 -lmkl_sequential -lmkl_core > -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport > -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s > -lstdc++ -ldl -L/lib -Wl,-rpath,/lib > -Wl,-rpath,/usr/local/mpi/intel/openmpi-4.0.0/lib64 > -L/usr/local/mpi/intel/openmpi-4.0.0/lib64 > -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 > -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 > > Perhaps PETSc also picks up both versions (and there is a way to query it > from PETSc?), but I can't confirm this. Is there a way to instruct make to > select only -L/lib64? I want to rule out that 32-bit dynamic library is not > a culprit for the random non-convergence of PETSc solvers and the eventual > crash of the simulations. I have tried both gcc-7.3.0 and intel-18 > compilers -- but the same thing is happening. > > > From mlohry at gmail.com Thu Mar 14 08:24:52 2019 From: mlohry at gmail.com (Mark Lohry) Date: Thu, 14 Mar 2019 09:24:52 -0400 Subject: [petsc-users] GAMG parallel convergence sensitivity In-Reply-To: <874l86b0bc.fsf@jedbrown.org> References: <874l86b0bc.fsf@jedbrown.org> Message-ID: Thanks Jed, that's a good read. I'm not too familiar with the Schur approach but I might give it a go in the future. A primitive variable formulation is in the works though, at least for pseudotime-derivative preconditioning. It seems to me with these semi-implicit methods the CFL limit is still so close to the explicit limit (that paper stops at 30), I don't really see the purpose unless you're running purely incompressible? That's just my ignorance speaking though. I'm currently running fully implicit for everything, with CFLs around 1e3 - 1e5 or so. On Wed, Mar 13, 2019 at 11:30 PM Jed Brown wrote: > Mark Lohry via petsc-users writes: > > > For what it's worth, I'm regularly solving much larger problems (1M-100M > > unknowns, unsteady) with this discretization and AMG setup on 500+ cores > > with impressively great convergence, dramatically better than ILU/ASM. > This > > just happens to be the first time I've experimented with this extremely > low > > Mach number, which is known to have a whole host of issues and generally > > needs low-mach preconditioners, I was just a bit surprised by this > specific > > failure mechanism. > > A common technique for low-Mach preconditioning is to convert to > primitive variables (much better conditioned for the solve) and use a > Schur fieldsplit into the pressure space. For modest time step, you can > use SIMPLE-like method ("selfp" in PCFieldSplit lingo) to approximate > that Schur complement. You can also rediscretize to form that > approximation. This paper has a bunch of examples of choices for the > state variables and derivation of the continuous pressure preconditioner > each case. (They present it as a classical semi-implicit method, but > that would be the Schur complement preconditioner if using FieldSplit > with a fully implicit or IMEX method.) > > https://doi.org/10.1137/090775889 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Mar 14 08:45:25 2019 From: jed at jedbrown.org (Jed Brown) Date: Thu, 14 Mar 2019 07:45:25 -0600 Subject: [petsc-users] GAMG parallel convergence sensitivity In-Reply-To: References: <874l86b0bc.fsf@jedbrown.org> Message-ID: <87h8c58t8q.fsf@jedbrown.org> Mark Lohry writes: > It seems to me with these semi-implicit methods the CFL limit is still so > close to the explicit limit (that paper stops at 30), I don't really see > the purpose unless you're running purely incompressible? That's just my > ignorance speaking though. I'm currently running fully implicit for > everything, with CFLs around 1e3 - 1e5 or so. It depends what you're trying to resolve. Sounds like maybe you're stepping toward steady state. The paper is wishing to resolve vortex and baroclinic dynamics while stepping over acoustics and barotropic waves. From knepley at gmail.com Thu Mar 14 08:44:40 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 14 Mar 2019 09:44:40 -0400 Subject: [petsc-users] Cross-compilation cluster In-Reply-To: References: Message-ID: It is very dangerous to use different compilers. I would make sure that all the compilers are the MPI compilers. Thanks, Matt On Thu, Mar 14, 2019 at 8:46 AM Amneet Bhalla wrote: > Ah, Ok. Do serial compilers look OK to you? > > Can lib-32 and lib-64 (say -lm) operate simulataneously during runtime, or > this is my imagination? > > > > On Thu, Mar 14, 2019 at 5:36 AM Matthew Knepley wrote: > >> On Thu, Mar 14, 2019 at 8:28 AM Amneet Bhalla >> wrote: >> >>> Matt -- SAMRAI, PETSc, and libMesh configure logs are attached with this >>> email. Also including some other log files in case they are useful. >>> >> >> Okay, PETSc is not sticking in a /usr/lib (or /usr/lib64). However, I can >> see that you mpif90 (and perhaps other of the wrappers) are >> reporting both /usr/lib64 AND /usr/lib flags, and I am guessing that is >> where the other configures are picking it up. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> --Amneet >>> >>> On Thu, Mar 14, 2019 at 4:21 AM Matthew Knepley >>> wrote: >>> >>>> In order to see why each flag was included, we need to see >>>> configure.log. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> On Thu, Mar 14, 2019 at 2:40 AM Amneet Bhalla via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>>> Hi Folks, >>>>> >>>>> I am on a cluster that has -L/lib dir with 32-bit libraries and >>>>> -L/lib64 with 64-bit libraries. During compilation of some of >>>>> libraries required for my code (such as SAMRAI and libMesh) both paths >>>>> get picked -L/lib and -L/lib64. >>>>> >>>>> I am seeing some sporadic behavior in runtime when at some timesteps >>>>> PETSc does not converge. The same code with the same number of processors >>>>> run just fine on my workstation that has just 64-bit version of libraries. >>>>> >>>>> Even during the final linking stage of the executable, the linker >>>>> gives warnings like >>>>> >>>>> ld: skipping incompatible //lib/libm.so when searching for -lm >>>>> >>>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>>> >>>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>>> >>>>> ld: skipping incompatible //lib/libpthread.so when searching for >>>>> -lpthread >>>>> >>>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>>> -lpthread >>>>> >>>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>>> -lpthread >>>>> >>>>> ld: skipping incompatible //lib/libdl.so when searching for -ldl >>>>> >>>>> ld: skipping incompatible //lib/libc.so when searching for -lc >>>>> >>>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>>> >>>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>>> but the executable runs. >>>>> >>>>> >>>>> This is during config of SAMRAI when it picks both -L/lib and -L/lib64: >>>>> >>>>> checking whether we are using the GNU Fortran 77 compiler... no >>>>> >>>>> checking whether ifort accepts -g... yes >>>>> >>>>> checking how to get verbose linking output from ifort... -v >>>>> >>>>> checking for Fortran 77 libraries of ifort... >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/ >>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 >>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ -L/lib/../lib64 >>>>> -L/lib/../lib64/ -L/usr/lib/../lib64 -L/usr/lib/../lib64/ >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4/ >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/ >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/ >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64/ >>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../ -L/lib64 -L/lib/ >>>>> -L/usr/lib64 -L/usr/lib -lifport -lifcoremt -limf -lsvml -lm -lipgo -lirc >>>>> -lpthread -lgcc -lgcc_s -lirc_s -ldl >>>>> >>>>> libMesh is also picking that path >>>>> >>>>> libmesh_optional_LIBS............ : -lhdf5 -lhdf5_cpp -lz >>>>> -L/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>>> -Wl,-rpath,/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>>> -Wl,-rpath,/opt/intel/mkl/lib/intel64 -L/opt/intel/mkl/lib/intel64 >>>>> -Wl,-rpath,/opt/mellanox/hcoll/lib -L/opt/mellanox/hcoll/lib >>>>> -Wl,-rpath,/opt/mellanox/mxm/lib -L/opt/mellanox/mxm/lib >>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>> -lpetsc -lHYPRE -lmkl_intel_lp64 -lmkl_sequential -lmkl_core >>>>> -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport >>>>> -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s >>>>> -lstdc++ -ldl -L/lib -Wl,-rpath,/lib >>>>> -Wl,-rpath,/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>>> -L/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>>> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>>> >>>>> Perhaps PETSc also picks up both versions (and there is a way to query >>>>> it from PETSc?), but I can't confirm this. Is there a way to instruct make >>>>> to select only -L/lib64? I want to rule out that 32-bit dynamic library is >>>>> not a culprit for the random non-convergence of PETSc solvers and the >>>>> eventual crash of the simulations. I have tried both gcc-7.3.0 and intel-18 >>>>> compilers -- but the same thing is happening. >>>>> >>>>> >>>>> -- >>>>> --Amneet >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >>> -- >>> --Amneet >>> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- > --Amneet > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Thu Mar 14 08:56:46 2019 From: mlohry at gmail.com (Mark Lohry) Date: Thu, 14 Mar 2019 09:56:46 -0400 Subject: [petsc-users] GAMG parallel convergence sensitivity In-Reply-To: <87h8c58t8q.fsf@jedbrown.org> References: <874l86b0bc.fsf@jedbrown.org> <87h8c58t8q.fsf@jedbrown.org> Message-ID: > > It depends what you're trying to resolve. Sounds like maybe you're > stepping toward steady state. The paper is wishing to resolve vortex > and baroclinic dynamics while stepping over acoustics and barotropic > waves. Yeah, I'm usually working towards steady state or with fairly large time steps, but with the highly stretched meshes typical in aerodynamics applications even with pretty accurate time resolution fully implicit is advantageous. Doubly so with the nasty explicit stability limits in DG -- I've been playing with DNS setups (taylor-green vortex) and getting considerable efficiency advantages using ILU+gmres with a CFL ~100 with no noticeable resolution losses compared to explicit. On Thu, Mar 14, 2019 at 9:45 AM Jed Brown wrote: > Mark Lohry writes: > > > It seems to me with these semi-implicit methods the CFL limit is still so > > close to the explicit limit (that paper stops at 30), I don't really see > > the purpose unless you're running purely incompressible? That's just my > > ignorance speaking though. I'm currently running fully implicit for > > everything, with CFLs around 1e3 - 1e5 or so. > > It depends what you're trying to resolve. Sounds like maybe you're > stepping toward steady state. The paper is wishing to resolve vortex > and baroclinic dynamics while stepping over acoustics and barotropic > waves. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail2amneet at gmail.com Thu Mar 14 09:05:16 2019 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Thu, 14 Mar 2019 07:05:16 -0700 Subject: [petsc-users] Cross-compilation cluster In-Reply-To: References: Message-ID: I need the serial built of Silo and HDF5 libraries. I am sure that the MPI wrappers are coming from the serial compilers. i.e, If I do gcc ?version and mpicc ?version, I see the same GCC version. On Thu, Mar 14, 2019 at 6:44 AM Matthew Knepley wrote: > It is very dangerous to use different compilers. I would make sure that > all the compilers are the MPI compilers. > > Thanks, > > Matt > > On Thu, Mar 14, 2019 at 8:46 AM Amneet Bhalla > wrote: > >> Ah, Ok. Do serial compilers look OK to you? >> >> Can lib-32 and lib-64 (say -lm) operate simulataneously during runtime, >> or this is my imagination? >> >> >> >> On Thu, Mar 14, 2019 at 5:36 AM Matthew Knepley >> wrote: >> >>> On Thu, Mar 14, 2019 at 8:28 AM Amneet Bhalla >>> wrote: >>> >>>> Matt -- SAMRAI, PETSc, and libMesh configure logs are attached with >>>> this email. Also including some other log files in case they are useful. >>>> >>> >>> Okay, PETSc is not sticking in a /usr/lib (or /usr/lib64). However, I >>> can see that you mpif90 (and perhaps other of the wrappers) are >>> reporting both /usr/lib64 AND /usr/lib flags, and I am guessing that is >>> where the other configures are picking it up. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks, >>>> --Amneet >>>> >>>> On Thu, Mar 14, 2019 at 4:21 AM Matthew Knepley >>>> wrote: >>>> >>>>> In order to see why each flag was included, we need to see >>>>> configure.log. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> On Thu, Mar 14, 2019 at 2:40 AM Amneet Bhalla via petsc-users < >>>>> petsc-users at mcs.anl.gov> wrote: >>>>> >>>>>> Hi Folks, >>>>>> >>>>>> I am on a cluster that has -L/lib dir with 32-bit libraries and >>>>>> -L/lib64 with 64-bit libraries. During compilation of some of >>>>>> libraries required for my code (such as SAMRAI and libMesh) both paths >>>>>> get picked -L/lib and -L/lib64. >>>>>> >>>>>> I am seeing some sporadic behavior in runtime when at some timesteps >>>>>> PETSc does not converge. The same code with the same number of processors >>>>>> run just fine on my workstation that has just 64-bit version of libraries. >>>>>> >>>>>> Even during the final linking stage of the executable, the linker >>>>>> gives warnings like >>>>>> >>>>>> ld: skipping incompatible //lib/libm.so when searching for -lm >>>>>> >>>>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>>>> >>>>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>>>> >>>>>> ld: skipping incompatible //lib/libpthread.so when searching for >>>>>> -lpthread >>>>>> >>>>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>>>> -lpthread >>>>>> >>>>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>>>> -lpthread >>>>>> >>>>>> ld: skipping incompatible //lib/libdl.so when searching for -ldl >>>>>> >>>>>> ld: skipping incompatible //lib/libc.so when searching for -lc >>>>>> >>>>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>>>> >>>>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>>>> but the executable runs. >>>>>> >>>>>> >>>>>> This is during config of SAMRAI when it picks both -L/lib and >>>>>> -L/lib64: >>>>>> >>>>>> checking whether we are using the GNU Fortran 77 compiler... no >>>>>> >>>>>> checking whether ifort accepts -g... yes >>>>>> >>>>>> checking how to get verbose linking output from ifort... -v >>>>>> >>>>>> checking for Fortran 77 libraries of ifort... >>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/ >>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 >>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ -L/lib/../lib64 >>>>>> -L/lib/../lib64/ -L/usr/lib/../lib64 -L/usr/lib/../lib64/ >>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4/ >>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/ >>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/ >>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64/ >>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../ -L/lib64 -L/lib/ >>>>>> -L/usr/lib64 -L/usr/lib -lifport -lifcoremt -limf -lsvml -lm -lipgo -lirc >>>>>> -lpthread -lgcc -lgcc_s -lirc_s -ldl >>>>>> >>>>>> libMesh is also picking that path >>>>>> >>>>>> libmesh_optional_LIBS............ : -lhdf5 -lhdf5_cpp -lz >>>>>> -L/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>>>> -Wl,-rpath,/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>>>> -Wl,-rpath,/opt/intel/mkl/lib/intel64 -L/opt/intel/mkl/lib/intel64 >>>>>> -Wl,-rpath,/opt/mellanox/hcoll/lib -L/opt/mellanox/hcoll/lib >>>>>> -Wl,-rpath,/opt/mellanox/mxm/lib -L/opt/mellanox/mxm/lib >>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>> -lpetsc -lHYPRE -lmkl_intel_lp64 -lmkl_sequential -lmkl_core >>>>>> -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport >>>>>> -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s >>>>>> -lstdc++ -ldl -L/lib -Wl,-rpath,/lib >>>>>> -Wl,-rpath,/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>>>> -L/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>>>> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>>>> >>>>>> Perhaps PETSc also picks up both versions (and there is a way to >>>>>> query it from PETSc?), but I can't confirm this. Is there a way to instruct >>>>>> make to select only -L/lib64? I want to rule out that 32-bit dynamic >>>>>> library is not a culprit for the random non-convergence of PETSc solvers >>>>>> and the eventual crash of the simulations. I have tried both gcc-7.3.0 and >>>>>> intel-18 compilers -- but the same thing is happening. >>>>>> >>>>>> >>>>>> -- >>>>>> --Amneet >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>>> >>>> -- >>>> --Amneet >>>> >>>> >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- >> --Amneet >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 14 11:34:00 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 14 Mar 2019 12:34:00 -0400 Subject: [petsc-users] Cross-compilation cluster In-Reply-To: References: Message-ID: On Thu, Mar 14, 2019 at 10:05 AM Amneet Bhalla wrote: > I need the serial built of Silo and HDF5 libraries. I am sure that the MPI > wrappers are coming from the serial compilers. > > i.e, If I do > > gcc ?version and mpicc ?version, I see the same GCC version. > That is in a perfect world. Unfortunately, flags for compilers can make incompatible libraries. Its no problem to give the MPI compilers to Silo and HDF5, even if you do not want to link the libraries. Matt > On Thu, Mar 14, 2019 at 6:44 AM Matthew Knepley wrote: > >> It is very dangerous to use different compilers. I would make sure that >> all the compilers are the MPI compilers. >> >> Thanks, >> >> Matt >> >> On Thu, Mar 14, 2019 at 8:46 AM Amneet Bhalla >> wrote: >> >>> Ah, Ok. Do serial compilers look OK to you? >>> >>> Can lib-32 and lib-64 (say -lm) operate simulataneously during runtime, >>> or this is my imagination? >>> >>> >>> >>> On Thu, Mar 14, 2019 at 5:36 AM Matthew Knepley >>> wrote: >>> >>>> On Thu, Mar 14, 2019 at 8:28 AM Amneet Bhalla >>>> wrote: >>>> >>>>> Matt -- SAMRAI, PETSc, and libMesh configure logs are attached with >>>>> this email. Also including some other log files in case they are useful. >>>>> >>>> >>>> Okay, PETSc is not sticking in a /usr/lib (or /usr/lib64). However, I >>>> can see that you mpif90 (and perhaps other of the wrappers) are >>>> reporting both /usr/lib64 AND /usr/lib flags, and I am guessing that is >>>> where the other configures are picking it up. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks, >>>>> --Amneet >>>>> >>>>> On Thu, Mar 14, 2019 at 4:21 AM Matthew Knepley >>>>> wrote: >>>>> >>>>>> In order to see why each flag was included, we need to see >>>>>> configure.log. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> On Thu, Mar 14, 2019 at 2:40 AM Amneet Bhalla via petsc-users < >>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>> >>>>>>> Hi Folks, >>>>>>> >>>>>>> I am on a cluster that has -L/lib dir with 32-bit libraries and >>>>>>> -L/lib64 with 64-bit libraries. During compilation of some of >>>>>>> libraries required for my code (such as SAMRAI and libMesh) both paths >>>>>>> get picked -L/lib and -L/lib64. >>>>>>> >>>>>>> I am seeing some sporadic behavior in runtime when at some timesteps >>>>>>> PETSc does not converge. The same code with the same number of processors >>>>>>> run just fine on my workstation that has just 64-bit version of libraries. >>>>>>> >>>>>>> Even during the final linking stage of the executable, the linker >>>>>>> gives warnings like >>>>>>> >>>>>>> ld: skipping incompatible //lib/libm.so when searching for -lm >>>>>>> >>>>>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>>>>> >>>>>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>>>>> >>>>>>> ld: skipping incompatible //lib/libpthread.so when searching for >>>>>>> -lpthread >>>>>>> >>>>>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>>>>> -lpthread >>>>>>> >>>>>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>>>>> -lpthread >>>>>>> >>>>>>> ld: skipping incompatible //lib/libdl.so when searching for -ldl >>>>>>> >>>>>>> ld: skipping incompatible //lib/libc.so when searching for -lc >>>>>>> >>>>>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>>>>> >>>>>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>>>>> but the executable runs. >>>>>>> >>>>>>> >>>>>>> This is during config of SAMRAI when it picks both -L/lib and >>>>>>> -L/lib64: >>>>>>> >>>>>>> checking whether we are using the GNU Fortran 77 compiler... no >>>>>>> >>>>>>> checking whether ifort accepts -g... yes >>>>>>> >>>>>>> checking how to get verbose linking output from ifort... -v >>>>>>> >>>>>>> checking for Fortran 77 libraries of ifort... >>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/ >>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 >>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ -L/lib/../lib64 >>>>>>> -L/lib/../lib64/ -L/usr/lib/../lib64 -L/usr/lib/../lib64/ >>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4/ >>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/ >>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/ >>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64/ >>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../ -L/lib64 -L/lib/ >>>>>>> -L/usr/lib64 -L/usr/lib -lifport -lifcoremt -limf -lsvml -lm -lipgo -lirc >>>>>>> -lpthread -lgcc -lgcc_s -lirc_s -ldl >>>>>>> >>>>>>> libMesh is also picking that path >>>>>>> >>>>>>> libmesh_optional_LIBS............ : -lhdf5 -lhdf5_cpp -lz >>>>>>> -L/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>>>>> -Wl,-rpath,/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>>>>> -Wl,-rpath,/opt/intel/mkl/lib/intel64 -L/opt/intel/mkl/lib/intel64 >>>>>>> -Wl,-rpath,/opt/mellanox/hcoll/lib -L/opt/mellanox/hcoll/lib >>>>>>> -Wl,-rpath,/opt/mellanox/mxm/lib -L/opt/mellanox/mxm/lib >>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>>> -lpetsc -lHYPRE -lmkl_intel_lp64 -lmkl_sequential -lmkl_core >>>>>>> -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport >>>>>>> -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s >>>>>>> -lstdc++ -ldl -L/lib -Wl,-rpath,/lib >>>>>>> -Wl,-rpath,/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>>>>> -L/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>>>>> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>>>>> >>>>>>> Perhaps PETSc also picks up both versions (and there is a way to >>>>>>> query it from PETSc?), but I can't confirm this. Is there a way to instruct >>>>>>> make to select only -L/lib64? I want to rule out that 32-bit dynamic >>>>>>> library is not a culprit for the random non-convergence of PETSc solvers >>>>>>> and the eventual crash of the simulations. I have tried both gcc-7.3.0 and >>>>>>> intel-18 compilers -- but the same thing is happening. >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> --Amneet >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> --Amneet >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> -- >>> --Amneet >>> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- > --Amneet > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail2amneet at gmail.com Thu Mar 14 12:04:05 2019 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Thu, 14 Mar 2019 10:04:05 -0700 Subject: [petsc-users] Cross-compilation cluster In-Reply-To: References: Message-ID: Hmmn, ok. I will try it out. On Thu, Mar 14, 2019 at 9:34 AM Matthew Knepley wrote: > On Thu, Mar 14, 2019 at 10:05 AM Amneet Bhalla > wrote: > >> I need the serial built of Silo and HDF5 libraries. I am sure that the >> MPI wrappers are coming from the serial compilers. >> >> i.e, If I do >> >> gcc ?version and mpicc ?version, I see the same GCC version. >> > > That is in a perfect world. Unfortunately, flags for compilers can make > incompatible libraries. Its no problem > to give the MPI compilers to Silo and HDF5, even if you do not want to > link the libraries. > > Matt > > >> On Thu, Mar 14, 2019 at 6:44 AM Matthew Knepley >> wrote: >> >>> It is very dangerous to use different compilers. I would make sure that >>> all the compilers are the MPI compilers. >>> >>> Thanks, >>> >>> Matt >>> >>> On Thu, Mar 14, 2019 at 8:46 AM Amneet Bhalla >>> wrote: >>> >>>> Ah, Ok. Do serial compilers look OK to you? >>>> >>>> Can lib-32 and lib-64 (say -lm) operate simulataneously during runtime, >>>> or this is my imagination? >>>> >>>> >>>> >>>> On Thu, Mar 14, 2019 at 5:36 AM Matthew Knepley >>>> wrote: >>>> >>>>> On Thu, Mar 14, 2019 at 8:28 AM Amneet Bhalla >>>>> wrote: >>>>> >>>>>> Matt -- SAMRAI, PETSc, and libMesh configure logs are attached with >>>>>> this email. Also including some other log files in case they are useful. >>>>>> >>>>> >>>>> Okay, PETSc is not sticking in a /usr/lib (or /usr/lib64). However, I >>>>> can see that you mpif90 (and perhaps other of the wrappers) are >>>>> reporting both /usr/lib64 AND /usr/lib flags, and I am guessing that >>>>> is where the other configures are picking it up. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks, >>>>>> --Amneet >>>>>> >>>>>> On Thu, Mar 14, 2019 at 4:21 AM Matthew Knepley >>>>>> wrote: >>>>>> >>>>>>> In order to see why each flag was included, we need to see >>>>>>> configure.log. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> On Thu, Mar 14, 2019 at 2:40 AM Amneet Bhalla via petsc-users < >>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>> >>>>>>>> Hi Folks, >>>>>>>> >>>>>>>> I am on a cluster that has -L/lib dir with 32-bit libraries and >>>>>>>> -L/lib64 with 64-bit libraries. During compilation of some of >>>>>>>> libraries required for my code (such as SAMRAI and libMesh) both paths >>>>>>>> get picked -L/lib and -L/lib64. >>>>>>>> >>>>>>>> I am seeing some sporadic behavior in runtime when at some >>>>>>>> timesteps PETSc does not converge. The same code with the same number of >>>>>>>> processors run just fine on my workstation that has just 64-bit version of >>>>>>>> libraries. >>>>>>>> >>>>>>>> Even during the final linking stage of the executable, the linker >>>>>>>> gives warnings like >>>>>>>> >>>>>>>> ld: skipping incompatible //lib/libm.so when searching for -lm >>>>>>>> >>>>>>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>>>>>> >>>>>>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>>>>>> >>>>>>>> ld: skipping incompatible //lib/libpthread.so when searching for >>>>>>>> -lpthread >>>>>>>> >>>>>>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>>>>>> -lpthread >>>>>>>> >>>>>>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>>>>>> -lpthread >>>>>>>> >>>>>>>> ld: skipping incompatible //lib/libdl.so when searching for -ldl >>>>>>>> >>>>>>>> ld: skipping incompatible //lib/libc.so when searching for -lc >>>>>>>> >>>>>>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>>>>>> >>>>>>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>>>>>> but the executable runs. >>>>>>>> >>>>>>>> >>>>>>>> This is during config of SAMRAI when it picks both -L/lib and >>>>>>>> -L/lib64: >>>>>>>> >>>>>>>> checking whether we are using the GNU Fortran 77 compiler... no >>>>>>>> >>>>>>>> checking whether ifort accepts -g... yes >>>>>>>> >>>>>>>> checking how to get verbose linking output from ifort... -v >>>>>>>> >>>>>>>> checking for Fortran 77 libraries of ifort... >>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/ >>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 >>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ -L/lib/../lib64 >>>>>>>> -L/lib/../lib64/ -L/usr/lib/../lib64 -L/usr/lib/../lib64/ >>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4/ >>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/ >>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/ >>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64/ >>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../ -L/lib64 -L/lib/ >>>>>>>> -L/usr/lib64 -L/usr/lib -lifport -lifcoremt -limf -lsvml -lm -lipgo -lirc >>>>>>>> -lpthread -lgcc -lgcc_s -lirc_s -ldl >>>>>>>> >>>>>>>> libMesh is also picking that path >>>>>>>> >>>>>>>> libmesh_optional_LIBS............ : -lhdf5 -lhdf5_cpp -lz >>>>>>>> -L/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>>>>>> -Wl,-rpath,/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>>>>>> -Wl,-rpath,/opt/intel/mkl/lib/intel64 -L/opt/intel/mkl/lib/intel64 >>>>>>>> -Wl,-rpath,/opt/mellanox/hcoll/lib -L/opt/mellanox/hcoll/lib >>>>>>>> -Wl,-rpath,/opt/mellanox/mxm/lib -L/opt/mellanox/mxm/lib >>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>>>> -lpetsc -lHYPRE -lmkl_intel_lp64 -lmkl_sequential -lmkl_core >>>>>>>> -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport >>>>>>>> -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s >>>>>>>> -lstdc++ -ldl -L/lib -Wl,-rpath,/lib >>>>>>>> -Wl,-rpath,/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>>>>>> -L/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>>>>>> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>>>>>> >>>>>>>> Perhaps PETSc also picks up both versions (and there is a way to >>>>>>>> query it from PETSc?), but I can't confirm this. Is there a way to instruct >>>>>>>> make to select only -L/lib64? I want to rule out that 32-bit dynamic >>>>>>>> library is not a culprit for the random non-convergence of PETSc solvers >>>>>>>> and the eventual crash of the simulations. I have tried both gcc-7.3.0 and >>>>>>>> intel-18 compilers -- but the same thing is happening. >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> --Amneet >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> --Amneet >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> -- >>>> --Amneet >>>> >>>> >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- >> --Amneet >> >> >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: From yyang85 at stanford.edu Fri Mar 15 16:29:58 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Fri, 15 Mar 2019 21:29:58 +0000 Subject: [petsc-users] Using PETSc with GPU Message-ID: Hello team, Our group is thinking of using GPUs for the linear solves in our code, which is written in PETSc. I was reading the 2013 book chapter on implementation of PETSc using GPUs but wonder if there is any more updated reference that I check out? I also saw one example cuda code online (using thrust), but would like to check with you if there is a more complete documentation of how the GPU implementation is done? Thanks very much! Best regards, Yuyun -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 15 16:54:02 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 15 Mar 2019 17:54:02 -0400 Subject: [petsc-users] Using PETSc with GPU In-Reply-To: References: Message-ID: On Fri, Mar 15, 2019 at 5:30 PM Yuyun Yang via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello team, > > > > Our group is thinking of using GPUs for the linear solves in our code, > which is written in PETSc. I was reading the 2013 book chapter on > implementation of PETSc using GPUs but wonder if there is any more updated > reference that I check out? I also saw one example cuda code online (using > thrust), but would like to check with you if there is a more complete > documentation of how the GPU implementation is done? > Have you seen this page? https://www.mcs.anl.gov/petsc/features/gpus.html Also, before using GPUs, I would take some time to understand what you think the possible benefit can be. For example, there is almost no benefit is you use BLAS1, and you would have a huge maintenance burden with a different toolchain. This is also largely true for SpMV, since the bandwidth difference between CPUs and GPUs is now not much. So you really should have some kind of flop intensive (BLAS3-like) work in there somewhere or its hard to see your motivation. Thanks, Matt > > > Thanks very much! > > > > Best regards, > > Yuyun > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yyang85 at stanford.edu Fri Mar 15 19:33:00 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Sat, 16 Mar 2019 00:33:00 +0000 Subject: [petsc-users] Using PETSc with GPU In-Reply-To: References: , Message-ID: Thanks Matt, I've seen that page, but there isn't that much documentation, and there is only one CUDA example, so I wanted to check if there may be more references or examples somewhere else. We have very large linear systems that need to be solved every time step, and which involves matrix-matrix multiplications, so we thought GPU could have some benefits, but we are unsure how difficult it is to migrate parts of the code to GPU with PETSc. From that webpage it seems like we only need to specify the Vec / Mat option on the command line and maybe change a few functions to have CUDA? The CUDA example however also involves using thrust and programming a kernel function, so I want to make sure I know how this works before trying to implement. Thanks a lot, Yuyun Get Outlook for iOS ________________________________ From: Matthew Knepley Sent: Friday, March 15, 2019 2:54:02 PM To: Yuyun Yang Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Using PETSc with GPU On Fri, Mar 15, 2019 at 5:30 PM Yuyun Yang via petsc-users > wrote: Hello team, Our group is thinking of using GPUs for the linear solves in our code, which is written in PETSc. I was reading the 2013 book chapter on implementation of PETSc using GPUs but wonder if there is any more updated reference that I check out? I also saw one example cuda code online (using thrust), but would like to check with you if there is a more complete documentation of how the GPU implementation is done? Have you seen this page? https://www.mcs.anl.gov/petsc/features/gpus.html Also, before using GPUs, I would take some time to understand what you think the possible benefit can be. For example, there is almost no benefit is you use BLAS1, and you would have a huge maintenance burden with a different toolchain. This is also largely true for SpMV, since the bandwidth difference between CPUs and GPUs is now not much. So you really should have some kind of flop intensive (BLAS3-like) work in there somewhere or its hard to see your motivation. Thanks, Matt Thanks very much! Best regards, Yuyun -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Mar 15 19:43:23 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sat, 16 Mar 2019 00:43:23 +0000 Subject: [petsc-users] Using PETSc with GPU In-Reply-To: References: Message-ID: > On Mar 15, 2019, at 7:33 PM, Yuyun Yang via petsc-users wrote: > > Thanks Matt, I've seen that page, but there isn't that much documentation, and there is only one CUDA example, so I wanted to check if there may be more references or examples somewhere else. We have very large linear systems that need to be solved every time step, and which involves matrix-matrix multiplications, where do these matrix-matrix multiplications appear? Are you providing a "matrix-free" based operator for your linear system where you apply matrix-vector operations via a subroutine call? Or are you explicitly forming sparse matrices and using them to define the operator? > so we thought GPU could have some benefits, but we are unsure how difficult it is to migrate parts of the code to GPU with PETSc. From that webpage it seems like we only need to specify the Vec / Mat option on the command line and maybe change a few functions to have CUDA? The CUDA example however also involves using thrust and programming a kernel function, so I want to make sure I know how this works before trying to implement. How much, if any, CUDA/GPU code you have to write depends on what you want to have done on the GPU. If you provide a sparse matrix and only want the system solve to take place on the GPU then you don't need to write any CUDA/GPU code, you just use the "CUDA" vector and matrix class. If you are doing "matrix-free" solves and you provide the routine that performs the matrix-vector product then you need to write/optimize that routine for CUDA/GPU. Barry > > Thanks a lot, > Yuyun > > Get Outlook for iOS > From: Matthew Knepley > Sent: Friday, March 15, 2019 2:54:02 PM > To: Yuyun Yang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Using PETSc with GPU > > On Fri, Mar 15, 2019 at 5:30 PM Yuyun Yang via petsc-users wrote: > Hello team, > > > > Our group is thinking of using GPUs for the linear solves in our code, which is written in PETSc. I was reading the 2013 book chapter on implementation of PETSc using GPUs but wonder if there is any more updated reference that I check out? I also saw one example cuda code online (using thrust), but would like to check with you if there is a more complete documentation of how the GPU implementation is done? > > > Have you seen this page? https://www.mcs.anl.gov/petsc/features/gpus.html > > Also, before using GPUs, I would take some time to understand what you think the possible benefit can be. > For example, there is almost no benefit is you use BLAS1, and you would have a huge maintenance burden > with a different toolchain. This is also largely true for SpMV, since the bandwidth difference between CPUs > and GPUs is now not much. So you really should have some kind of flop intensive (BLAS3-like) work in there > somewhere or its hard to see your motivation. > > Thanks, > > Matt > > > Thanks very much! > > > > Best regards, > > Yuyun > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From yyang85 at stanford.edu Fri Mar 15 20:07:58 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Sat, 16 Mar 2019 01:07:58 +0000 Subject: [petsc-users] Using PETSc with GPU In-Reply-To: References: , Message-ID: Currently we are forming the sparse matrices explicitly, but I think the goal is to move towards matrix-free methods and use a stencil, which I suppose is good to use GPUs for and more efficient. On the other hand, I've also read about matrix-free operations in the manual just on the CPUs. Would there be any benefit then to switching to GPU (looks like matrix-free in PETSc is rather straightforward to use, whereas writing the kernel function for GPU stencil would require quite a lot of work)? Thanks! Yuyun Get Outlook for iOS ________________________________ From: Smith, Barry F. Sent: Friday, March 15, 2019 5:43:23 PM To: Yuyun Yang Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Using PETSc with GPU > On Mar 15, 2019, at 7:33 PM, Yuyun Yang via petsc-users wrote: > > Thanks Matt, I've seen that page, but there isn't that much documentation, and there is only one CUDA example, so I wanted to check if there may be more references or examples somewhere else. We have very large linear systems that need to be solved every time step, and which involves matrix-matrix multiplications, where do these matrix-matrix multiplications appear? Are you providing a "matrix-free" based operator for your linear system where you apply matrix-vector operations via a subroutine call? Or are you explicitly forming sparse matrices and using them to define the operator? > so we thought GPU could have some benefits, but we are unsure how difficult it is to migrate parts of the code to GPU with PETSc. From that webpage it seems like we only need to specify the Vec / Mat option on the command line and maybe change a few functions to have CUDA? The CUDA example however also involves using thrust and programming a kernel function, so I want to make sure I know how this works before trying to implement. How much, if any, CUDA/GPU code you have to write depends on what you want to have done on the GPU. If you provide a sparse matrix and only want the system solve to take place on the GPU then you don't need to write any CUDA/GPU code, you just use the "CUDA" vector and matrix class. If you are doing "matrix-free" solves and you provide the routine that performs the matrix-vector product then you need to write/optimize that routine for CUDA/GPU. Barry > > Thanks a lot, > Yuyun > > Get Outlook for iOS > From: Matthew Knepley > Sent: Friday, March 15, 2019 2:54:02 PM > To: Yuyun Yang > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Using PETSc with GPU > > On Fri, Mar 15, 2019 at 5:30 PM Yuyun Yang via petsc-users wrote: > Hello team, > > > > Our group is thinking of using GPUs for the linear solves in our code, which is written in PETSc. I was reading the 2013 book chapter on implementation of PETSc using GPUs but wonder if there is any more updated reference that I check out? I also saw one example cuda code online (using thrust), but would like to check with you if there is a more complete documentation of how the GPU implementation is done? > > > Have you seen this page? https://www.mcs.anl.gov/petsc/features/gpus.html > > Also, before using GPUs, I would take some time to understand what you think the possible benefit can be. > For example, there is almost no benefit is you use BLAS1, and you would have a huge maintenance burden > with a different toolchain. This is also largely true for SpMV, since the bandwidth difference between CPUs > and GPUs is now not much. So you really should have some kind of flop intensive (BLAS3-like) work in there > somewhere or its hard to see your motivation. > > Thanks, > > Matt > > > Thanks very much! > > > > Best regards, > > Yuyun > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Mar 15 21:06:29 2019 From: jed at jedbrown.org (Jed Brown) Date: Fri, 15 Mar 2019 20:06:29 -0600 Subject: [petsc-users] Using PETSc with GPU In-Reply-To: References: Message-ID: <87mulv8tei.fsf@jedbrown.org> Yuyun Yang via petsc-users writes: > Currently we are forming the sparse matrices explicitly, but I think the goal is to move towards matrix-free methods and use a stencil, which I suppose is good to use GPUs for and more efficient. On the other hand, I've also read about matrix-free operations in the manual just on the CPUs. Would there be any benefit then to switching to GPU (looks like matrix-free in PETSc is rather straightforward to use, whereas writing the kernel function for GPU stencil would require quite a lot of work)? It all depends what kind of computation happens in there and how well you can implement it for the GPU. It's important to have a clear idea of what you expect to achieve. For example, if you write an excellent GPU implementation of your SNES residual/matrix-free Jacobian, it might be 2-3x faster than a good CPU implementation on hardware of similar cost ($ or Watt). But you still need preconditioning, which is usually at least half the work, and perhaps a preconditioner runs the same speed on GPU and CPU (CPU version often converges a bit faster; preconditioning operations are often less amenable to GPUs). So after all that effort, and now with code that is likely harder to maintain, you go from 4 seconds per solve to 3 seconds per solve on hardware of the same cost. Is that worth it? Maybe, but you probably want that to be in the critical path for your research and/or customers. From yyang85 at stanford.edu Fri Mar 15 21:09:22 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Sat, 16 Mar 2019 02:09:22 +0000 Subject: [petsc-users] Using PETSc with GPU In-Reply-To: <87mulv8tei.fsf@jedbrown.org> References: , <87mulv8tei.fsf@jedbrown.org> Message-ID: Good point, thank you so much for the advice! I'll take that into consideration. Best regards, Yuyun Get Outlook for iOS ________________________________ From: Jed Brown Sent: Friday, March 15, 2019 7:06:29 PM To: Yuyun Yang; Smith, Barry F. Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Using PETSc with GPU Yuyun Yang via petsc-users writes: > Currently we are forming the sparse matrices explicitly, but I think the goal is to move towards matrix-free methods and use a stencil, which I suppose is good to use GPUs for and more efficient. On the other hand, I've also read about matrix-free operations in the manual just on the CPUs. Would there be any benefit then to switching to GPU (looks like matrix-free in PETSc is rather straightforward to use, whereas writing the kernel function for GPU stencil would require quite a lot of work)? It all depends what kind of computation happens in there and how well you can implement it for the GPU. It's important to have a clear idea of what you expect to achieve. For example, if you write an excellent GPU implementation of your SNES residual/matrix-free Jacobian, it might be 2-3x faster than a good CPU implementation on hardware of similar cost ($ or Watt). But you still need preconditioning, which is usually at least half the work, and perhaps a preconditioner runs the same speed on GPU and CPU (CPU version often converges a bit faster; preconditioning operations are often less amenable to GPUs). So after all that effort, and now with code that is likely harder to maintain, you go from 4 seconds per solve to 3 seconds per solve on hardware of the same cost. Is that worth it? Maybe, but you probably want that to be in the critical path for your research and/or customers. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ys453 at cam.ac.uk Sat Mar 16 18:50:54 2019 From: ys453 at cam.ac.uk (Y. Shidi) Date: Sat, 16 Mar 2019 23:50:54 +0000 Subject: [petsc-users] PCFieldSplit gives different results for direct and iterative solver Message-ID: <0a59ff504728c756b0965f201a97ab7c@cam.ac.uk> Hello, I am trying to solve the incompressible n-s equations by PCFieldSplit. The large matrix and vectors are formed by MatCreateNest() and VecCreateNest(). The system is solved directly by the following command: -ksp_type fgmres \ -pc_type fieldsplit \ -pc_fieldsplit_type schur \ -pc_fieldsplit_schur_fact_type full \ -ksp_converged_reason \ -ksp_monitor_true_residual \ -fieldsplit_0_ksp_type preonly \ -fieldsplit_0_pc_type cholesky \ -fieldsplit_0_pc_factor_mat_solver_package mumps \ -mat_mumps_icntl_28 2 \ -mat_mumps_icntl_29 2 \ -fieldsplit_1_ksp_type preonly \ -fieldsplit_1_pc_type jacobi \ Output: 0 KSP unpreconditioned resid norm 1.214252932161e+04 true resid norm 1.214252932161e+04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 1.642782495109e-02 true resid norm 1.642782495109e-02 ||r(i)||/||b|| 1.352916226594e-06 Linear solve converged due to CONVERGED_RTOL iterations 1 The system is solved iteratively by the following command: -ksp_type fgmres \ -pc_type fieldsplit \ -pc_fieldsplit_type schur \ -pc_fieldsplit_schur_factorization_type diag \ -ksp_converged_reason \ -ksp_monitor_true_residual \ -fieldsplit_0_ksp_type preonly \ -fieldsplit_0_pc_type gamg \ -fieldsplit_1_ksp_type minres \ -fieldsplit_1_pc_type none \ Output: 0 KSP unpreconditioned resid norm 1.214252932161e+04 true resid norm 1.214252932161e+04 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 2.184037364915e+02 true resid norm 2.184037364915e+02 ||r(i)||/||b|| 1.798667565109e-02 2 KSP unpreconditioned resid norm 2.120097409539e+02 true resid norm 2.120097409635e+02 ||r(i)||/||b|| 1.746009709742e-02 3 KSP unpreconditioned resid norm 4.364091658268e+01 true resid norm 4.364091658575e+01 ||r(i)||/||b|| 3.594054865332e-03 4 KSP unpreconditioned resid norm 2.632671796885e+00 true resid norm 2.632671797020e+00 ||r(i)||/||b|| 2.168141189773e-04 5 KSP unpreconditioned resid norm 2.209213998004e+00 true resid norm 2.209213980361e+00 ||r(i)||/||b|| 1.819401808180e-04 6 KSP unpreconditioned resid norm 4.683775185840e-01 true resid norm 4.683775085753e-01 ||r(i)||/||b|| 3.857330677735e-05 7 KSP unpreconditioned resid norm 3.042503284736e-02 true resid norm 3.042503349258e-02 ||r(i)||/||b|| 2.505658638883e-06 Both methods give answers, but they are different so I am wondering if it is possible that you can help me figure out which part I am doing wrong. Thank you for your time. Kind Regards, Shidi From bsmith at mcs.anl.gov Sat Mar 16 19:05:42 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sun, 17 Mar 2019 00:05:42 +0000 Subject: [petsc-users] PCFieldSplit gives different results for direct and iterative solver In-Reply-To: <0a59ff504728c756b0965f201a97ab7c@cam.ac.uk> References: <0a59ff504728c756b0965f201a97ab7c@cam.ac.uk> Message-ID: > On Mar 16, 2019, at 6:50 PM, Y. Shidi via petsc-users wrote: > > Hello, > > I am trying to solve the incompressible n-s equations by > PCFieldSplit. > > The large matrix and vectors are formed by MatCreateNest() > and VecCreateNest(). > The system is solved directly by the following command: > -ksp_type fgmres \ > -pc_type fieldsplit \ > -pc_fieldsplit_type schur \ > -pc_fieldsplit_schur_fact_type full \ > -ksp_converged_reason \ > -ksp_monitor_true_residual \ > -fieldsplit_0_ksp_type preonly \ > -fieldsplit_0_pc_type cholesky \ > -fieldsplit_0_pc_factor_mat_solver_package mumps \ > -mat_mumps_icntl_28 2 \ > -mat_mumps_icntl_29 2 \ > -fieldsplit_1_ksp_type preonly \ > -fieldsplit_1_pc_type jacobi \ > Output: > 0 KSP unpreconditioned resid norm 1.214252932161e+04 true resid norm 1.214252932161e+04 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 1.642782495109e-02 true resid norm 1.642782495109e-02 ||r(i)||/||b|| 1.352916226594e-06 > Linear solve converged due to CONVERGED_RTOL iterations 1 > > The system is solved iteratively by the following command: > -ksp_type fgmres \ > -pc_type fieldsplit \ > -pc_fieldsplit_type schur \ > -pc_fieldsplit_schur_factorization_type diag \ > -ksp_converged_reason \ > -ksp_monitor_true_residual \ > -fieldsplit_0_ksp_type preonly \ > -fieldsplit_0_pc_type gamg \ > -fieldsplit_1_ksp_type minres \ > -fieldsplit_1_pc_type none \ > Output: > 0 KSP unpreconditioned resid norm 1.214252932161e+04 true resid norm 1.214252932161e+04 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 2.184037364915e+02 true resid norm 2.184037364915e+02 ||r(i)||/||b|| 1.798667565109e-02 > 2 KSP unpreconditioned resid norm 2.120097409539e+02 true resid norm 2.120097409635e+02 ||r(i)||/||b|| 1.746009709742e-02 > 3 KSP unpreconditioned resid norm 4.364091658268e+01 true resid norm 4.364091658575e+01 ||r(i)||/||b|| 3.594054865332e-03 > 4 KSP unpreconditioned resid norm 2.632671796885e+00 true resid norm 2.632671797020e+00 ||r(i)||/||b|| 2.168141189773e-04 > 5 KSP unpreconditioned resid norm 2.209213998004e+00 true resid norm 2.209213980361e+00 ||r(i)||/||b|| 1.819401808180e-04 > 6 KSP unpreconditioned resid norm 4.683775185840e-01 true resid norm 4.683775085753e-01 ||r(i)||/||b|| 3.857330677735e-05 > 7 KSP unpreconditioned resid norm 3.042503284736e-02 true resid norm 3.042503349258e-02 ||r(i)||/||b|| 2.505658638883e-06 > > > Both methods give answers, but they are different What do you mean the answers are different? Do you mean the solution x from KSPSolve() is different? How are you calculating their difference and how different are they? Since the solutions are only approximate; true residual norm is around 1.642782495109e-02 and 3.042503349258e-02 for the two different solvers there will only be a certain number of identical digits in the two solutions (which depends on the condition number of the original matrix). You can run both solvers with -ksp_rtol 1.e-12 and then (assuming everything is working correctly) the two solutions will be much closer to each other. Barry > so I am wondering > if it is possible that you can help me figure out which part I am > doing wrong. > > Thank you for your time. > > Kind Regards, > Shidi From mail2amneet at gmail.com Sun Mar 17 22:23:39 2019 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Sun, 17 Mar 2019 20:23:39 -0700 Subject: [petsc-users] Cross-compilation cluster In-Reply-To: References: Message-ID: Apparently the problem is with openmpi 4.0 (or perhaps latest PETSc 3.10.4 not being compatible with it). I switched to openmpi 1.10.7 and PETSc 3.10, which is what I was using on my workstation, and everything works fine. ==== Matt, I tried your suggestion. Apparently, Silo when compiled with mpi wrappers (libsilo.a) does not produce link flags, as required by AC_CHECK_LINKFLAGS(silo). When I compile it with serial compiler it does not cause that check failure. I noticed that configure help of openmpi has some options to add some extra link flags. Perhaps it is supposed to circumvent situations like above. On Thu, Mar 14, 2019 at 10:04 AM Amneet Bhalla wrote: > Hmmn, ok. I will try it out. > > On Thu, Mar 14, 2019 at 9:34 AM Matthew Knepley wrote: > >> On Thu, Mar 14, 2019 at 10:05 AM Amneet Bhalla >> wrote: >> >>> I need the serial built of Silo and HDF5 libraries. I am sure that the >>> MPI wrappers are coming from the serial compilers. >>> >>> i.e, If I do >>> >>> gcc ?version and mpicc ?version, I see the same GCC version. >>> >> >> That is in a perfect world. Unfortunately, flags for compilers can make >> incompatible libraries. Its no problem >> to give the MPI compilers to Silo and HDF5, even if you do not want to >> link the libraries. >> >> Matt >> >> >>> On Thu, Mar 14, 2019 at 6:44 AM Matthew Knepley >>> wrote: >>> >>>> It is very dangerous to use different compilers. I would make sure that >>>> all the compilers are the MPI compilers. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> On Thu, Mar 14, 2019 at 8:46 AM Amneet Bhalla >>>> wrote: >>>> >>>>> Ah, Ok. Do serial compilers look OK to you? >>>>> >>>>> Can lib-32 and lib-64 (say -lm) operate simulataneously during >>>>> runtime, or this is my imagination? >>>>> >>>>> >>>>> >>>>> On Thu, Mar 14, 2019 at 5:36 AM Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Thu, Mar 14, 2019 at 8:28 AM Amneet Bhalla >>>>>> wrote: >>>>>> >>>>>>> Matt -- SAMRAI, PETSc, and libMesh configure logs are attached with >>>>>>> this email. Also including some other log files in case they are useful. >>>>>>> >>>>>> >>>>>> Okay, PETSc is not sticking in a /usr/lib (or /usr/lib64). However, I >>>>>> can see that you mpif90 (and perhaps other of the wrappers) are >>>>>> reporting both /usr/lib64 AND /usr/lib flags, and I am guessing that >>>>>> is where the other configures are picking it up. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> --Amneet >>>>>>> >>>>>>> On Thu, Mar 14, 2019 at 4:21 AM Matthew Knepley >>>>>>> wrote: >>>>>>> >>>>>>>> In order to see why each flag was included, we need to see >>>>>>>> configure.log. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> On Thu, Mar 14, 2019 at 2:40 AM Amneet Bhalla via petsc-users < >>>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>>> >>>>>>>>> Hi Folks, >>>>>>>>> >>>>>>>>> I am on a cluster that has -L/lib dir with 32-bit libraries and >>>>>>>>> -L/lib64 with 64-bit libraries. During compilation of some of >>>>>>>>> libraries required for my code (such as SAMRAI and libMesh) both paths >>>>>>>>> get picked -L/lib and -L/lib64. >>>>>>>>> >>>>>>>>> I am seeing some sporadic behavior in runtime when at some >>>>>>>>> timesteps PETSc does not converge. The same code with the same number of >>>>>>>>> processors run just fine on my workstation that has just 64-bit version of >>>>>>>>> libraries. >>>>>>>>> >>>>>>>>> Even during the final linking stage of the executable, the linker >>>>>>>>> gives warnings like >>>>>>>>> >>>>>>>>> ld: skipping incompatible //lib/libm.so when searching for -lm >>>>>>>>> >>>>>>>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>>>>>>> >>>>>>>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>>>>>>> >>>>>>>>> ld: skipping incompatible //lib/libpthread.so when searching for >>>>>>>>> -lpthread >>>>>>>>> >>>>>>>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>>>>>>> -lpthread >>>>>>>>> >>>>>>>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>>>>>>> -lpthread >>>>>>>>> >>>>>>>>> ld: skipping incompatible //lib/libdl.so when searching for -ldl >>>>>>>>> >>>>>>>>> ld: skipping incompatible //lib/libc.so when searching for -lc >>>>>>>>> >>>>>>>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>>>>>>> >>>>>>>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>>>>>>> but the executable runs. >>>>>>>>> >>>>>>>>> >>>>>>>>> This is during config of SAMRAI when it picks both -L/lib and >>>>>>>>> -L/lib64: >>>>>>>>> >>>>>>>>> checking whether we are using the GNU Fortran 77 compiler... no >>>>>>>>> >>>>>>>>> checking whether ifort accepts -g... yes >>>>>>>>> >>>>>>>>> checking how to get verbose linking output from ifort... -v >>>>>>>>> >>>>>>>>> checking for Fortran 77 libraries of ifort... >>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/ >>>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 >>>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ -L/lib/../lib64 >>>>>>>>> -L/lib/../lib64/ -L/usr/lib/../lib64 -L/usr/lib/../lib64/ >>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4/ >>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/ >>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/ >>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64/ >>>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../ -L/lib64 -L/lib/ >>>>>>>>> -L/usr/lib64 -L/usr/lib -lifport -lifcoremt -limf -lsvml -lm -lipgo -lirc >>>>>>>>> -lpthread -lgcc -lgcc_s -lirc_s -ldl >>>>>>>>> >>>>>>>>> libMesh is also picking that path >>>>>>>>> >>>>>>>>> libmesh_optional_LIBS............ : -lhdf5 -lhdf5_cpp -lz >>>>>>>>> -L/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>>>>>>> -Wl,-rpath,/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>>>>>>> -Wl,-rpath,/opt/intel/mkl/lib/intel64 -L/opt/intel/mkl/lib/intel64 >>>>>>>>> -Wl,-rpath,/opt/mellanox/hcoll/lib -L/opt/mellanox/hcoll/lib >>>>>>>>> -Wl,-rpath,/opt/mellanox/mxm/lib -L/opt/mellanox/mxm/lib >>>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>>>>> -lpetsc -lHYPRE -lmkl_intel_lp64 -lmkl_sequential -lmkl_core >>>>>>>>> -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport >>>>>>>>> -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s >>>>>>>>> -lstdc++ -ldl -L/lib -Wl,-rpath,/lib >>>>>>>>> -Wl,-rpath,/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>>>>>>> -L/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>>>>>>> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>>>>>>> >>>>>>>>> Perhaps PETSc also picks up both versions (and there is a way to >>>>>>>>> query it from PETSc?), but I can't confirm this. Is there a way to instruct >>>>>>>>> make to select only -L/lib64? I want to rule out that 32-bit dynamic >>>>>>>>> library is not a culprit for the random non-convergence of PETSc solvers >>>>>>>>> and the eventual crash of the simulations. I have tried both gcc-7.3.0 and >>>>>>>>> intel-18 compilers -- but the same thing is happening. >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> --Amneet >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> --Amneet >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> -- >>>>> --Amneet >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> -- >>> --Amneet >>> >>> >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- > --Amneet > > > > -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Mar 17 22:41:14 2019 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 17 Mar 2019 23:41:14 -0400 Subject: [petsc-users] Cross-compilation cluster In-Reply-To: References: Message-ID: On Sun, Mar 17, 2019 at 11:23 PM Amneet Bhalla wrote: > Apparently the problem is with openmpi 4.0 (or perhaps latest PETSc 3.10.4 > not being compatible with it). I switched to openmpi 1.10.7 and PETSc 3.10, > which is what I was using on my workstation, and everything works fine. > > ==== > > Matt, I tried your suggestion. Apparently, Silo when compiled with mpi > wrappers (libsilo.a) does not produce link flags, as required by > AC_CHECK_LINKFLAGS(silo). When I compile it with serial compiler it does > not cause that check failure. > > I noticed that configure help of openmpi has some options to add some > extra link flags. Perhaps it is supposed to circumvent situations like > above. > Long ago I made Silo work with MPI compilers, but we dropped support since no one was using it. Is Silo still supported? Thanks, Matt > On Thu, Mar 14, 2019 at 10:04 AM Amneet Bhalla > wrote: > >> Hmmn, ok. I will try it out. >> >> On Thu, Mar 14, 2019 at 9:34 AM Matthew Knepley >> wrote: >> >>> On Thu, Mar 14, 2019 at 10:05 AM Amneet Bhalla >>> wrote: >>> >>>> I need the serial built of Silo and HDF5 libraries. I am sure that the >>>> MPI wrappers are coming from the serial compilers. >>>> >>>> i.e, If I do >>>> >>>> gcc ?version and mpicc ?version, I see the same GCC version. >>>> >>> >>> That is in a perfect world. Unfortunately, flags for compilers can make >>> incompatible libraries. Its no problem >>> to give the MPI compilers to Silo and HDF5, even if you do not want to >>> link the libraries. >>> >>> Matt >>> >>> >>>> On Thu, Mar 14, 2019 at 6:44 AM Matthew Knepley >>>> wrote: >>>> >>>>> It is very dangerous to use different compilers. I would make sure >>>>> that all the compilers are the MPI compilers. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> On Thu, Mar 14, 2019 at 8:46 AM Amneet Bhalla >>>>> wrote: >>>>> >>>>>> Ah, Ok. Do serial compilers look OK to you? >>>>>> >>>>>> Can lib-32 and lib-64 (say -lm) operate simulataneously during >>>>>> runtime, or this is my imagination? >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Mar 14, 2019 at 5:36 AM Matthew Knepley >>>>>> wrote: >>>>>> >>>>>>> On Thu, Mar 14, 2019 at 8:28 AM Amneet Bhalla >>>>>>> wrote: >>>>>>> >>>>>>>> Matt -- SAMRAI, PETSc, and libMesh configure logs are attached with >>>>>>>> this email. Also including some other log files in case they are useful. >>>>>>>> >>>>>>> >>>>>>> Okay, PETSc is not sticking in a /usr/lib (or /usr/lib64). However, >>>>>>> I can see that you mpif90 (and perhaps other of the wrappers) are >>>>>>> reporting both /usr/lib64 AND /usr/lib flags, and I am guessing that >>>>>>> is where the other configures are picking it up. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> --Amneet >>>>>>>> >>>>>>>> On Thu, Mar 14, 2019 at 4:21 AM Matthew Knepley >>>>>>>> wrote: >>>>>>>> >>>>>>>>> In order to see why each flag was included, we need to see >>>>>>>>> configure.log. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> On Thu, Mar 14, 2019 at 2:40 AM Amneet Bhalla via petsc-users < >>>>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>>>> >>>>>>>>>> Hi Folks, >>>>>>>>>> >>>>>>>>>> I am on a cluster that has -L/lib dir with 32-bit libraries and >>>>>>>>>> -L/lib64 with 64-bit libraries. During compilation of some of >>>>>>>>>> libraries required for my code (such as SAMRAI and libMesh) both paths >>>>>>>>>> get picked -L/lib and -L/lib64. >>>>>>>>>> >>>>>>>>>> I am seeing some sporadic behavior in runtime when at some >>>>>>>>>> timesteps PETSc does not converge. The same code with the same number of >>>>>>>>>> processors run just fine on my workstation that has just 64-bit version of >>>>>>>>>> libraries. >>>>>>>>>> >>>>>>>>>> Even during the final linking stage of the executable, the linker >>>>>>>>>> gives warnings like >>>>>>>>>> >>>>>>>>>> ld: skipping incompatible //lib/libm.so when searching for -lm >>>>>>>>>> >>>>>>>>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>>>>>>>> >>>>>>>>>> ld: skipping incompatible /lib/libm.so when searching for -lm >>>>>>>>>> >>>>>>>>>> ld: skipping incompatible //lib/libpthread.so when searching for >>>>>>>>>> -lpthread >>>>>>>>>> >>>>>>>>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>>>>>>>> -lpthread >>>>>>>>>> >>>>>>>>>> ld: skipping incompatible /lib/libpthread.so when searching for >>>>>>>>>> -lpthread >>>>>>>>>> >>>>>>>>>> ld: skipping incompatible //lib/libdl.so when searching for -ldl >>>>>>>>>> >>>>>>>>>> ld: skipping incompatible //lib/libc.so when searching for -lc >>>>>>>>>> >>>>>>>>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>>>>>>>> >>>>>>>>>> ld: skipping incompatible /lib/libc.so when searching for -lc >>>>>>>>>> but the executable runs. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> This is during config of SAMRAI when it picks both -L/lib and >>>>>>>>>> -L/lib64: >>>>>>>>>> >>>>>>>>>> checking whether we are using the GNU Fortran 77 compiler... no >>>>>>>>>> >>>>>>>>>> checking whether ifort accepts -g... yes >>>>>>>>>> >>>>>>>>>> checking how to get verbose linking output from ifort... -v >>>>>>>>>> >>>>>>>>>> checking for Fortran 77 libraries of ifort... >>>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/ >>>>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64 >>>>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/ -L/lib/../lib64 >>>>>>>>>> -L/lib/../lib64/ -L/usr/lib/../lib64 -L/usr/lib/../lib64/ >>>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4/ >>>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/ >>>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/ >>>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64/ >>>>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../ -L/lib64 -L/lib/ >>>>>>>>>> -L/usr/lib64 -L/usr/lib -lifport -lifcoremt -limf -lsvml -lm -lipgo -lirc >>>>>>>>>> -lpthread -lgcc -lgcc_s -lirc_s -ldl >>>>>>>>>> >>>>>>>>>> libMesh is also picking that path >>>>>>>>>> >>>>>>>>>> libmesh_optional_LIBS............ : -lhdf5 -lhdf5_cpp -lz >>>>>>>>>> -L/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>>>>>>>> -Wl,-rpath,/home/asbhalla/softwares/PETSc-BitBucket/PETSc/linux-opt/lib >>>>>>>>>> -Wl,-rpath,/opt/intel/mkl/lib/intel64 -L/opt/intel/mkl/lib/intel64 >>>>>>>>>> -Wl,-rpath,/opt/mellanox/hcoll/lib -L/opt/mellanox/hcoll/lib >>>>>>>>>> -Wl,-rpath,/opt/mellanox/mxm/lib -L/opt/mellanox/mxm/lib >>>>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.4 >>>>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64 >>>>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64 >>>>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64 >>>>>>>>>> -Wl,-rpath,/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>>>>>> -L/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin >>>>>>>>>> -lpetsc -lHYPRE -lmkl_intel_lp64 -lmkl_sequential -lmkl_core >>>>>>>>>> -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lifport >>>>>>>>>> -lifcoremt_pic -limf -lsvml -lm -lipgo -lirc -lpthread -lgcc_s -lirc_s >>>>>>>>>> -lstdc++ -ldl -L/lib -Wl,-rpath,/lib >>>>>>>>>> -Wl,-rpath,/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>>>>>>>> -L/usr/local/mpi/intel/openmpi-4.0.0/lib64 >>>>>>>>>> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>>>>>>>> -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 >>>>>>>>>> >>>>>>>>>> Perhaps PETSc also picks up both versions (and there is a way to >>>>>>>>>> query it from PETSc?), but I can't confirm this. Is there a way to instruct >>>>>>>>>> make to select only -L/lib64? I want to rule out that 32-bit dynamic >>>>>>>>>> library is not a culprit for the random non-convergence of PETSc solvers >>>>>>>>>> and the eventual crash of the simulations. I have tried both gcc-7.3.0 and >>>>>>>>>> intel-18 compilers -- but the same thing is happening. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> --Amneet >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> --Amneet >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> -- >>>>>> --Amneet >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> -- >>>> --Amneet >>>> >>>> >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- >> --Amneet >> >> >> >> -- > --Amneet > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From srossi at email.unc.edu Mon Mar 18 14:14:55 2019 From: srossi at email.unc.edu (Rossi, Simone) Date: Mon, 18 Mar 2019 19:14:55 +0000 Subject: [petsc-users] BJACOBI with FIELDSPLIT Message-ID: Dear all, I'm debugging my application in which I'm trying to use the FIELDSPLIT preconditioner for solving a 2x2 block matrix. Currently I'm testing the preconditioner on a decoupled system where I solve two identical and independent Poisson problems. Using the default fieldsplit type (multiplicative), I'm expecting the method to be equivalent to a Block Jacobi solver. Setting -ksp_rtol 1e-6 while using gmres/hypre on each subblock with -fieldsplit_0_ksp_rtol 1e-12 -fieldsplit_1_ksp_rtol 1e-12 I'm expecting to converge in 1 iteration with a single solve for each block. Asking to output the iteration count for the subblocks with -ksp_converged_reason -fieldsplit_0_ksp_converged_reason -fieldsplit_1_ksp_converged_reason revealed that the outer solver converges in 1 iteration, but each block is solved for 3 times. This is the output I get: Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 0 KSP preconditioned resid norm 9.334948012657e+01 true resid norm 1.280164130222e+02 ||r(i)||/||b|| 1.000000000000e+00 Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 1 KSP preconditioned resid norm 1.518151977611e-11 true resid norm 8.123270435936e-12 ||r(i)||/||b|| 6.345491366429e-14 Linear solve converged due to CONVERGED_RTOL iterations 1 Are the subblocks actually solved for multiple times at every outer iteration? Thanks for the help, Simone -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Mar 18 14:27:13 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 18 Mar 2019 19:27:13 +0000 Subject: [petsc-users] BJACOBI with FIELDSPLIT In-Reply-To: References: Message-ID: <3AFF5052-A97B-4820-B9F4-D19775112519@anl.gov> Simone, This is indeed surprising, given the block structure of the matrix and the exact block solves we'd expect the solver to converge after the application of the preconditioner. Please send the output of -ksp_view Barry Also if you are willing to share your test code we can try running it to determine why it doesn't converge immediately. > On Mar 18, 2019, at 2:14 PM, Rossi, Simone via petsc-users wrote: > > Dear all, > I'm debugging my application in which I'm trying to use the FIELDSPLIT preconditioner for solving a 2x2 block matrix. > > Currently I'm testing the preconditioner on a decoupled system where I solve two identical and independent Poisson problems. Using the default fieldsplit type (multiplicative), I'm expecting the method to be equivalent to a Block Jacobi solver. > Setting > -ksp_rtol 1e-6 > while using gmres/hypre on each subblock with > -fieldsplit_0_ksp_rtol 1e-12 > -fieldsplit_1_ksp_rtol 1e-12 > I'm expecting to converge in 1 iteration with a single solve for each block. > > Asking to output the iteration count for the subblocks with > -ksp_converged_reason > -fieldsplit_0_ksp_converged_reason > -fieldsplit_1_ksp_converged_reason > revealed that the outer solver converges in 1 iteration, but each block is solved for 3 times. > This is the output I get: > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > 0 KSP preconditioned resid norm 9.334948012657e+01 true resid norm 1.280164130222e+02 ||r(i)||/||b|| 1.000000000000e+00 > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > 1 KSP preconditioned resid norm 1.518151977611e-11 true resid norm 8.123270435936e-12 ||r(i)||/||b|| 6.345491366429e-14 > > Linear solve converged due to CONVERGED_RTOL iterations 1 > > > Are the subblocks actually solved for multiple times at every outer iteration? > > Thanks for the help, > > Simone From srossi at email.unc.edu Mon Mar 18 14:33:35 2019 From: srossi at email.unc.edu (Rossi, Simone) Date: Mon, 18 Mar 2019 19:33:35 +0000 Subject: [petsc-users] BJACOBI with FIELDSPLIT In-Reply-To: <3AFF5052-A97B-4820-B9F4-D19775112519@anl.gov> References: , <3AFF5052-A97B-4820-B9F4-D19775112519@anl.gov> Message-ID: Thanks Barry. Let me know if you can spot anything out of the ksp_view KSP Object: 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=5000, nonzero initial guess tolerances: relative=0.001, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 2 Solver info for each split is in the following KSP objects: Split number 0 Fields 0 KSP Object: (fieldsplit_0_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: hypre HYPRE BoomerAMG preconditioning Cycle type V Maximum number of levels 25 Maximum number of iterations PER hypre call 1 Convergence tolerance PER hypre call 0. Threshold for strong coupling 0.25 Interpolation truncation factor 0. Interpolation: max elements per row 0 Number of levels of aggressive coarsening 0 Number of paths for aggressive coarsening 1 Maximum row sums 0.9 Sweeps down 1 Sweeps up 1 Sweeps on coarse 1 Relax down symmetric-SOR/Jacobi Relax up symmetric-SOR/Jacobi Relax on coarse Gaussian-elimination Relax weight (all) 1. Outer relax weight (all) 1. Using CF-relaxation Not using more complex smoothers. Measure type local Coarsen type Falgout Interpolation type classical linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=35937, cols=35937 total: nonzeros=912673, allocated nonzeros=912673 total number of mallocs used during MatSetValues calls =0 not using I-node routines Split number 1 Fields 1 KSP Object: (fieldsplit_1_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_1_) 1 MPI processes type: hypre HYPRE BoomerAMG preconditioning Cycle type V Maximum number of levels 25 Maximum number of iterations PER hypre call 1 Convergence tolerance PER hypre call 0. Threshold for strong coupling 0.25 Interpolation truncation factor 0. Interpolation: max elements per row 0 Number of levels of aggressive coarsening 0 Number of paths for aggressive coarsening 1 Maximum row sums 0.9 Sweeps down 1 Sweeps up 1 Sweeps on coarse 1 Relax down symmetric-SOR/Jacobi Relax up symmetric-SOR/Jacobi Relax on coarse Gaussian-elimination Relax weight (all) 1. Outer relax weight (all) 1. Using CF-relaxation Not using more complex smoothers. Measure type local Coarsen type Falgout Interpolation type classical linear system matrix = precond matrix: Mat Object: (fieldsplit_1_) 1 MPI processes type: seqaij rows=35937, cols=35937 total: nonzeros=912673, allocated nonzeros=912673 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: () 1 MPI processes type: seqaij rows=71874, cols=71874 total: nonzeros=3650692, allocated nonzeros=3650692 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 35937 nodes, limit used is 5 ________________________________ From: Smith, Barry F. Sent: Monday, March 18, 2019 3:27:13 PM To: Rossi, Simone Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] BJACOBI with FIELDSPLIT Simone, This is indeed surprising, given the block structure of the matrix and the exact block solves we'd expect the solver to converge after the application of the preconditioner. Please send the output of -ksp_view Barry Also if you are willing to share your test code we can try running it to determine why it doesn't converge immediately. > On Mar 18, 2019, at 2:14 PM, Rossi, Simone via petsc-users wrote: > > Dear all, > I'm debugging my application in which I'm trying to use the FIELDSPLIT preconditioner for solving a 2x2 block matrix. > > Currently I'm testing the preconditioner on a decoupled system where I solve two identical and independent Poisson problems. Using the default fieldsplit type (multiplicative), I'm expecting the method to be equivalent to a Block Jacobi solver. > Setting > -ksp_rtol 1e-6 > while using gmres/hypre on each subblock with > -fieldsplit_0_ksp_rtol 1e-12 > -fieldsplit_1_ksp_rtol 1e-12 > I'm expecting to converge in 1 iteration with a single solve for each block. > > Asking to output the iteration count for the subblocks with > -ksp_converged_reason > -fieldsplit_0_ksp_converged_reason > -fieldsplit_1_ksp_converged_reason > revealed that the outer solver converges in 1 iteration, but each block is solved for 3 times. > This is the output I get: > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > 0 KSP preconditioned resid norm 9.334948012657e+01 true resid norm 1.280164130222e+02 ||r(i)||/||b|| 1.000000000000e+00 > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > 1 KSP preconditioned resid norm 1.518151977611e-11 true resid norm 8.123270435936e-12 ||r(i)||/||b|| 6.345491366429e-14 > > Linear solve converged due to CONVERGED_RTOL iterations 1 > > > Are the subblocks actually solved for multiple times at every outer iteration? > > Thanks for the help, > > Simone -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Mon Mar 18 14:38:34 2019 From: jychang48 at gmail.com (Justin Chang) Date: Mon, 18 Mar 2019 13:38:34 -0600 Subject: [petsc-users] BJACOBI with FIELDSPLIT In-Reply-To: References: <3AFF5052-A97B-4820-B9F4-D19775112519@anl.gov> Message-ID: Use -ksp_type fgmres if your inner ksp solvers are gmres. Maybe that will help? On Mon, Mar 18, 2019 at 1:33 PM Rossi, Simone via petsc-users < petsc-users at mcs.anl.gov> wrote: > Thanks Barry. > > Let me know if you can spot anything out of the ksp_view > > > KSP Object: 1 MPI processes > > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > > maximum iterations=5000, nonzero initial guess > > tolerances: relative=0.001, absolute=1e-50, divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object: 1 MPI processes > > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, > blocksize = 2 > > Solver info for each split is in the following KSP objects: > > Split number 0 Fields 0 > > KSP Object: (fieldsplit_0_) 1 MPI processes > > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object: (fieldsplit_0_) 1 MPI processes > > type: hypre > > HYPRE BoomerAMG preconditioning > > Cycle type V > > Maximum number of levels 25 > > Maximum number of iterations PER hypre call 1 > > Convergence tolerance PER hypre call 0. > > Threshold for strong coupling 0.25 > > Interpolation truncation factor 0. > > Interpolation: max elements per row 0 > > Number of levels of aggressive coarsening 0 > > Number of paths for aggressive coarsening 1 > > Maximum row sums 0.9 > > Sweeps down 1 > > Sweeps up 1 > > Sweeps on coarse 1 > > Relax down symmetric-SOR/Jacobi > > Relax up symmetric-SOR/Jacobi > > Relax on coarse Gaussian-elimination > > Relax weight (all) 1. > > Outer relax weight (all) 1. > > Using CF-relaxation > > Not using more complex smoothers. > > Measure type local > > Coarsen type Falgout > > Interpolation type classical > > linear system matrix = precond matrix: > > Mat Object: (fieldsplit_0_) 1 MPI processes > > type: seqaij > > rows=35937, cols=35937 > > total: nonzeros=912673, allocated nonzeros=912673 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Split number 1 Fields 1 > > KSP Object: (fieldsplit_1_) 1 MPI processes > > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object: (fieldsplit_1_) 1 MPI processes > > type: hypre > > HYPRE BoomerAMG preconditioning > > Cycle type V > > Maximum number of levels 25 > > Maximum number of iterations PER hypre call 1 > > Convergence tolerance PER hypre call 0. > > Threshold for strong coupling 0.25 > > Interpolation truncation factor 0. > > Interpolation: max elements per row 0 > > Number of levels of aggressive coarsening 0 > > Number of paths for aggressive coarsening 1 > > Maximum row sums 0.9 > > Sweeps down 1 > > Sweeps up 1 > > Sweeps on coarse 1 > > Relax down symmetric-SOR/Jacobi > > Relax up symmetric-SOR/Jacobi > > Relax on coarse Gaussian-elimination > > Relax weight (all) 1. > > Outer relax weight (all) 1. > > Using CF-relaxation > > Not using more complex smoothers. > > Measure type local > > Coarsen type Falgout > > Interpolation type classical > > linear system matrix = precond matrix: > > Mat Object: (fieldsplit_1_) 1 MPI processes > > type: seqaij > > rows=35937, cols=35937 > > total: nonzeros=912673, allocated nonzeros=912673 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: () 1 MPI processes > > type: seqaij > > rows=71874, cols=71874 > > total: nonzeros=3650692, allocated nonzeros=3650692 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 35937 nodes, limit used is 5 > > > ------------------------------ > *From:* Smith, Barry F. > *Sent:* Monday, March 18, 2019 3:27:13 PM > *To:* Rossi, Simone > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] BJACOBI with FIELDSPLIT > > > Simone, > > This is indeed surprising, given the block structure of the matrix and > the exact block solves we'd expect the solver to converge after the > application of the preconditioner. Please send the output of -ksp_view > > Barry > > Also if you are willing to share your test code we can try running it to > determine why it doesn't converge immediately. > > > > On Mar 18, 2019, at 2:14 PM, Rossi, Simone via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > Dear all, > > I'm debugging my application in which I'm trying to use the FIELDSPLIT > preconditioner for solving a 2x2 block matrix. > > > > Currently I'm testing the preconditioner on a decoupled system where I > solve two identical and independent Poisson problems. Using the default > fieldsplit type (multiplicative), I'm expecting the method to be equivalent > to a Block Jacobi solver. > > Setting > > -ksp_rtol 1e-6 > > while using gmres/hypre on each subblock with > > -fieldsplit_0_ksp_rtol 1e-12 > > -fieldsplit_1_ksp_rtol 1e-12 > > I'm expecting to converge in 1 iteration with a single solve for each > block. > > > > Asking to output the iteration count for the subblocks with > > -ksp_converged_reason > > -fieldsplit_0_ksp_converged_reason > > -fieldsplit_1_ksp_converged_reason > > revealed that the outer solver converges in 1 iteration, but each block > is solved for 3 times. > > This is the output I get: > > > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > > 0 KSP preconditioned resid norm 9.334948012657e+01 true resid norm > 1.280164130222e+02 ||r(i)||/||b|| 1.000000000000e+00 > > > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > > 1 KSP preconditioned resid norm 1.518151977611e-11 true resid norm > 8.123270435936e-12 ||r(i)||/||b|| 6.345491366429e-14 > > > > Linear solve converged due to CONVERGED_RTOL iterations 1 > > > > > > Are the subblocks actually solved for multiple times at every outer > iteration? > > > > Thanks for the help, > > > > Simone > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srossi at email.unc.edu Mon Mar 18 14:43:04 2019 From: srossi at email.unc.edu (Rossi, Simone) Date: Mon, 18 Mar 2019 19:43:04 +0000 Subject: [petsc-users] BJACOBI with FIELDSPLIT In-Reply-To: References: <3AFF5052-A97B-4820-B9F4-D19775112519@anl.gov> , Message-ID: Thanks, using fgmres it does work as expected. I thought gmres would do the same since I'm solving the subblocks "exactly". Simone ________________________________ From: Justin Chang Sent: Monday, March 18, 2019 3:38:34 PM To: Rossi, Simone Cc: Smith, Barry F.; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] BJACOBI with FIELDSPLIT Use -ksp_type fgmres if your inner ksp solvers are gmres. Maybe that will help? On Mon, Mar 18, 2019 at 1:33 PM Rossi, Simone via petsc-users > wrote: Thanks Barry. Let me know if you can spot anything out of the ksp_view KSP Object: 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=5000, nonzero initial guess tolerances: relative=0.001, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 2 Solver info for each split is in the following KSP objects: Split number 0 Fields 0 KSP Object: (fieldsplit_0_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: hypre HYPRE BoomerAMG preconditioning Cycle type V Maximum number of levels 25 Maximum number of iterations PER hypre call 1 Convergence tolerance PER hypre call 0. Threshold for strong coupling 0.25 Interpolation truncation factor 0. Interpolation: max elements per row 0 Number of levels of aggressive coarsening 0 Number of paths for aggressive coarsening 1 Maximum row sums 0.9 Sweeps down 1 Sweeps up 1 Sweeps on coarse 1 Relax down symmetric-SOR/Jacobi Relax up symmetric-SOR/Jacobi Relax on coarse Gaussian-elimination Relax weight (all) 1. Outer relax weight (all) 1. Using CF-relaxation Not using more complex smoothers. Measure type local Coarsen type Falgout Interpolation type classical linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=35937, cols=35937 total: nonzeros=912673, allocated nonzeros=912673 total number of mallocs used during MatSetValues calls =0 not using I-node routines Split number 1 Fields 1 KSP Object: (fieldsplit_1_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_1_) 1 MPI processes type: hypre HYPRE BoomerAMG preconditioning Cycle type V Maximum number of levels 25 Maximum number of iterations PER hypre call 1 Convergence tolerance PER hypre call 0. Threshold for strong coupling 0.25 Interpolation truncation factor 0. Interpolation: max elements per row 0 Number of levels of aggressive coarsening 0 Number of paths for aggressive coarsening 1 Maximum row sums 0.9 Sweeps down 1 Sweeps up 1 Sweeps on coarse 1 Relax down symmetric-SOR/Jacobi Relax up symmetric-SOR/Jacobi Relax on coarse Gaussian-elimination Relax weight (all) 1. Outer relax weight (all) 1. Using CF-relaxation Not using more complex smoothers. Measure type local Coarsen type Falgout Interpolation type classical linear system matrix = precond matrix: Mat Object: (fieldsplit_1_) 1 MPI processes type: seqaij rows=35937, cols=35937 total: nonzeros=912673, allocated nonzeros=912673 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: () 1 MPI processes type: seqaij rows=71874, cols=71874 total: nonzeros=3650692, allocated nonzeros=3650692 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 35937 nodes, limit used is 5 ________________________________ From: Smith, Barry F. > Sent: Monday, March 18, 2019 3:27:13 PM To: Rossi, Simone Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] BJACOBI with FIELDSPLIT Simone, This is indeed surprising, given the block structure of the matrix and the exact block solves we'd expect the solver to converge after the application of the preconditioner. Please send the output of -ksp_view Barry Also if you are willing to share your test code we can try running it to determine why it doesn't converge immediately. > On Mar 18, 2019, at 2:14 PM, Rossi, Simone via petsc-users > wrote: > > Dear all, > I'm debugging my application in which I'm trying to use the FIELDSPLIT preconditioner for solving a 2x2 block matrix. > > Currently I'm testing the preconditioner on a decoupled system where I solve two identical and independent Poisson problems. Using the default fieldsplit type (multiplicative), I'm expecting the method to be equivalent to a Block Jacobi solver. > Setting > -ksp_rtol 1e-6 > while using gmres/hypre on each subblock with > -fieldsplit_0_ksp_rtol 1e-12 > -fieldsplit_1_ksp_rtol 1e-12 > I'm expecting to converge in 1 iteration with a single solve for each block. > > Asking to output the iteration count for the subblocks with > -ksp_converged_reason > -fieldsplit_0_ksp_converged_reason > -fieldsplit_1_ksp_converged_reason > revealed that the outer solver converges in 1 iteration, but each block is solved for 3 times. > This is the output I get: > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > 0 KSP preconditioned resid norm 9.334948012657e+01 true resid norm 1.280164130222e+02 ||r(i)||/||b|| 1.000000000000e+00 > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > 1 KSP preconditioned resid norm 1.518151977611e-11 true resid norm 8.123270435936e-12 ||r(i)||/||b|| 6.345491366429e-14 > > Linear solve converged due to CONVERGED_RTOL iterations 1 > > > Are the subblocks actually solved for multiple times at every outer iteration? > > Thanks for the help, > > Simone -------------- next part -------------- An HTML attachment was scrubbed... URL: From srossi at email.unc.edu Mon Mar 18 14:56:12 2019 From: srossi at email.unc.edu (Rossi, Simone) Date: Mon, 18 Mar 2019 19:56:12 +0000 Subject: [petsc-users] BJACOBI with FIELDSPLIT In-Reply-To: References: <3AFF5052-A97B-4820-B9F4-D19775112519@anl.gov> , , Message-ID: To follow up on that: when would you want to use gmres instead of fgmres in the outer ksp? Thanks again for the help, Simone ________________________________ From: Rossi, Simone Sent: Monday, March 18, 2019 3:43:04 PM To: Justin Chang Cc: Smith, Barry F.; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] BJACOBI with FIELDSPLIT Thanks, using fgmres it does work as expected. I thought gmres would do the same since I'm solving the subblocks "exactly". Simone ________________________________ From: Justin Chang Sent: Monday, March 18, 2019 3:38:34 PM To: Rossi, Simone Cc: Smith, Barry F.; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] BJACOBI with FIELDSPLIT Use -ksp_type fgmres if your inner ksp solvers are gmres. Maybe that will help? On Mon, Mar 18, 2019 at 1:33 PM Rossi, Simone via petsc-users > wrote: Thanks Barry. Let me know if you can spot anything out of the ksp_view KSP Object: 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=5000, nonzero initial guess tolerances: relative=0.001, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 2 Solver info for each split is in the following KSP objects: Split number 0 Fields 0 KSP Object: (fieldsplit_0_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: hypre HYPRE BoomerAMG preconditioning Cycle type V Maximum number of levels 25 Maximum number of iterations PER hypre call 1 Convergence tolerance PER hypre call 0. Threshold for strong coupling 0.25 Interpolation truncation factor 0. Interpolation: max elements per row 0 Number of levels of aggressive coarsening 0 Number of paths for aggressive coarsening 1 Maximum row sums 0.9 Sweeps down 1 Sweeps up 1 Sweeps on coarse 1 Relax down symmetric-SOR/Jacobi Relax up symmetric-SOR/Jacobi Relax on coarse Gaussian-elimination Relax weight (all) 1. Outer relax weight (all) 1. Using CF-relaxation Not using more complex smoothers. Measure type local Coarsen type Falgout Interpolation type classical linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=35937, cols=35937 total: nonzeros=912673, allocated nonzeros=912673 total number of mallocs used during MatSetValues calls =0 not using I-node routines Split number 1 Fields 1 KSP Object: (fieldsplit_1_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_1_) 1 MPI processes type: hypre HYPRE BoomerAMG preconditioning Cycle type V Maximum number of levels 25 Maximum number of iterations PER hypre call 1 Convergence tolerance PER hypre call 0. Threshold for strong coupling 0.25 Interpolation truncation factor 0. Interpolation: max elements per row 0 Number of levels of aggressive coarsening 0 Number of paths for aggressive coarsening 1 Maximum row sums 0.9 Sweeps down 1 Sweeps up 1 Sweeps on coarse 1 Relax down symmetric-SOR/Jacobi Relax up symmetric-SOR/Jacobi Relax on coarse Gaussian-elimination Relax weight (all) 1. Outer relax weight (all) 1. Using CF-relaxation Not using more complex smoothers. Measure type local Coarsen type Falgout Interpolation type classical linear system matrix = precond matrix: Mat Object: (fieldsplit_1_) 1 MPI processes type: seqaij rows=35937, cols=35937 total: nonzeros=912673, allocated nonzeros=912673 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: () 1 MPI processes type: seqaij rows=71874, cols=71874 total: nonzeros=3650692, allocated nonzeros=3650692 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 35937 nodes, limit used is 5 ________________________________ From: Smith, Barry F. > Sent: Monday, March 18, 2019 3:27:13 PM To: Rossi, Simone Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] BJACOBI with FIELDSPLIT Simone, This is indeed surprising, given the block structure of the matrix and the exact block solves we'd expect the solver to converge after the application of the preconditioner. Please send the output of -ksp_view Barry Also if you are willing to share your test code we can try running it to determine why it doesn't converge immediately. > On Mar 18, 2019, at 2:14 PM, Rossi, Simone via petsc-users > wrote: > > Dear all, > I'm debugging my application in which I'm trying to use the FIELDSPLIT preconditioner for solving a 2x2 block matrix. > > Currently I'm testing the preconditioner on a decoupled system where I solve two identical and independent Poisson problems. Using the default fieldsplit type (multiplicative), I'm expecting the method to be equivalent to a Block Jacobi solver. > Setting > -ksp_rtol 1e-6 > while using gmres/hypre on each subblock with > -fieldsplit_0_ksp_rtol 1e-12 > -fieldsplit_1_ksp_rtol 1e-12 > I'm expecting to converge in 1 iteration with a single solve for each block. > > Asking to output the iteration count for the subblocks with > -ksp_converged_reason > -fieldsplit_0_ksp_converged_reason > -fieldsplit_1_ksp_converged_reason > revealed that the outer solver converges in 1 iteration, but each block is solved for 3 times. > This is the output I get: > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > 0 KSP preconditioned resid norm 9.334948012657e+01 true resid norm 1.280164130222e+02 ||r(i)||/||b|| 1.000000000000e+00 > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > 1 KSP preconditioned resid norm 1.518151977611e-11 true resid norm 8.123270435936e-12 ||r(i)||/||b|| 6.345491366429e-14 > > Linear solve converged due to CONVERGED_RTOL iterations 1 > > > Are the subblocks actually solved for multiple times at every outer iteration? > > Thanks for the help, > > Simone -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 18 14:57:37 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 18 Mar 2019 15:57:37 -0400 Subject: [petsc-users] BJACOBI with FIELDSPLIT In-Reply-To: References: Message-ID: On Mon, Mar 18, 2019 at 3:18 PM Rossi, Simone via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear all, > > I'm debugging my application in which I'm trying to use the FIELDSPLIT > preconditioner for solving a 2x2 block matrix. > > > Currently I'm testing the preconditioner on a decoupled system where I > solve two identical and independent Poisson problems. Using the default > fieldsplit type (multiplicative), I'm expecting the method to be equivalent > to a Block Jacobi solver. > > Setting > > -ksp_rtol 1e-6 > > while using gmres/hypre on each subblock with > > -fieldsplit_0_ksp_rtol 1e-12 > > -fieldsplit_1_ksp_rtol 1e-12 > > I'm expecting to converge in 1 iteration with a single solve for each > block. > > > Asking to output the iteration count for the subblocks with > > -ksp_converged_reason > -fieldsplit_0_ksp_converged_reason > > -fieldsplit_1_ksp_converged_reason > > revealed that the outer solver converges in 1 iteration, but each block is > solved for 3 times. > > This is the output I get: > > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > This first application of the PC is to evaluate the initial preconditioned residual. > 0 KSP preconditioned resid norm 9.334948012657e+01 true resid norm > 1.280164130222e+02 ||r(i)||/||b|| 1.000000000000e+00 > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > This next one is for applying M^{-1} A in the Krylov iteration. > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > I think this one might be from building the true residual. Matt > 1 KSP preconditioned resid norm 1.518151977611e-11 true resid norm > 8.123270435936e-12 ||r(i)||/||b|| 6.345491366429e-14 > > Linear solve converged due to CONVERGED_RTOL iterations 1 > > > > Are the subblocks actually solved for multiple times at every outer > iteration? > > > Thanks for the help, > > > Simone > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 18 14:58:39 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 18 Mar 2019 15:58:39 -0400 Subject: [petsc-users] BJACOBI with FIELDSPLIT In-Reply-To: References: <3AFF5052-A97B-4820-B9F4-D19775112519@anl.gov> Message-ID: On Mon, Mar 18, 2019 at 3:56 PM Rossi, Simone via petsc-users < petsc-users at mcs.anl.gov> wrote: > To follow up on that: when would you want to use gmres instead of fgmres > in the outer ksp? > The difference here is just that FGMRES is right-preconditioned by default, so you do not get the extra application. I think if you use the regular monitor, -ksp_monitor, you will not see 2 applications. Matt > Thanks again for the help, > > Simone > ------------------------------ > *From:* Rossi, Simone > *Sent:* Monday, March 18, 2019 3:43:04 PM > *To:* Justin Chang > *Cc:* Smith, Barry F.; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] BJACOBI with FIELDSPLIT > > > Thanks, using fgmres it does work as expected. > > I thought gmres would do the same since I'm solving the subblocks > "exactly". > > > Simone > > > ------------------------------ > *From:* Justin Chang > *Sent:* Monday, March 18, 2019 3:38:34 PM > *To:* Rossi, Simone > *Cc:* Smith, Barry F.; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] BJACOBI with FIELDSPLIT > > Use -ksp_type fgmres if your inner ksp solvers are gmres. Maybe that will > help? > > On Mon, Mar 18, 2019 at 1:33 PM Rossi, Simone via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Thanks Barry. > > Let me know if you can spot anything out of the ksp_view > > > KSP Object: 1 MPI processes > > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > > maximum iterations=5000, nonzero initial guess > > tolerances: relative=0.001, absolute=1e-50, divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object: 1 MPI processes > > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, > blocksize = 2 > > Solver info for each split is in the following KSP objects: > > Split number 0 Fields 0 > > KSP Object: (fieldsplit_0_) 1 MPI processes > > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object: (fieldsplit_0_) 1 MPI processes > > type: hypre > > HYPRE BoomerAMG preconditioning > > Cycle type V > > Maximum number of levels 25 > > Maximum number of iterations PER hypre call 1 > > Convergence tolerance PER hypre call 0. > > Threshold for strong coupling 0.25 > > Interpolation truncation factor 0. > > Interpolation: max elements per row 0 > > Number of levels of aggressive coarsening 0 > > Number of paths for aggressive coarsening 1 > > Maximum row sums 0.9 > > Sweeps down 1 > > Sweeps up 1 > > Sweeps on coarse 1 > > Relax down symmetric-SOR/Jacobi > > Relax up symmetric-SOR/Jacobi > > Relax on coarse Gaussian-elimination > > Relax weight (all) 1. > > Outer relax weight (all) 1. > > Using CF-relaxation > > Not using more complex smoothers. > > Measure type local > > Coarsen type Falgout > > Interpolation type classical > > linear system matrix = precond matrix: > > Mat Object: (fieldsplit_0_) 1 MPI processes > > type: seqaij > > rows=35937, cols=35937 > > total: nonzeros=912673, allocated nonzeros=912673 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Split number 1 Fields 1 > > KSP Object: (fieldsplit_1_) 1 MPI processes > > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-12, absolute=1e-50, divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object: (fieldsplit_1_) 1 MPI processes > > type: hypre > > HYPRE BoomerAMG preconditioning > > Cycle type V > > Maximum number of levels 25 > > Maximum number of iterations PER hypre call 1 > > Convergence tolerance PER hypre call 0. > > Threshold for strong coupling 0.25 > > Interpolation truncation factor 0. > > Interpolation: max elements per row 0 > > Number of levels of aggressive coarsening 0 > > Number of paths for aggressive coarsening 1 > > Maximum row sums 0.9 > > Sweeps down 1 > > Sweeps up 1 > > Sweeps on coarse 1 > > Relax down symmetric-SOR/Jacobi > > Relax up symmetric-SOR/Jacobi > > Relax on coarse Gaussian-elimination > > Relax weight (all) 1. > > Outer relax weight (all) 1. > > Using CF-relaxation > > Not using more complex smoothers. > > Measure type local > > Coarsen type Falgout > > Interpolation type classical > > linear system matrix = precond matrix: > > Mat Object: (fieldsplit_1_) 1 MPI processes > > type: seqaij > > rows=35937, cols=35937 > > total: nonzeros=912673, allocated nonzeros=912673 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > linear system matrix = precond matrix: > > Mat Object: () 1 MPI processes > > type: seqaij > > rows=71874, cols=71874 > > total: nonzeros=3650692, allocated nonzeros=3650692 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 35937 nodes, limit used is 5 > > > ------------------------------ > *From:* Smith, Barry F. > *Sent:* Monday, March 18, 2019 3:27:13 PM > *To:* Rossi, Simone > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] BJACOBI with FIELDSPLIT > > > Simone, > > This is indeed surprising, given the block structure of the matrix and > the exact block solves we'd expect the solver to converge after the > application of the preconditioner. Please send the output of -ksp_view > > Barry > > Also if you are willing to share your test code we can try running it to > determine why it doesn't converge immediately. > > > > On Mar 18, 2019, at 2:14 PM, Rossi, Simone via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > Dear all, > > I'm debugging my application in which I'm trying to use the FIELDSPLIT > preconditioner for solving a 2x2 block matrix. > > > > Currently I'm testing the preconditioner on a decoupled system where I > solve two identical and independent Poisson problems. Using the default > fieldsplit type (multiplicative), I'm expecting the method to be equivalent > to a Block Jacobi solver. > > Setting > > -ksp_rtol 1e-6 > > while using gmres/hypre on each subblock with > > -fieldsplit_0_ksp_rtol 1e-12 > > -fieldsplit_1_ksp_rtol 1e-12 > > I'm expecting to converge in 1 iteration with a single solve for each > block. > > > > Asking to output the iteration count for the subblocks with > > -ksp_converged_reason > > -fieldsplit_0_ksp_converged_reason > > -fieldsplit_1_ksp_converged_reason > > revealed that the outer solver converges in 1 iteration, but each block > is solved for 3 times. > > This is the output I get: > > > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > > 0 KSP preconditioned resid norm 9.334948012657e+01 true resid norm > 1.280164130222e+02 ||r(i)||/||b|| 1.000000000000e+00 > > > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > > 1 KSP preconditioned resid norm 1.518151977611e-11 true resid norm > 8.123270435936e-12 ||r(i)||/||b|| 6.345491366429e-14 > > > > Linear solve converged due to CONVERGED_RTOL iterations 1 > > > > > > Are the subblocks actually solved for multiple times at every outer > iteration? > > > > Thanks for the help, > > > > Simone > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From colin.cotter at imperial.ac.uk Mon Mar 18 15:14:48 2019 From: colin.cotter at imperial.ac.uk (Cotter, Colin J) Date: Mon, 18 Mar 2019 20:14:48 +0000 Subject: [petsc-users] Confusing Schur preconditioner behaviour Message-ID: Dear petsc-users, I'm solving a 2x2 block system, for which I can construct the Schur complement analytically (through compatible FEM stuff), which I can pass as the preconditioning matrix. When using gmres on the outer iteration, and preonly+lu on the inner iterations with a Schur complement preconditioner, I see convergence in 1 iteration as expected. However, when I set gmres+lu on the inner iteration for S, I see several iterations. This seems strange to me, as the first result seems to confirm that I have an exact Schur complement, but the second result implies not. What could be going on here? I've appended output to the bottom of this message, first the preonly+lu and then for gmres+lu. all the best --Colin Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Residual norms for firedrake_0_ solve. 0 KSP preconditioned resid norm 4.985448866758e+00 true resid norm 1.086016610848e-03 ||r(i)||/||b|| 1.000000000000e+00 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.245615753306e-13 true resid norm 2.082000915439e-14 ||r(i)||/||b|| 1.917098591903e-11 KSP Object: (firedrake_0_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-07, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (firedrake_0_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=6144, cols=6144 package used to perform factorization: petsc total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=6144, cols=6144 total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 5.09173 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=15360, cols=15360 package used to perform factorization: petsc total: nonzeros=1360836, allocated nonzeros=1360836 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: schurcomplement rows=15360, cols=15360 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=15360, cols=15360 total: nonzeros=267264, allocated nonzeros=267264 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 A10 Mat Object: 1 MPI processes type: seqaij rows=15360, cols=6144 total: nonzeros=73728, allocated nonzeros=73728 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 KSP of A00 KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=6144, cols=6144 package used to perform factorization: petsc total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=6144, cols=6144 total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=6144, cols=15360 total: nonzeros=73728, allocated nonzeros=73728 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=15360, cols=15360 total: nonzeros=267264, allocated nonzeros=267264 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (firedrake_0_) 1 MPI processes type: nest rows=21504, cols=21504 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : type=seqaij, rows=15360, cols=15360 (0,1) : type=seqaij, rows=15360, cols=6144 (1,0) : type=seqaij, rows=6144, cols=15360 (1,1) : type=seqaij, rows=6144, cols=6144 Mat Object: (firedrake_0_) 1 MPI processes type: nest rows=21504, cols=21504 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="firedrake_0_fieldsplit_1_", type=seqaij, rows=15360, cols=15360 (0,1) : type=seqaij, rows=15360, cols=6144 (1,0) : type=seqaij, rows=6144, cols=15360 (1,1) : prefix="firedrake_0_fieldsplit_0_", type=seqaij, rows=6144, cols=6144 ===== Residual norms for firedrake_0_fieldsplit_1_ solve. 0 KSP preconditioned resid norm 8.819238435108e-02 true resid norm 1.797571993221e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.025167319984e-02 true resid norm 3.340583874349e-02 ||r(i)||/||b|| 1.858386694356e-01 2 KSP preconditioned resid norm 1.235104644359e-03 true resid norm 8.148396804822e-03 ||r(i)||/||b|| 4.533001646417e-02 3 KSP preconditioned resid norm 1.624748553125e-04 true resid norm 1.612221957927e-03 ||r(i)||/||b|| 8.968886720573e-03 4 KSP preconditioned resid norm 2.233373761266e-05 true resid norm 3.292437172839e-04 ||r(i)||/||b|| 1.831602397710e-03 5 KSP preconditioned resid norm 1.895393184017e-06 true resid norm 4.091207337005e-05 ||r(i)||/||b|| 2.275962994770e-04 6 KSP preconditioned resid norm 1.699212495729e-07 true resid norm 3.851173419652e-06 ||r(i)||/||b|| 2.142430697728e-05 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 6 KSP Object: (firedrake_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=6144, cols=6144 package used to perform factorization: petsc total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=6144, cols=6144 total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 5.09173 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=15360, cols=15360 package used to perform factorization: petsc total: nonzeros=1360836, allocated nonzeros=1360836 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: schurcomplement rows=15360, cols=15360 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=15360, cols=15360 total: nonzeros=267264, allocated nonzeros=267264 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 A10 Mat Object: 1 MPI processes type: seqaij rows=15360, cols=6144 total: nonzeros=73728, allocated nonzeros=73728 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 KSP of A00 KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=6144, cols=6144 package used to perform factorization: petsc total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=6144, cols=6144 total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=6144, cols=15360 total: nonzeros=73728, allocated nonzeros=73728 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=15360, cols=15360 total: nonzeros=267264, allocated nonzeros=267264 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (firedrake_0_) 1 MPI processes type: nest rows=21504, cols=21504 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : type=seqaij, rows=15360, cols=15360 (0,1) : type=seqaij, rows=15360, cols=6144 (1,0) : type=seqaij, rows=6144, cols=15360 (1,1) : type=seqaij, rows=6144, cols=6144 Mat Object: (firedrake_0_) 1 MPI processes type: nest rows=21504, cols=21504 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="firedrake_0_fieldsplit_1_", type=seqaij, rows=15360, cols=15360 (0,1) : type=seqaij, rows=15360, cols=6144 (1,0) : type=seqaij, rows=6144, cols=15360 (1,1) : prefix="firedrake_0_fieldsplit_0_", type=seqaij, rows=6144, cols=6144 -------------- next part -------------- An HTML attachment was scrubbed... URL: From colin.cotter at imperial.ac.uk Mon Mar 18 15:16:35 2019 From: colin.cotter at imperial.ac.uk (Cotter, Colin J) Date: Mon, 18 Mar 2019 20:16:35 +0000 Subject: [petsc-users] Confusing Schur preconditioner behaviour In-Reply-To: References: Message-ID: Sorry, just to clarify, in the second case I see several *inner* iterations, even though I'm using LU on a supposedly exact Schur complement as the preconditioner for the Schur system. ________________________________ From: petsc-users on behalf of Cotter, Colin J via petsc-users Sent: 18 March 2019 20:14:48 To: petsc-users at mcs.anl.gov Subject: [petsc-users] Confusing Schur preconditioner behaviour Dear petsc-users, I'm solving a 2x2 block system, for which I can construct the Schur complement analytically (through compatible FEM stuff), which I can pass as the preconditioning matrix. When using gmres on the outer iteration, and preonly+lu on the inner iterations with a Schur complement preconditioner, I see convergence in 1 iteration as expected. However, when I set gmres+lu on the inner iteration for S, I see several iterations. This seems strange to me, as the first result seems to confirm that I have an exact Schur complement, but the second result implies not. What could be going on here? I've appended output to the bottom of this message, first the preonly+lu and then for gmres+lu. all the best --Colin Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Residual norms for firedrake_0_ solve. 0 KSP preconditioned resid norm 4.985448866758e+00 true resid norm 1.086016610848e-03 ||r(i)||/||b|| 1.000000000000e+00 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.245615753306e-13 true resid norm 2.082000915439e-14 ||r(i)||/||b|| 1.917098591903e-11 KSP Object: (firedrake_0_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-07, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (firedrake_0_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=6144, cols=6144 package used to perform factorization: petsc total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=6144, cols=6144 total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 5.09173 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=15360, cols=15360 package used to perform factorization: petsc total: nonzeros=1360836, allocated nonzeros=1360836 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: schurcomplement rows=15360, cols=15360 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=15360, cols=15360 total: nonzeros=267264, allocated nonzeros=267264 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 A10 Mat Object: 1 MPI processes type: seqaij rows=15360, cols=6144 total: nonzeros=73728, allocated nonzeros=73728 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 KSP of A00 KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=6144, cols=6144 package used to perform factorization: petsc total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=6144, cols=6144 total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=6144, cols=15360 total: nonzeros=73728, allocated nonzeros=73728 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=15360, cols=15360 total: nonzeros=267264, allocated nonzeros=267264 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (firedrake_0_) 1 MPI processes type: nest rows=21504, cols=21504 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : type=seqaij, rows=15360, cols=15360 (0,1) : type=seqaij, rows=15360, cols=6144 (1,0) : type=seqaij, rows=6144, cols=15360 (1,1) : type=seqaij, rows=6144, cols=6144 Mat Object: (firedrake_0_) 1 MPI processes type: nest rows=21504, cols=21504 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="firedrake_0_fieldsplit_1_", type=seqaij, rows=15360, cols=15360 (0,1) : type=seqaij, rows=15360, cols=6144 (1,0) : type=seqaij, rows=6144, cols=15360 (1,1) : prefix="firedrake_0_fieldsplit_0_", type=seqaij, rows=6144, cols=6144 ===== Residual norms for firedrake_0_fieldsplit_1_ solve. 0 KSP preconditioned resid norm 8.819238435108e-02 true resid norm 1.797571993221e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.025167319984e-02 true resid norm 3.340583874349e-02 ||r(i)||/||b|| 1.858386694356e-01 2 KSP preconditioned resid norm 1.235104644359e-03 true resid norm 8.148396804822e-03 ||r(i)||/||b|| 4.533001646417e-02 3 KSP preconditioned resid norm 1.624748553125e-04 true resid norm 1.612221957927e-03 ||r(i)||/||b|| 8.968886720573e-03 4 KSP preconditioned resid norm 2.233373761266e-05 true resid norm 3.292437172839e-04 ||r(i)||/||b|| 1.831602397710e-03 5 KSP preconditioned resid norm 1.895393184017e-06 true resid norm 4.091207337005e-05 ||r(i)||/||b|| 2.275962994770e-04 6 KSP preconditioned resid norm 1.699212495729e-07 true resid norm 3.851173419652e-06 ||r(i)||/||b|| 2.142430697728e-05 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 6 KSP Object: (firedrake_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=6144, cols=6144 package used to perform factorization: petsc total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=6144, cols=6144 total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 5.09173 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=15360, cols=15360 package used to perform factorization: petsc total: nonzeros=1360836, allocated nonzeros=1360836 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: schurcomplement rows=15360, cols=15360 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=15360, cols=15360 total: nonzeros=267264, allocated nonzeros=267264 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 A10 Mat Object: 1 MPI processes type: seqaij rows=15360, cols=6144 total: nonzeros=73728, allocated nonzeros=73728 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 KSP of A00 KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=6144, cols=6144 package used to perform factorization: petsc total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=6144, cols=6144 total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=6144, cols=15360 total: nonzeros=73728, allocated nonzeros=73728 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=15360, cols=15360 total: nonzeros=267264, allocated nonzeros=267264 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (firedrake_0_) 1 MPI processes type: nest rows=21504, cols=21504 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : type=seqaij, rows=15360, cols=15360 (0,1) : type=seqaij, rows=15360, cols=6144 (1,0) : type=seqaij, rows=6144, cols=15360 (1,1) : type=seqaij, rows=6144, cols=6144 Mat Object: (firedrake_0_) 1 MPI processes type: nest rows=21504, cols=21504 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="firedrake_0_fieldsplit_1_", type=seqaij, rows=15360, cols=15360 (0,1) : type=seqaij, rows=15360, cols=6144 (1,0) : type=seqaij, rows=6144, cols=15360 (1,1) : prefix="firedrake_0_fieldsplit_0_", type=seqaij, rows=6144, cols=6144 -------------- next part -------------- An HTML attachment was scrubbed... URL: From srossi at email.unc.edu Mon Mar 18 15:17:27 2019 From: srossi at email.unc.edu (Rossi, Simone) Date: Mon, 18 Mar 2019 20:17:27 +0000 Subject: [petsc-users] BJACOBI with FIELDSPLIT In-Reply-To: References: <3AFF5052-A97B-4820-B9F4-D19775112519@anl.gov> , Message-ID: Got it. I could use -pc_pc_side to get the same with gmres: now it makes sense. Thanks Simone ________________________________ From: Matthew Knepley Sent: Monday, March 18, 2019 3:58:39 PM To: Rossi, Simone Cc: Justin Chang; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] BJACOBI with FIELDSPLIT On Mon, Mar 18, 2019 at 3:56 PM Rossi, Simone via petsc-users > wrote: To follow up on that: when would you want to use gmres instead of fgmres in the outer ksp? The difference here is just that FGMRES is right-preconditioned by default, so you do not get the extra application. I think if you use the regular monitor, -ksp_monitor, you will not see 2 applications. Matt Thanks again for the help, Simone ________________________________ From: Rossi, Simone Sent: Monday, March 18, 2019 3:43:04 PM To: Justin Chang Cc: Smith, Barry F.; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] BJACOBI with FIELDSPLIT Thanks, using fgmres it does work as expected. I thought gmres would do the same since I'm solving the subblocks "exactly". Simone ________________________________ From: Justin Chang > Sent: Monday, March 18, 2019 3:38:34 PM To: Rossi, Simone Cc: Smith, Barry F.; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] BJACOBI with FIELDSPLIT Use -ksp_type fgmres if your inner ksp solvers are gmres. Maybe that will help? On Mon, Mar 18, 2019 at 1:33 PM Rossi, Simone via petsc-users > wrote: Thanks Barry. Let me know if you can spot anything out of the ksp_view KSP Object: 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=5000, nonzero initial guess tolerances: relative=0.001, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 2 Solver info for each split is in the following KSP objects: Split number 0 Fields 0 KSP Object: (fieldsplit_0_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: hypre HYPRE BoomerAMG preconditioning Cycle type V Maximum number of levels 25 Maximum number of iterations PER hypre call 1 Convergence tolerance PER hypre call 0. Threshold for strong coupling 0.25 Interpolation truncation factor 0. Interpolation: max elements per row 0 Number of levels of aggressive coarsening 0 Number of paths for aggressive coarsening 1 Maximum row sums 0.9 Sweeps down 1 Sweeps up 1 Sweeps on coarse 1 Relax down symmetric-SOR/Jacobi Relax up symmetric-SOR/Jacobi Relax on coarse Gaussian-elimination Relax weight (all) 1. Outer relax weight (all) 1. Using CF-relaxation Not using more complex smoothers. Measure type local Coarsen type Falgout Interpolation type classical linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=35937, cols=35937 total: nonzeros=912673, allocated nonzeros=912673 total number of mallocs used during MatSetValues calls =0 not using I-node routines Split number 1 Fields 1 KSP Object: (fieldsplit_1_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_1_) 1 MPI processes type: hypre HYPRE BoomerAMG preconditioning Cycle type V Maximum number of levels 25 Maximum number of iterations PER hypre call 1 Convergence tolerance PER hypre call 0. Threshold for strong coupling 0.25 Interpolation truncation factor 0. Interpolation: max elements per row 0 Number of levels of aggressive coarsening 0 Number of paths for aggressive coarsening 1 Maximum row sums 0.9 Sweeps down 1 Sweeps up 1 Sweeps on coarse 1 Relax down symmetric-SOR/Jacobi Relax up symmetric-SOR/Jacobi Relax on coarse Gaussian-elimination Relax weight (all) 1. Outer relax weight (all) 1. Using CF-relaxation Not using more complex smoothers. Measure type local Coarsen type Falgout Interpolation type classical linear system matrix = precond matrix: Mat Object: (fieldsplit_1_) 1 MPI processes type: seqaij rows=35937, cols=35937 total: nonzeros=912673, allocated nonzeros=912673 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: () 1 MPI processes type: seqaij rows=71874, cols=71874 total: nonzeros=3650692, allocated nonzeros=3650692 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 35937 nodes, limit used is 5 ________________________________ From: Smith, Barry F. > Sent: Monday, March 18, 2019 3:27:13 PM To: Rossi, Simone Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] BJACOBI with FIELDSPLIT Simone, This is indeed surprising, given the block structure of the matrix and the exact block solves we'd expect the solver to converge after the application of the preconditioner. Please send the output of -ksp_view Barry Also if you are willing to share your test code we can try running it to determine why it doesn't converge immediately. > On Mar 18, 2019, at 2:14 PM, Rossi, Simone via petsc-users > wrote: > > Dear all, > I'm debugging my application in which I'm trying to use the FIELDSPLIT preconditioner for solving a 2x2 block matrix. > > Currently I'm testing the preconditioner on a decoupled system where I solve two identical and independent Poisson problems. Using the default fieldsplit type (multiplicative), I'm expecting the method to be equivalent to a Block Jacobi solver. > Setting > -ksp_rtol 1e-6 > while using gmres/hypre on each subblock with > -fieldsplit_0_ksp_rtol 1e-12 > -fieldsplit_1_ksp_rtol 1e-12 > I'm expecting to converge in 1 iteration with a single solve for each block. > > Asking to output the iteration count for the subblocks with > -ksp_converged_reason > -fieldsplit_0_ksp_converged_reason > -fieldsplit_1_ksp_converged_reason > revealed that the outer solver converges in 1 iteration, but each block is solved for 3 times. > This is the output I get: > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > 0 KSP preconditioned resid norm 9.334948012657e+01 true resid norm 1.280164130222e+02 ||r(i)||/||b|| 1.000000000000e+00 > > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_0_ solve converged due to CONVERGED_RTOL iterations 7 > Linear fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 7 > 1 KSP preconditioned resid norm 1.518151977611e-11 true resid norm 8.123270435936e-12 ||r(i)||/||b|| 6.345491366429e-14 > > Linear solve converged due to CONVERGED_RTOL iterations 1 > > > Are the subblocks actually solved for multiple times at every outer iteration? > > Thanks for the help, > > Simone -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Mon Mar 18 16:01:17 2019 From: jychang48 at gmail.com (Justin Chang) Date: Mon, 18 Mar 2019 15:01:17 -0600 Subject: [petsc-users] Confusing Schur preconditioner behaviour In-Reply-To: References: Message-ID: Colin, 1) What equations are you solving? 2) In your second case, you set the outer ksp to preonly, thus we are unable to see the ksp_monitor for the (firedrake_0_) solver. Set it to gmres and see if you have a similar output to your first case: 0 KSP preconditioned resid norm 4.985448866758e+00 true resid norm 1.086016610848e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.245615753306e-13 true resid norm 2.082000915439e-14 ||r(i)||/||b|| 1.917098591903e-11 Because according to the first ksp_view output, after one lu sweep for the (firedrake_0_fieldsplit_1_) solver. That is, going from: 0 KSP preconditioned resid norm 8.819238435108e-02 true resid norm 1.797571993221e-01 ||r(i)||/||b|| 1.000000000000e+00 to 1 KSP preconditioned resid norm 1.025167319984e-02 true resid norm 3.340583874349e-02 ||r(i)||/||b|| 1.858386694356e-01 appeared to give you an exact schur complement. Justin On Mon, Mar 18, 2019 at 2:18 PM Cotter, Colin J via petsc-users < petsc-users at mcs.anl.gov> wrote: > Sorry, just to clarify, in the second case I see several *inner* > iterations, even though I'm using LU on a supposedly exact Schur complement > as the preconditioner for the Schur system. > ------------------------------ > *From:* petsc-users on behalf of > Cotter, Colin J via petsc-users > *Sent:* 18 March 2019 20:14:48 > *To:* petsc-users at mcs.anl.gov > *Subject:* [petsc-users] Confusing Schur preconditioner behaviour > > > Dear petsc-users, > > I'm solving a 2x2 block system, for which I can construct the Schur > complement analytically (through compatible FEM stuff), > > which I can pass as the preconditioning matrix. > > > When using gmres on the outer iteration, and preonly+lu on the inner > iterations with a Schur complement preconditioner, > > I see convergence in 1 iteration as expected. However, when I set gmres+lu > on the inner iteration for S, I see several iterations. > > > This seems strange to me, as the first result seems to confirm that I have > an exact Schur complement, but the second result > > implies not. > > What could be going on here? > > > I've appended output to the bottom of this message, first the preonly+lu > and then for gmres+lu. > > > all the best > > --Colin > > > Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_ITS > iterations 1 > Residual norms for firedrake_0_ solve. > 0 KSP preconditioned resid norm 4.985448866758e+00 true resid norm > 1.086016610848e-03 ||r(i)||/||b|| 1.000000000000e+00 > Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_ITS > iterations 1 > 1 KSP preconditioned resid norm 1.245615753306e-13 true resid norm > 2.082000915439e-14 ||r(i)||/||b|| 1.917098591903e-11 > KSP Object: (firedrake_0_) 1 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-07, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (firedrake_0_) 1 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, factorization FULL > Preconditioner for the Schur complement formed from A11 > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=6144 > package used to perform factorization: petsc > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=6144, cols=6144 > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 5.09173 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=15360, cols=15360 > package used to perform factorization: petsc > total: nonzeros=1360836, allocated nonzeros=1360836 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > linear system matrix followed by preconditioner matrix: > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: schurcomplement > rows=15360, cols=15360 > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: seqaij > rows=15360, cols=15360 > total: nonzeros=267264, allocated nonzeros=267264 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > A10 > Mat Object: 1 MPI processes > type: seqaij > rows=15360, cols=6144 > total: nonzeros=73728, allocated nonzeros=73728 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > KSP of A00 > KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=6144 > package used to perform factorization: petsc > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues > calls =0 > using I-node routines: found 2048 nodes, limit > used is 5 > linear system matrix = precond matrix: > Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=6144, cols=6144 > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is > 5 > A01 > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=15360 > total: nonzeros=73728, allocated nonzeros=73728 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: seqaij > rows=15360, cols=15360 > total: nonzeros=267264, allocated nonzeros=267264 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > linear system matrix followed by preconditioner matrix: > Mat Object: (firedrake_0_) 1 MPI processes > type: nest > rows=21504, cols=21504 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : type=seqaij, rows=15360, cols=15360 > (0,1) : type=seqaij, rows=15360, cols=6144 > (1,0) : type=seqaij, rows=6144, cols=15360 > (1,1) : type=seqaij, rows=6144, cols=6144 > Mat Object: (firedrake_0_) 1 MPI processes > type: nest > rows=21504, cols=21504 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : prefix="firedrake_0_fieldsplit_1_", type=seqaij, > rows=15360, cols=15360 > (0,1) : type=seqaij, rows=15360, cols=6144 > (1,0) : type=seqaij, rows=6144, cols=15360 > (1,1) : prefix="firedrake_0_fieldsplit_0_", type=seqaij, > rows=6144, cols=6144 > > ===== > > > Residual norms for firedrake_0_fieldsplit_1_ solve. > 0 KSP preconditioned resid norm 8.819238435108e-02 true resid norm > 1.797571993221e-01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.025167319984e-02 true resid norm > 3.340583874349e-02 ||r(i)||/||b|| 1.858386694356e-01 > 2 KSP preconditioned resid norm 1.235104644359e-03 true resid norm > 8.148396804822e-03 ||r(i)||/||b|| 4.533001646417e-02 > 3 KSP preconditioned resid norm 1.624748553125e-04 true resid norm > 1.612221957927e-03 ||r(i)||/||b|| 8.968886720573e-03 > 4 KSP preconditioned resid norm 2.233373761266e-05 true resid norm > 3.292437172839e-04 ||r(i)||/||b|| 1.831602397710e-03 > 5 KSP preconditioned resid norm 1.895393184017e-06 true resid norm > 4.091207337005e-05 ||r(i)||/||b|| 2.275962994770e-04 > 6 KSP preconditioned resid norm 1.699212495729e-07 true resid norm > 3.851173419652e-06 ||r(i)||/||b|| 2.142430697728e-05 > Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL > iterations 6 > KSP Object: (firedrake_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_) 1 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, factorization FULL > Preconditioner for the Schur complement formed from A11 > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=6144 > package used to perform factorization: petsc > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=6144, cols=6144 > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 5.09173 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=15360, cols=15360 > package used to perform factorization: petsc > total: nonzeros=1360836, allocated nonzeros=1360836 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > linear system matrix followed by preconditioner matrix: > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: schurcomplement > rows=15360, cols=15360 > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: seqaij > rows=15360, cols=15360 > total: nonzeros=267264, allocated nonzeros=267264 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > A10 > Mat Object: 1 MPI processes > type: seqaij > rows=15360, cols=6144 > total: nonzeros=73728, allocated nonzeros=73728 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > KSP of A00 > KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=6144 > package used to perform factorization: petsc > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues > calls =0 > using I-node routines: found 2048 nodes, limit > used is 5 > linear system matrix = precond matrix: > Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=6144, cols=6144 > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is > 5 > A01 > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=15360 > total: nonzeros=73728, allocated nonzeros=73728 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: seqaij > rows=15360, cols=15360 > total: nonzeros=267264, allocated nonzeros=267264 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > linear system matrix followed by preconditioner matrix: > Mat Object: (firedrake_0_) 1 MPI processes > type: nest > rows=21504, cols=21504 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : type=seqaij, rows=15360, cols=15360 > (0,1) : type=seqaij, rows=15360, cols=6144 > (1,0) : type=seqaij, rows=6144, cols=15360 > (1,1) : type=seqaij, rows=6144, cols=6144 > Mat Object: (firedrake_0_) 1 MPI processes > type: nest > rows=21504, cols=21504 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : prefix="firedrake_0_fieldsplit_1_", type=seqaij, > rows=15360, cols=15360 > (0,1) : type=seqaij, rows=15360, cols=6144 > (1,0) : type=seqaij, rows=6144, cols=15360 > (1,1) : prefix="firedrake_0_fieldsplit_0_", type=seqaij, > rows=6144, cols=6144 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From colin.cotter at imperial.ac.uk Mon Mar 18 16:33:35 2019 From: colin.cotter at imperial.ac.uk (Cotter, Colin J) Date: Mon, 18 Mar 2019 21:33:35 +0000 Subject: [petsc-users] Confusing Schur preconditioner behaviour In-Reply-To: References: , Message-ID: Hi Justin, 1) Here is some UFL for my mixed system (u in BDM2, h in DG1) a = ( inner(ai*u,v) - dt*inner(f*perp(u),v) + dt*inner(g*h,div(v)) +inner(ai*h,phi) - dt*inner(H*div(u),phi) )*dx Which then has the exact Schur complement (when using an H(div)-L2 element pair aP = (inner(ai*u,v) + dt**2*inner(g*H*div(u)/ai,div(v)) - dt*inner(f*perp(u),v))*dx 2) I put FGMRES on the outer iteration and appended the result to the bottom of this message. all the best --Colin Residual norms for firedrake_0_ solve. 0 KSP unpreconditioned resid norm 1.086016610848e-03 true resid norm 1.086016610848e-03 ||r(i)||/||b|| 1.000000000000e+00 Residual norms for firedrake_0_fieldsplit_1_ solve. 0 KSP preconditioned resid norm 8.120721494511e+01 true resid norm 1.655197512879e+02 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 9.439702024300e+00 true resid norm 3.075997034466e+01 ||r(i)||/||b|| 1.858386694356e-01 2 KSP preconditioned resid norm 1.137279699060e+00 true resid norm 7.503013051026e+00 ||r(i)||/||b|| 4.533001646417e-02 3 KSP preconditioned resid norm 1.496062340941e-01 true resid norm 1.484527899318e+00 ||r(i)||/||b|| 8.968886720568e-03 4 KSP preconditioned resid norm 2.056482137526e-02 true resid norm 3.031663733277e-01 ||r(i)||/||b|| 1.831602397713e-03 5 KSP preconditioned resid norm 1.745270896489e-03 true resid norm 3.767168288302e-02 ||r(i)||/||b|| 2.275962994742e-04 6 KSP preconditioned resid norm 1.564628458491e-04 true resid norm 3.546145962849e-03 ||r(i)||/||b|| 2.142430698002e-05 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 6 1 KSP unpreconditioned resid norm 1.085836886832e-03 true resid norm 1.085836886832e-03 ||r(i)||/||b|| 9.998345108044e-01 Residual norms for firedrake_0_fieldsplit_1_ solve. 0 KSP preconditioned resid norm 3.877459187203e-01 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 4.065088730543e-02 true resid norm 1.756656283999e-01 ||r(i)||/||b|| 1.756656283999e-01 2 KSP preconditioned resid norm 5.708009085053e-03 true resid norm 4.422956647739e-02 ||r(i)||/||b|| 4.422956647739e-02 3 KSP preconditioned resid norm 7.309033373261e-04 true resid norm 8.666099848662e-03 ||r(i)||/||b|| 8.666099848662e-03 4 KSP preconditioned resid norm 1.133694903547e-04 true resid norm 1.609496315342e-03 ||r(i)||/||b|| 1.609496315342e-03 5 KSP preconditioned resid norm 1.227169910863e-05 true resid norm 2.710100172099e-04 ||r(i)||/||b|| 2.710100172099e-04 6 KSP preconditioned resid norm 1.047216213052e-06 true resid norm 2.489323919135e-05 ||r(i)||/||b|| 2.489323919135e-05 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 6 2 KSP unpreconditioned resid norm 1.075476649437e-03 true resid norm 1.075476649437e-03 ||r(i)||/||b|| 9.902948432781e-01 Residual norms for firedrake_0_fieldsplit_1_ solve. 0 KSP preconditioned resid norm 4.856780223694e-01 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 4.745469162937e-02 true resid norm 3.302334019978e-01 ||r(i)||/||b|| 3.302334019978e-01 2 KSP preconditioned resid norm 7.434637319370e-03 true resid norm 7.716456937097e-02 ||r(i)||/||b|| 7.716456937097e-02 3 KSP preconditioned resid norm 9.477403463953e-04 true resid norm 1.245956505675e-02 ||r(i)||/||b|| 1.245956505675e-02 4 KSP preconditioned resid norm 1.197149803247e-04 true resid norm 2.445415954040e-03 ||r(i)||/||b|| 2.445415954041e-03 5 KSP preconditioned resid norm 8.629298003468e-06 true resid norm 2.172099238673e-04 ||r(i)||/||b|| 2.172099238673e-04 6 KSP preconditioned resid norm 8.102365661753e-07 true resid norm 2.101591443240e-05 ||r(i)||/||b|| 2.101591443241e-05 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 6 3 KSP unpreconditioned resid norm 8.228196318802e-04 true resid norm 8.228196318802e-04 ||r(i)||/||b|| 7.576492142582e-01 Residual norms for firedrake_0_fieldsplit_1_ solve. 0 KSP preconditioned resid norm 5.432442358474e-01 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 5.878289354883e-02 true resid norm 3.137642427673e-01 ||r(i)||/||b|| 3.137642427673e-01 2 KSP preconditioned resid norm 5.788229173110e-03 true resid norm 8.897854046325e-02 ||r(i)||/||b|| 8.897854046324e-02 3 KSP preconditioned resid norm 8.498991030513e-04 true resid norm 1.495053243269e-02 ||r(i)||/||b|| 1.495053243269e-02 4 KSP preconditioned resid norm 1.033515356385e-04 true resid norm 1.845694418294e-03 ||r(i)||/||b|| 1.845694418294e-03 5 KSP preconditioned resid norm 7.847588222485e-06 true resid norm 2.466173373852e-04 ||r(i)||/||b|| 2.466173373852e-04 6 KSP preconditioned resid norm 7.146079509981e-07 true resid norm 1.719476126742e-05 ||r(i)||/||b|| 1.719476126742e-05 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 6 4 KSP unpreconditioned resid norm 1.657400277959e-04 true resid norm 1.657400277959e-04 ||r(i)||/||b|| 1.526127926041e-01 Residual norms for firedrake_0_fieldsplit_1_ solve. 0 KSP preconditioned resid norm 5.120302137337e-01 true resid norm 9.999999999998e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 6.454992757385e-02 true resid norm 2.912694339903e-01 ||r(i)||/||b|| 2.912694339903e-01 2 KSP preconditioned resid norm 6.671724869182e-03 true resid norm 6.254491231334e-02 ||r(i)||/||b|| 6.254491231335e-02 3 KSP preconditioned resid norm 8.212101901860e-04 true resid norm 1.773238703254e-02 ||r(i)||/||b|| 1.773238703254e-02 4 KSP preconditioned resid norm 1.093679881645e-04 true resid norm 2.931666247026e-03 ||r(i)||/||b|| 2.931666247027e-03 5 KSP preconditioned resid norm 1.186578053622e-05 true resid norm 3.303748399822e-04 ||r(i)||/||b|| 3.303748399822e-04 6 KSP preconditioned resid norm 6.503381106855e-07 true resid norm 1.912788140490e-05 ||r(i)||/||b|| 1.912788140490e-05 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 6 5 KSP unpreconditioned resid norm 2.448137330609e-05 true resid norm 2.448137330609e-05 ||r(i)||/||b|| 2.254235622324e-02 Residual norms for firedrake_0_fieldsplit_1_ solve. 0 KSP preconditioned resid norm 5.179583235178e-01 true resid norm 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 5.254142246569e-02 true resid norm 2.516327547106e-01 ||r(i)||/||b|| 2.516327547105e-01 2 KSP preconditioned resid norm 7.363444586475e-03 true resid norm 5.234766639265e-02 ||r(i)||/||b|| 5.234766639263e-02 3 KSP preconditioned resid norm 9.704643808280e-04 true resid norm 1.267956066012e-02 ||r(i)||/||b|| 1.267956066011e-02 4 KSP preconditioned resid norm 1.086598853731e-04 true resid norm 3.391601116038e-03 ||r(i)||/||b|| 3.391601116037e-03 5 KSP preconditioned resid norm 7.629924161284e-06 true resid norm 3.006619653634e-04 ||r(i)||/||b|| 3.006619653632e-04 6 KSP preconditioned resid norm 7.239392986410e-07 true resid norm 2.150547391218e-05 ||r(i)||/||b|| 2.150547391217e-05 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 6 6 KSP unpreconditioned resid norm 2.402006637155e-06 true resid norm 2.402006637153e-06 ||r(i)||/||b|| 2.211758653745e-03 Residual norms for firedrake_0_fieldsplit_1_ solve. 0 KSP preconditioned resid norm 3.416765078712e-01 true resid norm 9.999999999993e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 3.841617294439e-02 true resid norm 1.983060322744e-01 ||r(i)||/||b|| 1.983060322745e-01 2 KSP preconditioned resid norm 5.192499214944e-03 true resid norm 3.903498722784e-02 ||r(i)||/||b|| 3.903498722787e-02 3 KSP preconditioned resid norm 7.152864973143e-04 true resid norm 7.896855714048e-03 ||r(i)||/||b|| 7.896855714054e-03 4 KSP preconditioned resid norm 9.150050794784e-05 true resid norm 1.709682316519e-03 ||r(i)||/||b|| 1.709682316520e-03 5 KSP preconditioned resid norm 1.369470900870e-05 true resid norm 4.755514972773e-04 ||r(i)||/||b|| 4.755514972776e-04 6 KSP preconditioned resid norm 8.553609674167e-07 true resid norm 4.128479317005e-05 ||r(i)||/||b|| 4.128479317008e-05 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 6 7 KSP unpreconditioned resid norm 1.986730253405e-07 true resid norm 1.986730253416e-07 ||r(i)||/||b|| 1.829373725568e-04 Residual norms for firedrake_0_fieldsplit_1_ solve. 0 KSP preconditioned resid norm 3.737152121707e-01 true resid norm 1.000000000001e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.327226787051e-02 true resid norm 1.728532258160e-01 ||r(i)||/||b|| 1.728532258158e-01 2 KSP preconditioned resid norm 2.663283601003e-03 true resid norm 2.117224456449e-02 ||r(i)||/||b|| 2.117224456446e-02 3 KSP preconditioned resid norm 4.000029435799e-04 true resid norm 4.527340194475e-03 ||r(i)||/||b|| 4.527340194470e-03 4 KSP preconditioned resid norm 4.886053047259e-05 true resid norm 7.959488416139e-04 ||r(i)||/||b|| 7.959488416128e-04 5 KSP preconditioned resid norm 7.135914581342e-06 true resid norm 1.973225005950e-04 ||r(i)||/||b|| 1.973225005947e-04 6 KSP preconditioned resid norm 1.059888416230e-06 true resid norm 4.026894864745e-05 ||r(i)||/||b|| 4.026894864740e-05 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 6 8 KSP unpreconditioned resid norm 1.847329119073e-08 true resid norm 1.847329119335e-08 ||r(i)||/||b|| 1.701013686976e-05 Residual norms for firedrake_0_fieldsplit_1_ solve. 0 KSP preconditioned resid norm 3.105703514818e-01 true resid norm 9.999999999993e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.274848341406e-02 true resid norm 1.655926612050e-01 ||r(i)||/||b|| 1.655926612051e-01 2 KSP preconditioned resid norm 1.896323766042e-03 true resid norm 2.108620526506e-02 ||r(i)||/||b|| 2.108620526507e-02 3 KSP preconditioned resid norm 2.098171871374e-04 true resid norm 2.562423770591e-03 ||r(i)||/||b|| 2.562423770592e-03 4 KSP preconditioned resid norm 2.802038862500e-05 true resid norm 4.753351438216e-04 ||r(i)||/||b|| 4.753351438219e-04 5 KSP preconditioned resid norm 2.875826461982e-06 true resid norm 8.618109518277e-05 ||r(i)||/||b|| 8.618109518283e-05 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 5 9 KSP unpreconditioned resid norm 1.093052879829e-09 true resid norm 1.093052879480e-09 ||r(i)||/||b|| 1.006478969623e-06 Residual norms for firedrake_0_fieldsplit_1_ solve. 0 KSP preconditioned resid norm 2.400304155508e-01 true resid norm 9.999999999373e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.375935214529e-02 true resid norm 1.268731042794e-01 ||r(i)||/||b|| 1.268731042874e-01 2 KSP preconditioned resid norm 1.359927799420e-03 true resid norm 1.743482323300e-02 ||r(i)||/||b|| 1.743482323409e-02 3 KSP preconditioned resid norm 1.329143848901e-04 true resid norm 2.112118748978e-03 ||r(i)||/||b|| 2.112118749111e-03 4 KSP preconditioned resid norm 2.299763657223e-05 true resid norm 3.661832963830e-04 ||r(i)||/||b|| 3.661832964060e-04 5 KSP preconditioned resid norm 3.042002639898e-06 true resid norm 8.825775436656e-05 ||r(i)||/||b|| 8.825775437210e-05 6 KSP preconditioned resid norm 2.755403722945e-07 true resid norm 1.154450754603e-05 ||r(i)||/||b|| 1.154450754675e-05 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 6 10 KSP unpreconditioned resid norm 6.865249627765e-11 true resid norm 6.865249509492e-11 ||r(i)||/||b|| 6.321495860116e-08 KSP Object: (firedrake_0_) 1 MPI processes type: fgmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-07, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (firedrake_0_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=6144, cols=6144 package used to perform factorization: petsc total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=6144, cols=6144 total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 5.09173 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=15360, cols=15360 package used to perform factorization: petsc total: nonzeros=1360836, allocated nonzeros=1360836 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: schurcomplement rows=15360, cols=15360 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=15360, cols=15360 total: nonzeros=267264, allocated nonzeros=267264 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 A10 Mat Object: 1 MPI processes type: seqaij rows=15360, cols=6144 total: nonzeros=73728, allocated nonzeros=73728 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 KSP of A00 KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=6144, cols=6144 package used to perform factorization: petsc total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=6144, cols=6144 total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=6144, cols=15360 total: nonzeros=73728, allocated nonzeros=73728 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=15360, cols=15360 total: nonzeros=267264, allocated nonzeros=267264 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (firedrake_0_) 1 MPI processes type: seqaij rows=21504, cols=21504 total: nonzeros=433152, allocated nonzeros=433152 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 7168 nodes, limit used is 5 Mat Object: (firedrake_0_) 1 MPI processes type: seqaij rows=21504, cols=21504 total: nonzeros=433152, allocated nonzeros=433152 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 7168 nodes, limit used is 5 ________________________________ From: Justin Chang Sent: 18 March 2019 21:01:17 To: Cotter, Colin J Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Confusing Schur preconditioner behaviour Colin, 1) What equations are you solving? 2) In your second case, you set the outer ksp to preonly, thus we are unable to see the ksp_monitor for the (firedrake_0_) solver. Set it to gmres and see if you have a similar output to your first case: 0 KSP preconditioned resid norm 4.985448866758e+00 true resid norm 1.086016610848e-03 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.245615753306e-13 true resid norm 2.082000915439e-14 ||r(i)||/||b|| 1.917098591903e-11 Because according to the first ksp_view output, after one lu sweep for the (firedrake_0_fieldsplit_1_) solver. That is, going from: 0 KSP preconditioned resid norm 8.819238435108e-02 true resid norm 1.797571993221e-01 ||r(i)||/||b|| 1.000000000000e+00 to 1 KSP preconditioned resid norm 1.025167319984e-02 true resid norm 3.340583874349e-02 ||r(i)||/||b|| 1.858386694356e-01 appeared to give you an exact schur complement. Justin On Mon, Mar 18, 2019 at 2:18 PM Cotter, Colin J via petsc-users > wrote: Sorry, just to clarify, in the second case I see several *inner* iterations, even though I'm using LU on a supposedly exact Schur complement as the preconditioner for the Schur system. ________________________________ From: petsc-users > on behalf of Cotter, Colin J via petsc-users > Sent: 18 March 2019 20:14:48 To: petsc-users at mcs.anl.gov Subject: [petsc-users] Confusing Schur preconditioner behaviour Dear petsc-users, I'm solving a 2x2 block system, for which I can construct the Schur complement analytically (through compatible FEM stuff), which I can pass as the preconditioning matrix. When using gmres on the outer iteration, and preonly+lu on the inner iterations with a Schur complement preconditioner, I see convergence in 1 iteration as expected. However, when I set gmres+lu on the inner iteration for S, I see several iterations. This seems strange to me, as the first result seems to confirm that I have an exact Schur complement, but the second result implies not. What could be going on here? I've appended output to the bottom of this message, first the preonly+lu and then for gmres+lu. all the best --Colin Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 Residual norms for firedrake_0_ solve. 0 KSP preconditioned resid norm 4.985448866758e+00 true resid norm 1.086016610848e-03 ||r(i)||/||b|| 1.000000000000e+00 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_ITS iterations 1 1 KSP preconditioned resid norm 1.245615753306e-13 true resid norm 2.082000915439e-14 ||r(i)||/||b|| 1.917098591903e-11 KSP Object: (firedrake_0_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-07, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (firedrake_0_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=6144, cols=6144 package used to perform factorization: petsc total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=6144, cols=6144 total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 5.09173 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=15360, cols=15360 package used to perform factorization: petsc total: nonzeros=1360836, allocated nonzeros=1360836 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: schurcomplement rows=15360, cols=15360 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=15360, cols=15360 total: nonzeros=267264, allocated nonzeros=267264 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 A10 Mat Object: 1 MPI processes type: seqaij rows=15360, cols=6144 total: nonzeros=73728, allocated nonzeros=73728 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 KSP of A00 KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=6144, cols=6144 package used to perform factorization: petsc total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=6144, cols=6144 total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=6144, cols=15360 total: nonzeros=73728, allocated nonzeros=73728 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=15360, cols=15360 total: nonzeros=267264, allocated nonzeros=267264 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (firedrake_0_) 1 MPI processes type: nest rows=21504, cols=21504 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : type=seqaij, rows=15360, cols=15360 (0,1) : type=seqaij, rows=15360, cols=6144 (1,0) : type=seqaij, rows=6144, cols=15360 (1,1) : type=seqaij, rows=6144, cols=6144 Mat Object: (firedrake_0_) 1 MPI processes type: nest rows=21504, cols=21504 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="firedrake_0_fieldsplit_1_", type=seqaij, rows=15360, cols=15360 (0,1) : type=seqaij, rows=15360, cols=6144 (1,0) : type=seqaij, rows=6144, cols=15360 (1,1) : prefix="firedrake_0_fieldsplit_0_", type=seqaij, rows=6144, cols=6144 ===== Residual norms for firedrake_0_fieldsplit_1_ solve. 0 KSP preconditioned resid norm 8.819238435108e-02 true resid norm 1.797571993221e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.025167319984e-02 true resid norm 3.340583874349e-02 ||r(i)||/||b|| 1.858386694356e-01 2 KSP preconditioned resid norm 1.235104644359e-03 true resid norm 8.148396804822e-03 ||r(i)||/||b|| 4.533001646417e-02 3 KSP preconditioned resid norm 1.624748553125e-04 true resid norm 1.612221957927e-03 ||r(i)||/||b|| 8.968886720573e-03 4 KSP preconditioned resid norm 2.233373761266e-05 true resid norm 3.292437172839e-04 ||r(i)||/||b|| 1.831602397710e-03 5 KSP preconditioned resid norm 1.895393184017e-06 true resid norm 4.091207337005e-05 ||r(i)||/||b|| 2.275962994770e-04 6 KSP preconditioned resid norm 1.699212495729e-07 true resid norm 3.851173419652e-06 ||r(i)||/||b|| 2.142430697728e-05 Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL iterations 6 KSP Object: (firedrake_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=6144, cols=6144 package used to perform factorization: petsc total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=6144, cols=6144 total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: gmres restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 5.09173 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=15360, cols=15360 package used to perform factorization: petsc total: nonzeros=1360836, allocated nonzeros=1360836 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: schurcomplement rows=15360, cols=15360 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=15360, cols=15360 total: nonzeros=267264, allocated nonzeros=267264 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 A10 Mat Object: 1 MPI processes type: seqaij rows=15360, cols=6144 total: nonzeros=73728, allocated nonzeros=73728 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 KSP of A00 KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=6144, cols=6144 package used to perform factorization: petsc total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=6144, cols=6144 total: nonzeros=18432, allocated nonzeros=18432 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=6144, cols=15360 total: nonzeros=73728, allocated nonzeros=73728 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 2048 nodes, limit used is 5 Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=15360, cols=15360 total: nonzeros=267264, allocated nonzeros=267264 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 5120 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (firedrake_0_) 1 MPI processes type: nest rows=21504, cols=21504 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : type=seqaij, rows=15360, cols=15360 (0,1) : type=seqaij, rows=15360, cols=6144 (1,0) : type=seqaij, rows=6144, cols=15360 (1,1) : type=seqaij, rows=6144, cols=6144 Mat Object: (firedrake_0_) 1 MPI processes type: nest rows=21504, cols=21504 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="firedrake_0_fieldsplit_1_", type=seqaij, rows=15360, cols=15360 (0,1) : type=seqaij, rows=15360, cols=6144 (1,0) : type=seqaij, rows=6144, cols=15360 (1,1) : prefix="firedrake_0_fieldsplit_0_", type=seqaij, rows=6144, cols=6144 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Mon Mar 18 16:44:22 2019 From: jychang48 at gmail.com (Justin Chang) Date: Mon, 18 Mar 2019 15:44:22 -0600 Subject: [petsc-users] Confusing Schur preconditioner behaviour In-Reply-To: References: Message-ID: Hi Colin, Perhaps I may be talking out of my butt at this point, but it looks like this term in your aP: dt**2*inner(g*H*div(u)/ai,div(v)) is secretly a problem. When I see guys like this in my schur complement approximation, I found that scaling this term by some penalty/weighting value helps especially when dt is really small. Not exactly sure what the mathematical explanation is, but I found that this heuristical approach improves my schur complement approximations. Justin On Mon, Mar 18, 2019 at 3:33 PM Cotter, Colin J wrote: > Hi Justin, > > 1) Here is some UFL for my mixed system (u in BDM2, h in DG1) > > a = ( > > > inner(ai*u,v) - dt*inner(f*perp(u),v) + > > > dt*inner(g*h,div(v)) > > > +inner(ai*h,phi) - dt*inner(H*div(u),phi) > > > )*dx > > > Which then has the exact Schur complement (when using an H(div)-L2 element > pair > > > aP = (inner(ai*u,v) + dt**2*inner(g*H*div(u)/ai,div(v)) > > > - dt*inner(f*perp(u),v))*dx > > > 2) I put FGMRES on the outer iteration and appended the result to the > bottom of this message. > > > all the best > > --Colin > > > Residual norms for firedrake_0_ solve. > > 0 KSP unpreconditioned resid norm 1.086016610848e-03 true resid norm > 1.086016610848e-03 ||r(i)||/||b|| 1.000000000000e+00 > > Residual norms for firedrake_0_fieldsplit_1_ solve. > > 0 KSP preconditioned resid norm 8.120721494511e+01 true resid norm > 1.655197512879e+02 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 9.439702024300e+00 true resid norm > 3.075997034466e+01 ||r(i)||/||b|| 1.858386694356e-01 > > 2 KSP preconditioned resid norm 1.137279699060e+00 true resid norm > 7.503013051026e+00 ||r(i)||/||b|| 4.533001646417e-02 > > 3 KSP preconditioned resid norm 1.496062340941e-01 true resid norm > 1.484527899318e+00 ||r(i)||/||b|| 8.968886720568e-03 > > 4 KSP preconditioned resid norm 2.056482137526e-02 true resid norm > 3.031663733277e-01 ||r(i)||/||b|| 1.831602397713e-03 > > 5 KSP preconditioned resid norm 1.745270896489e-03 true resid norm > 3.767168288302e-02 ||r(i)||/||b|| 2.275962994742e-04 > > 6 KSP preconditioned resid norm 1.564628458491e-04 true resid norm > 3.546145962849e-03 ||r(i)||/||b|| 2.142430698002e-05 > > Linear firedrake_0_fieldsplit_1_ solve converged due to > CONVERGED_RTOL iterations 6 > > 1 KSP unpreconditioned resid norm 1.085836886832e-03 true resid norm > 1.085836886832e-03 ||r(i)||/||b|| 9.998345108044e-01 > > Residual norms for firedrake_0_fieldsplit_1_ solve. > > 0 KSP preconditioned resid norm 3.877459187203e-01 true resid norm > 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 4.065088730543e-02 true resid norm > 1.756656283999e-01 ||r(i)||/||b|| 1.756656283999e-01 > > 2 KSP preconditioned resid norm 5.708009085053e-03 true resid norm > 4.422956647739e-02 ||r(i)||/||b|| 4.422956647739e-02 > > 3 KSP preconditioned resid norm 7.309033373261e-04 true resid norm > 8.666099848662e-03 ||r(i)||/||b|| 8.666099848662e-03 > > 4 KSP preconditioned resid norm 1.133694903547e-04 true resid norm > 1.609496315342e-03 ||r(i)||/||b|| 1.609496315342e-03 > > 5 KSP preconditioned resid norm 1.227169910863e-05 true resid norm > 2.710100172099e-04 ||r(i)||/||b|| 2.710100172099e-04 > > 6 KSP preconditioned resid norm 1.047216213052e-06 true resid norm > 2.489323919135e-05 ||r(i)||/||b|| 2.489323919135e-05 > > Linear firedrake_0_fieldsplit_1_ solve converged due to > CONVERGED_RTOL iterations 6 > > 2 KSP unpreconditioned resid norm 1.075476649437e-03 true resid norm > 1.075476649437e-03 ||r(i)||/||b|| 9.902948432781e-01 > > Residual norms for firedrake_0_fieldsplit_1_ solve. > > 0 KSP preconditioned resid norm 4.856780223694e-01 true resid norm > 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 4.745469162937e-02 true resid norm > 3.302334019978e-01 ||r(i)||/||b|| 3.302334019978e-01 > > 2 KSP preconditioned resid norm 7.434637319370e-03 true resid norm > 7.716456937097e-02 ||r(i)||/||b|| 7.716456937097e-02 > > 3 KSP preconditioned resid norm 9.477403463953e-04 true resid norm > 1.245956505675e-02 ||r(i)||/||b|| 1.245956505675e-02 > > 4 KSP preconditioned resid norm 1.197149803247e-04 true resid norm > 2.445415954040e-03 ||r(i)||/||b|| 2.445415954041e-03 > > 5 KSP preconditioned resid norm 8.629298003468e-06 true resid norm > 2.172099238673e-04 ||r(i)||/||b|| 2.172099238673e-04 > > 6 KSP preconditioned resid norm 8.102365661753e-07 true resid norm > 2.101591443240e-05 ||r(i)||/||b|| 2.101591443241e-05 > > Linear firedrake_0_fieldsplit_1_ solve converged due to > CONVERGED_RTOL iterations 6 > > 3 KSP unpreconditioned resid norm 8.228196318802e-04 true resid norm > 8.228196318802e-04 ||r(i)||/||b|| 7.576492142582e-01 > > Residual norms for firedrake_0_fieldsplit_1_ solve. > > 0 KSP preconditioned resid norm 5.432442358474e-01 true resid norm > 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 5.878289354883e-02 true resid norm > 3.137642427673e-01 ||r(i)||/||b|| 3.137642427673e-01 > > 2 KSP preconditioned resid norm 5.788229173110e-03 true resid norm > 8.897854046325e-02 ||r(i)||/||b|| 8.897854046324e-02 > > 3 KSP preconditioned resid norm 8.498991030513e-04 true resid norm > 1.495053243269e-02 ||r(i)||/||b|| 1.495053243269e-02 > > 4 KSP preconditioned resid norm 1.033515356385e-04 true resid norm > 1.845694418294e-03 ||r(i)||/||b|| 1.845694418294e-03 > > 5 KSP preconditioned resid norm 7.847588222485e-06 true resid norm > 2.466173373852e-04 ||r(i)||/||b|| 2.466173373852e-04 > > 6 KSP preconditioned resid norm 7.146079509981e-07 true resid norm > 1.719476126742e-05 ||r(i)||/||b|| 1.719476126742e-05 > > Linear firedrake_0_fieldsplit_1_ solve converged due to > CONVERGED_RTOL iterations 6 > > 4 KSP unpreconditioned resid norm 1.657400277959e-04 true resid norm > 1.657400277959e-04 ||r(i)||/||b|| 1.526127926041e-01 > > Residual norms for firedrake_0_fieldsplit_1_ solve. > > 0 KSP preconditioned resid norm 5.120302137337e-01 true resid norm > 9.999999999998e-01 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 6.454992757385e-02 true resid norm > 2.912694339903e-01 ||r(i)||/||b|| 2.912694339903e-01 > > 2 KSP preconditioned resid norm 6.671724869182e-03 true resid norm > 6.254491231334e-02 ||r(i)||/||b|| 6.254491231335e-02 > > 3 KSP preconditioned resid norm 8.212101901860e-04 true resid norm > 1.773238703254e-02 ||r(i)||/||b|| 1.773238703254e-02 > > 4 KSP preconditioned resid norm 1.093679881645e-04 true resid norm > 2.931666247026e-03 ||r(i)||/||b|| 2.931666247027e-03 > > 5 KSP preconditioned resid norm 1.186578053622e-05 true resid norm > 3.303748399822e-04 ||r(i)||/||b|| 3.303748399822e-04 > > 6 KSP preconditioned resid norm 6.503381106855e-07 true resid norm > 1.912788140490e-05 ||r(i)||/||b|| 1.912788140490e-05 > > Linear firedrake_0_fieldsplit_1_ solve converged due to > CONVERGED_RTOL iterations 6 > > 5 KSP unpreconditioned resid norm 2.448137330609e-05 true resid norm > 2.448137330609e-05 ||r(i)||/||b|| 2.254235622324e-02 > > Residual norms for firedrake_0_fieldsplit_1_ solve. > > 0 KSP preconditioned resid norm 5.179583235178e-01 true resid norm > 1.000000000000e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 5.254142246569e-02 true resid norm > 2.516327547106e-01 ||r(i)||/||b|| 2.516327547105e-01 > > 2 KSP preconditioned resid norm 7.363444586475e-03 true resid norm > 5.234766639265e-02 ||r(i)||/||b|| 5.234766639263e-02 > > 3 KSP preconditioned resid norm 9.704643808280e-04 true resid norm > 1.267956066012e-02 ||r(i)||/||b|| 1.267956066011e-02 > > 4 KSP preconditioned resid norm 1.086598853731e-04 true resid norm > 3.391601116038e-03 ||r(i)||/||b|| 3.391601116037e-03 > > 5 KSP preconditioned resid norm 7.629924161284e-06 true resid norm > 3.006619653634e-04 ||r(i)||/||b|| 3.006619653632e-04 > > 6 KSP preconditioned resid norm 7.239392986410e-07 true resid norm > 2.150547391218e-05 ||r(i)||/||b|| 2.150547391217e-05 > > Linear firedrake_0_fieldsplit_1_ solve converged due to > CONVERGED_RTOL iterations 6 > > 6 KSP unpreconditioned resid norm 2.402006637155e-06 true resid norm > 2.402006637153e-06 ||r(i)||/||b|| 2.211758653745e-03 > > Residual norms for firedrake_0_fieldsplit_1_ solve. > > 0 KSP preconditioned resid norm 3.416765078712e-01 true resid norm > 9.999999999993e-01 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 3.841617294439e-02 true resid norm > 1.983060322744e-01 ||r(i)||/||b|| 1.983060322745e-01 > > 2 KSP preconditioned resid norm 5.192499214944e-03 true resid norm > 3.903498722784e-02 ||r(i)||/||b|| 3.903498722787e-02 > > 3 KSP preconditioned resid norm 7.152864973143e-04 true resid norm > 7.896855714048e-03 ||r(i)||/||b|| 7.896855714054e-03 > > 4 KSP preconditioned resid norm 9.150050794784e-05 true resid norm > 1.709682316519e-03 ||r(i)||/||b|| 1.709682316520e-03 > > 5 KSP preconditioned resid norm 1.369470900870e-05 true resid norm > 4.755514972773e-04 ||r(i)||/||b|| 4.755514972776e-04 > > 6 KSP preconditioned resid norm 8.553609674167e-07 true resid norm > 4.128479317005e-05 ||r(i)||/||b|| 4.128479317008e-05 > > Linear firedrake_0_fieldsplit_1_ solve converged due to > CONVERGED_RTOL iterations 6 > > 7 KSP unpreconditioned resid norm 1.986730253405e-07 true resid norm > 1.986730253416e-07 ||r(i)||/||b|| 1.829373725568e-04 > > Residual norms for firedrake_0_fieldsplit_1_ solve. > > 0 KSP preconditioned resid norm 3.737152121707e-01 true resid norm > 1.000000000001e+00 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 2.327226787051e-02 true resid norm > 1.728532258160e-01 ||r(i)||/||b|| 1.728532258158e-01 > > 2 KSP preconditioned resid norm 2.663283601003e-03 true resid norm > 2.117224456449e-02 ||r(i)||/||b|| 2.117224456446e-02 > > 3 KSP preconditioned resid norm 4.000029435799e-04 true resid norm > 4.527340194475e-03 ||r(i)||/||b|| 4.527340194470e-03 > > 4 KSP preconditioned resid norm 4.886053047259e-05 true resid norm > 7.959488416139e-04 ||r(i)||/||b|| 7.959488416128e-04 > > 5 KSP preconditioned resid norm 7.135914581342e-06 true resid norm > 1.973225005950e-04 ||r(i)||/||b|| 1.973225005947e-04 > > 6 KSP preconditioned resid norm 1.059888416230e-06 true resid norm > 4.026894864745e-05 ||r(i)||/||b|| 4.026894864740e-05 > > Linear firedrake_0_fieldsplit_1_ solve converged due to > CONVERGED_RTOL iterations 6 > > 8 KSP unpreconditioned resid norm 1.847329119073e-08 true resid norm > 1.847329119335e-08 ||r(i)||/||b|| 1.701013686976e-05 > > Residual norms for firedrake_0_fieldsplit_1_ solve. > > 0 KSP preconditioned resid norm 3.105703514818e-01 true resid norm > 9.999999999993e-01 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 2.274848341406e-02 true resid norm > 1.655926612050e-01 ||r(i)||/||b|| 1.655926612051e-01 > > 2 KSP preconditioned resid norm 1.896323766042e-03 true resid norm > 2.108620526506e-02 ||r(i)||/||b|| 2.108620526507e-02 > > 3 KSP preconditioned resid norm 2.098171871374e-04 true resid norm > 2.562423770591e-03 ||r(i)||/||b|| 2.562423770592e-03 > > 4 KSP preconditioned resid norm 2.802038862500e-05 true resid norm > 4.753351438216e-04 ||r(i)||/||b|| 4.753351438219e-04 > > 5 KSP preconditioned resid norm 2.875826461982e-06 true resid norm > 8.618109518277e-05 ||r(i)||/||b|| 8.618109518283e-05 > > Linear firedrake_0_fieldsplit_1_ solve converged due to > CONVERGED_RTOL iterations 5 > > 9 KSP unpreconditioned resid norm 1.093052879829e-09 true resid norm > 1.093052879480e-09 ||r(i)||/||b|| 1.006478969623e-06 > > Residual norms for firedrake_0_fieldsplit_1_ solve. > > 0 KSP preconditioned resid norm 2.400304155508e-01 true resid norm > 9.999999999373e-01 ||r(i)||/||b|| 1.000000000000e+00 > > 1 KSP preconditioned resid norm 1.375935214529e-02 true resid norm > 1.268731042794e-01 ||r(i)||/||b|| 1.268731042874e-01 > > 2 KSP preconditioned resid norm 1.359927799420e-03 true resid norm > 1.743482323300e-02 ||r(i)||/||b|| 1.743482323409e-02 > > 3 KSP preconditioned resid norm 1.329143848901e-04 true resid norm > 2.112118748978e-03 ||r(i)||/||b|| 2.112118749111e-03 > > 4 KSP preconditioned resid norm 2.299763657223e-05 true resid norm > 3.661832963830e-04 ||r(i)||/||b|| 3.661832964060e-04 > > 5 KSP preconditioned resid norm 3.042002639898e-06 true resid norm > 8.825775436656e-05 ||r(i)||/||b|| 8.825775437210e-05 > > 6 KSP preconditioned resid norm 2.755403722945e-07 true resid norm > 1.154450754603e-05 ||r(i)||/||b|| 1.154450754675e-05 > > Linear firedrake_0_fieldsplit_1_ solve converged due to > CONVERGED_RTOL iterations 6 > > 10 KSP unpreconditioned resid norm 6.865249627765e-11 true resid norm > 6.865249509492e-11 ||r(i)||/||b|| 6.321495860116e-08 > > KSP Object: (firedrake_0_) 1 MPI processes > > type: fgmres > > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-07, absolute=1e-50, divergence=10000. > > right preconditioning > > using UNPRECONDITIONED norm type for convergence test > > PC Object: (firedrake_0_) 1 MPI processes > > type: fieldsplit > > FieldSplit with Schur preconditioner, factorization FULL > > Preconditioner for the Schur complement formed from A11 > > Split info: > > Split number 0 Defined by IS > > Split number 1 Defined by IS > > KSP solver for A00 block > > KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > > type: lu > > out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5., needed 1. > > Factored matrix follows: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=6144, cols=6144 > > package used to perform factorization: petsc > > total: nonzeros=18432, allocated nonzeros=18432 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 2048 nodes, limit used is 5 > > linear system matrix = precond matrix: > > Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > > type: seqaij > > rows=6144, cols=6144 > > total: nonzeros=18432, allocated nonzeros=18432 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 2048 nodes, limit used is 5 > > KSP solver for S = A11 - A10 inv(A00) A01 > > KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > > type: lu > > out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5., needed 5.09173 > > Factored matrix follows: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=15360, cols=15360 > > package used to perform factorization: petsc > > total: nonzeros=1360836, allocated nonzeros=1360836 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 5120 nodes, limit used is 5 > > linear system matrix followed by preconditioner matrix: > > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > > type: schurcomplement > > rows=15360, cols=15360 > > Schur complement A11 - A10 inv(A00) A01 > > A11 > > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > > type: seqaij > > rows=15360, cols=15360 > > total: nonzeros=267264, allocated nonzeros=267264 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 5120 nodes, limit used is 5 > > A10 > > Mat Object: 1 MPI processes > > type: seqaij > > rows=15360, cols=6144 > > total: nonzeros=73728, allocated nonzeros=73728 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 5120 nodes, limit used is 5 > > KSP of A00 > > KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > > type: lu > > out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 5., needed 1. > > Factored matrix follows: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=6144, cols=6144 > > package used to perform factorization: petsc > > total: nonzeros=18432, allocated nonzeros=18432 > > total number of mallocs used during MatSetValues > calls =0 > > using I-node routines: found 2048 nodes, limit > used is 5 > > linear system matrix = precond matrix: > > Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > > type: seqaij > > rows=6144, cols=6144 > > total: nonzeros=18432, allocated nonzeros=18432 > > total number of mallocs used during MatSetValues calls > =0 > > using I-node routines: found 2048 nodes, limit used > is 5 > > A01 > > Mat Object: 1 MPI processes > > type: seqaij > > rows=6144, cols=15360 > > total: nonzeros=73728, allocated nonzeros=73728 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 2048 nodes, limit used is 5 > > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > > type: seqaij > > rows=15360, cols=15360 > > total: nonzeros=267264, allocated nonzeros=267264 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 5120 nodes, limit used is 5 > > linear system matrix followed by preconditioner matrix: > > Mat Object: (firedrake_0_) 1 MPI processes > > type: seqaij > > rows=21504, cols=21504 > > total: nonzeros=433152, allocated nonzeros=433152 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 7168 nodes, limit used is 5 > > Mat Object: (firedrake_0_) 1 MPI processes > > type: seqaij > > rows=21504, cols=21504 > > total: nonzeros=433152, allocated nonzeros=433152 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 7168 nodes, limit used is 5 > ------------------------------ > *From:* Justin Chang > *Sent:* 18 March 2019 21:01:17 > *To:* Cotter, Colin J > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Confusing Schur preconditioner behaviour > > Colin, > > 1) What equations are you solving? > > 2) In your second case, you set the outer ksp to preonly, thus we are > unable to see the ksp_monitor for the (firedrake_0_) solver. Set it to > gmres and see if you have a similar output to your first case: > > 0 KSP preconditioned resid norm 4.985448866758e+00 true resid norm > 1.086016610848e-03 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.245615753306e-13 true resid norm > 2.082000915439e-14 ||r(i)||/||b|| 1.917098591903e-11 > > Because according to the first ksp_view output, after one lu sweep for the > (firedrake_0_fieldsplit_1_) solver. That is, going from: > > 0 KSP preconditioned resid norm 8.819238435108e-02 true resid norm > 1.797571993221e-01 ||r(i)||/||b|| 1.000000000000e+00 > > to > > 1 KSP preconditioned resid norm 1.025167319984e-02 true resid norm > 3.340583874349e-02 ||r(i)||/||b|| 1.858386694356e-01 > > appeared to give you an exact schur complement. > > Justin > > On Mon, Mar 18, 2019 at 2:18 PM Cotter, Colin J via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Sorry, just to clarify, in the second case I see several *inner* > iterations, even though I'm using LU on a supposedly exact Schur complement > as the preconditioner for the Schur system. > ------------------------------ > *From:* petsc-users on behalf of > Cotter, Colin J via petsc-users > *Sent:* 18 March 2019 20:14:48 > *To:* petsc-users at mcs.anl.gov > *Subject:* [petsc-users] Confusing Schur preconditioner behaviour > > > Dear petsc-users, > > I'm solving a 2x2 block system, for which I can construct the Schur > complement analytically (through compatible FEM stuff), > > which I can pass as the preconditioning matrix. > > > When using gmres on the outer iteration, and preonly+lu on the inner > iterations with a Schur complement preconditioner, > > I see convergence in 1 iteration as expected. However, when I set gmres+lu > on the inner iteration for S, I see several iterations. > > > This seems strange to me, as the first result seems to confirm that I have > an exact Schur complement, but the second result > > implies not. > > What could be going on here? > > > I've appended output to the bottom of this message, first the preonly+lu > and then for gmres+lu. > > > all the best > > --Colin > > > Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_ITS > iterations 1 > Residual norms for firedrake_0_ solve. > 0 KSP preconditioned resid norm 4.985448866758e+00 true resid norm > 1.086016610848e-03 ||r(i)||/||b|| 1.000000000000e+00 > Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_ITS > iterations 1 > 1 KSP preconditioned resid norm 1.245615753306e-13 true resid norm > 2.082000915439e-14 ||r(i)||/||b|| 1.917098591903e-11 > KSP Object: (firedrake_0_) 1 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-07, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (firedrake_0_) 1 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, factorization FULL > Preconditioner for the Schur complement formed from A11 > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=6144 > package used to perform factorization: petsc > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=6144, cols=6144 > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 5.09173 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=15360, cols=15360 > package used to perform factorization: petsc > total: nonzeros=1360836, allocated nonzeros=1360836 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > linear system matrix followed by preconditioner matrix: > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: schurcomplement > rows=15360, cols=15360 > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: seqaij > rows=15360, cols=15360 > total: nonzeros=267264, allocated nonzeros=267264 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > A10 > Mat Object: 1 MPI processes > type: seqaij > rows=15360, cols=6144 > total: nonzeros=73728, allocated nonzeros=73728 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > KSP of A00 > KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=6144 > package used to perform factorization: petsc > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues > calls =0 > using I-node routines: found 2048 nodes, limit > used is 5 > linear system matrix = precond matrix: > Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=6144, cols=6144 > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is > 5 > A01 > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=15360 > total: nonzeros=73728, allocated nonzeros=73728 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: seqaij > rows=15360, cols=15360 > total: nonzeros=267264, allocated nonzeros=267264 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > linear system matrix followed by preconditioner matrix: > Mat Object: (firedrake_0_) 1 MPI processes > type: nest > rows=21504, cols=21504 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : type=seqaij, rows=15360, cols=15360 > (0,1) : type=seqaij, rows=15360, cols=6144 > (1,0) : type=seqaij, rows=6144, cols=15360 > (1,1) : type=seqaij, rows=6144, cols=6144 > Mat Object: (firedrake_0_) 1 MPI processes > type: nest > rows=21504, cols=21504 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : prefix="firedrake_0_fieldsplit_1_", type=seqaij, > rows=15360, cols=15360 > (0,1) : type=seqaij, rows=15360, cols=6144 > (1,0) : type=seqaij, rows=6144, cols=15360 > (1,1) : prefix="firedrake_0_fieldsplit_0_", type=seqaij, > rows=6144, cols=6144 > > ===== > > > Residual norms for firedrake_0_fieldsplit_1_ solve. > 0 KSP preconditioned resid norm 8.819238435108e-02 true resid norm > 1.797571993221e-01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.025167319984e-02 true resid norm > 3.340583874349e-02 ||r(i)||/||b|| 1.858386694356e-01 > 2 KSP preconditioned resid norm 1.235104644359e-03 true resid norm > 8.148396804822e-03 ||r(i)||/||b|| 4.533001646417e-02 > 3 KSP preconditioned resid norm 1.624748553125e-04 true resid norm > 1.612221957927e-03 ||r(i)||/||b|| 8.968886720573e-03 > 4 KSP preconditioned resid norm 2.233373761266e-05 true resid norm > 3.292437172839e-04 ||r(i)||/||b|| 1.831602397710e-03 > 5 KSP preconditioned resid norm 1.895393184017e-06 true resid norm > 4.091207337005e-05 ||r(i)||/||b|| 2.275962994770e-04 > 6 KSP preconditioned resid norm 1.699212495729e-07 true resid norm > 3.851173419652e-06 ||r(i)||/||b|| 2.142430697728e-05 > Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL > iterations 6 > KSP Object: (firedrake_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_) 1 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, factorization FULL > Preconditioner for the Schur complement formed from A11 > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=6144 > package used to perform factorization: petsc > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=6144, cols=6144 > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 5.09173 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=15360, cols=15360 > package used to perform factorization: petsc > total: nonzeros=1360836, allocated nonzeros=1360836 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > linear system matrix followed by preconditioner matrix: > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: schurcomplement > rows=15360, cols=15360 > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: seqaij > rows=15360, cols=15360 > total: nonzeros=267264, allocated nonzeros=267264 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > A10 > Mat Object: 1 MPI processes > type: seqaij > rows=15360, cols=6144 > total: nonzeros=73728, allocated nonzeros=73728 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > KSP of A00 > KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=6144 > package used to perform factorization: petsc > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues > calls =0 > using I-node routines: found 2048 nodes, limit > used is 5 > linear system matrix = precond matrix: > Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=6144, cols=6144 > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is > 5 > A01 > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=15360 > total: nonzeros=73728, allocated nonzeros=73728 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: seqaij > rows=15360, cols=15360 > total: nonzeros=267264, allocated nonzeros=267264 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > linear system matrix followed by preconditioner matrix: > Mat Object: (firedrake_0_) 1 MPI processes > type: nest > rows=21504, cols=21504 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : type=seqaij, rows=15360, cols=15360 > (0,1) : type=seqaij, rows=15360, cols=6144 > (1,0) : type=seqaij, rows=6144, cols=15360 > (1,1) : type=seqaij, rows=6144, cols=6144 > Mat Object: (firedrake_0_) 1 MPI processes > type: nest > rows=21504, cols=21504 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : prefix="firedrake_0_fieldsplit_1_", type=seqaij, > rows=15360, cols=15360 > (0,1) : type=seqaij, rows=15360, cols=6144 > (1,0) : type=seqaij, rows=6144, cols=15360 > (1,1) : prefix="firedrake_0_fieldsplit_0_", type=seqaij, > rows=6144, cols=6144 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Mon Mar 18 17:53:33 2019 From: dave.mayhem23 at gmail.com (Dave May) Date: Mon, 18 Mar 2019 23:53:33 +0100 Subject: [petsc-users] Confusing Schur preconditioner behaviour In-Reply-To: References: Message-ID: On Mon, 18 Mar 2019 at 20:14, Cotter, Colin J via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear petsc-users, > > I'm solving a 2x2 block system, for which I can construct the Schur > complement analytically (through compatible FEM stuff), > > which I can pass as the preconditioning matrix. > In the KSP view your sent I see lines "Preconditioner for the Schur complement formed from A11" which indicates your custom Schur complement preconditioner is not being used. It doesn't exactly explain what you observe but the solver appears to not be configured as you described / expected. Thanks Dave > When using gmres on the outer iteration, and preonly+lu on the inner > iterations with a Schur complement preconditioner, > > I see convergence in 1 iteration as expected. However, when I set gmres+lu > on the inner iteration for S, I see several iterations. > > > This seems strange to me, as the first result seems to confirm that I have > an exact Schur complement, but the second result > > implies not. > > What could be going on here? > > > I've appended output to the bottom of this message, first the preonly+lu > and then for gmres+lu. > > > all the best > > --Colin > > > Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_ITS > iterations 1 > Residual norms for firedrake_0_ solve. > 0 KSP preconditioned resid norm 4.985448866758e+00 true resid norm > 1.086016610848e-03 ||r(i)||/||b|| 1.000000000000e+00 > Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_ITS > iterations 1 > 1 KSP preconditioned resid norm 1.245615753306e-13 true resid norm > 2.082000915439e-14 ||r(i)||/||b|| 1.917098591903e-11 > KSP Object: (firedrake_0_) 1 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-07, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (firedrake_0_) 1 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, factorization FULL > Preconditioner for the Schur complement formed from A11 > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=6144 > package used to perform factorization: petsc > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=6144, cols=6144 > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 5.09173 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=15360, cols=15360 > package used to perform factorization: petsc > total: nonzeros=1360836, allocated nonzeros=1360836 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > linear system matrix followed by preconditioner matrix: > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: schurcomplement > rows=15360, cols=15360 > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: seqaij > rows=15360, cols=15360 > total: nonzeros=267264, allocated nonzeros=267264 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > A10 > Mat Object: 1 MPI processes > type: seqaij > rows=15360, cols=6144 > total: nonzeros=73728, allocated nonzeros=73728 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > KSP of A00 > KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=6144 > package used to perform factorization: petsc > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues > calls =0 > using I-node routines: found 2048 nodes, limit > used is 5 > linear system matrix = precond matrix: > Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=6144, cols=6144 > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is > 5 > A01 > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=15360 > total: nonzeros=73728, allocated nonzeros=73728 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: seqaij > rows=15360, cols=15360 > total: nonzeros=267264, allocated nonzeros=267264 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > linear system matrix followed by preconditioner matrix: > Mat Object: (firedrake_0_) 1 MPI processes > type: nest > rows=21504, cols=21504 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : type=seqaij, rows=15360, cols=15360 > (0,1) : type=seqaij, rows=15360, cols=6144 > (1,0) : type=seqaij, rows=6144, cols=15360 > (1,1) : type=seqaij, rows=6144, cols=6144 > Mat Object: (firedrake_0_) 1 MPI processes > type: nest > rows=21504, cols=21504 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : prefix="firedrake_0_fieldsplit_1_", type=seqaij, > rows=15360, cols=15360 > (0,1) : type=seqaij, rows=15360, cols=6144 > (1,0) : type=seqaij, rows=6144, cols=15360 > (1,1) : prefix="firedrake_0_fieldsplit_0_", type=seqaij, > rows=6144, cols=6144 > > ===== > > > Residual norms for firedrake_0_fieldsplit_1_ solve. > 0 KSP preconditioned resid norm 8.819238435108e-02 true resid norm > 1.797571993221e-01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.025167319984e-02 true resid norm > 3.340583874349e-02 ||r(i)||/||b|| 1.858386694356e-01 > 2 KSP preconditioned resid norm 1.235104644359e-03 true resid norm > 8.148396804822e-03 ||r(i)||/||b|| 4.533001646417e-02 > 3 KSP preconditioned resid norm 1.624748553125e-04 true resid norm > 1.612221957927e-03 ||r(i)||/||b|| 8.968886720573e-03 > 4 KSP preconditioned resid norm 2.233373761266e-05 true resid norm > 3.292437172839e-04 ||r(i)||/||b|| 1.831602397710e-03 > 5 KSP preconditioned resid norm 1.895393184017e-06 true resid norm > 4.091207337005e-05 ||r(i)||/||b|| 2.275962994770e-04 > 6 KSP preconditioned resid norm 1.699212495729e-07 true resid norm > 3.851173419652e-06 ||r(i)||/||b|| 2.142430697728e-05 > Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_RTOL > iterations 6 > KSP Object: (firedrake_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_) 1 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, factorization FULL > Preconditioner for the Schur complement formed from A11 > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=6144 > package used to perform factorization: petsc > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=6144, cols=6144 > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 5.09173 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=15360, cols=15360 > package used to perform factorization: petsc > total: nonzeros=1360836, allocated nonzeros=1360836 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > linear system matrix followed by preconditioner matrix: > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: schurcomplement > rows=15360, cols=15360 > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: seqaij > rows=15360, cols=15360 > total: nonzeros=267264, allocated nonzeros=267264 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > A10 > Mat Object: 1 MPI processes > type: seqaij > rows=15360, cols=6144 > total: nonzeros=73728, allocated nonzeros=73728 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > KSP of A00 > KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=6144 > package used to perform factorization: petsc > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues > calls =0 > using I-node routines: found 2048 nodes, limit > used is 5 > linear system matrix = precond matrix: > Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=6144, cols=6144 > total: nonzeros=18432, allocated nonzeros=18432 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is > 5 > A01 > Mat Object: 1 MPI processes > type: seqaij > rows=6144, cols=15360 > total: nonzeros=73728, allocated nonzeros=73728 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 2048 nodes, limit used is 5 > Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes > type: seqaij > rows=15360, cols=15360 > total: nonzeros=267264, allocated nonzeros=267264 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 5120 nodes, limit used is 5 > linear system matrix followed by preconditioner matrix: > Mat Object: (firedrake_0_) 1 MPI processes > type: nest > rows=21504, cols=21504 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : type=seqaij, rows=15360, cols=15360 > (0,1) : type=seqaij, rows=15360, cols=6144 > (1,0) : type=seqaij, rows=6144, cols=15360 > (1,1) : type=seqaij, rows=6144, cols=6144 > Mat Object: (firedrake_0_) 1 MPI processes > type: nest > rows=21504, cols=21504 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : prefix="firedrake_0_fieldsplit_1_", type=seqaij, > rows=15360, cols=15360 > (0,1) : type=seqaij, rows=15360, cols=6144 > (1,0) : type=seqaij, rows=6144, cols=15360 > (1,1) : prefix="firedrake_0_fieldsplit_0_", type=seqaij, > rows=6144, cols=6144 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Mon Mar 18 18:04:23 2019 From: dave.mayhem23 at gmail.com (Dave May) Date: Tue, 19 Mar 2019 00:04:23 +0100 Subject: [petsc-users] Confusing Schur preconditioner behaviour In-Reply-To: References: Message-ID: On Mon, 18 Mar 2019 at 22:53, Dave May wrote: > > > On Mon, 18 Mar 2019 at 20:14, Cotter, Colin J via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Dear petsc-users, >> >> I'm solving a 2x2 block system, for which I can construct the Schur >> complement analytically (through compatible FEM stuff), >> >> which I can pass as the preconditioning matrix. >> > > In the KSP view your sent I see lines > "Preconditioner for the Schur complement formed from A11" > which indicates your custom Schur complement preconditioner is not being > used. > > It doesn't exactly explain what you observe but the solver appears to not > be configured as you described / expected. > Oh - are your defining the exact Schur operator within the preconditioning matrix? If you are doing that, then you need to tell fieldsplit to use the Amat to define the splits otherwise it will define the Schur compliment as S = B22 - B21 inv(B11) B12 preconditiones with B22, where as what you want is S = A22 - A21 inv(A11) A12 preconditioned with B22. If your operators are set up this way and you didn't indicate to use Amat to define S this would definitely explain why preonly works but iterating on Schur does not. > > Thanks > Dave > > > >> When using gmres on the outer iteration, and preonly+lu on the inner >> iterations with a Schur complement preconditioner, >> >> I see convergence in 1 iteration as expected. However, when I set >> gmres+lu on the inner iteration for S, I see several iterations. >> >> >> This seems strange to me, as the first result seems to confirm that I >> have an exact Schur complement, but the second result >> >> implies not. >> >> What could be going on here? >> >> >> I've appended output to the bottom of this message, first the preonly+lu >> and then for gmres+lu. >> >> >> all the best >> >> --Colin >> >> >> Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_ITS >> iterations 1 >> Residual norms for firedrake_0_ solve. >> 0 KSP preconditioned resid norm 4.985448866758e+00 true resid norm >> 1.086016610848e-03 ||r(i)||/||b|| 1.000000000000e+00 >> Linear firedrake_0_fieldsplit_1_ solve converged due to CONVERGED_ITS >> iterations 1 >> 1 KSP preconditioned resid norm 1.245615753306e-13 true resid norm >> 2.082000915439e-14 ||r(i)||/||b|| 1.917098591903e-11 >> KSP Object: (firedrake_0_) 1 MPI processes >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-07, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: (firedrake_0_) 1 MPI processes >> type: fieldsplit >> FieldSplit with Schur preconditioner, factorization FULL >> Preconditioner for the Schur complement formed from A11 >> Split info: >> Split number 0 Defined by IS >> Split number 1 Defined by IS >> KSP solver for A00 block >> KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=6144, cols=6144 >> package used to perform factorization: petsc >> total: nonzeros=18432, allocated nonzeros=18432 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 2048 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes >> type: seqaij >> rows=6144, cols=6144 >> total: nonzeros=18432, allocated nonzeros=18432 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 2048 nodes, limit used is 5 >> KSP solver for S = A11 - A10 inv(A00) A01 >> KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5., needed 5.09173 >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=15360, cols=15360 >> package used to perform factorization: petsc >> total: nonzeros=1360836, allocated nonzeros=1360836 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 5120 nodes, limit used is 5 >> linear system matrix followed by preconditioner matrix: >> Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes >> type: schurcomplement >> rows=15360, cols=15360 >> Schur complement A11 - A10 inv(A00) A01 >> A11 >> Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes >> type: seqaij >> rows=15360, cols=15360 >> total: nonzeros=267264, allocated nonzeros=267264 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 5120 nodes, limit used is 5 >> A10 >> Mat Object: 1 MPI processes >> type: seqaij >> rows=15360, cols=6144 >> total: nonzeros=73728, allocated nonzeros=73728 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 5120 nodes, limit used is 5 >> KSP of A00 >> KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=6144, cols=6144 >> package used to perform factorization: petsc >> total: nonzeros=18432, allocated nonzeros=18432 >> total number of mallocs used during MatSetValues >> calls =0 >> using I-node routines: found 2048 nodes, limit >> used is 5 >> linear system matrix = precond matrix: >> Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes >> type: seqaij >> rows=6144, cols=6144 >> total: nonzeros=18432, allocated nonzeros=18432 >> total number of mallocs used during MatSetValues calls >> =0 >> using I-node routines: found 2048 nodes, limit used >> is 5 >> A01 >> Mat Object: 1 MPI processes >> type: seqaij >> rows=6144, cols=15360 >> total: nonzeros=73728, allocated nonzeros=73728 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 2048 nodes, limit used is 5 >> Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes >> type: seqaij >> rows=15360, cols=15360 >> total: nonzeros=267264, allocated nonzeros=267264 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 5120 nodes, limit used is 5 >> linear system matrix followed by preconditioner matrix: >> Mat Object: (firedrake_0_) 1 MPI processes >> type: nest >> rows=21504, cols=21504 >> Matrix object: >> type=nest, rows=2, cols=2 >> MatNest structure: >> (0,0) : type=seqaij, rows=15360, cols=15360 >> (0,1) : type=seqaij, rows=15360, cols=6144 >> (1,0) : type=seqaij, rows=6144, cols=15360 >> (1,1) : type=seqaij, rows=6144, cols=6144 >> Mat Object: (firedrake_0_) 1 MPI processes >> type: nest >> rows=21504, cols=21504 >> Matrix object: >> type=nest, rows=2, cols=2 >> MatNest structure: >> (0,0) : prefix="firedrake_0_fieldsplit_1_", type=seqaij, >> rows=15360, cols=15360 >> (0,1) : type=seqaij, rows=15360, cols=6144 >> (1,0) : type=seqaij, rows=6144, cols=15360 >> (1,1) : prefix="firedrake_0_fieldsplit_0_", type=seqaij, >> rows=6144, cols=6144 >> >> ===== >> >> >> Residual norms for firedrake_0_fieldsplit_1_ solve. >> 0 KSP preconditioned resid norm 8.819238435108e-02 true resid norm >> 1.797571993221e-01 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP preconditioned resid norm 1.025167319984e-02 true resid norm >> 3.340583874349e-02 ||r(i)||/||b|| 1.858386694356e-01 >> 2 KSP preconditioned resid norm 1.235104644359e-03 true resid norm >> 8.148396804822e-03 ||r(i)||/||b|| 4.533001646417e-02 >> 3 KSP preconditioned resid norm 1.624748553125e-04 true resid norm >> 1.612221957927e-03 ||r(i)||/||b|| 8.968886720573e-03 >> 4 KSP preconditioned resid norm 2.233373761266e-05 true resid norm >> 3.292437172839e-04 ||r(i)||/||b|| 1.831602397710e-03 >> 5 KSP preconditioned resid norm 1.895393184017e-06 true resid norm >> 4.091207337005e-05 ||r(i)||/||b|| 2.275962994770e-04 >> 6 KSP preconditioned resid norm 1.699212495729e-07 true resid norm >> 3.851173419652e-06 ||r(i)||/||b|| 2.142430697728e-05 >> Linear firedrake_0_fieldsplit_1_ solve converged due to >> CONVERGED_RTOL iterations 6 >> KSP Object: (firedrake_0_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (firedrake_0_) 1 MPI processes >> type: fieldsplit >> FieldSplit with Schur preconditioner, factorization FULL >> Preconditioner for the Schur complement formed from A11 >> Split info: >> Split number 0 Defined by IS >> Split number 1 Defined by IS >> KSP solver for A00 block >> KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=6144, cols=6144 >> package used to perform factorization: petsc >> total: nonzeros=18432, allocated nonzeros=18432 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 2048 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes >> type: seqaij >> rows=6144, cols=6144 >> total: nonzeros=18432, allocated nonzeros=18432 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 2048 nodes, limit used is 5 >> KSP solver for S = A11 - A10 inv(A00) A01 >> KSP Object: (firedrake_0_fieldsplit_1_) 1 MPI processes >> type: gmres >> restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: (firedrake_0_fieldsplit_1_) 1 MPI processes >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5., needed 5.09173 >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=15360, cols=15360 >> package used to perform factorization: petsc >> total: nonzeros=1360836, allocated nonzeros=1360836 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 5120 nodes, limit used is 5 >> linear system matrix followed by preconditioner matrix: >> Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes >> type: schurcomplement >> rows=15360, cols=15360 >> Schur complement A11 - A10 inv(A00) A01 >> A11 >> Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes >> type: seqaij >> rows=15360, cols=15360 >> total: nonzeros=267264, allocated nonzeros=267264 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 5120 nodes, limit used is 5 >> A10 >> Mat Object: 1 MPI processes >> type: seqaij >> rows=15360, cols=6144 >> total: nonzeros=73728, allocated nonzeros=73728 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 5120 nodes, limit used is 5 >> KSP of A00 >> KSP Object: (firedrake_0_fieldsplit_0_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (firedrake_0_fieldsplit_0_) 1 MPI processes >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=6144, cols=6144 >> package used to perform factorization: petsc >> total: nonzeros=18432, allocated nonzeros=18432 >> total number of mallocs used during MatSetValues >> calls =0 >> using I-node routines: found 2048 nodes, limit >> used is 5 >> linear system matrix = precond matrix: >> Mat Object: (firedrake_0_fieldsplit_0_) 1 MPI processes >> type: seqaij >> rows=6144, cols=6144 >> total: nonzeros=18432, allocated nonzeros=18432 >> total number of mallocs used during MatSetValues calls >> =0 >> using I-node routines: found 2048 nodes, limit used >> is 5 >> A01 >> Mat Object: 1 MPI processes >> type: seqaij >> rows=6144, cols=15360 >> total: nonzeros=73728, allocated nonzeros=73728 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 2048 nodes, limit used is 5 >> Mat Object: (firedrake_0_fieldsplit_1_) 1 MPI processes >> type: seqaij >> rows=15360, cols=15360 >> total: nonzeros=267264, allocated nonzeros=267264 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 5120 nodes, limit used is 5 >> linear system matrix followed by preconditioner matrix: >> Mat Object: (firedrake_0_) 1 MPI processes >> type: nest >> rows=21504, cols=21504 >> Matrix object: >> type=nest, rows=2, cols=2 >> MatNest structure: >> (0,0) : type=seqaij, rows=15360, cols=15360 >> (0,1) : type=seqaij, rows=15360, cols=6144 >> (1,0) : type=seqaij, rows=6144, cols=15360 >> (1,1) : type=seqaij, rows=6144, cols=6144 >> Mat Object: (firedrake_0_) 1 MPI processes >> type: nest >> rows=21504, cols=21504 >> Matrix object: >> type=nest, rows=2, cols=2 >> MatNest structure: >> (0,0) : prefix="firedrake_0_fieldsplit_1_", type=seqaij, >> rows=15360, cols=15360 >> (0,1) : type=seqaij, rows=15360, cols=6144 >> (1,0) : type=seqaij, rows=6144, cols=15360 >> (1,1) : prefix="firedrake_0_fieldsplit_0_", type=seqaij, >> rows=6144, cols=6144 >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbuerkle at web.de Mon Mar 18 21:30:14 2019 From: mbuerkle at web.de (Marius Buerkle) Date: Tue, 19 Mar 2019 03:30:14 +0100 Subject: [petsc-users] MatCompositeMerge + MatCreateRedundantMatrix In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 18 21:47:08 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 18 Mar 2019 22:47:08 -0400 Subject: [petsc-users] MatCompositeMerge + MatCreateRedundantMatrix In-Reply-To: References: Message-ID: On Mon, Mar 18, 2019 at 10:30 PM Marius Buerkle wrote: > I have another question > regarding MatCreateRedundantMatrix and MPICreateSubMatricesMPI. The former > works for MPIAIJ and MPIDENSE and the later only for MPIAIJ. Would it be > possible to use MatCreateRedundantMatrix with a factored matrix > We usually do not have direct access to the data for factored matrices since it lives in the factorization package. > and MPICreateSubMatricesMPI with dense and/or elemental matrices ? > This would not be hard, it would just take time. The dense case is easy. I think Elemental already has such an operation, but we would have to find it and call it. Thanks, Matt > Indeed, was very easy to add. Are you going to include the Fortran > interface for MPICreateSubMatricesMPI in future releases of PETSC ? > > Regarding my initial problem, thanks a lot. It works very well with > MPICreateSubMatricesMPI and the solution can be implemented in a few > lines. > > Thanks and Best, > > Marius > > > On Tue, Mar 12, 2019 at 4:50 AM Marius Buerkle wrote: > >> I tried to follow your suggestions but it seems there is >> no MPICreateSubMatricesMPI for Fortran. Is this correct? >> > > We just have to write the binding. Its almost identical to > MatCreateSubMatrices() in src/mat/interface/ftn-custom/zmatrixf.c > > Matt > >> >> On Wed, Feb 20, 2019 at 6:57 PM Marius Buerkle wrote: >> >>> ok, I think I understand now. I will give it a try and if there is some >>> trouble comeback to you. thanks. >>> >> >> Cool. >> >> Matt >> >> >>> >>> marius >>> >>> On Tue, Feb 19, 2019 at 8:42 PM Marius Buerkle wrote: >>> >>>> ok, so it seems there is no straight forward way to transfer data >>>> between PETSc matrices on different subcomms. Probably doing it by "hand" >>>> extracting the matricies on the subcomms create a MPI_INTERCOMM transfering >>>> the data to PETSC_COMM_WORLD and assembling them in a new PETSc matrix >>>> would be possible, right? >>>> >>> >>> That sounds too complicated. Why not just reverse >>> MPICreateSubMatricesMPI()? Meaning make it collective on the whole big >>> communicator, so that you can swap out all the subcommunicator for the >>> aggregation call, just like we do in that function. >>> Then its really just a matter of reversing the communication call. >>> >>> Matt >>> >>>> >>>> On Tue, Feb 19, 2019 at 7:12 PM Marius Buerkle wrote: >>>> >>>>> I see. This would work if the matrices are on different >>>>> subcommumicators ? Is it possible to add this functionality ? >>>>> >>>> >>>> Hmm, no. That is specialized to serial matrices. You need the inverse >>>> of MatCreateSubMatricesMPI(). >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> marius >>>>> >>>>> >>>>> You basically need the inverse of MatCreateSubmatrices(). I do not >>>>> think we have that right now, but it could probably be done without too >>>>> much trouble by looking at that code. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> On Tue, Feb 19, 2019 at 6:15 AM Marius Buerkle via petsc-users < >>>>> petsc-users at mcs.anl.gov> wrote: >>>>> >>>>>> Hi ! >>>>>> >>>>>> Is there some way to combine MatCompositeMerge >>>>>> with MatCreateRedundantMatrix? I basically want to create copies of a >>>>>> matrix from PETSC_COMM_WORLD to subcommunicators, do some work on each >>>>>> subcommunicator and than gather the results back to PETSC_COMM_WORLD, >>>>>> namely I want to sum the the invidual matrices from the subcommunicatos >>>>>> component wise and get the resulting matrix on PETSC_COMM_WORLD. Is this >>>>>> somehow possible without going through all the hassle of using MPI >>>>>> directly? >>>>>> >>>>>> marius >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From colin.cotter at imperial.ac.uk Tue Mar 19 04:33:41 2019 From: colin.cotter at imperial.ac.uk (Cotter, Colin J) Date: Tue, 19 Mar 2019 09:33:41 +0000 Subject: [petsc-users] Confusing Schur preconditioner behaviour In-Reply-To: References: , Message-ID: Hi Dave, >If you are doing that, then you need to tell fieldsplit to use the Amat to define the splits otherwise it will define the Schur compliment as >S = B22 - B21 inv(B11) B12 >preconditiones with B22, where as what you want is >S = A22 - A21 inv(A11) A12 >preconditioned with B22. >If your operators are set up this way and you didn't indicate to use Amat to define S this would definitely explain why preonly works but iterating on Schur does not. Yes, thanks - this solves it! I need pc_use_amat. all the best --Colin -------------- next part -------------- An HTML attachment was scrubbed... URL: From eda.oktay at metu.edu.tr Tue Mar 19 05:31:29 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Tue, 19 Mar 2019 13:31:29 +0300 Subject: [petsc-users] SLEPc Build Error Message-ID: Hello, I am trying to install PETSc with following configure options: ./configure --download-openmpi --download-openblas --download-slepc --download-cmake --download-metis --download-parmetis Compilation is done but after the following command, I got an error: make PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.4 PETSC_ARCH=arch-linux2-c-debug all *** Building slepc *** **************************ERROR************************************* Error building slepc. Check arch-linux2-c-debug/lib/petsc/conf/slepc.log ******************************************************************** /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/petsc/conf/petscrules:46: recipe for target 'slepcbuild' failed make[1]: *** [slepcbuild] Error 1 make[1]: Leaving directory '/home/slurm_local/e200781/petsc-3.10.4' **************************ERROR************************************* Error during compile, check arch-linux2-c-debug/lib/petsc/conf/make.log Send it and arch-linux2-c-debug/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov ******************************************************************** makefile:30: recipe for target 'all' failed make: *** [all] Error 1 How can I fix the problem? Thank you, Eda -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 19 05:36:06 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 19 Mar 2019 06:36:06 -0400 Subject: [petsc-users] SLEPc Build Error In-Reply-To: References: Message-ID: On Tue, Mar 19, 2019 at 6:31 AM Eda Oktay via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > I am trying to install PETSc with following configure options: > > ./configure --download-openmpi --download-openblas --download-slepc > --download-cmake --download-metis --download-parmetis > > Compilation is done but after the following command, I got an error: > > make PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.4 > PETSC_ARCH=arch-linux2-c-debug all > > *** Building slepc *** > **************************ERROR************************************* > Error building slepc. Check arch-linux2-c-debug/lib/petsc/conf/slepc.log > We need slepc.log Thanks, Matt > ******************************************************************** > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/petsc/conf/petscrules:46: > recipe for target 'slepcbuild' failed > make[1]: *** [slepcbuild] Error 1 > make[1]: Leaving directory '/home/slurm_local/e200781/petsc-3.10.4' > **************************ERROR************************************* > Error during compile, check arch-linux2-c-debug/lib/petsc/conf/make.log > Send it and arch-linux2-c-debug/lib/petsc/conf/configure.log to > petsc-maint at mcs.anl.gov > ******************************************************************** > makefile:30: recipe for target 'all' failed > make: *** [all] Error 1 > > How can I fix the problem? > > Thank you, > > Eda > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eda.oktay at metu.edu.tr Tue Mar 19 05:41:04 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Tue, 19 Mar 2019 13:41:04 +0300 Subject: [petsc-users] SLEPc Build Error In-Reply-To: References: Message-ID: This is slepc.log: Checking environment... done Checking PETSc installation... ERROR: Unable to link with PETSc ERROR: See "arch-linux2-c-debug/lib/slepc/conf/configure.log" file for details Matthew Knepley , 19 Mar 2019 Sal, 13:36 tarihinde ?unu yazd?: > On Tue, Mar 19, 2019 at 6:31 AM Eda Oktay via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello, >> >> I am trying to install PETSc with following configure options: >> >> ./configure --download-openmpi --download-openblas --download-slepc >> --download-cmake --download-metis --download-parmetis >> >> Compilation is done but after the following command, I got an error: >> >> make PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.4 >> PETSC_ARCH=arch-linux2-c-debug all >> >> *** Building slepc *** >> **************************ERROR************************************* >> Error building slepc. Check arch-linux2-c-debug/lib/petsc/conf/slepc.log >> > > We need slepc.log > > Thanks, > > Matt > > >> ******************************************************************** >> /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/petsc/conf/petscrules:46: >> recipe for target 'slepcbuild' failed >> make[1]: *** [slepcbuild] Error 1 >> make[1]: Leaving directory '/home/slurm_local/e200781/petsc-3.10.4' >> **************************ERROR************************************* >> Error during compile, check arch-linux2-c-debug/lib/petsc/conf/make.log >> Send it and arch-linux2-c-debug/lib/petsc/conf/configure.log to >> petsc-maint at mcs.anl.gov >> ******************************************************************** >> makefile:30: recipe for target 'all' failed >> make: *** [all] Error 1 >> >> How can I fix the problem? >> >> Thank you, >> >> Eda >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Tue Mar 19 05:46:09 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 19 Mar 2019 11:46:09 +0100 Subject: [petsc-users] SLEPc Build Error In-Reply-To: References: Message-ID: And what is in $SLEPC_DIR/arch-linux2-c-debug/lib/slepc/conf/configure.log ? Jose > El 19 mar 2019, a las 11:41, Eda Oktay via petsc-users escribi?: > > This is slepc.log: > > Checking environment... done > Checking PETSc installation... > ERROR: Unable to link with PETSc > ERROR: See "arch-linux2-c-debug/lib/slepc/conf/configure.log" file for details > > > > Matthew Knepley , 19 Mar 2019 Sal, 13:36 tarihinde ?unu yazd?: > On Tue, Mar 19, 2019 at 6:31 AM Eda Oktay via petsc-users wrote: > Hello, > > I am trying to install PETSc with following configure options: > > ./configure --download-openmpi --download-openblas --download-slepc --download-cmake --download-metis --download-parmetis > > Compilation is done but after the following command, I got an error: > > make PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.4 PETSC_ARCH=arch-linux2-c-debug all > > *** Building slepc *** > **************************ERROR************************************* > Error building slepc. Check arch-linux2-c-debug/lib/petsc/conf/slepc.log > > We need slepc.log > > Thanks, > > Matt > > ******************************************************************** > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/petsc/conf/petscrules:46: recipe for target 'slepcbuild' failed > make[1]: *** [slepcbuild] Error 1 > make[1]: Leaving directory '/home/slurm_local/e200781/petsc-3.10.4' > **************************ERROR************************************* > Error during compile, check arch-linux2-c-debug/lib/petsc/conf/make.log > Send it and arch-linux2-c-debug/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov > ******************************************************************** > makefile:30: recipe for target 'all' failed > make: *** [all] Error 1 > > How can I fix the problem? > > Thank you, > > Eda > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From jroman at dsic.upv.es Tue Mar 19 05:49:39 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 19 Mar 2019 11:49:39 +0100 Subject: [petsc-users] SLEPc Build Error In-Reply-To: References: Message-ID: Correction: the correct path is $PETSC_DIR/$PETSC_ARCH/externalpackages/git.slepc/$PETSC_ARCH/lib/slepc/conf/configure.log > El 19 mar 2019, a las 11:46, Jose E. Roman via petsc-users escribi?: > > And what is in $SLEPC_DIR/arch-linux2-c-debug/lib/slepc/conf/configure.log ? > Jose > >> El 19 mar 2019, a las 11:41, Eda Oktay via petsc-users escribi?: >> >> This is slepc.log: >> >> Checking environment... done >> Checking PETSc installation... >> ERROR: Unable to link with PETSc >> ERROR: See "arch-linux2-c-debug/lib/slepc/conf/configure.log" file for details >> >> >> >> Matthew Knepley , 19 Mar 2019 Sal, 13:36 tarihinde ?unu yazd?: >> On Tue, Mar 19, 2019 at 6:31 AM Eda Oktay via petsc-users wrote: >> Hello, >> >> I am trying to install PETSc with following configure options: >> >> ./configure --download-openmpi --download-openblas --download-slepc --download-cmake --download-metis --download-parmetis >> >> Compilation is done but after the following command, I got an error: >> >> make PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.4 PETSC_ARCH=arch-linux2-c-debug all >> >> *** Building slepc *** >> **************************ERROR************************************* >> Error building slepc. Check arch-linux2-c-debug/lib/petsc/conf/slepc.log >> >> We need slepc.log >> >> Thanks, >> >> Matt >> >> ******************************************************************** >> /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/petsc/conf/petscrules:46: recipe for target 'slepcbuild' failed >> make[1]: *** [slepcbuild] Error 1 >> make[1]: Leaving directory '/home/slurm_local/e200781/petsc-3.10.4' >> **************************ERROR************************************* >> Error during compile, check arch-linux2-c-debug/lib/petsc/conf/make.log >> Send it and arch-linux2-c-debug/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov >> ******************************************************************** >> makefile:30: recipe for target 'all' failed >> make: *** [all] Error 1 >> >> How can I fix the problem? >> >> Thank you, >> >> Eda >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ > From eda.oktay at metu.edu.tr Tue Mar 19 05:58:44 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Tue, 19 Mar 2019 13:58:44 +0300 Subject: [petsc-users] SLEPc Build Error In-Reply-To: References: Message-ID: ================================================================================ Starting Configure Run at Tue Mar 19 11:53:05 2019 Configure Options: --prefix=/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug Working directory: /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/externalpackages/git.slepc Python version: 2.7.9 (default, Sep 25 2018, 20:42:16) [GCC 4.9.2] make: /usr/bin/make PETSc source directory: /home/slurm_local/e200781/petsc-3.10.4 PETSc install directory: /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug PETSc version: 3.10.4 PETSc architecture: arch-linux2-c-debug SLEPc source directory: /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/externalpackages/git.slepc SLEPc install directory: /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug SLEPc version: 3.10.1 ================================================================================ Checking PETSc installation... #include "petscsnes.h" int main() { Vec v; Mat m; KSP k; PetscInitializeNoArguments(); VecCreate(PETSC_COMM_WORLD,&v); MatCreate(PETSC_COMM_WORLD,&m); KSPCreate(PETSC_COMM_WORLD,&k); return 0; } make[2]: Entering directory '/tmp/slepc-2F1MtJ' /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc -o checklink.o -c -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g3 -I/home/slurm_local/e200781/petsc-3.10.4/include -I/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/include `pwd`/checklink.c /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g3 -o checklink checklink.o -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.9 -L/usr/lib/gcc/x86_64-linux-gnu/4.9 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lpetsc -lopenblas -lparmetis -lmetis -lm -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl /usr/bin/ld: warning: libmpi.so.0, needed by /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, may conflict with libmpi.so.40 /usr/bin/ld: warning: libmpi.so.0, needed by /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, may conflict with libmpi.so.40 /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: undefined reference to `MatPartitioningParmetisSetRepartition' /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: undefined reference to `MatPartitioningCreate_Parmetis' collect2: error: ld returned 1 exit status makefile:2: recipe for target 'checklink' failed make[2]: *** [checklink] Error 1 make[2]: Leaving directory '/tmp/slepc-2F1MtJ' ERROR: Unable to link with PETSc Jose E. Roman , 19 Mar 2019 Sal, 13:49 tarihinde ?unu yazd?: > Correction: the correct path is > $PETSC_DIR/$PETSC_ARCH/externalpackages/git.slepc/$PETSC_ARCH/lib/slepc/conf/configure.log > > > > > El 19 mar 2019, a las 11:46, Jose E. Roman via petsc-users < > petsc-users at mcs.anl.gov> escribi?: > > > > And what is in > $SLEPC_DIR/arch-linux2-c-debug/lib/slepc/conf/configure.log ? > > Jose > > > >> El 19 mar 2019, a las 11:41, Eda Oktay via petsc-users < > petsc-users at mcs.anl.gov> escribi?: > >> > >> This is slepc.log: > >> > >> Checking environment... done > >> Checking PETSc installation... > >> ERROR: Unable to link with PETSc > >> ERROR: See "arch-linux2-c-debug/lib/slepc/conf/configure.log" file for > details > >> > >> > >> > >> Matthew Knepley , 19 Mar 2019 Sal, 13:36 tarihinde > ?unu yazd?: > >> On Tue, Mar 19, 2019 at 6:31 AM Eda Oktay via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello, > >> > >> I am trying to install PETSc with following configure options: > >> > >> ./configure --download-openmpi --download-openblas --download-slepc > --download-cmake --download-metis --download-parmetis > >> > >> Compilation is done but after the following command, I got an error: > >> > >> make PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.4 > PETSC_ARCH=arch-linux2-c-debug all > >> > >> *** Building slepc *** > >> **************************ERROR************************************* > >> Error building slepc. Check arch-linux2-c-debug/lib/petsc/conf/slepc.log > >> > >> We need slepc.log > >> > >> Thanks, > >> > >> Matt > >> > >> ******************************************************************** > >> > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/petsc/conf/petscrules:46: > recipe for target 'slepcbuild' failed > >> make[1]: *** [slepcbuild] Error 1 > >> make[1]: Leaving directory '/home/slurm_local/e200781/petsc-3.10.4' > >> **************************ERROR************************************* > >> Error during compile, check arch-linux2-c-debug/lib/petsc/conf/make.log > >> Send it and arch-linux2-c-debug/lib/petsc/conf/configure.log to > petsc-maint at mcs.anl.gov > >> ******************************************************************** > >> makefile:30: recipe for target 'all' failed > >> make: *** [all] Error 1 > >> > >> How can I fix the problem? > >> > >> Thank you, > >> > >> Eda > >> > >> > >> -- > >> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >> -- Norbert Wiener > >> > >> https://www.cse.buffalo.edu/~knepley/ > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ys453 at cam.ac.uk Tue Mar 19 05:59:19 2019 From: ys453 at cam.ac.uk (Y. Shidi) Date: Tue, 19 Mar 2019 10:59:19 +0000 Subject: [petsc-users] PCFieldSplit gives different results for direct and iterative solver In-Reply-To: References: <0a59ff504728c756b0965f201a97ab7c@cam.ac.uk> Message-ID: Hello Barry, Thank you for your reply. I reduced the tolerances and get desired solution. I am solving a multiphase incompressible n-s problems and currently we are using augmented lagrangina technique with uzawa iteration. Because the problems are getting larger, we are also looking for some other methods for solving the linear system. I follow pcfieldsplit tutorial from: https://www.mcs.anl.gov/petsc/documentation/tutorials/MSITutorial.pdf However, it takes about 10s to finish one iteration and overall it requires like 150s to complete one time step with 100k unknowns, which is a long time compared to our current solver 10s for one time step. I tried the following options: 1). -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type lower -fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type gamg -fieldsplit_pressure_ksp_type minres -fieldsplit_pressure_pc_type none 2). -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type diag -fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type gamg -fieldsplit_pressure_ksp_type minres -fieldsplit_pressure_pc_type none 3). -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full -fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type jacobi So I am wondering if there is any other options that can help improve the pcfieldsplit performance. Kind Regards, Shidi On 2019-03-17 00:05, Smith, Barry F. wrote: >> On Mar 16, 2019, at 6:50 PM, Y. Shidi via petsc-users >> wrote: >> >> Hello, >> >> I am trying to solve the incompressible n-s equations by >> PCFieldSplit. >> >> The large matrix and vectors are formed by MatCreateNest() >> and VecCreateNest(). >> The system is solved directly by the following command: >> -ksp_type fgmres \ >> -pc_type fieldsplit \ >> -pc_fieldsplit_type schur \ >> -pc_fieldsplit_schur_fact_type full \ >> -ksp_converged_reason \ >> -ksp_monitor_true_residual \ >> -fieldsplit_0_ksp_type preonly \ >> -fieldsplit_0_pc_type cholesky \ >> -fieldsplit_0_pc_factor_mat_solver_package mumps \ >> -mat_mumps_icntl_28 2 \ >> -mat_mumps_icntl_29 2 \ >> -fieldsplit_1_ksp_type preonly \ >> -fieldsplit_1_pc_type jacobi \ >> Output: >> 0 KSP unpreconditioned resid norm 1.214252932161e+04 true resid norm >> 1.214252932161e+04 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 1.642782495109e-02 true resid norm >> 1.642782495109e-02 ||r(i)||/||b|| 1.352916226594e-06 >> Linear solve converged due to CONVERGED_RTOL iterations 1 >> >> The system is solved iteratively by the following command: >> -ksp_type fgmres \ >> -pc_type fieldsplit \ >> -pc_fieldsplit_type schur \ >> -pc_fieldsplit_schur_factorization_type diag \ >> -ksp_converged_reason \ >> -ksp_monitor_true_residual \ >> -fieldsplit_0_ksp_type preonly \ >> -fieldsplit_0_pc_type gamg \ >> -fieldsplit_1_ksp_type minres \ >> -fieldsplit_1_pc_type none \ >> Output: >> 0 KSP unpreconditioned resid norm 1.214252932161e+04 true resid norm >> 1.214252932161e+04 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP unpreconditioned resid norm 2.184037364915e+02 true resid norm >> 2.184037364915e+02 ||r(i)||/||b|| 1.798667565109e-02 >> 2 KSP unpreconditioned resid norm 2.120097409539e+02 true resid norm >> 2.120097409635e+02 ||r(i)||/||b|| 1.746009709742e-02 >> 3 KSP unpreconditioned resid norm 4.364091658268e+01 true resid norm >> 4.364091658575e+01 ||r(i)||/||b|| 3.594054865332e-03 >> 4 KSP unpreconditioned resid norm 2.632671796885e+00 true resid norm >> 2.632671797020e+00 ||r(i)||/||b|| 2.168141189773e-04 >> 5 KSP unpreconditioned resid norm 2.209213998004e+00 true resid norm >> 2.209213980361e+00 ||r(i)||/||b|| 1.819401808180e-04 >> 6 KSP unpreconditioned resid norm 4.683775185840e-01 true resid norm >> 4.683775085753e-01 ||r(i)||/||b|| 3.857330677735e-05 >> 7 KSP unpreconditioned resid norm 3.042503284736e-02 true resid norm >> 3.042503349258e-02 ||r(i)||/||b|| 2.505658638883e-06 >> >> >> Both methods give answers, but they are different > > What do you mean the answers are different? Do you mean the > solution x from KSPSolve() is different? How are you calculating their > difference and how different are they? > > Since the solutions are only approximate; true residual norm is > around 1.642782495109e-02 and 3.042503349258e-02 for the two > different solvers there will only be a certain number of identical > digits in the two solutions (which depends on the condition number of > the original matrix). You can run both solvers with -ksp_rtol 1.e-12 > and then (assuming everything is working correctly) the two solutions > will be much closer to each other. > > Barry > >> so I am wondering >> if it is possible that you can help me figure out which part I am >> doing wrong. >> >> Thank you for your time. >> >> Kind Regards, >> Shidi From jroman at dsic.upv.es Tue Mar 19 06:08:23 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 19 Mar 2019 12:08:23 +0100 Subject: [petsc-users] SLEPc Build Error In-Reply-To: References: Message-ID: There seems to be a link problem with PETSc. Suggest re-configuring without the option --download-slepc Then, after building PETSc, try 'make check' to make sure that PETSc is built correctly. Then install SLEPc afterwards. Jose > El 19 mar 2019, a las 11:58, Eda Oktay escribi?: > > ================================================================================ > Starting Configure Run at Tue Mar 19 11:53:05 2019 > Configure Options: --prefix=/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug > Working directory: /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/externalpackages/git.slepc > Python version: > 2.7.9 (default, Sep 25 2018, 20:42:16) > [GCC 4.9.2] > make: /usr/bin/make > PETSc source directory: /home/slurm_local/e200781/petsc-3.10.4 > PETSc install directory: /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug > PETSc version: 3.10.4 > PETSc architecture: arch-linux2-c-debug > SLEPc source directory: /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/externalpackages/git.slepc > SLEPc install directory: /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug > SLEPc version: 3.10.1 > ================================================================================ > Checking PETSc installation... > #include "petscsnes.h" > int main() { > Vec v; Mat m; KSP k; > PetscInitializeNoArguments(); > VecCreate(PETSC_COMM_WORLD,&v); > MatCreate(PETSC_COMM_WORLD,&m); > KSPCreate(PETSC_COMM_WORLD,&k); > return 0; > } > make[2]: Entering directory '/tmp/slepc-2F1MtJ' > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc -o checklink.o -c -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g3 -I/home/slurm_local/e200781/petsc-3.10.4/include -I/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/include `pwd`/checklink.c > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g3 -o checklink checklink.o -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.9 -L/usr/lib/gcc/x86_64-linux-gnu/4.9 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lpetsc -lopenblas -lparmetis -lmetis -lm -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl > /usr/bin/ld: warning: libmpi.so.0, needed by /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, may conflict with libmpi.so.40 > /usr/bin/ld: warning: libmpi.so.0, needed by /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, may conflict with libmpi.so.40 > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: undefined reference to `MatPartitioningParmetisSetRepartition' > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: undefined reference to `MatPartitioningCreate_Parmetis' > collect2: error: ld returned 1 exit status > makefile:2: recipe for target 'checklink' failed > make[2]: *** [checklink] Error 1 > make[2]: Leaving directory '/tmp/slepc-2F1MtJ' > > ERROR: Unable to link with PETSc > > > Jose E. Roman , 19 Mar 2019 Sal, 13:49 tarihinde ?unu yazd?: > Correction: the correct path is $PETSC_DIR/$PETSC_ARCH/externalpackages/git.slepc/$PETSC_ARCH/lib/slepc/conf/configure.log > > > > > El 19 mar 2019, a las 11:46, Jose E. Roman via petsc-users escribi?: > > > > And what is in $SLEPC_DIR/arch-linux2-c-debug/lib/slepc/conf/configure.log ? > > Jose > > > >> El 19 mar 2019, a las 11:41, Eda Oktay via petsc-users escribi?: > >> > >> This is slepc.log: > >> > >> Checking environment... done > >> Checking PETSc installation... > >> ERROR: Unable to link with PETSc > >> ERROR: See "arch-linux2-c-debug/lib/slepc/conf/configure.log" file for details > >> > >> > >> > >> Matthew Knepley , 19 Mar 2019 Sal, 13:36 tarihinde ?unu yazd?: > >> On Tue, Mar 19, 2019 at 6:31 AM Eda Oktay via petsc-users wrote: > >> Hello, > >> > >> I am trying to install PETSc with following configure options: > >> > >> ./configure --download-openmpi --download-openblas --download-slepc --download-cmake --download-metis --download-parmetis > >> > >> Compilation is done but after the following command, I got an error: > >> > >> make PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.4 PETSC_ARCH=arch-linux2-c-debug all > >> > >> *** Building slepc *** > >> **************************ERROR************************************* > >> Error building slepc. Check arch-linux2-c-debug/lib/petsc/conf/slepc.log > >> > >> We need slepc.log > >> > >> Thanks, > >> > >> Matt > >> > >> ******************************************************************** > >> /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/petsc/conf/petscrules:46: recipe for target 'slepcbuild' failed > >> make[1]: *** [slepcbuild] Error 1 > >> make[1]: Leaving directory '/home/slurm_local/e200781/petsc-3.10.4' > >> **************************ERROR************************************* > >> Error during compile, check arch-linux2-c-debug/lib/petsc/conf/make.log > >> Send it and arch-linux2-c-debug/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov > >> ******************************************************************** > >> makefile:30: recipe for target 'all' failed > >> make: *** [all] Error 1 > >> > >> How can I fix the problem? > >> > >> Thank you, > >> > >> Eda > >> > >> > >> -- > >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > >> -- Norbert Wiener > >> > >> https://www.cse.buffalo.edu/~knepley/ > > > From knepley at gmail.com Tue Mar 19 06:17:43 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 19 Mar 2019 07:17:43 -0400 Subject: [petsc-users] PCFieldSplit gives different results for direct and iterative solver In-Reply-To: References: <0a59ff504728c756b0965f201a97ab7c@cam.ac.uk> Message-ID: On Tue, Mar 19, 2019 at 6:59 AM Y. Shidi via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello Barry, > > Thank you for your reply. > > I reduced the tolerances and get desired solution. > > I am solving a multiphase incompressible n-s problems and currently > we are using augmented lagrangina technique with uzawa iteration. > Because the problems are getting larger, we are also looking for some > other methods for solving the linear system. > I follow pcfieldsplit tutorial from: > https://www.mcs.anl.gov/petsc/documentation/tutorials/MSITutorial.pdf > > However, it takes about 10s to finish one iteration and overall > it requires like 150s to complete one time step with 100k unknowns, > which is a long time compared to our current solver 10s for one > time step. > The first thing to do is look at how many Schur complement iterations you are doing: -fieldsplit_pressure_ksp_monitor_true_residual Perhaps you need a better preconditioner for it. Thanks, Matt > I tried the following options: > 1). > -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur > -pc_fieldsplit_schur_factorization_type lower > -fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type gamg > -fieldsplit_pressure_ksp_type minres -fieldsplit_pressure_pc_type none > 2). > -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur > -pc_fieldsplit_schur_factorization_type diag > -fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type gamg > -fieldsplit_pressure_ksp_type minres -fieldsplit_pressure_pc_type none > 3). > -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur > -pc_fieldsplit_schur_factorization_type full > -fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type lu > -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type jacobi > > So I am wondering if there is any other options that can help improve > the > pcfieldsplit performance. > > Kind Regards, > Shidi > > > On 2019-03-17 00:05, Smith, Barry F. wrote: > >> On Mar 16, 2019, at 6:50 PM, Y. Shidi via petsc-users > >> wrote: > >> > >> Hello, > >> > >> I am trying to solve the incompressible n-s equations by > >> PCFieldSplit. > >> > >> The large matrix and vectors are formed by MatCreateNest() > >> and VecCreateNest(). > >> The system is solved directly by the following command: > >> -ksp_type fgmres \ > >> -pc_type fieldsplit \ > >> -pc_fieldsplit_type schur \ > >> -pc_fieldsplit_schur_fact_type full \ > >> -ksp_converged_reason \ > >> -ksp_monitor_true_residual \ > >> -fieldsplit_0_ksp_type preonly \ > >> -fieldsplit_0_pc_type cholesky \ > >> -fieldsplit_0_pc_factor_mat_solver_package mumps \ > >> -mat_mumps_icntl_28 2 \ > >> -mat_mumps_icntl_29 2 \ > >> -fieldsplit_1_ksp_type preonly \ > >> -fieldsplit_1_pc_type jacobi \ > >> Output: > >> 0 KSP unpreconditioned resid norm 1.214252932161e+04 true resid norm > >> 1.214252932161e+04 ||r(i)||/||b|| 1.000000000000e+00 > >> 1 KSP unpreconditioned resid norm 1.642782495109e-02 true resid norm > >> 1.642782495109e-02 ||r(i)||/||b|| 1.352916226594e-06 > >> Linear solve converged due to CONVERGED_RTOL iterations 1 > >> > >> The system is solved iteratively by the following command: > >> -ksp_type fgmres \ > >> -pc_type fieldsplit \ > >> -pc_fieldsplit_type schur \ > >> -pc_fieldsplit_schur_factorization_type diag \ > >> -ksp_converged_reason \ > >> -ksp_monitor_true_residual \ > >> -fieldsplit_0_ksp_type preonly \ > >> -fieldsplit_0_pc_type gamg \ > >> -fieldsplit_1_ksp_type minres \ > >> -fieldsplit_1_pc_type none \ > >> Output: > >> 0 KSP unpreconditioned resid norm 1.214252932161e+04 true resid norm > >> 1.214252932161e+04 ||r(i)||/||b|| 1.000000000000e+00 > >> 1 KSP unpreconditioned resid norm 2.184037364915e+02 true resid norm > >> 2.184037364915e+02 ||r(i)||/||b|| 1.798667565109e-02 > >> 2 KSP unpreconditioned resid norm 2.120097409539e+02 true resid norm > >> 2.120097409635e+02 ||r(i)||/||b|| 1.746009709742e-02 > >> 3 KSP unpreconditioned resid norm 4.364091658268e+01 true resid norm > >> 4.364091658575e+01 ||r(i)||/||b|| 3.594054865332e-03 > >> 4 KSP unpreconditioned resid norm 2.632671796885e+00 true resid norm > >> 2.632671797020e+00 ||r(i)||/||b|| 2.168141189773e-04 > >> 5 KSP unpreconditioned resid norm 2.209213998004e+00 true resid norm > >> 2.209213980361e+00 ||r(i)||/||b|| 1.819401808180e-04 > >> 6 KSP unpreconditioned resid norm 4.683775185840e-01 true resid norm > >> 4.683775085753e-01 ||r(i)||/||b|| 3.857330677735e-05 > >> 7 KSP unpreconditioned resid norm 3.042503284736e-02 true resid norm > >> 3.042503349258e-02 ||r(i)||/||b|| 2.505658638883e-06 > >> > >> > >> Both methods give answers, but they are different > > > > What do you mean the answers are different? Do you mean the > > solution x from KSPSolve() is different? How are you calculating their > > difference and how different are they? > > > > Since the solutions are only approximate; true residual norm is > > around 1.642782495109e-02 and 3.042503349258e-02 for the two > > different solvers there will only be a certain number of identical > > digits in the two solutions (which depends on the condition number of > > the original matrix). You can run both solvers with -ksp_rtol 1.e-12 > > and then (assuming everything is working correctly) the two solutions > > will be much closer to each other. > > > > Barry > > > >> so I am wondering > >> if it is possible that you can help me figure out which part I am > >> doing wrong. > >> > >> Thank you for your time. > >> > >> Kind Regards, > >> Shidi > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Tue Mar 19 06:25:11 2019 From: dave.mayhem23 at gmail.com (Dave May) Date: Tue, 19 Mar 2019 11:25:11 +0000 Subject: [petsc-users] Confusing Schur preconditioner behaviour In-Reply-To: References: Message-ID: Hi Colin, On Tue, 19 Mar 2019 at 09:33, Cotter, Colin J wrote: > Hi Dave, > > >If you are doing that, then you need to tell fieldsplit to use the Amat > to define the splits otherwise it will define the Schur compliment as > >S = B22 - B21 inv(B11) B12 > >preconditiones with B22, where as what you want is > >S = A22 - A21 inv(A11) A12 > >preconditioned with B22. > > >If your operators are set up this way and you didn't indicate to use Amat > to define S this would definitely explain why preonly works but iterating > on Schur does not. > > Yes, thanks - this solves it! I need pc_use_amat. > Okay great. But doesn't that option eradicate your custom Schur complement object which you inserted into the Bmat in the (2,2) slot? I thought you would use the option -pc_fieldsplit_diag_use_amat In general for fieldsplit (Schur) I found that the best way to manage user defined Schur complement preconditioners is via PCFieldSplitSetSchurPre(). https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetSchurPre.html#PCFieldSplitSetSchurPre Also, for solver debugging purposes with fieldsplit and MatNest, I find it incredibly useful to attach textual names to all the matrices going to into FieldSplit. You can use PetscObjectSetName() with each of your sub-matrices in the Amat and the Bmat, and any schur complement operators. The textual names will be displayed in KSP view. In that way you have a better chance of understanding which operators are being used where. (Note that this trick is less useful with the Amat and Bmat are AIJ matrices). Below is an example KSPView associated with 2x2 block system where I've attached the names Auu,Aup,Apu,App, and S* to the Amat sub-matices and the schur complement preconditioner. PC Object:(dcy_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (lumped, if requested) A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (dcy_fieldsplit_u_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (dcy_fieldsplit_u_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 0., needed 0. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=85728, cols=85728 package used to perform factorization: umfpack total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 not using I-node routines UMFPACK run parameters: Control[UMFPACK_PRL]: 1. Control[UMFPACK_STRATEGY]: 0. Control[UMFPACK_DENSE_COL]: 0.2 Control[UMFPACK_DENSE_ROW]: 0.2 Control[UMFPACK_AMD_DENSE]: 10. Control[UMFPACK_BLOCK_SIZE]: 32. Control[UMFPACK_FIXQ]: 0. Control[UMFPACK_AGGRESSIVE]: 1. Control[UMFPACK_PIVOT_TOLERANCE]: 0.1 Control[UMFPACK_SYM_PIVOT_TOLERANCE]: 0.001 Control[UMFPACK_SCALE]: 1. Control[UMFPACK_ALLOC_INIT]: 0.7 Control[UMFPACK_DROPTOL]: 0. Control[UMFPACK_IRSTEP]: 0. Control[UMFPACK_ORDERING]: AMD (not using the PETSc ordering) linear system matrix = precond matrix: Mat Object: Auu (dcy_fieldsplit_u_) 1 MPI processes type: seqaij rows=85728, cols=85728 total: nonzeros=1028736, allocated nonzeros=1028736 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 21432 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (dcy_fieldsplit_p_) 1 MPI processes type: fgmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=300, initial guess is zero tolerances: relative=0.01, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (dcy_fieldsplit_p_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 0., needed 0. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=32542, cols=32542 package used to perform factorization: umfpack total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 not using I-node routines UMFPACK run parameters: Control[UMFPACK_PRL]: 1. Control[UMFPACK_STRATEGY]: 0. Control[UMFPACK_DENSE_COL]: 0.2 Control[UMFPACK_DENSE_ROW]: 0.2 Control[UMFPACK_AMD_DENSE]: 10. Control[UMFPACK_BLOCK_SIZE]: 32. Control[UMFPACK_FIXQ]: 0. Control[UMFPACK_AGGRESSIVE]: 1. Control[UMFPACK_PIVOT_TOLERANCE]: 0.1 Control[UMFPACK_SYM_PIVOT_TOLERANCE]: 0.001 Control[UMFPACK_SCALE]: 1. Control[UMFPACK_ALLOC_INIT]: 0.7 Control[UMFPACK_DROPTOL]: 0. Control[UMFPACK_IRSTEP]: 0. Control[UMFPACK_ORDERING]: AMD (not using the PETSc ordering) linear system matrix followed by preconditioner matrix: Mat Object: (dcy_fieldsplit_p_) 1 MPI processes type: schurcomplement rows=32542, cols=32542 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: App (dcy_fieldsplit_p_) 1 MPI processes type: seqaij rows=32542, cols=32542 total: nonzeros=548482, allocated nonzeros=548482 total number of mallocs used during MatSetValues calls =0 not using I-node routines A10 Mat Object: Apu 1 MPI processes type: seqaij rows=32542, cols=85728 total: nonzeros=857280, allocated nonzeros=921990 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (dcy_fieldsplit_u_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (dcy_fieldsplit_u_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 0., needed 0. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=85728, cols=85728 package used to perform factorization: umfpack total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 not using I-node routines UMFPACK run parameters: Control[UMFPACK_PRL]: 1. Control[UMFPACK_STRATEGY]: 0. Control[UMFPACK_DENSE_COL]: 0.2 Control[UMFPACK_DENSE_ROW]: 0.2 Control[UMFPACK_AMD_DENSE]: 10. Control[UMFPACK_BLOCK_SIZE]: 32. Control[UMFPACK_FIXQ]: 0. Control[UMFPACK_AGGRESSIVE]: 1. Control[UMFPACK_PIVOT_TOLERANCE]: 0.1 Control[UMFPACK_SYM_PIVOT_TOLERANCE]: 0.001 Control[UMFPACK_SCALE]: 1. Control[UMFPACK_ALLOC_INIT]: 0.7 Control[UMFPACK_DROPTOL]: 0. Control[UMFPACK_IRSTEP]: 0. Control[UMFPACK_ORDERING]: AMD (not using the PETSc ordering) linear system matrix = precond matrix: Mat Object: Auu (dcy_fieldsplit_u_) 1 MPI processes type: seqaij rows=85728, cols=85728 total: nonzeros=1028736, allocated nonzeros=1028736 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 21432 nodes, limit used is 5 A01 Mat Object: Aup 1 MPI processes type: seqaij rows=85728, cols=32542 total: nonzeros=857280, allocated nonzeros=942634 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 21432 nodes, limit used is 5 Mat Object: S* 1 MPI processes type: seqaij rows=32542, cols=32542 total: nonzeros=548482, allocated nonzeros=548482 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=118270, cols=118270 has attached null space Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : name="Auu", prefix="dcy_fieldsplit_u_", type=seqaij, rows=85728, cols=85728 (0,1) : name="Aup", type=seqaij, rows=85728, cols=32542 (1,0) : name="Apu", type=seqaij, rows=32542, cols=85728 (1,1) : name="App", prefix="dcy_fieldsplit_p_", type=seqaij, rows=32542, cols=32542 > > all the best > --Colin > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ys453 at cam.ac.uk Tue Mar 19 06:34:49 2019 From: ys453 at cam.ac.uk (Y. Shidi) Date: Tue, 19 Mar 2019 11:34:49 +0000 Subject: [petsc-users] PCFieldSplit gives different results for direct and iterative solver In-Reply-To: References: <0a59ff504728c756b0965f201a97ab7c@cam.ac.uk> Message-ID: <11fe8314758751372d990fcfcf8545bf@cam.ac.uk> > Perhaps you need a better preconditioner for it. Hello Matt, Thank you for your help. Yes, I think I need a better preconditioner; it requires about 600 iterations for Schur complement. Is there any tutorials for this? Or shall I just try different combinations and find the most suitable one? Kind Regards, Shidi On 2019-03-19 11:17, Matthew Knepley wrote: > On Tue, Mar 19, 2019 at 6:59 AM Y. Shidi via petsc-users > wrote: > >> Hello Barry, >> >> Thank you for your reply. >> >> I reduced the tolerances and get desired solution. >> >> I am solving a multiphase incompressible n-s problems and currently >> we are using augmented lagrangina technique with uzawa iteration. >> Because the problems are getting larger, we are also looking for >> some >> other methods for solving the linear system. >> I follow pcfieldsplit tutorial from: >> > https://www.mcs.anl.gov/petsc/documentation/tutorials/MSITutorial.pdf >> [1] >> >> However, it takes about 10s to finish one iteration and overall >> it requires like 150s to complete one time step with 100k unknowns, >> which is a long time compared to our current solver 10s for one >> time step. > > The first thing to do is look at how many Schur complement iterations > you are doing: > > -fieldsplit_pressure_ksp_monitor_true_residual > > Perhaps you need a better preconditioner for it. > > Thanks, > > Matt > >> I tried the following options: >> 1). >> -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur >> -pc_fieldsplit_schur_factorization_type lower >> -fieldsplit_velocity_ksp_type preonly >> -fieldsplit_velocity_pc_type gamg >> -fieldsplit_pressure_ksp_type minres >> -fieldsplit_pressure_pc_type none >> 2). >> -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur >> -pc_fieldsplit_schur_factorization_type diag >> -fieldsplit_velocity_ksp_type preonly >> -fieldsplit_velocity_pc_type gamg >> -fieldsplit_pressure_ksp_type minres >> -fieldsplit_pressure_pc_type none >> 3). >> -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur >> -pc_fieldsplit_schur_factorization_type full >> -fieldsplit_velocity_ksp_type preonly >> -fieldsplit_velocity_pc_type lu >> -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type >> jacobi >> >> So I am wondering if there is any other options that can help >> improve >> the >> pcfieldsplit performance. >> >> Kind Regards, >> Shidi >> >> On 2019-03-17 00:05, Smith, Barry F. wrote: >>>> On Mar 16, 2019, at 6:50 PM, Y. Shidi via petsc-users >>>> wrote: >>>> >>>> Hello, >>>> >>>> I am trying to solve the incompressible n-s equations by >>>> PCFieldSplit. >>>> >>>> The large matrix and vectors are formed by MatCreateNest() >>>> and VecCreateNest(). >>>> The system is solved directly by the following command: >>>> -ksp_type fgmres >>>> -pc_type fieldsplit >>>> -pc_fieldsplit_type schur >>>> -pc_fieldsplit_schur_fact_type full >>>> -ksp_converged_reason >>>> -ksp_monitor_true_residual >>>> -fieldsplit_0_ksp_type preonly >>>> -fieldsplit_0_pc_type cholesky >>>> -fieldsplit_0_pc_factor_mat_solver_package mumps >>>> -mat_mumps_icntl_28 2 >>>> -mat_mumps_icntl_29 2 >>>> -fieldsplit_1_ksp_type preonly >>>> -fieldsplit_1_pc_type jacobi >>>> Output: >>>> 0 KSP unpreconditioned resid norm 1.214252932161e+04 true resid >> norm >>>> 1.214252932161e+04 ||r(i)||/||b|| 1.000000000000e+00 >>>> 1 KSP unpreconditioned resid norm 1.642782495109e-02 true resid >> norm >>>> 1.642782495109e-02 ||r(i)||/||b|| 1.352916226594e-06 >>>> Linear solve converged due to CONVERGED_RTOL iterations 1 >>>> >>>> The system is solved iteratively by the following command: >>>> -ksp_type fgmres >>>> -pc_type fieldsplit >>>> -pc_fieldsplit_type schur >>>> -pc_fieldsplit_schur_factorization_type diag >>>> -ksp_converged_reason >>>> -ksp_monitor_true_residual >>>> -fieldsplit_0_ksp_type preonly >>>> -fieldsplit_0_pc_type gamg >>>> -fieldsplit_1_ksp_type minres >>>> -fieldsplit_1_pc_type none >>>> Output: >>>> 0 KSP unpreconditioned resid norm 1.214252932161e+04 true resid >> norm >>>> 1.214252932161e+04 ||r(i)||/||b|| 1.000000000000e+00 >>>> 1 KSP unpreconditioned resid norm 2.184037364915e+02 true resid >> norm >>>> 2.184037364915e+02 ||r(i)||/||b|| 1.798667565109e-02 >>>> 2 KSP unpreconditioned resid norm 2.120097409539e+02 true resid >> norm >>>> 2.120097409635e+02 ||r(i)||/||b|| 1.746009709742e-02 >>>> 3 KSP unpreconditioned resid norm 4.364091658268e+01 true resid >> norm >>>> 4.364091658575e+01 ||r(i)||/||b|| 3.594054865332e-03 >>>> 4 KSP unpreconditioned resid norm 2.632671796885e+00 true resid >> norm >>>> 2.632671797020e+00 ||r(i)||/||b|| 2.168141189773e-04 >>>> 5 KSP unpreconditioned resid norm 2.209213998004e+00 true resid >> norm >>>> 2.209213980361e+00 ||r(i)||/||b|| 1.819401808180e-04 >>>> 6 KSP unpreconditioned resid norm 4.683775185840e-01 true resid >> norm >>>> 4.683775085753e-01 ||r(i)||/||b|| 3.857330677735e-05 >>>> 7 KSP unpreconditioned resid norm 3.042503284736e-02 true resid >> norm >>>> 3.042503349258e-02 ||r(i)||/||b|| 2.505658638883e-06 >>>> >>>> >>>> Both methods give answers, but they are different >>> >>> What do you mean the answers are different? Do you mean the >>> solution x from KSPSolve() is different? How are you calculating >> their >>> difference and how different are they? >>> >>> Since the solutions are only approximate; true residual norm >> is >>> around 1.642782495109e-02 and 3.042503349258e-02 for the two >>> different solvers there will only be a certain number of >> identical >>> digits in the two solutions (which depends on the condition >> number of >>> the original matrix). You can run both solvers with -ksp_rtol >> 1.e-12 >>> and then (assuming everything is working correctly) the two >> solutions >>> will be much closer to each other. >>> >>> Barry >>> >>>> so I am wondering >>>> if it is possible that you can help me figure out which part I >> am >>>> doing wrong. >>>> >>>> Thank you for your time. >>>> >>>> Kind Regards, >>>> Shidi > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ [2] > > > Links: > ------ > [1] > https://www.mcs.anl.gov/petsc/documentation/tutorials/MSITutorial.pdf > [2] http://www.cse.buffalo.edu/~knepley/ From eda.oktay at metu.edu.tr Tue Mar 19 07:02:46 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Tue, 19 Mar 2019 15:02:46 +0300 Subject: [petsc-users] SLEPc Build Error In-Reply-To: References: Message-ID: Without slepc, I configured petsc succesfully. Then I installed slepc with following steps: export PETSC_ARCH=arch-linux2-c-debug export PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.4 export SLEPC_DIR=/home/slurm_local/e200781/slepc-3.10.2 cd slepc-3.10.2 ./configure However, I still get the error: Checking environment... done Checking PETSc installation... ERROR: Unable to link with PETSc ERROR: See "arch-linux2-c-debug/lib/slepc/conf/configure.log" file for details Where the configure.log is: ================================================================================ Starting Configure Run at Tue Mar 19 14:57:21 2019 Configure Options: Working directory: /home/slurm_local/e200781/slepc-3.10.2 Python version: 2.7.9 (default, Sep 25 2018, 20:42:16) [GCC 4.9.2] make: /usr/bin/make PETSc source directory: /home/slurm_local/e200781/petsc-3.10.4 PETSc install directory: /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug PETSc version: 3.10.4 PETSc architecture: arch-linux2-c-debug SLEPc source directory: /home/slurm_local/e200781/slepc-3.10.2 SLEPc version: 3.10.2 ================================================================================ Checking PETSc installation... #include "petscsnes.h" int main() { Vec v; Mat m; KSP k; PetscInitializeNoArguments(); VecCreate(PETSC_COMM_WORLD,&v); MatCreate(PETSC_COMM_WORLD,&m); KSPCreate(PETSC_COMM_WORLD,&k); return 0; } /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc -o checklink.o -c -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g3 -I/home/slurm_local/e200781/petsc-3.10.4/include -I/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/include `pwd`/checklink.c /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g3 -o checklink checklink.o -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.9 -L/usr/lib/gcc/x86_64-linux-gnu/4.9 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lpetsc -lopenblas -lparmetis -lmetis -lm -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl /usr/bin/ld: warning: libmpi.so.0, needed by /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, may conflict with libmpi.so.40 /usr/bin/ld: warning: libmpi.so.0, needed by /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, may conflict with libmpi.so.40 /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: undefined reference to `MatPartitioningParmetisSetRepartition' /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: undefined reference to `MatPartitioningCreate_Parmetis' collect2: error: ld returned 1 exit status makefile:2: recipe for target 'checklink' failed make: *** [checklink] Error 1 ERROR: Unable to link with PETSc Jose E. Roman , 19 Mar 2019 Sal, 14:08 tarihinde ?unu yazd?: > There seems to be a link problem with PETSc. > Suggest re-configuring without the option --download-slepc > Then, after building PETSc, try 'make check' to make sure that PETSc is > built correctly. Then install SLEPc afterwards. > Jose > > > > El 19 mar 2019, a las 11:58, Eda Oktay escribi?: > > > > > ================================================================================ > > Starting Configure Run at Tue Mar 19 11:53:05 2019 > > Configure Options: > --prefix=/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug > > Working directory: > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/externalpackages/git.slepc > > Python version: > > 2.7.9 (default, Sep 25 2018, 20:42:16) > > [GCC 4.9.2] > > make: /usr/bin/make > > PETSc source directory: /home/slurm_local/e200781/petsc-3.10.4 > > PETSc install directory: > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug > > PETSc version: 3.10.4 > > PETSc architecture: arch-linux2-c-debug > > SLEPc source directory: > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/externalpackages/git.slepc > > SLEPc install directory: > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug > > SLEPc version: 3.10.1 > > > ================================================================================ > > Checking PETSc installation... > > #include "petscsnes.h" > > int main() { > > Vec v; Mat m; KSP k; > > PetscInitializeNoArguments(); > > VecCreate(PETSC_COMM_WORLD,&v); > > MatCreate(PETSC_COMM_WORLD,&m); > > KSPCreate(PETSC_COMM_WORLD,&k); > > return 0; > > } > > make[2]: Entering directory '/tmp/slepc-2F1MtJ' > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc -o > checklink.o -c -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g3 > -I/home/slurm_local/e200781/petsc-3.10.4/include > -I/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/include > `pwd`/checklink.c > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc > -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas > -fstack-protector -fvisibility=hidden -g3 -o checklink checklink.o > -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib > -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib > -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib > -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.9 > -L/usr/lib/gcc/x86_64-linux-gnu/4.9 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > -L/lib/x86_64-linux-gnu -lpetsc -lopenblas -lparmetis -lmetis -lm -lX11 > -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi > -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl > > /usr/bin/ld: warning: libmpi.so.0, needed by > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, > may conflict with libmpi.so.40 > > /usr/bin/ld: warning: libmpi.so.0, needed by > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, > may conflict with libmpi.so.40 > > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: > undefined reference to `MatPartitioningParmetisSetRepartition' > > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: > undefined reference to `MatPartitioningCreate_Parmetis' > > collect2: error: ld returned 1 exit status > > makefile:2: recipe for target 'checklink' failed > > make[2]: *** [checklink] Error 1 > > make[2]: Leaving directory '/tmp/slepc-2F1MtJ' > > > > ERROR: Unable to link with PETSc > > > > > > Jose E. Roman , 19 Mar 2019 Sal, 13:49 tarihinde > ?unu yazd?: > > Correction: the correct path is > $PETSC_DIR/$PETSC_ARCH/externalpackages/git.slepc/$PETSC_ARCH/lib/slepc/conf/configure.log > > > > > > > > > El 19 mar 2019, a las 11:46, Jose E. Roman via petsc-users < > petsc-users at mcs.anl.gov> escribi?: > > > > > > And what is in > $SLEPC_DIR/arch-linux2-c-debug/lib/slepc/conf/configure.log ? > > > Jose > > > > > >> El 19 mar 2019, a las 11:41, Eda Oktay via petsc-users < > petsc-users at mcs.anl.gov> escribi?: > > >> > > >> This is slepc.log: > > >> > > >> Checking environment... done > > >> Checking PETSc installation... > > >> ERROR: Unable to link with PETSc > > >> ERROR: See "arch-linux2-c-debug/lib/slepc/conf/configure.log" file > for details > > >> > > >> > > >> > > >> Matthew Knepley , 19 Mar 2019 Sal, 13:36 > tarihinde ?unu yazd?: > > >> On Tue, Mar 19, 2019 at 6:31 AM Eda Oktay via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > >> Hello, > > >> > > >> I am trying to install PETSc with following configure options: > > >> > > >> ./configure --download-openmpi --download-openblas --download-slepc > --download-cmake --download-metis --download-parmetis > > >> > > >> Compilation is done but after the following command, I got an error: > > >> > > >> make PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.4 > PETSC_ARCH=arch-linux2-c-debug all > > >> > > >> *** Building slepc *** > > >> **************************ERROR************************************* > > >> Error building slepc. Check > arch-linux2-c-debug/lib/petsc/conf/slepc.log > > >> > > >> We need slepc.log > > >> > > >> Thanks, > > >> > > >> Matt > > >> > > >> ******************************************************************** > > >> > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/petsc/conf/petscrules:46: > recipe for target 'slepcbuild' failed > > >> make[1]: *** [slepcbuild] Error 1 > > >> make[1]: Leaving directory '/home/slurm_local/e200781/petsc-3.10.4' > > >> **************************ERROR************************************* > > >> Error during compile, check > arch-linux2-c-debug/lib/petsc/conf/make.log > > >> Send it and arch-linux2-c-debug/lib/petsc/conf/configure.log to > petsc-maint at mcs.anl.gov > > >> ******************************************************************** > > >> makefile:30: recipe for target 'all' failed > > >> make: *** [all] Error 1 > > >> > > >> How can I fix the problem? > > >> > > >> Thank you, > > >> > > >> Eda > > >> > > >> > > >> -- > > >> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > >> -- Norbert Wiener > > >> > > >> https://www.cse.buffalo.edu/~knepley/ > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 19 07:02:46 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 19 Mar 2019 08:02:46 -0400 Subject: [petsc-users] PCFieldSplit gives different results for direct and iterative solver In-Reply-To: <11fe8314758751372d990fcfcf8545bf@cam.ac.uk> References: <0a59ff504728c756b0965f201a97ab7c@cam.ac.uk> <11fe8314758751372d990fcfcf8545bf@cam.ac.uk> Message-ID: On Tue, Mar 19, 2019 at 7:34 AM Y. Shidi wrote: > > Perhaps you need a better preconditioner for it. > > Hello Matt, > Thank you for your help. > Yes, I think I need a better preconditioner; > it requires about 600 iterations for Schur complement. > Is there any tutorials for this? Or shall I just > try different combinations and find the most suitable > one? > For incompressible Stokes with a constant viscosity, the Schur complement is spectrally equivalent to the weighted mass matrix 1/nu M Using that for the preconditioning matrix is the first step. It is more complicated for Navier-Stokes, but starting with a mass matrix can help, especially for slow flows. For very new approaches to NS, see something like https://arxiv.org/abs/1810.03315 Thanks, Matt > Kind Regards, > Shidi > > On 2019-03-19 11:17, Matthew Knepley wrote: > > On Tue, Mar 19, 2019 at 6:59 AM Y. Shidi via petsc-users > > wrote: > > > >> Hello Barry, > >> > >> Thank you for your reply. > >> > >> I reduced the tolerances and get desired solution. > >> > >> I am solving a multiphase incompressible n-s problems and currently > >> we are using augmented lagrangina technique with uzawa iteration. > >> Because the problems are getting larger, we are also looking for > >> some > >> other methods for solving the linear system. > >> I follow pcfieldsplit tutorial from: > >> > > https://www.mcs.anl.gov/petsc/documentation/tutorials/MSITutorial.pdf > >> [1] > >> > >> However, it takes about 10s to finish one iteration and overall > >> it requires like 150s to complete one time step with 100k unknowns, > >> which is a long time compared to our current solver 10s for one > >> time step. > > > > The first thing to do is look at how many Schur complement iterations > > you are doing: > > > > -fieldsplit_pressure_ksp_monitor_true_residual > > > > Perhaps you need a better preconditioner for it. > > > > Thanks, > > > > Matt > > > >> I tried the following options: > >> 1). > >> -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur > >> -pc_fieldsplit_schur_factorization_type lower > >> -fieldsplit_velocity_ksp_type preonly > >> -fieldsplit_velocity_pc_type gamg > >> -fieldsplit_pressure_ksp_type minres > >> -fieldsplit_pressure_pc_type none > >> 2). > >> -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur > >> -pc_fieldsplit_schur_factorization_type diag > >> -fieldsplit_velocity_ksp_type preonly > >> -fieldsplit_velocity_pc_type gamg > >> -fieldsplit_pressure_ksp_type minres > >> -fieldsplit_pressure_pc_type none > >> 3). > >> -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur > >> -pc_fieldsplit_schur_factorization_type full > >> -fieldsplit_velocity_ksp_type preonly > >> -fieldsplit_velocity_pc_type lu > >> -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type > >> jacobi > >> > >> So I am wondering if there is any other options that can help > >> improve > >> the > >> pcfieldsplit performance. > >> > >> Kind Regards, > >> Shidi > >> > >> On 2019-03-17 00:05, Smith, Barry F. wrote: > >>>> On Mar 16, 2019, at 6:50 PM, Y. Shidi via petsc-users > >>>> wrote: > >>>> > >>>> Hello, > >>>> > >>>> I am trying to solve the incompressible n-s equations by > >>>> PCFieldSplit. > >>>> > >>>> The large matrix and vectors are formed by MatCreateNest() > >>>> and VecCreateNest(). > >>>> The system is solved directly by the following command: > >>>> -ksp_type fgmres > >>>> -pc_type fieldsplit > >>>> -pc_fieldsplit_type schur > >>>> -pc_fieldsplit_schur_fact_type full > >>>> -ksp_converged_reason > >>>> -ksp_monitor_true_residual > >>>> -fieldsplit_0_ksp_type preonly > >>>> -fieldsplit_0_pc_type cholesky > >>>> -fieldsplit_0_pc_factor_mat_solver_package mumps > >>>> -mat_mumps_icntl_28 2 > >>>> -mat_mumps_icntl_29 2 > >>>> -fieldsplit_1_ksp_type preonly > >>>> -fieldsplit_1_pc_type jacobi > >>>> Output: > >>>> 0 KSP unpreconditioned resid norm 1.214252932161e+04 true resid > >> norm > >>>> 1.214252932161e+04 ||r(i)||/||b|| 1.000000000000e+00 > >>>> 1 KSP unpreconditioned resid norm 1.642782495109e-02 true resid > >> norm > >>>> 1.642782495109e-02 ||r(i)||/||b|| 1.352916226594e-06 > >>>> Linear solve converged due to CONVERGED_RTOL iterations 1 > >>>> > >>>> The system is solved iteratively by the following command: > >>>> -ksp_type fgmres > >>>> -pc_type fieldsplit > >>>> -pc_fieldsplit_type schur > >>>> -pc_fieldsplit_schur_factorization_type diag > >>>> -ksp_converged_reason > >>>> -ksp_monitor_true_residual > >>>> -fieldsplit_0_ksp_type preonly > >>>> -fieldsplit_0_pc_type gamg > >>>> -fieldsplit_1_ksp_type minres > >>>> -fieldsplit_1_pc_type none > >>>> Output: > >>>> 0 KSP unpreconditioned resid norm 1.214252932161e+04 true resid > >> norm > >>>> 1.214252932161e+04 ||r(i)||/||b|| 1.000000000000e+00 > >>>> 1 KSP unpreconditioned resid norm 2.184037364915e+02 true resid > >> norm > >>>> 2.184037364915e+02 ||r(i)||/||b|| 1.798667565109e-02 > >>>> 2 KSP unpreconditioned resid norm 2.120097409539e+02 true resid > >> norm > >>>> 2.120097409635e+02 ||r(i)||/||b|| 1.746009709742e-02 > >>>> 3 KSP unpreconditioned resid norm 4.364091658268e+01 true resid > >> norm > >>>> 4.364091658575e+01 ||r(i)||/||b|| 3.594054865332e-03 > >>>> 4 KSP unpreconditioned resid norm 2.632671796885e+00 true resid > >> norm > >>>> 2.632671797020e+00 ||r(i)||/||b|| 2.168141189773e-04 > >>>> 5 KSP unpreconditioned resid norm 2.209213998004e+00 true resid > >> norm > >>>> 2.209213980361e+00 ||r(i)||/||b|| 1.819401808180e-04 > >>>> 6 KSP unpreconditioned resid norm 4.683775185840e-01 true resid > >> norm > >>>> 4.683775085753e-01 ||r(i)||/||b|| 3.857330677735e-05 > >>>> 7 KSP unpreconditioned resid norm 3.042503284736e-02 true resid > >> norm > >>>> 3.042503349258e-02 ||r(i)||/||b|| 2.505658638883e-06 > >>>> > >>>> > >>>> Both methods give answers, but they are different > >>> > >>> What do you mean the answers are different? Do you mean the > >>> solution x from KSPSolve() is different? How are you calculating > >> their > >>> difference and how different are they? > >>> > >>> Since the solutions are only approximate; true residual norm > >> is > >>> around 1.642782495109e-02 and 3.042503349258e-02 for the two > >>> different solvers there will only be a certain number of > >> identical > >>> digits in the two solutions (which depends on the condition > >> number of > >>> the original matrix). You can run both solvers with -ksp_rtol > >> 1.e-12 > >>> and then (assuming everything is working correctly) the two > >> solutions > >>> will be much closer to each other. > >>> > >>> Barry > >>> > >>>> so I am wondering > >>>> if it is possible that you can help me figure out which part I > >> am > >>>> doing wrong. > >>>> > >>>> Thank you for your time. > >>>> > >>>> Kind Regards, > >>>> Shidi > > > > -- > > > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which > > their experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ [2] > > > > > > Links: > > ------ > > [1] > > https://www.mcs.anl.gov/petsc/documentation/tutorials/MSITutorial.pdf > > [2] http://www.cse.buffalo.edu/~knepley/ > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Tue Mar 19 07:15:34 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 19 Mar 2019 13:15:34 +0100 Subject: [petsc-users] SLEPc Build Error In-Reply-To: References: Message-ID: <7624A7BD-0E79-400F-92EF-08186E98E8A8@dsic.upv.es> What is the output of 'make check' in $PETSC_DIR ? > El 19 mar 2019, a las 13:02, Eda Oktay escribi?: > > Without slepc, I configured petsc succesfully. Then I installed slepc with following steps: > > export PETSC_ARCH=arch-linux2-c-debug > export PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.4 > export SLEPC_DIR=/home/slurm_local/e200781/slepc-3.10.2 > cd slepc-3.10.2 > ./configure > > However, I still get the error: > > Checking environment... done > Checking PETSc installation... > ERROR: Unable to link with PETSc > ERROR: See "arch-linux2-c-debug/lib/slepc/conf/configure.log" file for details > > Where the configure.log is: > > ================================================================================ > Starting Configure Run at Tue Mar 19 14:57:21 2019 > Configure Options: > Working directory: /home/slurm_local/e200781/slepc-3.10.2 > Python version: > 2.7.9 (default, Sep 25 2018, 20:42:16) > [GCC 4.9.2] > make: /usr/bin/make > PETSc source directory: /home/slurm_local/e200781/petsc-3.10.4 > PETSc install directory: /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug > PETSc version: 3.10.4 > PETSc architecture: arch-linux2-c-debug > SLEPc source directory: /home/slurm_local/e200781/slepc-3.10.2 > SLEPc version: 3.10.2 > ================================================================================ > Checking PETSc installation... > #include "petscsnes.h" > int main() { > Vec v; Mat m; KSP k; > PetscInitializeNoArguments(); > VecCreate(PETSC_COMM_WORLD,&v); > MatCreate(PETSC_COMM_WORLD,&m); > KSPCreate(PETSC_COMM_WORLD,&k); > return 0; > } > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc -o checklink.o -c -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g3 -I/home/slurm_local/e200781/petsc-3.10.4/include -I/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/include `pwd`/checklink.c > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g3 -o checklink checklink.o -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.9 -L/usr/lib/gcc/x86_64-linux-gnu/4.9 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lpetsc -lopenblas -lparmetis -lmetis -lm -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl > /usr/bin/ld: warning: libmpi.so.0, needed by /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, may conflict with libmpi.so.40 > /usr/bin/ld: warning: libmpi.so.0, needed by /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, may conflict with libmpi.so.40 > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: undefined reference to `MatPartitioningParmetisSetRepartition' > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: undefined reference to `MatPartitioningCreate_Parmetis' > collect2: error: ld returned 1 exit status > makefile:2: recipe for target 'checklink' failed > make: *** [checklink] Error 1 > > ERROR: Unable to link with PETSc > > Jose E. Roman , 19 Mar 2019 Sal, 14:08 tarihinde ?unu yazd?: > There seems to be a link problem with PETSc. > Suggest re-configuring without the option --download-slepc > Then, after building PETSc, try 'make check' to make sure that PETSc is built correctly. Then install SLEPc afterwards. > Jose > > > > El 19 mar 2019, a las 11:58, Eda Oktay escribi?: > > > > ================================================================================ > > Starting Configure Run at Tue Mar 19 11:53:05 2019 > > Configure Options: --prefix=/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug > > Working directory: /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/externalpackages/git.slepc > > Python version: > > 2.7.9 (default, Sep 25 2018, 20:42:16) > > [GCC 4.9.2] > > make: /usr/bin/make > > PETSc source directory: /home/slurm_local/e200781/petsc-3.10.4 > > PETSc install directory: /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug > > PETSc version: 3.10.4 > > PETSc architecture: arch-linux2-c-debug > > SLEPc source directory: /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/externalpackages/git.slepc > > SLEPc install directory: /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug > > SLEPc version: 3.10.1 > > ================================================================================ > > Checking PETSc installation... > > #include "petscsnes.h" > > int main() { > > Vec v; Mat m; KSP k; > > PetscInitializeNoArguments(); > > VecCreate(PETSC_COMM_WORLD,&v); > > MatCreate(PETSC_COMM_WORLD,&m); > > KSPCreate(PETSC_COMM_WORLD,&k); > > return 0; > > } > > make[2]: Entering directory '/tmp/slepc-2F1MtJ' > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc -o checklink.o -c -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g3 -I/home/slurm_local/e200781/petsc-3.10.4/include -I/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/include `pwd`/checklink.c > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g3 -o checklink checklink.o -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.9 -L/usr/lib/gcc/x86_64-linux-gnu/4.9 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lpetsc -lopenblas -lparmetis -lmetis -lm -lX11 -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl > > /usr/bin/ld: warning: libmpi.so.0, needed by /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, may conflict with libmpi.so.40 > > /usr/bin/ld: warning: libmpi.so.0, needed by /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, may conflict with libmpi.so.40 > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: undefined reference to `MatPartitioningParmetisSetRepartition' > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: undefined reference to `MatPartitioningCreate_Parmetis' > > collect2: error: ld returned 1 exit status > > makefile:2: recipe for target 'checklink' failed > > make[2]: *** [checklink] Error 1 > > make[2]: Leaving directory '/tmp/slepc-2F1MtJ' > > > > ERROR: Unable to link with PETSc > > > > > > Jose E. Roman , 19 Mar 2019 Sal, 13:49 tarihinde ?unu yazd?: > > Correction: the correct path is $PETSC_DIR/$PETSC_ARCH/externalpackages/git.slepc/$PETSC_ARCH/lib/slepc/conf/configure.log > > > > > > > > > El 19 mar 2019, a las 11:46, Jose E. Roman via petsc-users escribi?: > > > > > > And what is in $SLEPC_DIR/arch-linux2-c-debug/lib/slepc/conf/configure.log ? > > > Jose > > > > > >> El 19 mar 2019, a las 11:41, Eda Oktay via petsc-users escribi?: > > >> > > >> This is slepc.log: > > >> > > >> Checking environment... done > > >> Checking PETSc installation... > > >> ERROR: Unable to link with PETSc > > >> ERROR: See "arch-linux2-c-debug/lib/slepc/conf/configure.log" file for details > > >> > > >> > > >> > > >> Matthew Knepley , 19 Mar 2019 Sal, 13:36 tarihinde ?unu yazd?: > > >> On Tue, Mar 19, 2019 at 6:31 AM Eda Oktay via petsc-users wrote: > > >> Hello, > > >> > > >> I am trying to install PETSc with following configure options: > > >> > > >> ./configure --download-openmpi --download-openblas --download-slepc --download-cmake --download-metis --download-parmetis > > >> > > >> Compilation is done but after the following command, I got an error: > > >> > > >> make PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.4 PETSC_ARCH=arch-linux2-c-debug all > > >> > > >> *** Building slepc *** > > >> **************************ERROR************************************* > > >> Error building slepc. Check arch-linux2-c-debug/lib/petsc/conf/slepc.log > > >> > > >> We need slepc.log > > >> > > >> Thanks, > > >> > > >> Matt > > >> > > >> ******************************************************************** > > >> /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/petsc/conf/petscrules:46: recipe for target 'slepcbuild' failed > > >> make[1]: *** [slepcbuild] Error 1 > > >> make[1]: Leaving directory '/home/slurm_local/e200781/petsc-3.10.4' > > >> **************************ERROR************************************* > > >> Error during compile, check arch-linux2-c-debug/lib/petsc/conf/make.log > > >> Send it and arch-linux2-c-debug/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov > > >> ******************************************************************** > > >> makefile:30: recipe for target 'all' failed > > >> make: *** [all] Error 1 > > >> > > >> How can I fix the problem? > > >> > > >> Thank you, > > >> > > >> Eda > > >> > > >> > > >> -- > > >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > >> -- Norbert Wiener > > >> > > >> https://www.cse.buffalo.edu/~knepley/ > > > > > > From finnkochinski at keemail.me Tue Mar 19 07:46:13 2019 From: finnkochinski at keemail.me (finnkochinski at keemail.me) Date: Tue, 19 Mar 2019 13:46:13 +0100 (CET) Subject: [petsc-users] [petsc4py] DMPlexCreateFromDAG and other missing functions In-Reply-To: References: <> Message-ID: Thanks Matt, I tried your suggestions to implement DMPlexCreateFromDAG in petsc4py: 1. In petsc4py/src/PETSc/petscdmplex.pxi I uncommented: int DMMPlexCreateFromDAG(PetscDM,PetscInt,const_PetscInt[],const_PetscInt[], const_PetscInt[],const_PetscInt[],const_PetscScalar[]) 2. In petsc4py/src/PETSc/DMPlex.pyx I added: def createFromDAG(self, depth, numPoints, coneSize, cones, coneOrientations, vertexCoords): cdef PetscInt cdepth = asInt(depth) cdef const PetscInt *cnumPoints = NULL cdef const PetscInt *cconeSize = NULL cdef const PetscInt *ccones = NULL cdef const PetscInt *cconeOrientations = NULL cdef const PetscScalar *cvertexCoords = NULL cnumPoints = PyArray_DATA(numPoints) cconeSize = PyArray_DATA(coneSize) ccones = PyArray_DATA(cones) cconeOrientations = PyArray_DATA(coneOrientations) cvertexCoords = PyArray_DATA(vertexCoords) CHKERR( DMPlexCreateFromDAG(self.dm, cdepth, cnumPoints, cconeSize, ccones, cconeOrientations, cvertexCoords) ) return self I am testing this function using this snippet (following https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexCreateFromDAG.html ): import petsc4py import numpy as np import sys petsc4py.init(sys.argv) from petsc4py import PETSc dm=PETSc.DMPlex().create() dm.setType(PETSc.DM.Type.PLEX) numPoints=np.array([4,2]) coneSize=np.array([3,3,0,0,0,0]) cones=np.array([2,3,4, 3,5,4]) coneOrientations=np.array([0,0,0, 0,0,0]) vertexCoords=np.array([-1,0, 0,-1, 0,1, 1,0]) depth=1 dm.createFromDAG(depth,numPoints,coneSize,cones,coneOrientations,vertexCoords) It fails with output: Traceback (most recent call last): ? File "test.py", line 16, in ??? dm.createFromDAG(depth,numPoints,coneSize,cones,coneOrientations,vertexCoords) ? File "PETSc/DMPlex.pyx", line 64, in petsc4py.PETSc.DMPlex.createFromDAG petsc4py.PETSc.Error: error code 63 [0] DMPlexCreateFromDAG() line 1669 in /build/petsc-vurd6G/petsc-3.7.7+dfsg1/src/dm/impls/plex/plexcreate.c [0] DMPlexSetCone() line 1066 in /build/petsc-vurd6G/petsc-3.7.7+dfsg1/src/dm/impls/plex/plex.c [0] Argument out of range [0] Cone point 4 is not in the valid range [0, 4) Someone can spot the problem in my python wrapping attempts? I assume my test.py snippet should be fine. At least, the equivalent C-snippet runs without problems: #include int main(int argc,char **argv){ ? PetscInitialize(&argc, &argv, NULL, NULL); ? DM dm; ? int dim=2; ? DMPlexCreate(PETSC_COMM_WORLD,&dm); ? DMSetType(dm, DMPLEX); ? DMSetDimension(dm,dim); ? int depth=1; ? int numPoints[]={4,2}; ? int coneSize[]={3,3,0,0,0,0}; ? int cones[]={2, 3, 4,? 3, 5, 4}; ? int coneOrientations[]={0,0,0, 0,0,0}; ? double vertexCoords[]={-1,0, 0,-1, 0,1, 1,0}; ? DMPlexCreateFromDAG(dm, depth, numPoints, coneSize, cones, coneOrientations,vertexCoords); ? PetscFinalize(); } regards Chris Mar 8, 2019, 6:38 PM by knepley at gmail.com: > On Fri, Mar 8, 2019 at 11:02 AM Chris Finn via petsc-users <> petsc-users at mcs.anl.gov > > wrote: > >> Dear petsc4py experts, >> I'd like to ask why several PETSc functions are not wrapped in petsc4py. I'd need to use DMPlexCreateFromDAG from python. Could you explain with this function as an example why there is no python wrapper available? Do I have to expect severe difficulties when I try this myself - impossible data structures, memory management or something else? >> > > Lisandro is the expert, but I will try answering. The main problem is just time. There is no documentation for contributing, but what I do > is copy a function that is pretty much like the one I want. So I think DMPlexCreateFromCellList() is wrapped, and it looks almost the same. > > ? Thanks, > > ? ? Matt > ? > >> Then, if it was just lack of time that prevented these functions from being available in petsc4py but if it could be done easily: >> Is the wrapping process of petsc4py documented somewhere? Or do I have to browse the sources to get an understanding? Do you use swig, clif, boost.python or something else? >> >> Is it possible to write another (small) python extension for the missing functions independent from petsc4py that allows me to pass PETSc structures back and forth between the two? Or is it necessary to have /one/ complete wrapper, because interoperability is not possible otherwise? >> >> regards >> Chris >> >> -- >> Securely sent with Tutanota. Get your own encrypted, ad-free mailbox: >> https://tutanota.com >> > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From finnkochinski at keemail.me Tue Mar 19 07:54:11 2019 From: finnkochinski at keemail.me (finnkochinski at keemail.me) Date: Tue, 19 Mar 2019 13:54:11 +0100 (CET) Subject: [petsc-users] [petsc4py] DMPlexCreateFromDAG and other missing functions In-Reply-To: References: <> Message-ID: I am sorry for the bad formatting of my createFromDAG function. I attach this function, and I hope it will be easier to read. Mar 19, 2019, 12:46 PM by finnkochinski at keemail.me: > Thanks Matt, > I tried your suggestions to implement DMPlexCreateFromDAG in petsc4py: > > > 1. In > petsc4py/src/PETSc/petscdmplex.pxi> I uncommented: > > > int> > DMMPlexCreateFromDAG> (> PetscDM> ,> PetscInt> ,> const_PetscInt> [],> const_PetscInt> [], > const_PetscInt> [],> const_PetscInt> [],> const_PetscScalar> []) > > 2. In > petsc4py/src/PETSc/DMPlex.pyx> I added: > > > def> > createFromDAG> (> self> , > depth> , > numPoints> , > coneSize> , > cones> , > coneOrientations> , > vertexCoords> ): > cdef> > PetscInt> > cdepth> > => > asInt> (> depth> ) > cdef> > const> > PetscInt> > *> cnumPoints> > => > NULL> > cdef> > const> > PetscInt> > *> cconeSize> > => > NULL> > cdef> > const> > PetscInt> > *> ccones> > => > NULL> > cdef> > const> > PetscInt> > *> cconeOrientations> > => > NULL> > cdef> > const> > PetscScalar> > *> cvertexCoords> > => > NULL> > cnumPoints> > => > <> const> > PetscInt> > *> >> > PyArray_DATA> (> numPoints> ) > cconeSize> > => > <> const> > PetscInt> > *> >> > PyArray_DATA> (> coneSize> ) > ccones> > => > <> const> > PetscInt> > *> >> > PyArray_DATA> (> cones> ) > cconeOrientations> > => > <> const> > PetscInt> > *> >> > PyArray_DATA> (> coneOrientations> ) > cvertexCoords> > => > <> const> > PetscScalar> > *> >> > PyArray_DATA> (> vertexCoords> ) > CHKERR> ( > DMPlexCreateFromDAG> (> self> .> dm> , > cdepth> , > cnumPoints> , > cconeSize> , > ccones> , > cconeOrientations> , > cvertexCoords> ) ) > return> > self > I am testing this function using this snippet (following > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexCreateFromDAG.html > ): > > import petsc4py > import numpy as np > import sys > petsc4py.init(sys.argv) > from petsc4py import PETSc > > dm=PETSc.DMPlex().create() > dm.setType(PETSc.DM.Type.PLEX) > > numPoints=np.array([4,2]) > coneSize=np.array([3,3,0,0,0,0]) > cones=np.array([2,3,4, 3,5,4]) > coneOrientations=np.array([0,0,0, 0,0,0]) > vertexCoords=np.array([-1,0, 0,-1, 0,1, 1,0]) > depth=1 > dm.createFromDAG(depth,numPoints,coneSize,cones,coneOrientations,vertexCoords) > > It fails with output: > > Traceback (most recent call last): > ? File "test.py", line 16, in > ??? dm.createFromDAG(depth,numPoints,coneSize,cones,coneOrientations,vertexCoords) > ? File "PETSc/DMPlex.pyx", line 64, in petsc4py.PETSc.DMPlex.createFromDAG > petsc4py.PETSc.Error: error code 63 > [0] DMPlexCreateFromDAG() line 1669 in /build/petsc-vurd6G/petsc-3.7.7+dfsg1/src/dm/impls/plex/plexcreate.c > [0] DMPlexSetCone() line 1066 in /build/petsc-vurd6G/petsc-3.7.7+dfsg1/src/dm/impls/plex/plex.c > [0] Argument out of range > [0] Cone point 4 is not in the valid range [0, 4) > > Someone can spot the problem in my python wrapping attempts? > I assume my test.py snippet should be fine. At least, the equivalent C-snippet runs without problems: > > #include > int main(int argc,char **argv){ > ? PetscInitialize(&argc, &argv, NULL, NULL); > ? DM dm; > ? int dim=2; > ? DMPlexCreate(PETSC_COMM_WORLD,&dm); > ? DMSetType(dm, DMPLEX); > ? DMSetDimension(dm,dim); > ? int depth=1; > ? int numPoints[]={4,2}; > ? int coneSize[]={3,3,0,0,0,0}; > ? int cones[]={2, 3, 4,? 3, 5, 4}; > ? int coneOrientations[]={0,0,0, 0,0,0}; > ? double vertexCoords[]={-1,0, 0,-1, 0,1, 1,0}; > ? DMPlexCreateFromDAG(dm, depth, numPoints, coneSize, cones, coneOrientations,vertexCoords); > ? PetscFinalize(); > } > > regards > Chris > > Mar 8, 2019, 6:38 PM by > knepley at gmail.com > : > >> On Fri, Mar 8, 2019 at 11:02 AM Chris Finn via petsc-users <>> petsc-users at mcs.anl.gov >> > wrote: >> >>> Dear petsc4py experts, >>> I'd like to ask why several PETSc functions are not wrapped in petsc4py. I'd need to use DMPlexCreateFromDAG from python. Could you explain with this function as an example why there is no python wrapper available? Do I have to expect severe difficulties when I try this myself - impossible data structures, memory management or something else? >>> >> >> Lisandro is the expert, but I will try answering. The main problem is just time. There is no documentation for contributing, but what I do >> is copy a function that is pretty much like the one I want. So I think DMPlexCreateFromCellList() is wrapped, and it looks almost the same. >> >> ? Thanks, >> >> ? ? Matt >> ? >> >>> Then, if it was just lack of time that prevented these functions from being available in petsc4py but if it could be done easily: >>> Is the wrapping process of petsc4py documented somewhere? Or do I have to browse the sources to get an understanding? Do you use swig, clif, boost.python or something else? >>> >>> Is it possible to write another (small) python extension for the missing functions independent from petsc4py that allows me to pass PETSc structures back and forth between the two? Or is it necessary to have /one/ complete wrapper, because interoperability is not possible otherwise? >>> >>> regards >>> Chris >>> >>> -- >>> Securely sent with Tutanota. Get your own encrypted, ad-free mailbox: >>> https://tutanota.com >>> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: dmplexnew.pyx Type: application/octet-stream Size: 883 bytes Desc: not available URL: From mfadams at lbl.gov Tue Mar 19 08:10:42 2019 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 19 Mar 2019 09:10:42 -0400 Subject: [petsc-users] PCFieldSplit gives different results for direct and iterative solver In-Reply-To: References: <0a59ff504728c756b0965f201a97ab7c@cam.ac.uk> Message-ID: > > > > -fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type gamg > -fieldsplit_pressure_ksp_type minres -fieldsplit_pressure_pc_type none > You should use cg for the ksp_type with gamg if you are symmetric and gmres if not (you can try cg even if it is mildly asymmetric). minres is for indefinite symmetric, but you probably are SPD and should use cg. -------------- next part -------------- An HTML attachment was scrubbed... URL: From yyang85 at stanford.edu Tue Mar 19 10:58:46 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Tue, 19 Mar 2019 15:58:46 +0000 Subject: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in Matlab Message-ID: Hello team, Currently we're using the PETSc binary file format to save Vecs and Mats and visualize them in Matlab. It looks like HDF5 works more efficiently with large data sets (faster I/O), and we're wondering if PETSc Vecs/Mats saved in HDF5 viewer can be visualized in Matlab as well? Thanks for your help, Yuyun -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 19 12:09:48 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 19 Mar 2019 13:09:48 -0400 Subject: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in Matlab In-Reply-To: References: Message-ID: On Tue, Mar 19, 2019 at 11:58 AM Yuyun Yang via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello team, > > > > Currently we?re using the PETSc binary file format to save Vecs and Mats > and visualize them in Matlab. It looks like HDF5 works more efficiently > with large data sets (faster I/O), and we?re wondering if PETSc Vecs/Mats > saved in HDF5 viewer can be visualized in Matlab as well? > We do not have code for that. I am using Paraview to look at HDF5 since everything I do is on 2D and 3D meshes. Note that HDF5 is not likely to have faster I/O than the PETSc binary. Thanks, Matt > > > Thanks for your help, > > Yuyun > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yyang85 at stanford.edu Tue Mar 19 12:17:48 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Tue, 19 Mar 2019 17:17:48 +0000 Subject: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in Matlab In-Reply-To: References: Message-ID: Got it, thanks! From: Matthew Knepley Sent: Tuesday, March 19, 2019 10:10 AM To: Yuyun Yang Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in Matlab On Tue, Mar 19, 2019 at 11:58 AM Yuyun Yang via petsc-users > wrote: Hello team, Currently we?re using the PETSc binary file format to save Vecs and Mats and visualize them in Matlab. It looks like HDF5 works more efficiently with large data sets (faster I/O), and we?re wondering if PETSc Vecs/Mats saved in HDF5 viewer can be visualized in Matlab as well? We do not have code for that. I am using Paraview to look at HDF5 since everything I do is on 2D and 3D meshes. Note that HDF5 is not likely to have faster I/O than the PETSc binary. Thanks, Matt Thanks for your help, Yuyun -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From zakaryah at gmail.com Tue Mar 19 13:54:02 2019 From: zakaryah at gmail.com (zakaryah) Date: Tue, 19 Mar 2019 14:54:02 -0400 Subject: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in Matlab In-Reply-To: References: Message-ID: Hi Yuyun, I'm not sure exactly what you want to do but I use Matlab to work with and visualize HDF5 files from PETSc all the time. Matlab has h5info and h5read routines, then I visualize with my own routines. Is there something specific you need from Matlab? On Tue, Mar 19, 2019 at 1:18 PM Yuyun Yang via petsc-users < petsc-users at mcs.anl.gov> wrote: > Got it, thanks! > > > > *From:* Matthew Knepley > *Sent:* Tuesday, March 19, 2019 10:10 AM > *To:* Yuyun Yang > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in > Matlab > > > > On Tue, Mar 19, 2019 at 11:58 AM Yuyun Yang via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hello team, > > > > Currently we?re using the PETSc binary file format to save Vecs and Mats > and visualize them in Matlab. It looks like HDF5 works more efficiently > with large data sets (faster I/O), and we?re wondering if PETSc Vecs/Mats > saved in HDF5 viewer can be visualized in Matlab as well? > > > > We do not have code for that. I am using Paraview to look at HDF5 since > everything I do is on 2D and 3D meshes. Note that HDF5 is not likely to > have faster I/O than the PETSc binary. > > > > Thanks, > > > > Matt > > > > > > Thanks for your help, > > Yuyun > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yyang85 at stanford.edu Tue Mar 19 14:04:45 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Tue, 19 Mar 2019 19:04:45 +0000 Subject: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in Matlab In-Reply-To: References: , Message-ID: It's simply for visualization purposes. I wasn't sure if HDF5 would perform better than binary, and what specific functions are needed to load the PETSc vectors/matrices, so wanted to ask for some advice here. Since Matt mentioned it's not likely to be much faster than binary, I guess there is no need to make the change? So running h5read will load the vector from the hdf5 file directly to a Matlab vector? And similarly so for matrices? Thanks, Yuyun Get Outlook for iOS ________________________________ From: petsc-users on behalf of zakaryah via petsc-users Sent: Tuesday, March 19, 2019 11:54:02 AM To: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in Matlab Hi Yuyun, I'm not sure exactly what you want to do but I use Matlab to work with and visualize HDF5 files from PETSc all the time. Matlab has h5info and h5read routines, then I visualize with my own routines. Is there something specific you need from Matlab? On Tue, Mar 19, 2019 at 1:18 PM Yuyun Yang via petsc-users > wrote: Got it, thanks! From: Matthew Knepley > Sent: Tuesday, March 19, 2019 10:10 AM To: Yuyun Yang > Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in Matlab On Tue, Mar 19, 2019 at 11:58 AM Yuyun Yang via petsc-users > wrote: Hello team, Currently we?re using the PETSc binary file format to save Vecs and Mats and visualize them in Matlab. It looks like HDF5 works more efficiently with large data sets (faster I/O), and we?re wondering if PETSc Vecs/Mats saved in HDF5 viewer can be visualized in Matlab as well? We do not have code for that. I am using Paraview to look at HDF5 since everything I do is on 2D and 3D meshes. Note that HDF5 is not likely to have faster I/O than the PETSc binary. Thanks, Matt Thanks for your help, Yuyun -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From zakaryah at gmail.com Tue Mar 19 14:22:07 2019 From: zakaryah at gmail.com (zakaryah) Date: Tue, 19 Mar 2019 15:22:07 -0400 Subject: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in Matlab In-Reply-To: References: Message-ID: Regarding differences in speed, I defer to Matt - I know nothing about that. Matlab's h5read, passed the name of the vector, will read that vector from the HDF5 file and return a Matlab vector, yes. I am not sure about sparse matrices - I know HDF5 supports indexing, but I don't think PETSc supports MatView with sparse matrices and HDF5. It doesn't make much sense to me to do that anyway - just use binary. Dense matrices might work. On Tue, Mar 19, 2019 at 3:04 PM Yuyun Yang wrote: > It's simply for visualization purposes. I wasn't sure if HDF5 would > perform better than binary, and what specific functions are needed to load > the PETSc vectors/matrices, so wanted to ask for some advice here. Since > Matt mentioned it's not likely to be much faster than binary, I guess there > is no need to make the change? > > So running h5read will load the vector from the hdf5 file directly to a > Matlab vector? And similarly so for matrices? > > Thanks, > Yuyun > > Get Outlook for iOS > ------------------------------ > *From:* petsc-users on behalf of > zakaryah via petsc-users > *Sent:* Tuesday, March 19, 2019 11:54:02 AM > *To:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in > Matlab > > Hi Yuyun, > > I'm not sure exactly what you want to do but I use Matlab to work with and > visualize HDF5 files from PETSc all the time. Matlab has h5info and h5read > routines, then I visualize with my own routines. Is there something > specific you need from Matlab? > > On Tue, Mar 19, 2019 at 1:18 PM Yuyun Yang via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Got it, thanks! >> >> >> >> *From:* Matthew Knepley >> *Sent:* Tuesday, March 19, 2019 10:10 AM >> *To:* Yuyun Yang >> *Cc:* petsc-users at mcs.anl.gov >> *Subject:* Re: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in >> Matlab >> >> >> >> On Tue, Mar 19, 2019 at 11:58 AM Yuyun Yang via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >> Hello team, >> >> >> >> Currently we?re using the PETSc binary file format to save Vecs and Mats >> and visualize them in Matlab. It looks like HDF5 works more efficiently >> with large data sets (faster I/O), and we?re wondering if PETSc Vecs/Mats >> saved in HDF5 viewer can be visualized in Matlab as well? >> >> >> >> We do not have code for that. I am using Paraview to look at HDF5 since >> everything I do is on 2D and 3D meshes. Note that HDF5 is not likely to >> have faster I/O than the PETSc binary. >> >> >> >> Thanks, >> >> >> >> Matt >> >> >> >> >> >> Thanks for your help, >> >> Yuyun >> >> >> >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Mar 19 15:23:19 2019 From: jed at jedbrown.org (Jed Brown) Date: Tue, 19 Mar 2019 14:23:19 -0600 Subject: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in Matlab In-Reply-To: References: Message-ID: <87k1guvcjs.fsf@jedbrown.org> Yuyun Yang via petsc-users writes: > It's simply for visualization purposes. I wasn't sure if HDF5 would perform better than binary, and what specific functions are needed to load the PETSc vectors/matrices, so wanted to ask for some advice here. Since Matt mentioned it's not likely to be much faster than binary, I guess there is no need to make the change? There is no speed benefit. The advantage of HDF5 is that it supports better metadata, including the data types and sizes. The PETSc data format is quick, dirty, and fast. From mbuerkle at web.de Tue Mar 19 20:07:39 2019 From: mbuerkle at web.de (Marius Buerkle) Date: Wed, 20 Mar 2019 02:07:39 +0100 Subject: [petsc-users] MatCompositeMerge + MatCreateRedundantMatrix In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From yyang85 at stanford.edu Tue Mar 19 20:09:31 2019 From: yyang85 at stanford.edu (Yuyun Yang) Date: Wed, 20 Mar 2019 01:09:31 +0000 Subject: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in Matlab In-Reply-To: <87k1guvcjs.fsf@jedbrown.org> References: <87k1guvcjs.fsf@jedbrown.org> Message-ID: Sounds good, thanks for the advice! -----Original Message----- From: Jed Brown Sent: Tuesday, March 19, 2019 1:23 PM To: Yuyun Yang ; zakaryah ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Saving Vecs/Mats in HDF5 and visualizing in Matlab Yuyun Yang via petsc-users writes: > It's simply for visualization purposes. I wasn't sure if HDF5 would perform better than binary, and what specific functions are needed to load the PETSc vectors/matrices, so wanted to ask for some advice here. Since Matt mentioned it's not likely to be much faster than binary, I guess there is no need to make the change? There is no speed benefit. The advantage of HDF5 is that it supports better metadata, including the data types and sizes. The PETSc data format is quick, dirty, and fast. From knepley at gmail.com Tue Mar 19 20:11:03 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 19 Mar 2019 21:11:03 -0400 Subject: [petsc-users] MatCompositeMerge + MatCreateRedundantMatrix In-Reply-To: References: Message-ID: On Tue, Mar 19, 2019 at 9:07 PM Marius Buerkle wrote: > If it is not too complicated it would be nice if MPICreateSubMatricesMPI > would be avaible for other matrices than MPIAIJ. > Can you give us an idea why you would want such as thing? This usually helps people work on things. A good place to record such reasons is in an Issue: https://bitbucket.org/petsc/petsc/issues?status=new&status=open Thanks, Matt > > > On Mon, Mar 18, 2019 at 10:30 PM Marius Buerkle wrote: > >> I have another question >> regarding MatCreateRedundantMatrix and MPICreateSubMatricesMPI. The former >> works for MPIAIJ and MPIDENSE and the later only for MPIAIJ. Would it be >> possible to use MatCreateRedundantMatrix with a factored matrix >> > > We usually do not have direct access to the data for factored matrices > since it lives in the factorization package. > > >> and MPICreateSubMatricesMPI with dense and/or elemental matrices ? >> > > This would not be hard, it would just take time. The dense case is easy. I > think Elemental already has such an operation, but we would have to find it > and call it. > > Thanks, > > Matt > >> Indeed, was very easy to add. Are you going to include the Fortran >> interface for MPICreateSubMatricesMPI in future releases of PETSC ? >> >> Regarding my initial problem, thanks a lot. It works very well with >> MPICreateSubMatricesMPI and the solution can be implemented in a few >> lines. >> >> Thanks and Best, >> >> Marius >> >> >> On Tue, Mar 12, 2019 at 4:50 AM Marius Buerkle wrote: >> >>> I tried to follow your suggestions but it seems there is >>> no MPICreateSubMatricesMPI for Fortran. Is this correct? >>> >> >> We just have to write the binding. Its almost identical to >> MatCreateSubMatrices() in src/mat/interface/ftn-custom/zmatrixf.c >> >> Matt >> >>> >>> On Wed, Feb 20, 2019 at 6:57 PM Marius Buerkle wrote: >>> >>>> ok, I think I understand now. I will give it a try and if there is some >>>> trouble comeback to you. thanks. >>>> >>> >>> Cool. >>> >>> Matt >>> >>> >>>> >>>> marius >>>> >>>> On Tue, Feb 19, 2019 at 8:42 PM Marius Buerkle wrote: >>>> >>>>> ok, so it seems there is no straight forward way to transfer data >>>>> between PETSc matrices on different subcomms. Probably doing it by "hand" >>>>> extracting the matricies on the subcomms create a MPI_INTERCOMM transfering >>>>> the data to PETSC_COMM_WORLD and assembling them in a new PETSc matrix >>>>> would be possible, right? >>>>> >>>> >>>> That sounds too complicated. Why not just reverse >>>> MPICreateSubMatricesMPI()? Meaning make it collective on the whole big >>>> communicator, so that you can swap out all the subcommunicator for the >>>> aggregation call, just like we do in that function. >>>> Then its really just a matter of reversing the communication call. >>>> >>>> Matt >>>> >>>>> >>>>> On Tue, Feb 19, 2019 at 7:12 PM Marius Buerkle >>>>> wrote: >>>>> >>>>>> I see. This would work if the matrices are on different >>>>>> subcommumicators ? Is it possible to add this functionality ? >>>>>> >>>>> >>>>> Hmm, no. That is specialized to serial matrices. You need the inverse >>>>> of MatCreateSubMatricesMPI(). >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> marius >>>>>> >>>>>> >>>>>> You basically need the inverse of MatCreateSubmatrices(). I do not >>>>>> think we have that right now, but it could probably be done without too >>>>>> much trouble by looking at that code. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> On Tue, Feb 19, 2019 at 6:15 AM Marius Buerkle via petsc-users < >>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>> >>>>>>> Hi ! >>>>>>> >>>>>>> Is there some way to combine MatCompositeMerge >>>>>>> with MatCreateRedundantMatrix? I basically want to create copies of a >>>>>>> matrix from PETSC_COMM_WORLD to subcommunicators, do some work on each >>>>>>> subcommunicator and than gather the results back to PETSC_COMM_WORLD, >>>>>>> namely I want to sum the the invidual matrices from the subcommunicatos >>>>>>> component wise and get the resulting matrix on PETSC_COMM_WORLD. Is this >>>>>>> somehow possible without going through all the hassle of using MPI >>>>>>> directly? >>>>>>> >>>>>>> marius >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eda.oktay at metu.edu.tr Wed Mar 20 02:16:49 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Wed, 20 Mar 2019 10:16:49 +0300 Subject: [petsc-users] SLEPc Build Error In-Reply-To: <7624A7BD-0E79-400F-92EF-08186E98E8A8@dsic.upv.es> References: <7624A7BD-0E79-400F-92EF-08186E98E8A8@dsic.upv.es> Message-ID: Dear Professor Roman, I decided to install petsc-3.10.3 with slepc since I did it once succesfully. Thanks for your help. Eda Jose E. Roman , 19 Mar 2019 Sal, 15:15 tarihinde ?unu yazd?: > What is the output of 'make check' in $PETSC_DIR ? > > > El 19 mar 2019, a las 13:02, Eda Oktay escribi?: > > > > Without slepc, I configured petsc succesfully. Then I installed slepc > with following steps: > > > > export PETSC_ARCH=arch-linux2-c-debug > > export PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.4 > > export SLEPC_DIR=/home/slurm_local/e200781/slepc-3.10.2 > > cd slepc-3.10.2 > > ./configure > > > > However, I still get the error: > > > > Checking environment... done > > Checking PETSc installation... > > ERROR: Unable to link with PETSc > > ERROR: See "arch-linux2-c-debug/lib/slepc/conf/configure.log" file for > details > > > > Where the configure.log is: > > > > > ================================================================================ > > Starting Configure Run at Tue Mar 19 14:57:21 2019 > > Configure Options: > > Working directory: /home/slurm_local/e200781/slepc-3.10.2 > > Python version: > > 2.7.9 (default, Sep 25 2018, 20:42:16) > > [GCC 4.9.2] > > make: /usr/bin/make > > PETSc source directory: /home/slurm_local/e200781/petsc-3.10.4 > > PETSc install directory: > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug > > PETSc version: 3.10.4 > > PETSc architecture: arch-linux2-c-debug > > SLEPc source directory: /home/slurm_local/e200781/slepc-3.10.2 > > SLEPc version: 3.10.2 > > > ================================================================================ > > Checking PETSc installation... > > #include "petscsnes.h" > > int main() { > > Vec v; Mat m; KSP k; > > PetscInitializeNoArguments(); > > VecCreate(PETSC_COMM_WORLD,&v); > > MatCreate(PETSC_COMM_WORLD,&m); > > KSPCreate(PETSC_COMM_WORLD,&k); > > return 0; > > } > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc -o > checklink.o -c -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g3 > -I/home/slurm_local/e200781/petsc-3.10.4/include > -I/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/include > `pwd`/checklink.c > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc > -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas > -fstack-protector -fvisibility=hidden -g3 -o checklink checklink.o > -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib > -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib > -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib > -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.9 > -L/usr/lib/gcc/x86_64-linux-gnu/4.9 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > -L/lib/x86_64-linux-gnu -lpetsc -lopenblas -lparmetis -lmetis -lm -lX11 > -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi > -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl > > /usr/bin/ld: warning: libmpi.so.0, needed by > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, > may conflict with libmpi.so.40 > > /usr/bin/ld: warning: libmpi.so.0, needed by > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, > may conflict with libmpi.so.40 > > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: > undefined reference to `MatPartitioningParmetisSetRepartition' > > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: > undefined reference to `MatPartitioningCreate_Parmetis' > > collect2: error: ld returned 1 exit status > > makefile:2: recipe for target 'checklink' failed > > make: *** [checklink] Error 1 > > > > ERROR: Unable to link with PETSc > > > > Jose E. Roman , 19 Mar 2019 Sal, 14:08 tarihinde > ?unu yazd?: > > There seems to be a link problem with PETSc. > > Suggest re-configuring without the option --download-slepc > > Then, after building PETSc, try 'make check' to make sure that PETSc is > built correctly. Then install SLEPc afterwards. > > Jose > > > > > > > El 19 mar 2019, a las 11:58, Eda Oktay > escribi?: > > > > > > > ================================================================================ > > > Starting Configure Run at Tue Mar 19 11:53:05 2019 > > > Configure Options: > --prefix=/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug > > > Working directory: > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/externalpackages/git.slepc > > > Python version: > > > 2.7.9 (default, Sep 25 2018, 20:42:16) > > > [GCC 4.9.2] > > > make: /usr/bin/make > > > PETSc source directory: /home/slurm_local/e200781/petsc-3.10.4 > > > PETSc install directory: > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug > > > PETSc version: 3.10.4 > > > PETSc architecture: arch-linux2-c-debug > > > SLEPc source directory: > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/externalpackages/git.slepc > > > SLEPc install directory: > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug > > > SLEPc version: 3.10.1 > > > > ================================================================================ > > > Checking PETSc installation... > > > #include "petscsnes.h" > > > int main() { > > > Vec v; Mat m; KSP k; > > > PetscInitializeNoArguments(); > > > VecCreate(PETSC_COMM_WORLD,&v); > > > MatCreate(PETSC_COMM_WORLD,&m); > > > KSPCreate(PETSC_COMM_WORLD,&k); > > > return 0; > > > } > > > make[2]: Entering directory '/tmp/slepc-2F1MtJ' > > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc > -o checklink.o -c -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g3 > -I/home/slurm_local/e200781/petsc-3.10.4/include > -I/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/include > `pwd`/checklink.c > > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/bin/mpicc > -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas > -fstack-protector -fvisibility=hidden -g3 -o checklink checklink.o > -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib > -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib > -Wl,-rpath,/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib > -L/home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.9 > -L/usr/lib/gcc/x86_64-linux-gnu/4.9 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > -L/lib/x86_64-linux-gnu -lpetsc -lopenblas -lparmetis -lmetis -lm -lX11 > -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi > -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl > > > /usr/bin/ld: warning: libmpi.so.0, needed by > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, > may conflict with libmpi.so.40 > > > /usr/bin/ld: warning: libmpi.so.0, needed by > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libopenblas.so, > may conflict with libmpi.so.40 > > > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: > undefined reference to `MatPartitioningParmetisSetRepartition' > > > > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/libpetsc.so: > undefined reference to `MatPartitioningCreate_Parmetis' > > > collect2: error: ld returned 1 exit status > > > makefile:2: recipe for target 'checklink' failed > > > make[2]: *** [checklink] Error 1 > > > make[2]: Leaving directory '/tmp/slepc-2F1MtJ' > > > > > > ERROR: Unable to link with PETSc > > > > > > > > > Jose E. Roman , 19 Mar 2019 Sal, 13:49 tarihinde > ?unu yazd?: > > > Correction: the correct path is > $PETSC_DIR/$PETSC_ARCH/externalpackages/git.slepc/$PETSC_ARCH/lib/slepc/conf/configure.log > > > > > > > > > > > > > El 19 mar 2019, a las 11:46, Jose E. Roman via petsc-users < > petsc-users at mcs.anl.gov> escribi?: > > > > > > > > And what is in > $SLEPC_DIR/arch-linux2-c-debug/lib/slepc/conf/configure.log ? > > > > Jose > > > > > > > >> El 19 mar 2019, a las 11:41, Eda Oktay via petsc-users < > petsc-users at mcs.anl.gov> escribi?: > > > >> > > > >> This is slepc.log: > > > >> > > > >> Checking environment... done > > > >> Checking PETSc installation... > > > >> ERROR: Unable to link with PETSc > > > >> ERROR: See "arch-linux2-c-debug/lib/slepc/conf/configure.log" file > for details > > > >> > > > >> > > > >> > > > >> Matthew Knepley , 19 Mar 2019 Sal, 13:36 > tarihinde ?unu yazd?: > > > >> On Tue, Mar 19, 2019 at 6:31 AM Eda Oktay via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > >> Hello, > > > >> > > > >> I am trying to install PETSc with following configure options: > > > >> > > > >> ./configure --download-openmpi --download-openblas --download-slepc > --download-cmake --download-metis --download-parmetis > > > >> > > > >> Compilation is done but after the following command, I got an error: > > > >> > > > >> make PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.4 > PETSC_ARCH=arch-linux2-c-debug all > > > >> > > > >> *** Building slepc *** > > > >> **************************ERROR************************************* > > > >> Error building slepc. Check > arch-linux2-c-debug/lib/petsc/conf/slepc.log > > > >> > > > >> We need slepc.log > > > >> > > > >> Thanks, > > > >> > > > >> Matt > > > >> > > > >> ******************************************************************** > > > >> > /home/slurm_local/e200781/petsc-3.10.4/arch-linux2-c-debug/lib/petsc/conf/petscrules:46: > recipe for target 'slepcbuild' failed > > > >> make[1]: *** [slepcbuild] Error 1 > > > >> make[1]: Leaving directory '/home/slurm_local/e200781/petsc-3.10.4' > > > >> **************************ERROR************************************* > > > >> Error during compile, check > arch-linux2-c-debug/lib/petsc/conf/make.log > > > >> Send it and arch-linux2-c-debug/lib/petsc/conf/configure.log to > petsc-maint at mcs.anl.gov > > > >> ******************************************************************** > > > >> makefile:30: recipe for target 'all' failed > > > >> make: *** [all] Error 1 > > > >> > > > >> How can I fix the problem? > > > >> > > > >> Thank you, > > > >> > > > >> Eda > > > >> > > > >> > > > >> -- > > > >> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > >> -- Norbert Wiener > > > >> > > > >> https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eda.oktay at metu.edu.tr Wed Mar 20 02:25:52 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Wed, 20 Mar 2019 10:25:52 +0300 Subject: [petsc-users] petscmat.h: No such file or directory Message-ID: Hello, I am trying to compile a parallel program DENEME_TEMIZ_ENYENI_FINAL.c in PETSc. I wrote the following makefile but it says that there is nothing to do with the program: export CLINKER = gcc DENEME_TEMIZ_ENYENI_FINAL: DENEME_TEMIZ_ENYENI_FINAL.o chkopts -${CLINKER} -o DENEME_TEMIZ_ENYENI_FINAL DENEME_TEMIZ_ENYENI_FINAL.o ${SLEPC_SYS_LIB} ${RM} DENEME_TEMIZ_ENYENI_FINAL.o include ${SLEPC_DIR}/lib/slepc/conf/slepc_common Then I tried to compile it via mpicc DENEME_TEMIZ_ENYENI_FINAL.c but I got the error "petscmat.h: No such file or directory". I wrote the following lines before compiling the program: export PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.3 export SLEPC_DIR=/home/slurm_local/e200781/petsc-3.10.3/arch-linux2-c-debug export PETSC_ARCH=arch-linux2-c-debug Why I cannot compile the program? Thanks, Eda -------------- next part -------------- An HTML attachment was scrubbed... URL: From eda.oktay at metu.edu.tr Wed Mar 20 02:52:21 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Wed, 20 Mar 2019 10:52:21 +0300 Subject: [petsc-users] MatGetRow_MPIAIJ error Message-ID: Hello, I wrote a code computing element wise absolute value of a matrix. When I run the code sequentially, it works. However, when I try to use it in parallel with the same matrix, I get the following error: [1]PETSC ERROR: Argument out of range [1]PETSC ERROR: Only local rows The absolute value code is: ierr = MatGetSize(A,&n,NULL);CHKERRQ(ierr); ierr = MatDuplicate(A,MAT_COPY_VALUES,AbsA);CHKERRQ(ierr); for (i=0; i From jroman at dsic.upv.es Wed Mar 20 04:39:48 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Wed, 20 Mar 2019 10:39:48 +0100 Subject: [petsc-users] petscmat.h: No such file or directory In-Reply-To: References: Message-ID: <75CF7A6A-340F-48A1-A07F-BF9E2C8B45FA@dsic.upv.es> You must compile your program with 'make DENEME_TEMIZ_ENYENI_FINAL' or just 'make', not with 'mpicc DENEME_TEMIZ_ENYENI_FINAL.c' > El 20 mar 2019, a las 8:25, Eda Oktay via petsc-users escribi?: > > Hello, > > I am trying to compile a parallel program DENEME_TEMIZ_ENYENI_FINAL.c in PETSc. I wrote the following makefile but it says that there is nothing to do with the program: > > export CLINKER = gcc > > DENEME_TEMIZ_ENYENI_FINAL: DENEME_TEMIZ_ENYENI_FINAL.o chkopts > -${CLINKER} -o DENEME_TEMIZ_ENYENI_FINAL DENEME_TEMIZ_ENYENI_FINAL.o ${SLEPC_SYS_LIB} > ${RM} DENEME_TEMIZ_ENYENI_FINAL.o > include ${SLEPC_DIR}/lib/slepc/conf/slepc_common > > Then I tried to compile it via mpicc DENEME_TEMIZ_ENYENI_FINAL.c > but I got the error "petscmat.h: No such file or directory". > > I wrote the following lines before compiling the program: > export PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.3 > export SLEPC_DIR=/home/slurm_local/e200781/petsc-3.10.3/arch-linux2-c-debug > export PETSC_ARCH=arch-linux2-c-debug > > Why I cannot compile the program? > > Thanks, > > Eda > From jroman at dsic.upv.es Wed Mar 20 04:45:13 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Wed, 20 Mar 2019 10:45:13 +0100 Subject: [petsc-users] MatGetRow_MPIAIJ error In-Reply-To: References: Message-ID: <24BB00B8-B4DE-44F0-8A4D-CC7F46BB7BCE@dsic.upv.es> Mat objects in PETSc are parallel, meaning that the data structure is distributed. You should use MatGetOwnershipRange() so that each process accesses its local rows only. https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatGetOwnershipRange.html This is very basic usage. You should read the manual and the online documentation including examples, and try hard to solve these issues before asking to the list. Jose > El 20 mar 2019, a las 8:52, Eda Oktay via petsc-users escribi?: > > Hello, > > I wrote a code computing element wise absolute value of a matrix. When I run the code sequentially, it works. However, when I try to use it in parallel with the same matrix, I get the following error: > > [1]PETSC ERROR: Argument out of range > [1]PETSC ERROR: Only local rows > > The absolute value code is: > > ierr = MatGetSize(A,&n,NULL);CHKERRQ(ierr); > ierr = MatDuplicate(A,MAT_COPY_VALUES,AbsA);CHKERRQ(ierr); > > for (i=0; i ierr = MatGetRow(A,i,&nc,&aj,&aa);CHKERRQ(ierr); > > PetscMalloc1(nc,&absaa); > for (j=0; j absaa[j] = fabs(aa[j]); > } > > ierr = MatSetValues(*AbsA,1,&i,nc,aj,absaa,INSERT_VALUES);CHKERRQ(ierr); > ierr = MatRestoreRow(A,i,&nc,&aj,&aa);CHKERRQ(ierr); > } > > I didn't understand how I fix this problem since I am new to PETSc. > > Thanks > > Eda From eda.oktay at metu.edu.tr Wed Mar 20 05:15:23 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Wed, 20 Mar 2019 13:15:23 +0300 Subject: [petsc-users] petscmat.h: No such file or directory In-Reply-To: <75CF7A6A-340F-48A1-A07F-BF9E2C8B45FA@dsic.upv.es> References: <75CF7A6A-340F-48A1-A07F-BF9E2C8B45FA@dsic.upv.es> Message-ID: Before using mpicc, I just tried to compile with make DENEME_ENYENI-FINAL.c but it says there is nothing to do. On Wed, Mar 20, 2019, 12:39 PM Jose E. Roman wrote: > You must compile your program with 'make DENEME_TEMIZ_ENYENI_FINAL' or > just 'make', not with 'mpicc DENEME_TEMIZ_ENYENI_FINAL.c' > > > El 20 mar 2019, a las 8:25, Eda Oktay via petsc-users < > petsc-users at mcs.anl.gov> escribi?: > > > > Hello, > > > > I am trying to compile a parallel program DENEME_TEMIZ_ENYENI_FINAL.c in > PETSc. I wrote the following makefile but it says that there is nothing to > do with the program: > > > > export CLINKER = gcc > > > > DENEME_TEMIZ_ENYENI_FINAL: DENEME_TEMIZ_ENYENI_FINAL.o chkopts > > -${CLINKER} -o DENEME_TEMIZ_ENYENI_FINAL > DENEME_TEMIZ_ENYENI_FINAL.o ${SLEPC_SYS_LIB} > > ${RM} DENEME_TEMIZ_ENYENI_FINAL.o > > include ${SLEPC_DIR}/lib/slepc/conf/slepc_common > > > > Then I tried to compile it via mpicc DENEME_TEMIZ_ENYENI_FINAL.c > > but I got the error "petscmat.h: No such file or directory". > > > > I wrote the following lines before compiling the program: > > export PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.3 > > export > SLEPC_DIR=/home/slurm_local/e200781/petsc-3.10.3/arch-linux2-c-debug > > export PETSC_ARCH=arch-linux2-c-debug > > > > Why I cannot compile the program? > > > > Thanks, > > > > Eda > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Wed Mar 20 05:18:16 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Wed, 20 Mar 2019 11:18:16 +0100 Subject: [petsc-users] petscmat.h: No such file or directory In-Reply-To: References: <75CF7A6A-340F-48A1-A07F-BF9E2C8B45FA@dsic.upv.es> Message-ID: <163FD986-9FC3-4111-B8DC-13AE6B943678@dsic.upv.es> Do not add the ".c" extension. Read my answer: 'make DENEME_TEMIZ_ENYENI_FINAL' > El 20 mar 2019, a las 11:15, Eda Oktay escribi?: > > Before using mpicc, I just tried to compile with make DENEME_ENYENI-FINAL.c but it says there is nothing to do. > > On Wed, Mar 20, 2019, 12:39 PM Jose E. Roman wrote: > You must compile your program with 'make DENEME_TEMIZ_ENYENI_FINAL' or just 'make', not with 'mpicc DENEME_TEMIZ_ENYENI_FINAL.c' > > > El 20 mar 2019, a las 8:25, Eda Oktay via petsc-users escribi?: > > > > Hello, > > > > I am trying to compile a parallel program DENEME_TEMIZ_ENYENI_FINAL.c in PETSc. I wrote the following makefile but it says that there is nothing to do with the program: > > > > export CLINKER = gcc > > > > DENEME_TEMIZ_ENYENI_FINAL: DENEME_TEMIZ_ENYENI_FINAL.o chkopts > > -${CLINKER} -o DENEME_TEMIZ_ENYENI_FINAL DENEME_TEMIZ_ENYENI_FINAL.o ${SLEPC_SYS_LIB} > > ${RM} DENEME_TEMIZ_ENYENI_FINAL.o > > include ${SLEPC_DIR}/lib/slepc/conf/slepc_common > > > > Then I tried to compile it via mpicc DENEME_TEMIZ_ENYENI_FINAL.c > > but I got the error "petscmat.h: No such file or directory". > > > > I wrote the following lines before compiling the program: > > export PETSC_DIR=/home/slurm_local/e200781/petsc-3.10.3 > > export SLEPC_DIR=/home/slurm_local/e200781/petsc-3.10.3/arch-linux2-c-debug > > export PETSC_ARCH=arch-linux2-c-debug > > > > Why I cannot compile the program? > > > > Thanks, > > > > Eda > > > From yjwu16 at gmail.com Wed Mar 20 05:52:44 2019 From: yjwu16 at gmail.com (Yingjie Wu) Date: Wed, 20 Mar 2019 18:52:44 +0800 Subject: [petsc-users] Problems about GMRES restart and Scaling Message-ID: Dear PETSc developers: Hi, Recently, I used PETSc to solve a non-linear PDEs for thermodynamic problems. In the process of solving, I found the following two phenomena, hoping to get some help and suggestions. 1. Because my problem involves a lot of physical parameters, it needs to call a series of functions, and can not analytically construct Jacobian matrix, so I use - snes_mf_operator to solve it, and give an approximate Jacobian matrix as a preconditioner. Because of the large dimension of the problem and the magnitude difference of the physical variables involved, it is found that the linear step residuals will increase at each restart (default 30th linear step) . This problem can be solved by setting a large number of restart steps. I would like to ask the reasons for this phenomenon? What knowledge or articles should I learn if I want to find out this problem? 2. In my problem model, there are many physical fields (variables are realized by finite difference method), and the magnitude of variables varies greatly. Is there any Scaling interface or function in Petsc? Thanks, Yingjie -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 20 07:00:06 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 20 Mar 2019 08:00:06 -0400 Subject: [petsc-users] Problems about GMRES restart and Scaling In-Reply-To: References: Message-ID: On Wed, Mar 20, 2019 at 6:53 AM Yingjie Wu via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc developers: > Hi, > Recently, I used PETSc to solve a non-linear PDEs for thermodynamic > problems. In the process of solving, I found the following two phenomena, > hoping to get some help and suggestions. > > 1. Because my problem involves a lot of physical parameters, it needs to > call a series of functions, and can not analytically construct Jacobian > matrix, so I use - snes_mf_operator to solve it, and give an approximate > Jacobian matrix as a preconditioner. Because of the large dimension of the > problem and the magnitude difference of the physical variables involved, it > is found that the linear step residuals will increase at each restart > (default 30th linear step) . This problem can be solved by setting a large > number of restart steps. I would like to ask the reasons for this > phenomenon? What knowledge or articles should I learn if I want to find out > this problem? > Make sure you non-dimensionalize the problem first, so that any scale differences are real and not the result of units. > 2. In my problem model, there are many physical fields (variables are > realized by finite difference method), and the magnitude of variables > varies greatly. Is there any Scaling interface or function in Petsc? > That is what Jacobi does. Thanks, Matt > Thanks, > Yingjie > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yjwu16 at gmail.com Wed Mar 20 07:29:15 2019 From: yjwu16 at gmail.com (Yingjie Wu) Date: Wed, 20 Mar 2019 20:29:15 +0800 Subject: [petsc-users] Problems about GMRES restart and Scaling In-Reply-To: References: Message-ID: Thank you very much for your reply. I think my statement may not be very clear. I want to know why the linear residual increases at gmres restart. I think I should have no problem with the residual evaluation function, because after setting a large gmres restart, the results are also in line with expectations. Thanks, Yingjie Matthew Knepley ?2019?3?20??? ??8:00??? > On Wed, Mar 20, 2019 at 6:53 AM Yingjie Wu via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Dear PETSc developers: >> Hi, >> Recently, I used PETSc to solve a non-linear PDEs for thermodynamic >> problems. In the process of solving, I found the following two phenomena, >> hoping to get some help and suggestions. >> >> 1. Because my problem involves a lot of physical parameters, it needs to >> call a series of functions, and can not analytically construct Jacobian >> matrix, so I use - snes_mf_operator to solve it, and give an approximate >> Jacobian matrix as a preconditioner. Because of the large dimension of the >> problem and the magnitude difference of the physical variables involved, it >> is found that the linear step residuals will increase at each restart >> (default 30th linear step) . This problem can be solved by setting a large >> number of restart steps. I would like to ask the reasons for this >> phenomenon? What knowledge or articles should I learn if I want to find out >> this problem? >> > > Make sure you non-dimensionalize the problem first, so that any scale > differences are real and not the result of units. > > >> 2. In my problem model, there are many physical fields (variables are >> realized by finite difference method), and the magnitude of variables >> varies greatly. Is there any Scaling interface or function in Petsc? >> > > That is what Jacobi does. > > Thanks, > > Matt > > >> Thanks, >> Yingjie >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From myriam.peyrounette at idris.fr Wed Mar 20 07:48:38 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Wed, 20 Mar 2019 13:48:38 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> Message-ID: <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> Hi all, I used git bisect to determine when the memory need increased. I found that the first "bad" commit is ? aa690a28a7284adb519c28cb44eae20a2c131c85. Barry was right, this commit seems to be about an evolution of MatPtAPSymbolic_MPIAIJ_MPIAIJ. You mentioned the option "-matptap_via scalable" but I can't find any information about it. Can you tell me more? Thanks Myriam Le 03/11/19 ? 14:40, Mark Adams a ?crit?: > Is there a difference in memory usage on your tiny problem? I assume no. > > I don't see anything that could come from GAMG other than the RAP > stuff that you have discussed already. > > On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette > > wrote: > > The code I am using here is the example 42 of PETSc > (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). > Indeed it solves the Stokes equation. I thought it was a good idea > to use an example you might know (and didn't find any that uses > GAMG functions). I just changed the PCMG setup so that the memory > problem appears. And it appears when adding PCGAMG. > > I don't care about the performance or even the result rightness > here, but only about the difference in memory use between 3.6 and > 3.10. Do you think finding a more adapted script would help? > > I used the threshold of 0.1 only once, at the beginning, to test > its influence. I used the default threshold (of 0, I guess) for > all the other runs. > > Myriam > > > Le 03/11/19 ? 13:52, Mark Adams a ?crit?: >> In looking at this larger scale run ... >> >> * Your eigen estimates are much lower than your tiny test >> problem.? But this is Stokes apparently and it should not work >> anyway. Maybe you have a small time step that adds a lot of mass >> that brings the eigen estimates down. And your min eigenvalue >> (not used) is positive. I would expect negative for Stokes ... >> >> * You seem to be setting a threshold value of 0.1 -- that is very >> high >> >> * v3.6 says "using nonzero initial guess" but this is not in >> v3.10. Maybe we just stopped printing that. >> >> * There were some changes to coasening parameters in going from >> v3.6 but it does not look like your problem was effected. (The >> coarsening algo is non-deterministic by default and you can see >> small difference on different runs) >> >> * We may have also added a "noisy" RHS for eigen estimates by >> default from v3.6. >> >> * And for non-symetric problems you can try -pc_gamg_agg_nsmooths >> 0, but again GAMG is not built for Stokes anyway. >> >> >> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette >> > > wrote: >> >> I used PCView to display the size of the linear system in >> each level of the MG. You'll find the outputs attached to >> this mail (zip file) for both the default threshold value and >> a value of 0.1, and for both 3.6 and 3.10 PETSc versions. >> >> For convenience, I summarized the information in a graph, >> also attached (png file). >> >> As you can see, there are slight differences between the two >> versions but none is critical, in my opinion. Do you see >> anything suspicious in the outputs? >> >> + I can't find the default threshold value. Do you know where >> I can find it? >> >> Thanks for the follow-up >> >> Myriam >> >> >> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit?: >>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette >>> >> > wrote: >>> >>> Hi Matt, >>> >>> I plotted the memory scalings using different threshold >>> values. The two scalings are slightly translated (from >>> -22 to -88 mB) but this gain is neglectable. The >>> 3.6-scaling keeps being robust while the 3.10-scaling >>> deteriorates. >>> >>> Do you have any other suggestion? >>> >>> Mark, what is the option she can give to output all the GAMG >>> data? >>> >>> Also, run using -ksp_view. GAMG will report all the sizes of >>> its grids, so it should be easy to see >>> if the coarse grid sizes are increasing, and also what the >>> effect of the threshold value is. >>> >>> ? Thanks, >>> >>> ? ? ?Matt? >>> >>> Thanks >>> >>> Myriam >>> >>> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit?: >>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via >>>> petsc-users >>> > wrote: >>>> >>>> Hi, >>>> >>>> I used to run my code with PETSc 3.6. Since I >>>> upgraded the PETSc version >>>> to 3.10, this code has a bad memory scaling. >>>> >>>> To report this issue, I took the PETSc script >>>> ex42.c and slightly >>>> modified it so that the KSP and PC configurations >>>> are the same as in my >>>> code. In particular, I use a "personnalised" >>>> multi-grid method. The >>>> modifications are indicated by the keyword >>>> "TopBridge" in the attached >>>> scripts. >>>> >>>> To plot the memory (weak) scaling, I ran four >>>> calculations for each >>>> script with increasing problem sizes and >>>> computations cores: >>>> >>>> 1. 100,000 elts on 4 cores >>>> 2. 1 million elts on 40 cores >>>> 3. 10 millions elts on 400 cores >>>> 4. 100 millions elts on 4,000 cores >>>> >>>> The resulting graph is also attached. The scaling >>>> using PETSc 3.10 >>>> clearly deteriorates for large cases, while the one >>>> using PETSc 3.6 is >>>> robust. >>>> >>>> After a few tests, I found that the scaling is >>>> mostly sensitive to the >>>> use of the AMG method for the coarse grid (line 1780 in >>>> main_ex42_petsc36.cc). In particular, the >>>> performance strongly >>>> deteriorates when commenting lines 1777 to 1790 (in >>>> main_ex42_petsc36.cc). >>>> >>>> Do you have any idea of what changed between >>>> version 3.6 and version >>>> 3.10 that may imply such degradation? >>>> >>>> >>>> I believe the default values for PCGAMG changed between >>>> versions. It sounds like the coarsening rate >>>> is not great enough, so that these grids are too large. >>>> This can be set using: >>>> >>>> ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>> >>>> There is some explanation of this effect on that page. >>>> Let us know if setting this does not correct the situation. >>>> >>>> ? Thanks, >>>> >>>> ? ? ?Matt >>>> ? >>>> >>>> Let me know if you need further information. >>>> >>>> Best, >>>> >>>> Myriam Peyrounette >>>> >>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they >>>> begin their experiments is infinitely more interesting >>>> than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin >>> their experiments is infinitely more interesting than any >>> results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From mfadams at lbl.gov Wed Mar 20 09:44:11 2019 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 20 Mar 2019 10:44:11 -0400 Subject: [petsc-users] Problems about GMRES restart and Scaling In-Reply-To: References: Message-ID: On Wed, Mar 20, 2019 at 8:30 AM Yingjie Wu via petsc-users < petsc-users at mcs.anl.gov> wrote: > Thank you very much for your reply. > I think my statement may not be very clear. I want to know why the linear > residual increases at gmres restart. > GMRES combines the functions in the Krylov subspace that minimize the residual, by design (and this is provable). With restart you are not doing pure GMRES and the proof is gone. So if you have restart of 50 and do 51 iterations then the residual can not be lower than if you did 51 iterations of GMRES. I don't think you can prove anything about what the residual will do after a restart other than it will not go down more than it would without restart. > I think I should have no problem with the residual evaluation function, > because after setting a large gmres restart, the results are also in line > with expectations. > Thanks, > Yingjie > > Matthew Knepley ?2019?3?20??? ??8:00??? > >> On Wed, Mar 20, 2019 at 6:53 AM Yingjie Wu via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Dear PETSc developers: >>> Hi, >>> Recently, I used PETSc to solve a non-linear PDEs for thermodynamic >>> problems. In the process of solving, I found the following two phenomena, >>> hoping to get some help and suggestions. >>> >>> 1. Because my problem involves a lot of physical parameters, it needs to >>> call a series of functions, and can not analytically construct Jacobian >>> matrix, so I use - snes_mf_operator to solve it, and give an approximate >>> Jacobian matrix as a preconditioner. Because of the large dimension of the >>> problem and the magnitude difference of the physical variables involved, it >>> is found that the linear step residuals will increase at each restart >>> (default 30th linear step) . This problem can be solved by setting a large >>> number of restart steps. I would like to ask the reasons for this >>> phenomenon? What knowledge or articles should I learn if I want to find out >>> this problem? >>> >> >> Make sure you non-dimensionalize the problem first, so that any scale >> differences are real and not the result of units. >> >> >>> 2. In my problem model, there are many physical fields (variables are >>> realized by finite difference method), and the magnitude of variables >>> varies greatly. Is there any Scaling interface or function in Petsc? >>> >> >> That is what Jacobi does. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Yingjie >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From myriam.peyrounette at idris.fr Wed Mar 20 11:05:50 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Wed, 20 Mar 2019 17:05:50 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> Message-ID: <00d471e0-ed2a-51fa-3031-a6b63c3a96e1@idris.fr> More precisely: something happens when upgrading the functions MatPtAPNumeric_MPIAIJ_MPIAIJ and/or MatPtAPSymbolic_MPIAIJ_MPIAIJ. Unfortunately, there are a lot of differences between the old and new versions of these functions. I keep investigating but if you have any idea, please let me know. Best, Myriam Le 03/20/19 ? 13:48, Myriam Peyrounette a ?crit?: > > Hi all, > > I used git bisect to determine when the memory need increased. I found > that the first "bad" commit is ? aa690a28a7284adb519c28cb44eae20a2c131c85. > > Barry was right, this commit seems to be about an evolution of > MatPtAPSymbolic_MPIAIJ_MPIAIJ. You mentioned the option "-matptap_via > scalable" but I can't find any information about it. Can you tell me more? > > Thanks > > Myriam > > > Le 03/11/19 ? 14:40, Mark Adams a ?crit?: >> Is there a difference in memory usage on your tiny problem? I assume no. >> >> I don't see anything that could come from GAMG other than the RAP >> stuff that you have discussed already. >> >> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette >> > wrote: >> >> The code I am using here is the example 42 of PETSc >> (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). >> Indeed it solves the Stokes equation. I thought it was a good >> idea to use an example you might know (and didn't find any that >> uses GAMG functions). I just changed the PCMG setup so that the >> memory problem appears. And it appears when adding PCGAMG. >> >> I don't care about the performance or even the result rightness >> here, but only about the difference in memory use between 3.6 and >> 3.10. Do you think finding a more adapted script would help? >> >> I used the threshold of 0.1 only once, at the beginning, to test >> its influence. I used the default threshold (of 0, I guess) for >> all the other runs. >> >> Myriam >> >> >> Le 03/11/19 ? 13:52, Mark Adams a ?crit?: >>> In looking at this larger scale run ... >>> >>> * Your eigen estimates are much lower than your tiny test >>> problem.? But this is Stokes apparently and it should not work >>> anyway. Maybe you have a small time step that adds a lot of mass >>> that brings the eigen estimates down. And your min eigenvalue >>> (not used) is positive. I would expect negative for Stokes ... >>> >>> * You seem to be setting a threshold value of 0.1 -- that is >>> very high >>> >>> * v3.6 says "using nonzero initial guess" but this is not in >>> v3.10. Maybe we just stopped printing that. >>> >>> * There were some changes to coasening parameters in going from >>> v3.6 but it does not look like your problem was effected. (The >>> coarsening algo is non-deterministic by default and you can see >>> small difference on different runs) >>> >>> * We may have also added a "noisy" RHS for eigen estimates by >>> default from v3.6. >>> >>> * And for non-symetric problems you can try >>> -pc_gamg_agg_nsmooths 0, but again GAMG is not built for Stokes >>> anyway. >>> >>> >>> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette >>> >> > wrote: >>> >>> I used PCView to display the size of the linear system in >>> each level of the MG. You'll find the outputs attached to >>> this mail (zip file) for both the default threshold value >>> and a value of 0.1, and for both 3.6 and 3.10 PETSc versions. >>> >>> For convenience, I summarized the information in a graph, >>> also attached (png file). >>> >>> As you can see, there are slight differences between the two >>> versions but none is critical, in my opinion. Do you see >>> anything suspicious in the outputs? >>> >>> + I can't find the default threshold value. Do you know >>> where I can find it? >>> >>> Thanks for the follow-up >>> >>> Myriam >>> >>> >>> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit?: >>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette >>>> >>> > wrote: >>>> >>>> Hi Matt, >>>> >>>> I plotted the memory scalings using different threshold >>>> values. The two scalings are slightly translated (from >>>> -22 to -88 mB) but this gain is neglectable. The >>>> 3.6-scaling keeps being robust while the 3.10-scaling >>>> deteriorates. >>>> >>>> Do you have any other suggestion? >>>> >>>> Mark, what is the option she can give to output all the >>>> GAMG data? >>>> >>>> Also, run using -ksp_view. GAMG will report all the sizes >>>> of its grids, so it should be easy to see >>>> if the coarse grid sizes are increasing, and also what the >>>> effect of the threshold value is. >>>> >>>> ? Thanks, >>>> >>>> ? ? ?Matt? >>>> >>>> Thanks >>>> >>>> Myriam >>>> >>>> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit?: >>>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via >>>>> petsc-users >>>> > wrote: >>>>> >>>>> Hi, >>>>> >>>>> I used to run my code with PETSc 3.6. Since I >>>>> upgraded the PETSc version >>>>> to 3.10, this code has a bad memory scaling. >>>>> >>>>> To report this issue, I took the PETSc script >>>>> ex42.c and slightly >>>>> modified it so that the KSP and PC configurations >>>>> are the same as in my >>>>> code. In particular, I use a "personnalised" >>>>> multi-grid method. The >>>>> modifications are indicated by the keyword >>>>> "TopBridge" in the attached >>>>> scripts. >>>>> >>>>> To plot the memory (weak) scaling, I ran four >>>>> calculations for each >>>>> script with increasing problem sizes and >>>>> computations cores: >>>>> >>>>> 1. 100,000 elts on 4 cores >>>>> 2. 1 million elts on 40 cores >>>>> 3. 10 millions elts on 400 cores >>>>> 4. 100 millions elts on 4,000 cores >>>>> >>>>> The resulting graph is also attached. The scaling >>>>> using PETSc 3.10 >>>>> clearly deteriorates for large cases, while the >>>>> one using PETSc 3.6 is >>>>> robust. >>>>> >>>>> After a few tests, I found that the scaling is >>>>> mostly sensitive to the >>>>> use of the AMG method for the coarse grid (line >>>>> 1780 in >>>>> main_ex42_petsc36.cc). In particular, the >>>>> performance strongly >>>>> deteriorates when commenting lines 1777 to 1790 >>>>> (in main_ex42_petsc36.cc). >>>>> >>>>> Do you have any idea of what changed between >>>>> version 3.6 and version >>>>> 3.10 that may imply such degradation? >>>>> >>>>> >>>>> I believe the default values for PCGAMG changed >>>>> between versions. It sounds like the coarsening rate >>>>> is not great enough, so that these grids are too >>>>> large. This can be set using: >>>>> >>>>> ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>>> >>>>> There is some explanation of this effect on that page. >>>>> Let us know if setting this does not correct the >>>>> situation. >>>>> >>>>> ? Thanks, >>>>> >>>>> ? ? ?Matt >>>>> ? >>>>> >>>>> Let me know if you need further information. >>>>> >>>>> Best, >>>>> >>>>> Myriam Peyrounette >>>>> >>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they >>>>> begin their experiments is infinitely more interesting >>>>> than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin >>>> their experiments is infinitely more interesting than any >>>> results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From fdkong.jd at gmail.com Wed Mar 20 11:38:58 2019 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 20 Mar 2019 10:38:58 -0600 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: <00d471e0-ed2a-51fa-3031-a6b63c3a96e1@idris.fr> References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> <00d471e0-ed2a-51fa-3031-a6b63c3a96e1@idris.fr> Message-ID: Hi Myriam, There are three algorithms in PETSc to do PtAP ( const char *algTypes[3] = {"scalable","nonscalable","hypre"};), and can be specified using the petsc options: -matptap_via xxxx. (1) -matptap_via hypre: This call the hypre package to do the PtAP trough an all-at-once triple product. In our experiences, it is the most memory efficient, but could be slow. (2) -matptap_via scalable: This involves a row-wise algorithm plus an outer product. This will use more memory than hypre, but way faster. This used to have a bug that could take all your memory, and I have a fix at https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff. When using this option, we may want to have extra options such as -inner_offdiag_matmatmult_via scalable -inner_diag_matmatmult_via scalable to select inner scalable algorithms. (3) -matptap_via nonscalable: Suppose to be even faster, but use more memory. It does dense matrix operations. Thanks, Fande Kong On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via petsc-users < petsc-users at mcs.anl.gov> wrote: > More precisely: something happens when upgrading the functions > MatPtAPNumeric_MPIAIJ_MPIAIJ and/or MatPtAPSymbolic_MPIAIJ_MPIAIJ. > > Unfortunately, there are a lot of differences between the old and new > versions of these functions. I keep investigating but if you have any idea, > please let me know. > > Best, > > Myriam > > Le 03/20/19 ? 13:48, Myriam Peyrounette a ?crit : > > Hi all, > > I used git bisect to determine when the memory need increased. I found > that the first "bad" commit is aa690a28a7284adb519c28cb44eae20a2c131c85. > > Barry was right, this commit seems to be about an evolution of MatPtAPSymbolic_MPIAIJ_MPIAIJ. > You mentioned the option "-matptap_via scalable" but I can't find any > information about it. Can you tell me more? > > Thanks > > Myriam > > > Le 03/11/19 ? 14:40, Mark Adams a ?crit : > > Is there a difference in memory usage on your tiny problem? I assume no. > > I don't see anything that could come from GAMG other than the RAP stuff > that you have discussed already. > > On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette < > myriam.peyrounette at idris.fr> wrote: > >> The code I am using here is the example 42 of PETSc ( >> https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). >> Indeed it solves the Stokes equation. I thought it was a good idea to use >> an example you might know (and didn't find any that uses GAMG functions). I >> just changed the PCMG setup so that the memory problem appears. And it >> appears when adding PCGAMG. >> >> I don't care about the performance or even the result rightness here, but >> only about the difference in memory use between 3.6 and 3.10. Do you think >> finding a more adapted script would help? >> >> I used the threshold of 0.1 only once, at the beginning, to test its >> influence. I used the default threshold (of 0, I guess) for all the other >> runs. >> >> Myriam >> >> Le 03/11/19 ? 13:52, Mark Adams a ?crit : >> >> In looking at this larger scale run ... >> >> * Your eigen estimates are much lower than your tiny test problem. But >> this is Stokes apparently and it should not work anyway. Maybe you have a >> small time step that adds a lot of mass that brings the eigen estimates >> down. And your min eigenvalue (not used) is positive. I would expect >> negative for Stokes ... >> >> * You seem to be setting a threshold value of 0.1 -- that is very high >> >> * v3.6 says "using nonzero initial guess" but this is not in v3.10. Maybe >> we just stopped printing that. >> >> * There were some changes to coasening parameters in going from v3.6 but >> it does not look like your problem was effected. (The coarsening algo is >> non-deterministic by default and you can see small difference on different >> runs) >> >> * We may have also added a "noisy" RHS for eigen estimates by default >> from v3.6. >> >> * And for non-symetric problems you can try -pc_gamg_agg_nsmooths 0, but >> again GAMG is not built for Stokes anyway. >> >> >> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette < >> myriam.peyrounette at idris.fr> wrote: >> >>> I used PCView to display the size of the linear system in each level of >>> the MG. You'll find the outputs attached to this mail (zip file) for both >>> the default threshold value and a value of 0.1, and for both 3.6 and 3.10 >>> PETSc versions. >>> >>> For convenience, I summarized the information in a graph, also attached >>> (png file). >>> >>> As you can see, there are slight differences between the two versions >>> but none is critical, in my opinion. Do you see anything suspicious in the >>> outputs? >>> >>> + I can't find the default threshold value. Do you know where I can find >>> it? >>> >>> Thanks for the follow-up >>> >>> Myriam >>> >>> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit : >>> >>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette < >>> myriam.peyrounette at idris.fr> wrote: >>> >>>> Hi Matt, >>>> >>>> I plotted the memory scalings using different threshold values. The two >>>> scalings are slightly translated (from -22 to -88 mB) but this gain is >>>> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling >>>> deteriorates. >>>> >>>> Do you have any other suggestion? >>>> >>> Mark, what is the option she can give to output all the GAMG data? >>> >>> Also, run using -ksp_view. GAMG will report all the sizes of its grids, >>> so it should be easy to see >>> if the coarse grid sizes are increasing, and also what the effect of the >>> threshold value is. >>> >>> Thanks, >>> >>> Matt >>> >>>> Thanks >>>> Myriam >>>> >>>> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit : >>>> >>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>>> Hi, >>>>> >>>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc >>>>> version >>>>> to 3.10, this code has a bad memory scaling. >>>>> >>>>> To report this issue, I took the PETSc script ex42.c and slightly >>>>> modified it so that the KSP and PC configurations are the same as in my >>>>> code. In particular, I use a "personnalised" multi-grid method. The >>>>> modifications are indicated by the keyword "TopBridge" in the attached >>>>> scripts. >>>>> >>>>> To plot the memory (weak) scaling, I ran four calculations for each >>>>> script with increasing problem sizes and computations cores: >>>>> >>>>> 1. 100,000 elts on 4 cores >>>>> 2. 1 million elts on 40 cores >>>>> 3. 10 millions elts on 400 cores >>>>> 4. 100 millions elts on 4,000 cores >>>>> >>>>> The resulting graph is also attached. The scaling using PETSc 3.10 >>>>> clearly deteriorates for large cases, while the one using PETSc 3.6 is >>>>> robust. >>>>> >>>>> After a few tests, I found that the scaling is mostly sensitive to the >>>>> use of the AMG method for the coarse grid (line 1780 in >>>>> main_ex42_petsc36.cc). In particular, the performance strongly >>>>> deteriorates when commenting lines 1777 to 1790 (in >>>>> main_ex42_petsc36.cc). >>>>> >>>>> Do you have any idea of what changed between version 3.6 and version >>>>> 3.10 that may imply such degradation? >>>>> >>>> >>>> I believe the default values for PCGAMG changed between versions. It >>>> sounds like the coarsening rate >>>> is not great enough, so that these grids are too large. This can be set >>>> using: >>>> >>>> >>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>> >>>> There is some explanation of this effect on that page. Let us know if >>>> setting this does not correct the situation. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Let me know if you need further information. >>>>> >>>>> Best, >>>>> >>>>> Myriam Peyrounette >>>>> >>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Mar 20 12:18:32 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 20 Mar 2019 17:18:32 +0000 Subject: [petsc-users] Problems about GMRES restart and Scaling In-Reply-To: References: Message-ID: > On Mar 20, 2019, at 5:52 AM, Yingjie Wu via petsc-users wrote: > > Dear PETSc developers: > Hi, > Recently, I used PETSc to solve a non-linear PDEs for thermodynamic problems. In the process of solving, I found the following two phenomena, hoping to get some help and suggestions. > > 1. Because my problem involves a lot of physical parameters, it needs to call a series of functions, and can not analytically construct Jacobian matrix, so I use - snes_mf_operator to solve it, and give an approximate Jacobian matrix as a preconditioner. Because of the large dimension of the problem and the magnitude difference of the physical variables involved, it is found that the linear step residuals will increase at each restart (default 30th linear step) . This problem can be solved by setting a large number of restart steps. I would like to ask the reasons for this phenomenon? What knowledge or articles should I learn if I want to find out this problem? I've seen this behavior. I think in your case it is likely the -snes_mf_operator is not really producing an "accurate enough" Jacobian-Vector product (and the "solution" being generated by GMRES may be garbage). Run with -ksp_monitor_true_residual If your residual function has if () statements in it or other very sharp changes (discontinuities) then it may not even have a true Jacobian at the locations it is being evaluated at. In the sense that the "Jacobian" you are applying via finite differences is not a linear operator and hence GMRES will fail on it. What are you using for a preconditioner? And roughly how many KSP iterations are being used. Barry > > > 2. In my problem model, there are many physical fields (variables are realized by finite difference method), and the magnitude of variables varies greatly. Is there any Scaling interface or function in Petsc? > > Thanks, > Yingjie From mvalera-w at sdsu.edu Wed Mar 20 14:57:10 2019 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Wed, 20 Mar 2019 12:57:10 -0700 Subject: [petsc-users] MPI Communication times Message-ID: Hello, I am working on timing my model, which we made MPI scalable using petsc DMDAs, i want to know more about the output log and how to calculate a total communication times for my runs, so far i see we have "MPI Messages" and "MPI Messages Lengths" in the log, along VecScatterEnd and VecScatterBegin reports. My question is, how do i interpret these number to get a rough estimate on how much overhead we have just from MPI communications times in my model runs? Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Wed Mar 20 16:02:04 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Wed, 20 Mar 2019 21:02:04 +0000 Subject: [petsc-users] MPI Communication times In-Reply-To: References: Message-ID: See the "Mess AvgLen Reduct" number in each log stage. Mess is the total number of messages sent in an event over all processes. AvgLen is average message len. Reduct is the number of global reduction. Each event like VecScatterBegin/End has a maximal execution time over all processes, and a max/min ratio. %T is sum(execution time of the event on each process)/sum(execution time of the stage on each process). %T indicates how expensive the event is. It is a number you should pay attention to. If your code is imbalanced (i.e., with a big max/min ratio), then the performance number is skewed and becomes misleading because some processes are just waiting for others. Then, besides -log_view, you can add -log_sync, which adds an extra MPI_Barrier for each event to let them start at the same time. With that, it is easier to interpret the number. src/vec/vscat/examples/ex4.c is a tiny example for VecScatter logging. --Junchao Zhang On Wed, Mar 20, 2019 at 2:58 PM Manuel Valera via petsc-users > wrote: Hello, I am working on timing my model, which we made MPI scalable using petsc DMDAs, i want to know more about the output log and how to calculate a total communication times for my runs, so far i see we have "MPI Messages" and "MPI Messages Lengths" in the log, along VecScatterEnd and VecScatterBegin reports. My question is, how do i interpret these number to get a rough estimate on how much overhead we have just from MPI communications times in my model runs? Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera-w at sdsu.edu Wed Mar 20 16:17:45 2019 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Wed, 20 Mar 2019 14:17:45 -0700 Subject: [petsc-users] MPI Communication times In-Reply-To: References: Message-ID: Thanks for your answer, so for example i have a log for 200 cores across 10 nodes that reads: ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ---------------------------------------------------------------------------------------------------------------------- VecScatterBegin 3014 1.0 4.5550e+01 2.6 0.00e+00 0.0 *4.2e+06 1.1e+06 2.8e+01 4* 0 63 56 0 4 0 63 56 0 0 VecScatterEnd 2976 1.0 1.2143e+02 1.7 0.00e+00 0.0 *0.0e+00 0.0e+00 0.0e+00 14* 0 0 0 0 14 0 0 0 0 0 While for 20 nodes at one node i have: VecScatterBegin 2596 1.0 2.9142e+01 2.1 0.00e+00 0.0 *1.2e+05 4.0e+06 3.0e+01 2* 0 81 61 0 2 0 81 61 0 0 VecScatterEnd 2558 1.0 8.0344e+01 7.9 0.00e+00 0.0 *0.0e+00 0.0e+00 0.0e+00 3* 0 0 0 0 3 0 0 0 0 0 Where do i see the max/min ratio in here? and why End step is all 0.0e00 in both but still grows from 3% to 14% of total time? It seems i would need to run again with the -log_sync option, is this correct? Different question, can't i estimate the total communication time if i had a typical communication time per MPI message times the number of MPI messages reported in the log? or it doesn't work like that? Thanks. On Wed, Mar 20, 2019 at 2:02 PM Zhang, Junchao wrote: > See the "Mess AvgLen Reduct" number in each log stage. Mess is the > total number of messages sent in an event over all processes. AvgLen is > average message len. Reduct is the number of global reduction. > Each event like VecScatterBegin/End has a maximal execution time over all > processes, and a max/min ratio. %T is sum(execution time of the event on > each process)/sum(execution time of the stage on each process). %T > indicates how expensive the event is. It is a number you should pay > attention to. > If your code is imbalanced (i.e., with a big max/min ratio), then the > performance number is skewed and becomes misleading because some processes > are just waiting for others. Then, besides -log_view, you can add > -log_sync, which adds an extra MPI_Barrier for each event to let them start > at the same time. With that, it is easier to interpret the number. > src/vec/vscat/examples/ex4.c is a tiny example for VecScatter logging. > > --Junchao Zhang > > > On Wed, Mar 20, 2019 at 2:58 PM Manuel Valera via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello, >> >> I am working on timing my model, which we made MPI scalable using petsc >> DMDAs, i want to know more about the output log and how to calculate a >> total communication times for my runs, so far i see we have "MPI Messages" >> and "MPI Messages Lengths" in the log, along VecScatterEnd and >> VecScatterBegin reports. >> >> My question is, how do i interpret these number to get a rough estimate >> on how much overhead we have just from MPI communications times in my model >> runs? >> >> Thanks, >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Wed Mar 20 16:42:59 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Wed, 20 Mar 2019 21:42:59 +0000 Subject: [petsc-users] MPI Communication times In-Reply-To: References: Message-ID: On Wed, Mar 20, 2019 at 4:18 PM Manuel Valera > wrote: Thanks for your answer, so for example i have a log for 200 cores across 10 nodes that reads: ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ---------------------------------------------------------------------------------------------------------------------- VecScatterBegin 3014 1.0 4.5550e+01 2.6 0.00e+00 0.0 4.2e+06 1.1e+06 2.8e+01 4 0 63 56 0 4 0 63 56 0 0 VecScatterEnd 2976 1.0 1.2143e+02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 14 0 0 0 0 14 0 0 0 0 0 While for 20 nodes at one node i have: What does that mean? VecScatterBegin 2596 1.0 2.9142e+01 2.1 0.00e+00 0.0 1.2e+05 4.0e+06 3.0e+01 2 0 81 61 0 2 0 81 61 0 0 VecScatterEnd 2558 1.0 8.0344e+01 7.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 Where do i see the max/min ratio in here? and why End step is all 0.0e00 in both but still grows from 3% to 14% of total time? It seems i would need to run again with the -log_sync option, is this correct? e.g., 2.1, 7.9. MPI send/recv are in VecScatterBegin(). VecScatterEnd() only does MPI_Wait. That is why it has zero messages. Yes, run with -log_sync and see what happens. Different question, can't i estimate the total communication time if i had a typical communication time per MPI message times the number of MPI messages reported in the log? or it doesn't work like that? Probably not work because you have multiple processes doing send/recv at the same time. They might saturate the bandwidth. Petsc also does computation/communication overlapping. Thanks. On Wed, Mar 20, 2019 at 2:02 PM Zhang, Junchao > wrote: See the "Mess AvgLen Reduct" number in each log stage. Mess is the total number of messages sent in an event over all processes. AvgLen is average message len. Reduct is the number of global reduction. Each event like VecScatterBegin/End has a maximal execution time over all processes, and a max/min ratio. %T is sum(execution time of the event on each process)/sum(execution time of the stage on each process). %T indicates how expensive the event is. It is a number you should pay attention to. If your code is imbalanced (i.e., with a big max/min ratio), then the performance number is skewed and becomes misleading because some processes are just waiting for others. Then, besides -log_view, you can add -log_sync, which adds an extra MPI_Barrier for each event to let them start at the same time. With that, it is easier to interpret the number. src/vec/vscat/examples/ex4.c is a tiny example for VecScatter logging. --Junchao Zhang On Wed, Mar 20, 2019 at 2:58 PM Manuel Valera via petsc-users > wrote: Hello, I am working on timing my model, which we made MPI scalable using petsc DMDAs, i want to know more about the output log and how to calculate a total communication times for my runs, so far i see we have "MPI Messages" and "MPI Messages Lengths" in the log, along VecScatterEnd and VecScatterBegin reports. My question is, how do i interpret these number to get a rough estimate on how much overhead we have just from MPI communications times in my model runs? Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera-w at sdsu.edu Wed Mar 20 17:07:45 2019 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Wed, 20 Mar 2019 15:07:45 -0700 Subject: [petsc-users] MPI Communication times In-Reply-To: References: Message-ID: Sorry i meant 20 cores at one node. Ok i will retry with -log_sync and come back. Thanks for your help. On Wed, Mar 20, 2019 at 2:43 PM Zhang, Junchao wrote: > > > On Wed, Mar 20, 2019 at 4:18 PM Manuel Valera wrote: > >> Thanks for your answer, so for example i have a log for 200 cores across >> 10 nodes that reads: >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flop >> --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio >> Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> >> ---------------------------------------------------------------------------------------------------------------------- >> VecScatterBegin 3014 1.0 4.5550e+01 2.6 0.00e+00 0.0 *4.2e+06 >> 1.1e+06 2.8e+01 4* 0 63 56 0 4 0 63 56 0 0 >> VecScatterEnd 2976 1.0 1.2143e+02 1.7 0.00e+00 0.0 *0.0e+00 >> 0.0e+00 0.0e+00 14* 0 0 0 0 14 0 0 0 0 0 >> >> While for 20 nodes at one node i have: >> > What does that mean? > >> VecScatterBegin 2596 1.0 2.9142e+01 2.1 0.00e+00 0.0 *1.2e+05 >> 4.0e+06 3.0e+01 2* 0 81 61 0 2 0 81 61 0 0 >> VecScatterEnd 2558 1.0 8.0344e+01 7.9 0.00e+00 0.0 *0.0e+00 >> 0.0e+00 0.0e+00 3* 0 0 0 0 3 0 0 0 0 0 >> >> Where do i see the max/min ratio in here? and why End step is all 0.0e00 >> in both but still grows from 3% to 14% of total time? It seems i would need >> to run again with the -log_sync option, is this correct? >> >> e.g., 2.1, 7.9. MPI send/recv are in VecScatterBegin(). VecScatterEnd() > only does MPI_Wait. That is why it has zero messages. Yes, run with > -log_sync and see what happens. > > >> Different question, can't i estimate the total communication time if i >> had a typical communication time per MPI message times the number of MPI >> messages reported in the log? or it doesn't work like that? >> >> Probably not work because you have multiple processes doing send/recv at > the same time. They might saturate the bandwidth. Petsc also does > computation/communication overlapping. > > >> Thanks. >> >> >> >> >> >> On Wed, Mar 20, 2019 at 2:02 PM Zhang, Junchao >> wrote: >> >>> See the "Mess AvgLen Reduct" number in each log stage. Mess is the >>> total number of messages sent in an event over all processes. AvgLen is >>> average message len. Reduct is the number of global reduction. >>> Each event like VecScatterBegin/End has a maximal execution time over >>> all processes, and a max/min ratio. %T is sum(execution time of the event >>> on each process)/sum(execution time of the stage on each process). %T >>> indicates how expensive the event is. It is a number you should pay >>> attention to. >>> If your code is imbalanced (i.e., with a big max/min ratio), then the >>> performance number is skewed and becomes misleading because some processes >>> are just waiting for others. Then, besides -log_view, you can add >>> -log_sync, which adds an extra MPI_Barrier for each event to let them start >>> at the same time. With that, it is easier to interpret the number. >>> src/vec/vscat/examples/ex4.c is a tiny example for VecScatter logging. >>> >>> --Junchao Zhang >>> >>> >>> On Wed, Mar 20, 2019 at 2:58 PM Manuel Valera via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Hello, >>>> >>>> I am working on timing my model, which we made MPI scalable using petsc >>>> DMDAs, i want to know more about the output log and how to calculate a >>>> total communication times for my runs, so far i see we have "MPI Messages" >>>> and "MPI Messages Lengths" in the log, along VecScatterEnd and >>>> VecScatterBegin reports. >>>> >>>> My question is, how do i interpret these number to get a rough estimate >>>> on how much overhead we have just from MPI communications times in my model >>>> runs? >>>> >>>> Thanks, >>>> >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Wed Mar 20 17:24:53 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Wed, 20 Mar 2019 22:24:53 +0000 Subject: [petsc-users] MPI Communication times In-Reply-To: References: Message-ID: Forgot to mention long VecScatter time might also due to local memory copies. If the communication pattern has large local to local (self to self) scatter, which often happens thanks to locality, then the memory copy time is counted in VecScatter. You can analyze your code's communication pattern to see if it is the case. --Junchao Zhang On Wed, Mar 20, 2019 at 4:44 PM Zhang, Junchao via petsc-users > wrote: On Wed, Mar 20, 2019 at 4:18 PM Manuel Valera > wrote: Thanks for your answer, so for example i have a log for 200 cores across 10 nodes that reads: ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ---------------------------------------------------------------------------------------------------------------------- VecScatterBegin 3014 1.0 4.5550e+01 2.6 0.00e+00 0.0 4.2e+06 1.1e+06 2.8e+01 4 0 63 56 0 4 0 63 56 0 0 VecScatterEnd 2976 1.0 1.2143e+02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 14 0 0 0 0 14 0 0 0 0 0 While for 20 nodes at one node i have: What does that mean? VecScatterBegin 2596 1.0 2.9142e+01 2.1 0.00e+00 0.0 1.2e+05 4.0e+06 3.0e+01 2 0 81 61 0 2 0 81 61 0 0 VecScatterEnd 2558 1.0 8.0344e+01 7.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 Where do i see the max/min ratio in here? and why End step is all 0.0e00 in both but still grows from 3% to 14% of total time? It seems i would need to run again with the -log_sync option, is this correct? e.g., 2.1, 7.9. MPI send/recv are in VecScatterBegin(). VecScatterEnd() only does MPI_Wait. That is why it has zero messages. Yes, run with -log_sync and see what happens. Different question, can't i estimate the total communication time if i had a typical communication time per MPI message times the number of MPI messages reported in the log? or it doesn't work like that? Probably not work because you have multiple processes doing send/recv at the same time. They might saturate the bandwidth. Petsc also does computation/communication overlapping. Thanks. On Wed, Mar 20, 2019 at 2:02 PM Zhang, Junchao > wrote: See the "Mess AvgLen Reduct" number in each log stage. Mess is the total number of messages sent in an event over all processes. AvgLen is average message len. Reduct is the number of global reduction. Each event like VecScatterBegin/End has a maximal execution time over all processes, and a max/min ratio. %T is sum(execution time of the event on each process)/sum(execution time of the stage on each process). %T indicates how expensive the event is. It is a number you should pay attention to. If your code is imbalanced (i.e., with a big max/min ratio), then the performance number is skewed and becomes misleading because some processes are just waiting for others. Then, besides -log_view, you can add -log_sync, which adds an extra MPI_Barrier for each event to let them start at the same time. With that, it is easier to interpret the number. src/vec/vscat/examples/ex4.c is a tiny example for VecScatter logging. --Junchao Zhang On Wed, Mar 20, 2019 at 2:58 PM Manuel Valera via petsc-users > wrote: Hello, I am working on timing my model, which we made MPI scalable using petsc DMDAs, i want to know more about the output log and how to calculate a total communication times for my runs, so far i see we have "MPI Messages" and "MPI Messages Lengths" in the log, along VecScatterEnd and VecScatterBegin reports. My question is, how do i interpret these number to get a rough estimate on how much overhead we have just from MPI communications times in my model runs? Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedmud at gmail.com Wed Mar 20 17:42:06 2019 From: friedmud at gmail.com (Derek Gaston) Date: Wed, 20 Mar 2019 16:42:06 -0600 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors Message-ID: Trying to track down some memory corruption I'm seeing on larger scale runs (3.5B+ unknowns). Was able to run Valgrind on it... and I'm seeing quite a lot of uninitialized value errors coming from ghost updating. Here are some of the traces: ==87695== Conditional jump or move depends on uninitialised value(s) ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) ==87695== by 0x7323C70: PetscMallocA (mal.c:390) ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284) ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP (vpscat_mpi1.c:312) ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) ==133582== Conditional jump or move depends on uninitialised value(s) ==133582== at 0x4030384: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1034) ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack (vecscatterimpl.h:150) ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) This is from a Git checkout of PETSc... the hash I branched from is: 0e667e8fea4aa from December 23rd (updating would be really hard at this point as I've completed 90% of my dissertation with this version... and changing PETSc now would be pretty painful!). Any ideas? Is it possible it's in my code? Is it possible that there are later PETSc commits that already fix this? Thanks for any help, Derek -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Wed Mar 20 18:33:18 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Wed, 20 Mar 2019 23:33:18 +0000 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: References: Message-ID: Did you see the warning with small scale runs? Is it possible to provide a test code? You mentioned "changing PETSc now would be pretty painful". Is it because it will affect your performance (but not your code)? If yes, could you try PETSc master and run you code with or without -vecscatter_type sf. I want to isolate the problem and see if it is due to possible bugs in VecScatter. If the above suggestion is not feasible, I will disable VecScatterMemcpy. It is an optimization I added. Sorry I did not have an option to turn off it because I thought it was always useful:) I will provide you a patch later to disable it. With that you can run again to isolate possible bugs in VecScatterMemcpy. Thanks. --Junchao Zhang On Wed, Mar 20, 2019 at 5:40 PM Derek Gaston via petsc-users > wrote: Trying to track down some memory corruption I'm seeing on larger scale runs (3.5B+ unknowns). Was able to run Valgrind on it... and I'm seeing quite a lot of uninitialized value errors coming from ghost updating. Here are some of the traces: ==87695== Conditional jump or move depends on uninitialised value(s) ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) ==87695== by 0x7323C70: PetscMallocA (mal.c:390) ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284) ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP (vpscat_mpi1.c:312) ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) ==133582== Conditional jump or move depends on uninitialised value(s) ==133582== at 0x4030384: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1034) ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack (vecscatterimpl.h:150) ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) This is from a Git checkout of PETSc... the hash I branched from is: 0e667e8fea4aa from December 23rd (updating would be really hard at this point as I've completed 90% of my dissertation with this version... and changing PETSc now would be pretty painful!). Any ideas? Is it possible it's in my code? Is it possible that there are later PETSc commits that already fix this? Thanks for any help, Derek -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Wed Mar 20 22:21:15 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Thu, 21 Mar 2019 03:21:15 +0000 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: References: Message-ID: Hi, Derek, Try to apply this tiny (but dirty) patch on your version of PETSc to disable the VecScatterMemcpyPlan optimization to see if it helps. Thanks. --Junchao Zhang On Wed, Mar 20, 2019 at 6:33 PM Junchao Zhang > wrote: Did you see the warning with small scale runs? Is it possible to provide a test code? You mentioned "changing PETSc now would be pretty painful". Is it because it will affect your performance (but not your code)? If yes, could you try PETSc master and run you code with or without -vecscatter_type sf. I want to isolate the problem and see if it is due to possible bugs in VecScatter. If the above suggestion is not feasible, I will disable VecScatterMemcpy. It is an optimization I added. Sorry I did not have an option to turn off it because I thought it was always useful:) I will provide you a patch later to disable it. With that you can run again to isolate possible bugs in VecScatterMemcpy. Thanks. --Junchao Zhang On Wed, Mar 20, 2019 at 5:40 PM Derek Gaston via petsc-users > wrote: Trying to track down some memory corruption I'm seeing on larger scale runs (3.5B+ unknowns). Was able to run Valgrind on it... and I'm seeing quite a lot of uninitialized value errors coming from ghost updating. Here are some of the traces: ==87695== Conditional jump or move depends on uninitialised value(s) ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) ==87695== by 0x7323C70: PetscMallocA (mal.c:390) ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284) ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP (vpscat_mpi1.c:312) ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) ==133582== Conditional jump or move depends on uninitialised value(s) ==133582== at 0x4030384: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1034) ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack (vecscatterimpl.h:150) ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) This is from a Git checkout of PETSc... the hash I branched from is: 0e667e8fea4aa from December 23rd (updating would be really hard at this point as I've completed 90% of my dissertation with this version... and changing PETSc now would be pretty painful!). Any ideas? Is it possible it's in my code? Is it possible that there are later PETSc commits that already fix this? Thanks for any help, Derek -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: vscat.patch Type: application/octet-stream Size: 1379 bytes Desc: vscat.patch URL: From stefano.zampini at gmail.com Thu Mar 21 03:00:55 2019 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Thu, 21 Mar 2019 09:00:55 +0100 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: References: Message-ID: Derek I have fixed the optimized plan few weeks ago https://bitbucket.org/petsc/petsc/commits/c3caad8634d376283f7053f3b388606b45b3122c Maybe this will fix your problem too? Stefano Il Gio 21 Mar 2019, 04:21 Zhang, Junchao via petsc-users < petsc-users at mcs.anl.gov> ha scritto: > Hi, Derek, > Try to apply this tiny (but dirty) patch on your version of PETSc to > disable the VecScatterMemcpyPlan optimization to see if it helps. > Thanks. > --Junchao Zhang > > On Wed, Mar 20, 2019 at 6:33 PM Junchao Zhang wrote: > >> Did you see the warning with small scale runs? Is it possible to provide >> a test code? >> You mentioned "changing PETSc now would be pretty painful". Is it because >> it will affect your performance (but not your code)? If yes, could you try >> PETSc master and run you code with or without -vecscatter_type sf. I want >> to isolate the problem and see if it is due to possible bugs in VecScatter. >> If the above suggestion is not feasible, I will disable VecScatterMemcpy. >> It is an optimization I added. Sorry I did not have an option to turn off >> it because I thought it was always useful:) I will provide you a patch >> later to disable it. With that you can run again to isolate possible bugs >> in VecScatterMemcpy. >> Thanks. >> --Junchao Zhang >> >> >> On Wed, Mar 20, 2019 at 5:40 PM Derek Gaston via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Trying to track down some memory corruption I'm seeing on larger scale >>> runs (3.5B+ unknowns). Was able to run Valgrind on it... and I'm seeing >>> quite a lot of uninitialized value errors coming from ghost updating. Here >>> are some of the traces: >>> >>> ==87695== Conditional jump or move depends on uninitialised value(s) >>> ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) >>> ==87695== by 0x7323C70: PetscMallocA (mal.c:390) >>> ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284) >>> ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP >>> (vpscat_mpi1.c:312) >>> ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) >>> ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) >>> ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) >>> ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) >>> ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) >>> ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) >>> >>> ==133582== Conditional jump or move depends on uninitialised value(s) >>> ==133582== at 0x4030384: memcpy@@GLIBC_2.14 >>> (vg_replace_strmem.c:1034) >>> ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) >>> ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack >>> (vecscatterimpl.h:150) >>> ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) >>> ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) >>> ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) >>> >>> This is from a Git checkout of PETSc... the hash I branched from is: >>> 0e667e8fea4aa from December 23rd (updating would be really hard at this >>> point as I've completed 90% of my dissertation with this version... and >>> changing PETSc now would be pretty painful!). >>> >>> Any ideas? Is it possible it's in my code? Is it possible that there >>> are later PETSc commits that already fix this? >>> >>> Thanks for any help, >>> Derek >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Thu Mar 21 04:11:47 2019 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Thu, 21 Mar 2019 10:11:47 +0100 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: References: Message-ID: Derek, can you run with --track-origins=yes? There are few possibilities for the uninitialized warning (candidates for the uninitialized errors are the arrays starts, indices, and the variables bs or n) in the below code, and this valgrind option will help. PetscErrorCode VecScatterMemcpyPlanCreate_Index(PetscInt n,const PetscInt *starts,const PetscInt *indices,PetscInt bs,VecScatterMemcpyPlan *plan) { PetscErrorCode ierr; PetscInt i,j,k,my_copies,n_copies=0,step; PetscBool strided,has_strided; PetscFunctionBegin; ierr = PetscMemzero(plan,sizeof(VecScatterMemcpyPlan));CHKERRQ(ierr); plan->n = n; ierr = PetscMalloc2(n,&plan->optimized,n+1 ,&plan->copy_offsets);CHKERRQ(ierr); /* check if each remote part of the scatter is made of copies, and count total_copies */ for (i=0; i= 256) { /* worth using memcpy? */ plan->optimized[i] = PETSC_TRUE; n_copies += my_copies; } else { plan->optimized[i] = PETSC_FALSE; } } /* do malloc with the recently known n_copies */ -> THIS IS THE VAGRIND WARNING ierr = PetscMalloc2(n_copies,&plan->copy_starts,n_copies,&plan->copy_lengths);CHKERRQ(ierr); Il giorno gio 21 mar 2019 alle ore 09:00 Stefano Zampini < stefano.zampini at gmail.com> ha scritto: > Derek > > I have fixed the optimized plan few weeks ago > > > https://bitbucket.org/petsc/petsc/commits/c3caad8634d376283f7053f3b388606b45b3122c > > Maybe this will fix your problem too? > > Stefano > > > Il Gio 21 Mar 2019, 04:21 Zhang, Junchao via petsc-users < > petsc-users at mcs.anl.gov> ha scritto: > >> Hi, Derek, >> Try to apply this tiny (but dirty) patch on your version of PETSc to >> disable the VecScatterMemcpyPlan optimization to see if it helps. >> Thanks. >> --Junchao Zhang >> >> On Wed, Mar 20, 2019 at 6:33 PM Junchao Zhang >> wrote: >> >>> Did you see the warning with small scale runs? Is it possible to >>> provide a test code? >>> You mentioned "changing PETSc now would be pretty painful". Is it >>> because it will affect your performance (but not your code)? If yes, could >>> you try PETSc master and run you code with or without -vecscatter_type sf. >>> I want to isolate the problem and see if it is due to possible bugs in >>> VecScatter. >>> If the above suggestion is not feasible, I will disable >>> VecScatterMemcpy. It is an optimization I added. Sorry I did not have an >>> option to turn off it because I thought it was always useful:) I will >>> provide you a patch later to disable it. With that you can run again to >>> isolate possible bugs in VecScatterMemcpy. >>> Thanks. >>> --Junchao Zhang >>> >>> >>> On Wed, Mar 20, 2019 at 5:40 PM Derek Gaston via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Trying to track down some memory corruption I'm seeing on larger scale >>>> runs (3.5B+ unknowns). Was able to run Valgrind on it... and I'm seeing >>>> quite a lot of uninitialized value errors coming from ghost updating. Here >>>> are some of the traces: >>>> >>>> ==87695== Conditional jump or move depends on uninitialised value(s) >>>> ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) >>>> ==87695== by 0x7323C70: PetscMallocA (mal.c:390) >>>> ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index >>>> (vscat.c:284) >>>> ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP >>>> (vpscat_mpi1.c:312) >>>> ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) >>>> ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) >>>> ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) >>>> ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) >>>> ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) >>>> ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) >>>> >>>> ==133582== Conditional jump or move depends on uninitialised value(s) >>>> ==133582== at 0x4030384: memcpy@@GLIBC_2.14 >>>> (vg_replace_strmem.c:1034) >>>> ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) >>>> ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack >>>> (vecscatterimpl.h:150) >>>> ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) >>>> ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) >>>> ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) >>>> >>>> This is from a Git checkout of PETSc... the hash I branched from is: >>>> 0e667e8fea4aa from December 23rd (updating would be really hard at this >>>> point as I've completed 90% of my dissertation with this version... and >>>> changing PETSc now would be pretty painful!). >>>> >>>> Any ideas? Is it possible it's in my code? Is it possible that there >>>> are later PETSc commits that already fix this? >>>> >>>> Thanks for any help, >>>> Derek >>>> >>>> -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Mar 21 07:43:36 2019 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 21 Mar 2019 08:43:36 -0400 Subject: [petsc-users] Problems about GMRES restart and Scaling In-Reply-To: References: Message-ID: On Wed, Mar 20, 2019 at 1:18 PM Smith, Barry F. via petsc-users < petsc-users at mcs.anl.gov> wrote: > > > > On Mar 20, 2019, at 5:52 AM, Yingjie Wu via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > Dear PETSc developers: > > Hi, > > Recently, I used PETSc to solve a non-linear PDEs for thermodynamic > problems. In the process of solving, I found the following two phenomena, > hoping to get some help and suggestions. > > > > 1. Because my problem involves a lot of physical parameters, it needs to > call a series of functions, and can not analytically construct Jacobian > matrix, so I use - snes_mf_operator to solve it, and give an approximate > Jacobian matrix as a preconditioner. Because of the large dimension of the > problem and the magnitude difference of the physical variables involved, it > is found that the linear step residuals will increase at each restart > (default 30th linear step) . This problem can be solved by setting a large > number of restart steps. I would like to ask the reasons for this > phenomenon? What knowledge or articles should I learn if I want to find out > this problem? > > I've seen this behavior. I think in your case it is likely the > -snes_mf_operator is not really producing an "accurate enough" > Jacobian-Vector product (and the "solution" being generated by GMRES may be > garbage). Run with -ksp_monitor_true_residual > I have found that GMRES is very sensitive. I tried optimizing a smoother by sending single precision data in MPI and this tanked GMRES. CG was fine with this. > > If your residual function has if () statements in it or other very > sharp changes (discontinuities) then it may not even have a true Jacobian > at the locations it is being evaluated at. In the sense that the > "Jacobian" you are applying via finite differences is not a linear operator > and hence GMRES will fail on it. > > What are you using for a preconditioner? And roughly how many KSP > iterations are being used. > > Barry > > > > > > > 2. In my problem model, there are many physical fields (variables are > realized by finite difference method), and the magnitude of variables > varies greatly. Is there any Scaling interface or function in Petsc? > > > > Thanks, > > Yingjie > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yjwu16 at gmail.com Thu Mar 21 10:01:35 2019 From: yjwu16 at gmail.com (Yingjie Wu) Date: Thu, 21 Mar 2019 23:01:35 +0800 Subject: [petsc-users] Problems about GMRES restart and Scaling In-Reply-To: References: Message-ID: Thanks for all the reply. The model I simulated is a thermal model that contains multiple physical fields(eg. temperature, pressure, velocity). In PDEs, these variables are preceded by some physical parameters, which in turn are functions of these variables(eg. density is a function of pressure and temperature.). Due to the complexity of these physical parameter functions, we cannot explicitly construct Jacobian matrices for this problem. So I use -snes_mf_operator. My preconditioner is to treat these physical parameters as constants. At the beginning of each nonlinear step(SNES), the Jacobian matrix is updated with the result of the previous nonlinear step output(the physical parameters are updated). After setting a large KSP restart step, about 60 KSP can converge(ksp_rtol = 1.e-5). I have a feeling that my initial values are too large to cause this phenomenon. Snes/ex19 is actually a lot like my example, setting up: -da_grid_x 200 -da_grid_y 200 -snes_mf There will also be a residual rise in step 1290 of KSP But not all examples will produce this phenomenon. Thanks, Yingjie Smith, Barry F. ?2019?3?21??? ??1:18??? > > > > On Mar 20, 2019, at 5:52 AM, Yingjie Wu via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > Dear PETSc developers: > > Hi, > > Recently, I used PETSc to solve a non-linear PDEs for thermodynamic > problems. In the process of solving, I found the following two phenomena, > hoping to get some help and suggestions. > > > > 1. Because my problem involves a lot of physical parameters, it needs to > call a series of functions, and can not analytically construct Jacobian > matrix, so I use - snes_mf_operator to solve it, and give an approximate > Jacobian matrix as a preconditioner. Because of the large dimension of the > problem and the magnitude difference of the physical variables involved, it > is found that the linear step residuals will increase at each restart > (default 30th linear step) . This problem can be solved by setting a large > number of restart steps. I would like to ask the reasons for this > phenomenon? What knowledge or articles should I learn if I want to find out > this problem? > > I've seen this behavior. I think in your case it is likely the > -snes_mf_operator is not really producing an "accurate enough" > Jacobian-Vector product (and the "solution" being generated by GMRES may be > garbage). Run with -ksp_monitor_true_residual > > If your residual function has if () statements in it or other very > sharp changes (discontinuities) then it may not even have a true Jacobian > at the locations it is being evaluated at. In the sense that the > "Jacobian" you are applying via finite differences is not a linear operator > and hence GMRES will fail on it. > > What are you using for a preconditioner? And roughly how many KSP > iterations are being used. > > Barry > > > > > > > 2. In my problem model, there are many physical fields (variables are > realized by finite difference method), and the magnitude of variables > varies greatly. Is there any Scaling interface or function in Petsc? > > > > Thanks, > > Yingjie > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Thu Mar 21 10:51:31 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Thu, 21 Mar 2019 15:51:31 +0000 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: References: Message-ID: Thanks to Stefano for fixing this bug. His fix is easy to apply (two-line change) and therefore should be tried first. --Junchao Zhang On Thu, Mar 21, 2019 at 3:02 AM Stefano Zampini > wrote: Derek I have fixed the optimized plan few weeks ago https://bitbucket.org/petsc/petsc/commits/c3caad8634d376283f7053f3b388606b45b3122c Maybe this will fix your problem too? Stefano Il Gio 21 Mar 2019, 04:21 Zhang, Junchao via petsc-users > ha scritto: Hi, Derek, Try to apply this tiny (but dirty) patch on your version of PETSc to disable the VecScatterMemcpyPlan optimization to see if it helps. Thanks. --Junchao Zhang On Wed, Mar 20, 2019 at 6:33 PM Junchao Zhang > wrote: Did you see the warning with small scale runs? Is it possible to provide a test code? You mentioned "changing PETSc now would be pretty painful". Is it because it will affect your performance (but not your code)? If yes, could you try PETSc master and run you code with or without -vecscatter_type sf. I want to isolate the problem and see if it is due to possible bugs in VecScatter. If the above suggestion is not feasible, I will disable VecScatterMemcpy. It is an optimization I added. Sorry I did not have an option to turn off it because I thought it was always useful:) I will provide you a patch later to disable it. With that you can run again to isolate possible bugs in VecScatterMemcpy. Thanks. --Junchao Zhang On Wed, Mar 20, 2019 at 5:40 PM Derek Gaston via petsc-users > wrote: Trying to track down some memory corruption I'm seeing on larger scale runs (3.5B+ unknowns). Was able to run Valgrind on it... and I'm seeing quite a lot of uninitialized value errors coming from ghost updating. Here are some of the traces: ==87695== Conditional jump or move depends on uninitialised value(s) ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) ==87695== by 0x7323C70: PetscMallocA (mal.c:390) ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284) ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP (vpscat_mpi1.c:312) ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) ==133582== Conditional jump or move depends on uninitialised value(s) ==133582== at 0x4030384: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1034) ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack (vecscatterimpl.h:150) ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) This is from a Git checkout of PETSc... the hash I branched from is: 0e667e8fea4aa from December 23rd (updating would be really hard at this point as I've completed 90% of my dissertation with this version... and changing PETSc now would be pretty painful!). Any ideas? Is it possible it's in my code? Is it possible that there are later PETSc commits that already fix this? Thanks for any help, Derek -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Mar 21 11:15:46 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Thu, 21 Mar 2019 16:15:46 +0000 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: References: Message-ID: Does maint also need this fix? Satish On Thu, 21 Mar 2019, Stefano Zampini via petsc-users wrote: > Derek > > I have fixed the optimized plan few weeks ago > > https://bitbucket.org/petsc/petsc/commits/c3caad8634d376283f7053f3b388606b45b3122c > > Maybe this will fix your problem too? > > Stefano > > > Il Gio 21 Mar 2019, 04:21 Zhang, Junchao via petsc-users < > petsc-users at mcs.anl.gov> ha scritto: > > > Hi, Derek, > > Try to apply this tiny (but dirty) patch on your version of PETSc to > > disable the VecScatterMemcpyPlan optimization to see if it helps. > > Thanks. > > --Junchao Zhang > > > > On Wed, Mar 20, 2019 at 6:33 PM Junchao Zhang wrote: > > > >> Did you see the warning with small scale runs? Is it possible to provide > >> a test code? > >> You mentioned "changing PETSc now would be pretty painful". Is it because > >> it will affect your performance (but not your code)? If yes, could you try > >> PETSc master and run you code with or without -vecscatter_type sf. I want > >> to isolate the problem and see if it is due to possible bugs in VecScatter. > >> If the above suggestion is not feasible, I will disable VecScatterMemcpy. > >> It is an optimization I added. Sorry I did not have an option to turn off > >> it because I thought it was always useful:) I will provide you a patch > >> later to disable it. With that you can run again to isolate possible bugs > >> in VecScatterMemcpy. > >> Thanks. > >> --Junchao Zhang > >> > >> > >> On Wed, Mar 20, 2019 at 5:40 PM Derek Gaston via petsc-users < > >> petsc-users at mcs.anl.gov> wrote: > >> > >>> Trying to track down some memory corruption I'm seeing on larger scale > >>> runs (3.5B+ unknowns). Was able to run Valgrind on it... and I'm seeing > >>> quite a lot of uninitialized value errors coming from ghost updating. Here > >>> are some of the traces: > >>> > >>> ==87695== Conditional jump or move depends on uninitialised value(s) > >>> ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) > >>> ==87695== by 0x7323C70: PetscMallocA (mal.c:390) > >>> ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284) > >>> ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP > >>> (vpscat_mpi1.c:312) > >>> ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) > >>> ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) > >>> ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) > >>> ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) > >>> ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) > >>> ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) > >>> > >>> ==133582== Conditional jump or move depends on uninitialised value(s) > >>> ==133582== at 0x4030384: memcpy@@GLIBC_2.14 > >>> (vg_replace_strmem.c:1034) > >>> ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) > >>> ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack > >>> (vecscatterimpl.h:150) > >>> ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) > >>> ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) > >>> ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) > >>> > >>> This is from a Git checkout of PETSc... the hash I branched from is: > >>> 0e667e8fea4aa from December 23rd (updating would be really hard at this > >>> point as I've completed 90% of my dissertation with this version... and > >>> changing PETSc now would be pretty painful!). > >>> > >>> Any ideas? Is it possible it's in my code? Is it possible that there > >>> are later PETSc commits that already fix this? > >>> > >>> Thanks for any help, > >>> Derek > >>> > >>> > From jczhang at mcs.anl.gov Thu Mar 21 11:21:11 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Thu, 21 Mar 2019 16:21:11 +0000 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: <9f95dd707d674d418bc871c0048cb8c5@BN7PR09MB2835.namprd09.prod.outlook.com> References: <9f95dd707d674d418bc871c0048cb8c5@BN7PR09MB2835.namprd09.prod.outlook.com> Message-ID: Yes, it does. It is a bug. --Junchao Zhang On Thu, Mar 21, 2019 at 11:16 AM Balay, Satish > wrote: Does maint also need this fix? Satish On Thu, 21 Mar 2019, Stefano Zampini via petsc-users wrote: > Derek > > I have fixed the optimized plan few weeks ago > > https://bitbucket.org/petsc/petsc/commits/c3caad8634d376283f7053f3b388606b45b3122c > > Maybe this will fix your problem too? > > Stefano > > > Il Gio 21 Mar 2019, 04:21 Zhang, Junchao via petsc-users < > petsc-users at mcs.anl.gov> ha scritto: > > > Hi, Derek, > > Try to apply this tiny (but dirty) patch on your version of PETSc to > > disable the VecScatterMemcpyPlan optimization to see if it helps. > > Thanks. > > --Junchao Zhang > > > > On Wed, Mar 20, 2019 at 6:33 PM Junchao Zhang > wrote: > > > >> Did you see the warning with small scale runs? Is it possible to provide > >> a test code? > >> You mentioned "changing PETSc now would be pretty painful". Is it because > >> it will affect your performance (but not your code)? If yes, could you try > >> PETSc master and run you code with or without -vecscatter_type sf. I want > >> to isolate the problem and see if it is due to possible bugs in VecScatter. > >> If the above suggestion is not feasible, I will disable VecScatterMemcpy. > >> It is an optimization I added. Sorry I did not have an option to turn off > >> it because I thought it was always useful:) I will provide you a patch > >> later to disable it. With that you can run again to isolate possible bugs > >> in VecScatterMemcpy. > >> Thanks. > >> --Junchao Zhang > >> > >> > >> On Wed, Mar 20, 2019 at 5:40 PM Derek Gaston via petsc-users < > >> petsc-users at mcs.anl.gov> wrote: > >> > >>> Trying to track down some memory corruption I'm seeing on larger scale > >>> runs (3.5B+ unknowns). Was able to run Valgrind on it... and I'm seeing > >>> quite a lot of uninitialized value errors coming from ghost updating. Here > >>> are some of the traces: > >>> > >>> ==87695== Conditional jump or move depends on uninitialised value(s) > >>> ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) > >>> ==87695== by 0x7323C70: PetscMallocA (mal.c:390) > >>> ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284) > >>> ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP > >>> (vpscat_mpi1.c:312) > >>> ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) > >>> ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) > >>> ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) > >>> ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) > >>> ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) > >>> ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) > >>> > >>> ==133582== Conditional jump or move depends on uninitialised value(s) > >>> ==133582== at 0x4030384: memcpy@@GLIBC_2.14 > >>> (vg_replace_strmem.c:1034) > >>> ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) > >>> ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack > >>> (vecscatterimpl.h:150) > >>> ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) > >>> ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) > >>> ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) > >>> > >>> This is from a Git checkout of PETSc... the hash I branched from is: > >>> 0e667e8fea4aa from December 23rd (updating would be really hard at this > >>> point as I've completed 90% of my dissertation with this version... and > >>> changing PETSc now would be pretty painful!). > >>> > >>> Any ideas? Is it possible it's in my code? Is it possible that there > >>> are later PETSc commits that already fix this? > >>> > >>> Thanks for any help, > >>> Derek > >>> > >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Mar 21 11:28:12 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Thu, 21 Mar 2019 16:28:12 +0000 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: References: <9f95dd707d674d418bc871c0048cb8c5@BN7PR09MB2835.namprd09.prod.outlook.com> Message-ID: Ok - cherrypicked and pushed to maint. Satish On Thu, 21 Mar 2019, Zhang, Junchao via petsc-users wrote: > Yes, it does. It is a bug. > --Junchao Zhang > > > On Thu, Mar 21, 2019 at 11:16 AM Balay, Satish > wrote: > Does maint also need this fix? > > Satish > > On Thu, 21 Mar 2019, Stefano Zampini via petsc-users wrote: > > > Derek > > > > I have fixed the optimized plan few weeks ago > > > > https://bitbucket.org/petsc/petsc/commits/c3caad8634d376283f7053f3b388606b45b3122c > > > > Maybe this will fix your problem too? > > > > Stefano > > > > > > Il Gio 21 Mar 2019, 04:21 Zhang, Junchao via petsc-users < > > petsc-users at mcs.anl.gov> ha scritto: > > > > > Hi, Derek, > > > Try to apply this tiny (but dirty) patch on your version of PETSc to > > > disable the VecScatterMemcpyPlan optimization to see if it helps. > > > Thanks. > > > --Junchao Zhang > > > > > > On Wed, Mar 20, 2019 at 6:33 PM Junchao Zhang > wrote: > > > > > >> Did you see the warning with small scale runs? Is it possible to provide > > >> a test code? > > >> You mentioned "changing PETSc now would be pretty painful". Is it because > > >> it will affect your performance (but not your code)? If yes, could you try > > >> PETSc master and run you code with or without -vecscatter_type sf. I want > > >> to isolate the problem and see if it is due to possible bugs in VecScatter. > > >> If the above suggestion is not feasible, I will disable VecScatterMemcpy. > > >> It is an optimization I added. Sorry I did not have an option to turn off > > >> it because I thought it was always useful:) I will provide you a patch > > >> later to disable it. With that you can run again to isolate possible bugs > > >> in VecScatterMemcpy. > > >> Thanks. > > >> --Junchao Zhang > > >> > > >> > > >> On Wed, Mar 20, 2019 at 5:40 PM Derek Gaston via petsc-users < > > >> petsc-users at mcs.anl.gov> wrote: > > >> > > >>> Trying to track down some memory corruption I'm seeing on larger scale > > >>> runs (3.5B+ unknowns). Was able to run Valgrind on it... and I'm seeing > > >>> quite a lot of uninitialized value errors coming from ghost updating. Here > > >>> are some of the traces: > > >>> > > >>> ==87695== Conditional jump or move depends on uninitialised value(s) > > >>> ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) > > >>> ==87695== by 0x7323C70: PetscMallocA (mal.c:390) > > >>> ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284) > > >>> ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP > > >>> (vpscat_mpi1.c:312) > > >>> ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) > > >>> ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) > > >>> ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) > > >>> ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) > > >>> ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) > > >>> ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) > > >>> > > >>> ==133582== Conditional jump or move depends on uninitialised value(s) > > >>> ==133582== at 0x4030384: memcpy@@GLIBC_2.14 > > >>> (vg_replace_strmem.c:1034) > > >>> ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) > > >>> ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack > > >>> (vecscatterimpl.h:150) > > >>> ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) > > >>> ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) > > >>> ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) > > >>> > > >>> This is from a Git checkout of PETSc... the hash I branched from is: > > >>> 0e667e8fea4aa from December 23rd (updating would be really hard at this > > >>> point as I've completed 90% of my dissertation with this version... and > > >>> changing PETSc now would be pretty painful!). > > >>> > > >>> Any ideas? Is it possible it's in my code? Is it possible that there > > >>> are later PETSc commits that already fix this? > > >>> > > >>> Thanks for any help, > > >>> Derek > > >>> > > >>> > > > > From stefano.zampini at gmail.com Thu Mar 21 11:38:34 2019 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Thu, 21 Mar 2019 17:38:34 +0100 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: References: Message-ID: Il giorno mer 20 mar 2019 alle ore 23:40 Derek Gaston via petsc-users < petsc-users at mcs.anl.gov> ha scritto: > Trying to track down some memory corruption I'm seeing on larger scale > runs (3.5B+ unknowns). > Uhm.... are you using 32bit indices? is it possible there's integer overflow somewhere? > Was able to run Valgrind on it... and I'm seeing quite a lot of > uninitialized value errors coming from ghost updating. Here are some of > the traces: > > ==87695== Conditional jump or move depends on uninitialised value(s) > ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) > ==87695== by 0x7323C70: PetscMallocA (mal.c:390) > ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284) > ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP > (vpscat_mpi1.c:312) > ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) > ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) > ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) > ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) > ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) > ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) > > ==133582== Conditional jump or move depends on uninitialised value(s) > ==133582== at 0x4030384: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1034) > ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) > ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack > (vecscatterimpl.h:150) > ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) > ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) > ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) > > This is from a Git checkout of PETSc... the hash I branched from is: > 0e667e8fea4aa from December 23rd (updating would be really hard at this > point as I've completed 90% of my dissertation with this version... and > changing PETSc now would be pretty painful!). > > Any ideas? Is it possible it's in my code? Is it possible that there are > later PETSc commits that already fix this? > > Thanks for any help, > Derek > > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedmud at gmail.com Thu Mar 21 13:59:35 2019 From: friedmud at gmail.com (Derek Gaston) Date: Thu, 21 Mar 2019 12:59:35 -0600 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: References: Message-ID: It sounds like you already tracked this down... but for completeness here is what track-origins gives: ==262923== Conditional jump or move depends on uninitialised value(s) ==262923== at 0x73C6548: VecScatterMemcpyPlanCreate_Index (vscat.c:294) ==262923== by 0x73DBD97: VecScatterMemcpyPlanCreate_PtoP (vpscat_mpi1.c:312) ==262923== by 0x73DE6AE: VecScatterCreateCommon_PtoS_MPI1 (vpscat_mpi1.c:2328) ==262923== by 0x73DFFEA: VecScatterCreateLocal_PtoS_MPI1 (vpscat_mpi1.c:2202) ==262923== by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608) ==262923== by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857) ==262923== by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) ==262923== by 0x7413D39: VecScatterSetUp (vscatfce.c:212) ==262923== by 0x7412D73: VecScatterCreateWithData (vscreate.c:333) ==262923== by 0x747A232: VecCreateGhostWithArray (pbvec.c:685) ==262923== by 0x747A90D: VecCreateGhost (pbvec.c:741) ==262923== by 0x5C7FFD6: libMesh::PetscVector::init(unsigned long, unsigned long, std::vector > const&, bool, libMesh::ParallelType) (petsc_vector.h:752) ==262923== Uninitialised value was created by a heap allocation ==262923== at 0x402DDC6: memalign (vg_replace_malloc.c:899) ==262923== by 0x7359702: PetscMallocAlign (mal.c:41) ==262923== by 0x7359C70: PetscMallocA (mal.c:390) ==262923== by 0x73DECF0: VecScatterCreateLocal_PtoS_MPI1 (vpscat_mpi1.c:2061) ==262923== by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608) ==262923== by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857) ==262923== by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) ==262923== by 0x7413D39: VecScatterSetUp (vscatfce.c:212) ==262923== by 0x7412D73: VecScatterCreateWithData (vscreate.c:333) ==262923== by 0x747A232: VecCreateGhostWithArray (pbvec.c:685) ==262923== by 0x747A90D: VecCreateGhost (pbvec.c:741) ==262923== by 0x5C7FFD6: libMesh::PetscVector::init(unsigned long, unsigned long, std::vector > const&, bool, libMesh::ParallelType) (petsc_vector.h:752) BTW: This turned out not to be my actual problem. My actual problem was just some stupidity on my part... just a simple input parameter issue to my code (should have had better error checking!). But: It sounds like my digging may have uncovered something real here... so it wasn't completely useless :-) Thanks for your help everyone! Derek On Thu, Mar 21, 2019 at 10:38 AM Stefano Zampini wrote: > > > Il giorno mer 20 mar 2019 alle ore 23:40 Derek Gaston via petsc-users < > petsc-users at mcs.anl.gov> ha scritto: > >> Trying to track down some memory corruption I'm seeing on larger scale >> runs (3.5B+ unknowns). >> > > Uhm.... are you using 32bit indices? is it possible there's integer > overflow somewhere? > > > >> Was able to run Valgrind on it... and I'm seeing quite a lot of >> uninitialized value errors coming from ghost updating. Here are some of >> the traces: >> >> ==87695== Conditional jump or move depends on uninitialised value(s) >> ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) >> ==87695== by 0x7323C70: PetscMallocA (mal.c:390) >> ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284) >> ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP >> (vpscat_mpi1.c:312) >> ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) >> ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) >> ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) >> ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) >> ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) >> ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) >> >> ==133582== Conditional jump or move depends on uninitialised value(s) >> ==133582== at 0x4030384: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1034) >> ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) >> ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack >> (vecscatterimpl.h:150) >> ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) >> ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) >> ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) >> >> This is from a Git checkout of PETSc... the hash I branched from is: >> 0e667e8fea4aa from December 23rd (updating would be really hard at this >> point as I've completed 90% of my dissertation with this version... and >> changing PETSc now would be pretty painful!). >> >> Any ideas? Is it possible it's in my code? Is it possible that there >> are later PETSc commits that already fix this? >> >> Thanks for any help, >> Derek >> >> > > -- > Stefano > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Thu Mar 21 16:01:44 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Thu, 21 Mar 2019 21:01:44 +0000 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: References: Message-ID: On Thu, Mar 21, 2019 at 1:57 PM Derek Gaston via petsc-users > wrote: It sounds like you already tracked this down... but for completeness here is what track-origins gives: ==262923== Conditional jump or move depends on uninitialised value(s) ==262923== at 0x73C6548: VecScatterMemcpyPlanCreate_Index (vscat.c:294) ==262923== by 0x73DBD97: VecScatterMemcpyPlanCreate_PtoP (vpscat_mpi1.c:312) ==262923== by 0x73DE6AE: VecScatterCreateCommon_PtoS_MPI1 (vpscat_mpi1.c:2328) ==262923== by 0x73DFFEA: VecScatterCreateLocal_PtoS_MPI1 (vpscat_mpi1.c:2202) ==262923== by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608) ==262923== by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857) ==262923== by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) ==262923== by 0x7413D39: VecScatterSetUp (vscatfce.c:212) ==262923== by 0x7412D73: VecScatterCreateWithData (vscreate.c:333) ==262923== by 0x747A232: VecCreateGhostWithArray (pbvec.c:685) ==262923== by 0x747A90D: VecCreateGhost (pbvec.c:741) ==262923== by 0x5C7FFD6: libMesh::PetscVector::init(unsigned long, unsigned long, std::vector > const&, bool, libMesh::ParallelType) (petsc_vector.h:752) ==262923== Uninitialised value was created by a heap allocation I checked the code but could not figure out what was wrong. Perhaps you should use 64-bit integers and see whether the warning still exists. Please remember to incorporate Stefano's bug fix. ==262923== at 0x402DDC6: memalign (vg_replace_malloc.c:899) ==262923== by 0x7359702: PetscMallocAlign (mal.c:41) ==262923== by 0x7359C70: PetscMallocA (mal.c:390) ==262923== by 0x73DECF0: VecScatterCreateLocal_PtoS_MPI1 (vpscat_mpi1.c:2061) ==262923== by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608) ==262923== by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857) ==262923== by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) ==262923== by 0x7413D39: VecScatterSetUp (vscatfce.c:212) ==262923== by 0x7412D73: VecScatterCreateWithData (vscreate.c:333) ==262923== by 0x747A232: VecCreateGhostWithArray (pbvec.c:685) ==262923== by 0x747A90D: VecCreateGhost (pbvec.c:741) ==262923== by 0x5C7FFD6: libMesh::PetscVector::init(unsigned long, unsigned long, std::vector > const&, bool, libMesh::ParallelType) (petsc_vector.h:752) BTW: This turned out not to be my actual problem. My actual problem was just some stupidity on my part... just a simple input parameter issue to my code (should have had better error checking!). But: It sounds like my digging may have uncovered something real here... so it wasn't completely useless :-) Thanks for your help everyone! Derek On Thu, Mar 21, 2019 at 10:38 AM Stefano Zampini > wrote: Il giorno mer 20 mar 2019 alle ore 23:40 Derek Gaston via petsc-users > ha scritto: Trying to track down some memory corruption I'm seeing on larger scale runs (3.5B+ unknowns). Uhm.... are you using 32bit indices? is it possible there's integer overflow somewhere? Was able to run Valgrind on it... and I'm seeing quite a lot of uninitialized value errors coming from ghost updating. Here are some of the traces: ==87695== Conditional jump or move depends on uninitialised value(s) ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) ==87695== by 0x7323C70: PetscMallocA (mal.c:390) ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284) ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP (vpscat_mpi1.c:312) ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) ==133582== Conditional jump or move depends on uninitialised value(s) ==133582== at 0x4030384: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1034) ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack (vecscatterimpl.h:150) ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) This is from a Git checkout of PETSc... the hash I branched from is: 0e667e8fea4aa from December 23rd (updating would be really hard at this point as I've completed 90% of my dissertation with this version... and changing PETSc now would be pretty painful!). Any ideas? Is it possible it's in my code? Is it possible that there are later PETSc commits that already fix this? Thanks for any help, Derek -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Thu Mar 21 16:47:30 2019 From: jychang48 at gmail.com (Justin Chang) Date: Thu, 21 Mar 2019 15:47:30 -0600 Subject: [petsc-users] DMCreateSubDM() not available in petsc4py Message-ID: Hi all, I'm writing a petsc4py routine to manually create nested fieldsplits using index sets, and it looks like whenever I move onto the next level of splits I need to rescale the IS's. >From the PCFieldSplitSetDefault() routine, it looks like DMCreateSubDM() does exactly this here . However, I see no such Python wrapper for this function. While I could write my own, I kind of have my hands tied behind my back by using the petsc4py that's attached to FEniCS - I wonder if there's a general strategy or workaround for writing your own recursive fieldsplitsetIS. Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Mar 21 16:52:28 2019 From: jed at jedbrown.org (Jed Brown) Date: Thu, 21 Mar 2019 15:52:28 -0600 Subject: [petsc-users] DMCreateSubDM() not available in petsc4py In-Reply-To: References: Message-ID: <874l7vrj37.fsf@jedbrown.org> Justin Chang via petsc-users writes: > Hi all, > > I'm writing a petsc4py routine to manually create nested fieldsplits using > index sets, and it looks like whenever I move onto the next level of splits > I need to rescale the IS's. > > From the PCFieldSplitSetDefault() routine, it looks like DMCreateSubDM() > does exactly this here > . > However, I see no such Python wrapper for this function. While I could > write my own, I kind of have my hands tied behind my back by using the > petsc4py that's attached to FEniCS What do you mean "attached to FEniCS"? > - I wonder if there's a general strategy or workaround for writing > your own recursive fieldsplitsetIS. > > Thanks, > Justin From maahi.buet at gmail.com Thu Mar 21 21:09:13 2019 From: maahi.buet at gmail.com (Maahi Talukder) Date: Thu, 21 Mar 2019 22:09:13 -0400 Subject: [petsc-users] About Configuring PETSc Message-ID: Dear All, Currently, I am running PETSc with debugging option. And it says that if I run ./configure --with-debugging=no, the performance would be faster. My question is: what would I do if I want to go back to debugging mode, and If I configure it now with no debugging option, would it make any changes to my current setting? Regards, Maahi Talukder -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Mar 21 21:14:19 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Fri, 22 Mar 2019 02:14:19 +0000 Subject: [petsc-users] About Configuring PETSc In-Reply-To: References: Message-ID: PETSc uses the concept of PETSC_ARCH to enable multiple in-place builds. So you can have a debug build with PETSC_ARCH=arch-debug, and a optimized build with PETSC_ARCH=arch-opt etc. And if you are using a petsc formatted makefile with your code - you can switch between these builds by just switching PETSC_ARCH. Satish On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: > Dear All, > > Currently, I am running PETSc with debugging option. And it says that if I > run ./configure --with-debugging=no, the performance would be faster. My > question is: what would I do if I want to go back to debugging mode, and If > I configure it now with no debugging option, would it make any changes to > my current setting? > > Regards, > Maahi Talukder > From maahi.buet at gmail.com Thu Mar 21 21:30:24 2019 From: maahi.buet at gmail.com (Maahi Talukder) Date: Thu, 21 Mar 2019 22:30:24 -0400 Subject: [petsc-users] About Configuring PETSc In-Reply-To: References: Message-ID: Thank you for your reply. So do I need to set the value of PETSC_ARCH as needed in .bashrc as I did in case of PETSC_DIR ? And by PETSC_ARCH=arch-opt, do you mean the non-debugging mode? And I am using the following makefile with my code- CFLAGS = FFLAGS =-I/home/maahi/petsc/include -I/home/maahi/petsc/arch-linux2-c-debug/include -cpp -mcmodel=large CPPFLAGS = FPPFLAGS = include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules wholetest1: wholetest1.o -${FLINKER} -o wholetest1 wholetest1.o ${PETSC_LIB} ${RM} wholetest1.o So where do I add that PETSC_ARCH? Thanks, Maahi Talukder On Thu, Mar 21, 2019 at 10:14 PM Balay, Satish wrote: > PETSc uses the concept of PETSC_ARCH to enable multiple in-place > builds. > > So you can have a debug build with PETSC_ARCH=arch-debug, and a > optimized build with PETSC_ARCH=arch-opt etc. > > And if you are using a petsc formatted makefile with your code - you > can switch between these builds by just switching PETSC_ARCH. > > Satish > > On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: > > > Dear All, > > > > Currently, I am running PETSc with debugging option. And it says that if > I > > run ./configure --with-debugging=no, the performance would be faster. My > > question is: what would I do if I want to go back to debugging mode, and > If > > I configure it now with no debugging option, would it make any changes to > > my current setting? > > > > Regards, > > Maahi Talukder > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Mar 21 21:43:07 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Fri, 22 Mar 2019 02:43:07 +0000 Subject: [petsc-users] About Configuring PETSc In-Reply-To: References: Message-ID: On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: > Thank you for your reply. > > So do I need to set the value of PETSC_ARCH as needed in .bashrc as I did > in case of PETSC_DIR ? You can specify PETSC_ARCH as an option to make. You can have a default value set in .bashrc - and change to a different value on command line. For ex: in .bashrc export PETSC_ARCH=arch-debug Now if you want to build with debug libraries: make wholetest1 Now If you want to build with optimized libraries: make PETSC_ARCH=arch-opt wholetest1 > And by PETSC_ARCH=arch-opt, do you mean the > non-debugging mode? Yes. You can use whatever name you think is appropriate here. ./configure PETSC_ARCH=a-name-i-can-easily-associate-with-this-build [other configure options.] > > And I am using the following makefile with my code- > > CFLAGS = > FFLAGS =-I/home/maahi/petsc/include > -I/home/maahi/petsc/arch-linux2-c-debug/include -cpp -mcmodel=large Hm - you shouldn't be needing these options here. You should switch your source files from .f to .F and .f90 to .F90 - and remove the above FFLAGS Satish > CPPFLAGS = > FPPFLAGS = > > > include ${PETSC_DIR}/lib/petsc/conf/variables > include ${PETSC_DIR}/lib/petsc/conf/rules > > wholetest1: wholetest1.o > -${FLINKER} -o wholetest1 wholetest1.o ${PETSC_LIB} > ${RM} wholetest1.o > > So where do I add that PETSC_ARCH? > > Thanks, > Maahi Talukder > > On Thu, Mar 21, 2019 at 10:14 PM Balay, Satish wrote: > > > PETSc uses the concept of PETSC_ARCH to enable multiple in-place > > builds. > > > > So you can have a debug build with PETSC_ARCH=arch-debug, and a > > optimized build with PETSC_ARCH=arch-opt etc. > > > > And if you are using a petsc formatted makefile with your code - you > > can switch between these builds by just switching PETSC_ARCH. > > > > Satish > > > > On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: > > > > > Dear All, > > > > > > Currently, I am running PETSc with debugging option. And it says that if > > I > > > run ./configure --with-debugging=no, the performance would be faster. My > > > question is: what would I do if I want to go back to debugging mode, and > > If > > > I configure it now with no debugging option, would it make any changes to > > > my current setting? > > > > > > Regards, > > > Maahi Talukder > > > > > > > > From maahi.buet at gmail.com Thu Mar 21 22:55:32 2019 From: maahi.buet at gmail.com (Maahi Talukder) Date: Thu, 21 Mar 2019 23:55:32 -0400 Subject: [petsc-users] About Configuring PETSc In-Reply-To: References: Message-ID: Thank you so much for your reply. That clear things up! On Thu, Mar 21, 2019 at 10:43 PM Balay, Satish wrote: > On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: > > > Thank you for your reply. > > > > So do I need to set the value of PETSC_ARCH as needed in .bashrc as I > did > > in case of PETSC_DIR ? > > You can specify PETSC_ARCH as an option to make. You can have a default > value set in .bashrc - and change to a different value on command line. > > For ex: in .bashrc > > export PETSC_ARCH=arch-debug > > Now if you want to build with debug libraries: > > make wholetest1 > > Now If you want to build with optimized libraries: > > make PETSC_ARCH=arch-opt wholetest1 > > > > And by PETSC_ARCH=arch-opt, do you mean the > > non-debugging mode? > > Yes. You can use whatever name you think is appropriate here. > > ./configure PETSC_ARCH=a-name-i-can-easily-associate-with-this-build > [other configure options.] > > > > > And I am using the following makefile with my code- > > > > CFLAGS = > > FFLAGS =-I/home/maahi/petsc/include > > -I/home/maahi/petsc/arch-linux2-c-debug/include -cpp -mcmodel=large > > Hm - you shouldn't be needing these options here. You should switch your > source files from .f to .F and .f90 to .F90 - and remove the above FFLAGS > > Satish > > > CPPFLAGS = > > FPPFLAGS = > > > > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > wholetest1: wholetest1.o > > -${FLINKER} -o wholetest1 wholetest1.o ${PETSC_LIB} > > ${RM} wholetest1.o > > > > So where do I add that PETSC_ARCH? > > > > Thanks, > > Maahi Talukder > > > > On Thu, Mar 21, 2019 at 10:14 PM Balay, Satish > wrote: > > > > > PETSc uses the concept of PETSC_ARCH to enable multiple in-place > > > builds. > > > > > > So you can have a debug build with PETSC_ARCH=arch-debug, and a > > > optimized build with PETSC_ARCH=arch-opt etc. > > > > > > And if you are using a petsc formatted makefile with your code - you > > > can switch between these builds by just switching PETSC_ARCH. > > > > > > Satish > > > > > > On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: > > > > > > > Dear All, > > > > > > > > Currently, I am running PETSc with debugging option. And it says > that if > > > I > > > > run ./configure --with-debugging=no, the performance would be > faster. My > > > > question is: what would I do if I want to go back to debugging mode, > and > > > If > > > > I configure it now with no debugging option, would it make any > changes to > > > > my current setting? > > > > > > > > Regards, > > > > Maahi Talukder > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From swarnava89 at gmail.com Fri Mar 22 01:19:28 2019 From: swarnava89 at gmail.com (Swarnava Ghosh) Date: Thu, 21 Mar 2019 23:19:28 -0700 Subject: [petsc-users] Consistent domain decomposition between DMDA and DMPLEX Message-ID: Dear PETSc users and developers, I am new to DMPLEX and had a query regarding setting up a consistent domain decomposition of two meshes in PETSc. I have a structured finite difference grid, managed through DMDA. I have another unstructured finite element mesh managed through DMPLEX. Now all the nodes in the unstructured finite element mesh also belong to the set of nodes in the structured finite difference mesh (but not necessarily vice-versa), and the number of nodes in DMPLEX mesh is less than the number of nodes in DMDA mesh. How can I guarantee a consistent domain decomposition of the two meshes? By consistent, I mean that if a process has a set of nodes P from DMDA, and the same process has the set of nodes Q from DMPLEX, then Q is a subset of P. I look forward to your response. Sincerely, Swarnava -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Fri Mar 22 05:29:17 2019 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Fri, 22 Mar 2019 11:29:17 +0100 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: References: Message-ID: <9AAE0599-D4B2-4FCF-ADFE-011417C5CAF6@gmail.com> > On Mar 21, 2019, at 7:59 PM, Derek Gaston wrote: > > It sounds like you already tracked this down... but for completeness here is what track-origins gives: > > ==262923== Conditional jump or move depends on uninitialised value(s) > ==262923== at 0x73C6548: VecScatterMemcpyPlanCreate_Index (vscat.c:294) > ==262923== by 0x73DBD97: VecScatterMemcpyPlanCreate_PtoP (vpscat_mpi1.c:312) > ==262923== by 0x73DE6AE: VecScatterCreateCommon_PtoS_MPI1 (vpscat_mpi1.c:2328) > ==262923== by 0x73DFFEA: VecScatterCreateLocal_PtoS_MPI1 (vpscat_mpi1.c:2202) > ==262923== by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608) > ==262923== by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857) > ==262923== by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) > ==262923== by 0x7413D39: VecScatterSetUp (vscatfce.c:212) > ==262923== by 0x7412D73: VecScatterCreateWithData (vscreate.c:333) > ==262923== by 0x747A232: VecCreateGhostWithArray (pbvec.c:685) > ==262923== by 0x747A90D: VecCreateGhost (pbvec.c:741) > ==262923== by 0x5C7FFD6: libMesh::PetscVector::init(unsigned long, unsigned long, std::vector > const&, bool, libMesh::ParallelType) (petsc_vector.h:752) > ==262923== Uninitialised value was created by a heap allocation > ==262923== at 0x402DDC6: memalign (vg_replace_malloc.c:899) > ==262923== by 0x7359702: PetscMallocAlign (mal.c:41) > ==262923== by 0x7359C70: PetscMallocA (mal.c:390) > ==262923== by 0x73DECF0: VecScatterCreateLocal_PtoS_MPI1 (vpscat_mpi1.c:2061) > ==262923== by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608) > ==262923== by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857) > ==262923== by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) > ==262923== by 0x7413D39: VecScatterSetUp (vscatfce.c:212) > ==262923== by 0x7412D73: VecScatterCreateWithData (vscreate.c:333) > ==262923== by 0x747A232: VecCreateGhostWithArray (pbvec.c:685) > ==262923== by 0x747A90D: VecCreateGhost (pbvec.c:741) > ==262923== by 0x5C7FFD6: libMesh::PetscVector::init(unsigned long, unsigned long, std::vector > const&, bool, libMesh::ParallelType) (petsc_vector.h:752) > > > BTW: This turned out not to be my actual problem. My actual problem was just some stupidity on my part... just a simple input parameter issue to my code (should have had better error checking!). > > But: It sounds like my digging may have uncovered something real here... so it wasn't completely useless :-) Derek, I don?t understand. Is your problem fixed or not? Would be nice to understand what was the ?some stupidity on your part?, and if it was still leading to valid PETSc code or just to a broken setup. In the first case, then we should investigate the valgrind error you reported. In the second case, this is not a PETSc issue. > > Thanks for your help everyone! > > Derek > > > > On Thu, Mar 21, 2019 at 10:38 AM Stefano Zampini > wrote: > > > Il giorno mer 20 mar 2019 alle ore 23:40 Derek Gaston via petsc-users > ha scritto: > Trying to track down some memory corruption I'm seeing on larger scale runs (3.5B+ unknowns). > > Uhm.... are you using 32bit indices? is it possible there's integer overflow somewhere? > > > Was able to run Valgrind on it... and I'm seeing quite a lot of uninitialized value errors coming from ghost updating. Here are some of the traces: > > ==87695== Conditional jump or move depends on uninitialised value(s) > ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) > ==87695== by 0x7323C70: PetscMallocA (mal.c:390) > ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284) > ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP (vpscat_mpi1.c:312) > ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) > ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) > ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) > ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) > ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) > ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) > > ==133582== Conditional jump or move depends on uninitialised value(s) > ==133582== at 0x4030384: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1034) > ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) > ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack (vecscatterimpl.h:150) > ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) > ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) > ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) > > This is from a Git checkout of PETSc... the hash I branched from is: 0e667e8fea4aa from December 23rd (updating would be really hard at this point as I've completed 90% of my dissertation with this version... and changing PETSc now would be pretty painful!). > > Any ideas? Is it possible it's in my code? Is it possible that there are later PETSc commits that already fix this? > > Thanks for any help, > Derek > > > > -- > Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Mar 22 07:22:54 2019 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 22 Mar 2019 08:22:54 -0400 Subject: [petsc-users] Consistent domain decomposition between DMDA and DMPLEX In-Reply-To: References: Message-ID: Matt is the authority here but I think you are going to start with: https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/PetscPartitionerRegister.html Then you would clone a "setType" method, which will for one set the partition function. You can cache data like the regular grid metadata in here and then your partitioner function will pick the processor that each local vertex belongs to. Mark On Fri, Mar 22, 2019 at 2:20 AM Swarnava Ghosh via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc users and developers, > > I am new to DMPLEX and had a query regarding setting up a consistent > domain decomposition of two meshes in PETSc. > I have a structured finite difference grid, managed through DMDA. I have > another unstructured finite element mesh managed through DMPLEX. Now all > the nodes in the unstructured finite element mesh also belong to the set of > nodes in the structured finite difference mesh (but not necessarily > vice-versa), and the number of nodes in DMPLEX mesh is less than the number > of nodes in DMDA mesh. How can I guarantee a consistent domain > decomposition of the two meshes? By consistent, I mean that if a process > has a set of nodes P from DMDA, and the same process has the set of nodes Q > from DMPLEX, then Q is a subset of P. > > I look forward to your response. > > Sincerely, > Swarnava > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Mar 22 07:33:15 2019 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 22 Mar 2019 08:33:15 -0400 Subject: [petsc-users] Consistent domain decomposition between DMDA and DMPLEX In-Reply-To: References: Message-ID: Sorry, you don't clone a setType method, you clone a create method such as: src/dm/impls/plex/plexpartition.c:PETSC_EXTERN PetscErrorCode PetscPartitionerCreate_Simple(PetscPartitioner part) code like: PetscPartitionerSetType (PetscPartitioner , "my_part"); will call the function that you registered with PetscPartitionerRegister. On Fri, Mar 22, 2019 at 8:22 AM Mark Adams wrote: > Matt is the authority here but I think you are going to start with: > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/PetscPartitionerRegister.html > > Then you would clone a "setType" method, which will for one set the > partition function. You can cache data like the regular grid metadata in > here and then your partitioner function will pick the processor that each > local vertex belongs to. > > Mark > > On Fri, Mar 22, 2019 at 2:20 AM Swarnava Ghosh via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Dear PETSc users and developers, >> >> I am new to DMPLEX and had a query regarding setting up a consistent >> domain decomposition of two meshes in PETSc. >> I have a structured finite difference grid, managed through DMDA. I have >> another unstructured finite element mesh managed through DMPLEX. Now all >> the nodes in the unstructured finite element mesh also belong to the set of >> nodes in the structured finite difference mesh (but not necessarily >> vice-versa), and the number of nodes in DMPLEX mesh is less than the number >> of nodes in DMDA mesh. How can I guarantee a consistent domain >> decomposition of the two meshes? By consistent, I mean that if a process >> has a set of nodes P from DMDA, and the same process has the set of nodes Q >> from DMPLEX, then Q is a subset of P. >> >> I look forward to your response. >> >> Sincerely, >> Swarnava >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From maahi.buet at gmail.com Fri Mar 22 14:07:07 2019 From: maahi.buet at gmail.com (Maahi Talukder) Date: Fri, 22 Mar 2019 15:07:07 -0400 Subject: [petsc-users] About Configuring PETSc In-Reply-To: References: Message-ID: Hi, I tried to run the command 'make PETSC_ARCH=arch-opt wholetest1' but it shows me the following error- .................................................................................................................................................. [maahi at CB272PP-THINK1 Desktop]$ make PETSC_ARCH=arch-opt wholetest1 /home/maahi/petsc/lib/petsc/conf/rules:960: */home/maahi/petsc/arch-opt/lib/petsc/conf/petscrules: No such file or directory* make: *** No rule to make target '/home/maahi/petsc/arch-opt/lib/petsc/conf/petscrules'. Stop. ........................................................................................................................................................ My .bashrc is the following - ..................................................................... PATH=$PATH:$HOME/.local/bin:$HOME/bin:/usr/lib64/openmpi/bin export PATH export PETSC_DIR=$HOME/petsc export PETSC_ARCH=arch-linux2-c-debug #export PETSC_ARCH=arch-debug .......................................................................................... Could you please tell me what went wrong? Regards, Maahi Talukder On Thu, Mar 21, 2019 at 11:55 PM Maahi Talukder wrote: > Thank you so much for your reply. That clear things up! > > On Thu, Mar 21, 2019 at 10:43 PM Balay, Satish wrote: > >> On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: >> >> > Thank you for your reply. >> > >> > So do I need to set the value of PETSC_ARCH as needed in .bashrc as I >> did >> > in case of PETSC_DIR ? >> >> You can specify PETSC_ARCH as an option to make. You can have a default >> value set in .bashrc - and change to a different value on command line. >> >> For ex: in .bashrc >> >> export PETSC_ARCH=arch-debug >> >> Now if you want to build with debug libraries: >> >> make wholetest1 >> >> Now If you want to build with optimized libraries: >> >> make PETSC_ARCH=arch-opt wholetest1 >> >> >> > And by PETSC_ARCH=arch-opt, do you mean the >> > non-debugging mode? >> >> Yes. You can use whatever name you think is appropriate here. >> >> ./configure PETSC_ARCH=a-name-i-can-easily-associate-with-this-build >> [other configure options.] >> >> > >> > And I am using the following makefile with my code- >> > >> > CFLAGS = >> > FFLAGS =-I/home/maahi/petsc/include >> > -I/home/maahi/petsc/arch-linux2-c-debug/include -cpp -mcmodel=large >> >> Hm - you shouldn't be needing these options here. You should switch your >> source files from .f to .F and .f90 to .F90 - and remove the above FFLAGS >> >> Satish >> >> > CPPFLAGS = >> > FPPFLAGS = >> > >> > >> > include ${PETSC_DIR}/lib/petsc/conf/variables >> > include ${PETSC_DIR}/lib/petsc/conf/rules >> > >> > wholetest1: wholetest1.o >> > -${FLINKER} -o wholetest1 wholetest1.o ${PETSC_LIB} >> > ${RM} wholetest1.o >> > >> > So where do I add that PETSC_ARCH? >> > >> > Thanks, >> > Maahi Talukder >> > >> > On Thu, Mar 21, 2019 at 10:14 PM Balay, Satish >> wrote: >> > >> > > PETSc uses the concept of PETSC_ARCH to enable multiple in-place >> > > builds. >> > > >> > > So you can have a debug build with PETSC_ARCH=arch-debug, and a >> > > optimized build with PETSC_ARCH=arch-opt etc. >> > > >> > > And if you are using a petsc formatted makefile with your code - you >> > > can switch between these builds by just switching PETSC_ARCH. >> > > >> > > Satish >> > > >> > > On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: >> > > >> > > > Dear All, >> > > > >> > > > Currently, I am running PETSc with debugging option. And it says >> that if >> > > I >> > > > run ./configure --with-debugging=no, the performance would be >> faster. My >> > > > question is: what would I do if I want to go back to debugging >> mode, and >> > > If >> > > > I configure it now with no debugging option, would it make any >> changes to >> > > > my current setting? >> > > > >> > > > Regards, >> > > > Maahi Talukder >> > > > >> > > >> > > >> > >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Mar 22 14:11:58 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Fri, 22 Mar 2019 19:11:58 +0000 Subject: [petsc-users] About Configuring PETSc In-Reply-To: References: Message-ID: Did you build PETSc with PETSC_ARCH=arch-opt? Btw - I used PETSC_ARCH=arch-debug to illustrate - but you already have a build with PETSC_ARCH=arch-linux2-c-debug - so you should stick to that. Satish On Fri, 22 Mar 2019, Maahi Talukder via petsc-users wrote: > Hi, > > I tried to run the command 'make PETSC_ARCH=arch-opt wholetest1' but it > shows me the following error- > > .................................................................................................................................................. > [maahi at CB272PP-THINK1 Desktop]$ make PETSC_ARCH=arch-opt wholetest1 > /home/maahi/petsc/lib/petsc/conf/rules:960: > */home/maahi/petsc/arch-opt/lib/petsc/conf/petscrules: > No such file or directory* > make: *** No rule to make target > '/home/maahi/petsc/arch-opt/lib/petsc/conf/petscrules'. Stop. > ........................................................................................................................................................ > > My .bashrc is the following - > > ..................................................................... > > PATH=$PATH:$HOME/.local/bin:$HOME/bin:/usr/lib64/openmpi/bin > > export PATH > > export PETSC_DIR=$HOME/petsc > export PETSC_ARCH=arch-linux2-c-debug > #export PETSC_ARCH=arch-debug > > .......................................................................................... > > Could you please tell me what went wrong? > > Regards, > Maahi Talukder > > > On Thu, Mar 21, 2019 at 11:55 PM Maahi Talukder > wrote: > > > Thank you so much for your reply. That clear things up! > > > > On Thu, Mar 21, 2019 at 10:43 PM Balay, Satish wrote: > > > >> On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: > >> > >> > Thank you for your reply. > >> > > >> > So do I need to set the value of PETSC_ARCH as needed in .bashrc as I > >> did > >> > in case of PETSC_DIR ? > >> > >> You can specify PETSC_ARCH as an option to make. You can have a default > >> value set in .bashrc - and change to a different value on command line. > >> > >> For ex: in .bashrc > >> > >> export PETSC_ARCH=arch-debug > >> > >> Now if you want to build with debug libraries: > >> > >> make wholetest1 > >> > >> Now If you want to build with optimized libraries: > >> > >> make PETSC_ARCH=arch-opt wholetest1 > >> > >> > >> > And by PETSC_ARCH=arch-opt, do you mean the > >> > non-debugging mode? > >> > >> Yes. You can use whatever name you think is appropriate here. > >> > >> ./configure PETSC_ARCH=a-name-i-can-easily-associate-with-this-build > >> [other configure options.] > >> > >> > > >> > And I am using the following makefile with my code- > >> > > >> > CFLAGS = > >> > FFLAGS =-I/home/maahi/petsc/include > >> > -I/home/maahi/petsc/arch-linux2-c-debug/include -cpp -mcmodel=large > >> > >> Hm - you shouldn't be needing these options here. You should switch your > >> source files from .f to .F and .f90 to .F90 - and remove the above FFLAGS > >> > >> Satish > >> > >> > CPPFLAGS = > >> > FPPFLAGS = > >> > > >> > > >> > include ${PETSC_DIR}/lib/petsc/conf/variables > >> > include ${PETSC_DIR}/lib/petsc/conf/rules > >> > > >> > wholetest1: wholetest1.o > >> > -${FLINKER} -o wholetest1 wholetest1.o ${PETSC_LIB} > >> > ${RM} wholetest1.o > >> > > >> > So where do I add that PETSC_ARCH? > >> > > >> > Thanks, > >> > Maahi Talukder > >> > > >> > On Thu, Mar 21, 2019 at 10:14 PM Balay, Satish > >> wrote: > >> > > >> > > PETSc uses the concept of PETSC_ARCH to enable multiple in-place > >> > > builds. > >> > > > >> > > So you can have a debug build with PETSC_ARCH=arch-debug, and a > >> > > optimized build with PETSC_ARCH=arch-opt etc. > >> > > > >> > > And if you are using a petsc formatted makefile with your code - you > >> > > can switch between these builds by just switching PETSC_ARCH. > >> > > > >> > > Satish > >> > > > >> > > On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: > >> > > > >> > > > Dear All, > >> > > > > >> > > > Currently, I am running PETSc with debugging option. And it says > >> that if > >> > > I > >> > > > run ./configure --with-debugging=no, the performance would be > >> faster. My > >> > > > question is: what would I do if I want to go back to debugging > >> mode, and > >> > > If > >> > > > I configure it now with no debugging option, would it make any > >> changes to > >> > > > my current setting? > >> > > > > >> > > > Regards, > >> > > > Maahi Talukder > >> > > > > >> > > > >> > > > >> > > >> > >> > From bsmith at mcs.anl.gov Fri Mar 22 14:12:55 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 22 Mar 2019 19:12:55 +0000 Subject: [petsc-users] About Configuring PETSc In-Reply-To: References: Message-ID: <15B453AC-38F0-4893-B4A5-4D0FA04DFE21@anl.gov> You do not have an installation of PETSc at the directory location /home/maahi/petsc/arch-opt/ Perhaps you did not run ./configure and make with the PETSC_ARCH value set to arch-opt or perhaps the ./configure failed. Barry > On Mar 22, 2019, at 2:07 PM, Maahi Talukder via petsc-users wrote: > > Hi, > > I tried to run the command 'make PETSC_ARCH=arch-opt wholetest1' but it shows me the following error- > > .................................................................................................................................................. > [maahi at CB272PP-THINK1 Desktop]$ make PETSC_ARCH=arch-opt wholetest1 > /home/maahi/petsc/lib/petsc/conf/rules:960: /home/maahi/petsc/arch-opt/lib/petsc/conf/petscrules: No such file or directory > make: *** No rule to make target '/home/maahi/petsc/arch-opt/lib/petsc/conf/petscrules'. Stop. > ........................................................................................................................................................ > > My .bashrc is the following - > > ..................................................................... > > PATH=$PATH:$HOME/.local/bin:$HOME/bin:/usr/lib64/openmpi/bin > > export PATH > > export PETSC_DIR=$HOME/petsc > export PETSC_ARCH=arch-linux2-c-debug > #export PETSC_ARCH=arch-debug > > .......................................................................................... > > Could you please tell me what went wrong? > > Regards, > Maahi Talukder > > > On Thu, Mar 21, 2019 at 11:55 PM Maahi Talukder wrote: > Thank you so much for your reply. That clear things up! > > On Thu, Mar 21, 2019 at 10:43 PM Balay, Satish wrote: > On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: > > > Thank you for your reply. > > > > So do I need to set the value of PETSC_ARCH as needed in .bashrc as I did > > in case of PETSC_DIR ? > > You can specify PETSC_ARCH as an option to make. You can have a default value set in .bashrc - and change to a different value on command line. > > For ex: in .bashrc > > export PETSC_ARCH=arch-debug > > Now if you want to build with debug libraries: > > make wholetest1 > > Now If you want to build with optimized libraries: > > make PETSC_ARCH=arch-opt wholetest1 > > > > And by PETSC_ARCH=arch-opt, do you mean the > > non-debugging mode? > > Yes. You can use whatever name you think is appropriate here. > > ./configure PETSC_ARCH=a-name-i-can-easily-associate-with-this-build [other configure options.] > > > > > And I am using the following makefile with my code- > > > > CFLAGS = > > FFLAGS =-I/home/maahi/petsc/include > > -I/home/maahi/petsc/arch-linux2-c-debug/include -cpp -mcmodel=large > > Hm - you shouldn't be needing these options here. You should switch your source files from .f to .F and .f90 to .F90 - and remove the above FFLAGS > > Satish > > > CPPFLAGS = > > FPPFLAGS = > > > > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > wholetest1: wholetest1.o > > -${FLINKER} -o wholetest1 wholetest1.o ${PETSC_LIB} > > ${RM} wholetest1.o > > > > So where do I add that PETSC_ARCH? > > > > Thanks, > > Maahi Talukder > > > > On Thu, Mar 21, 2019 at 10:14 PM Balay, Satish wrote: > > > > > PETSc uses the concept of PETSC_ARCH to enable multiple in-place > > > builds. > > > > > > So you can have a debug build with PETSC_ARCH=arch-debug, and a > > > optimized build with PETSC_ARCH=arch-opt etc. > > > > > > And if you are using a petsc formatted makefile with your code - you > > > can switch between these builds by just switching PETSC_ARCH. > > > > > > Satish > > > > > > On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: > > > > > > > Dear All, > > > > > > > > Currently, I am running PETSc with debugging option. And it says that if > > > I > > > > run ./configure --with-debugging=no, the performance would be faster. My > > > > question is: what would I do if I want to go back to debugging mode, and > > > If > > > > I configure it now with no debugging option, would it make any changes to > > > > my current setting? > > > > > > > > Regards, > > > > Maahi Talukder > > > > > > > > > > > > > From mvalera-w at sdsu.edu Fri Mar 22 16:08:26 2019 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Fri, 22 Mar 2019 14:08:26 -0700 Subject: [petsc-users] MPI Communication times In-Reply-To: References: Message-ID: Hello, I repeated the timings with the -log_sync option and now i get for 200 processors / 20 nodes: ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ VecScatterBarrie 3014 1.0 5.6771e+01 3.9 0.00e+00 0.0 *0.0e+00 0.0e+00 0.0e+00 5 * 0 0 0 0 5 0 0 0 0 0 VecScatterBegin 3014 1.0 3.1684e+01 2.0 0.00e+00 0.0 *4.2e+06 1.1e+06 2.8e+01 4* 0 63 56 0 4 0 63 56 0 0 VecScatterEnd 2976 1.0 1.1383e+02 1.8 0.00e+00 0.0 *0.0e+00 0.0e+00 0.0e+00 14* 0 0 0 0 14 0 0 0 0 0 With 100 processors / 10 nodes: VecScatterBarrie 3010 1.0 7.4430e+01 5.0 0.00e+00 0.0 *0.0e+00 0.0e+00 0.0e+00 7* 0 0 0 0 7 0 0 0 0 0 VecScatterBegin 3010 1.0 3.8504e+01 2.4 0.00e+00 0.0 *1.6e+06 2.0e+06 2.8e+01 4 * 0 71 66 0 4 0 71 66 0 0 VecScatterEnd 2972 1.0 8.5158e+01 1.2 0.00e+00 0.0 *0.0e+00 0.0e+00 0.0e+00 9 * 0 0 0 0 9 0 0 0 0 0 And with 20 processors / 1 node: VecScatterBarrie 2596 1.0 4.0614e+01 7.3 0.00e+00 0.0* 0.0e+00 0.0e+00 0.0e+00 4 * 0 0 0 0 4 0 0 0 0 0 VecScatterBegin 2596 1.0 1.4970e+01 1.3 0.00e+00 0.0 *1.2e+05 4.0e+06 3.0e+01 1 * 0 81 61 0 1 0 81 61 0 0 VecScatterEnd 2558 1.0 1.4903e+01 1.3 0.00e+00 0.0* 0.0e+00 0.0e+00 0.0e+00 1 * 0 0 0 0 1 0 0 0 0 0 Can you help me interpret this? what i see is the End portion taking more relative time and Begin staying the same beyond one node, also Barrier and Begin counts are the same every time, but how do i estimate communication times from here? Thanks, On Wed, Mar 20, 2019 at 3:24 PM Zhang, Junchao wrote: > Forgot to mention long VecScatter time might also due to local memory > copies. If the communication pattern has large local to local (self to > self) scatter, which often happens thanks to locality, then the memory > copy time is counted in VecScatter. You can analyze your code's > communication pattern to see if it is the case. > > --Junchao Zhang > > > On Wed, Mar 20, 2019 at 4:44 PM Zhang, Junchao via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> >> >> On Wed, Mar 20, 2019 at 4:18 PM Manuel Valera wrote: >> >>> Thanks for your answer, so for example i have a log for 200 cores across >>> 10 nodes that reads: >>> >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> Event Count Time (sec) Flop >>> --- Global --- --- Stage --- Total >>> Max Ratio Max Ratio Max Ratio >>> Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>> >>> ---------------------------------------------------------------------------------------------------------------------- >>> VecScatterBegin 3014 1.0 4.5550e+01 2.6 0.00e+00 0.0 *4.2e+06 >>> 1.1e+06 2.8e+01 4* 0 63 56 0 4 0 63 56 0 0 >>> VecScatterEnd 2976 1.0 1.2143e+02 1.7 0.00e+00 0.0 *0.0e+00 >>> 0.0e+00 0.0e+00 14* 0 0 0 0 14 0 0 0 0 0 >>> >>> While for 20 nodes at one node i have: >>> >> What does that mean? >> >>> VecScatterBegin 2596 1.0 2.9142e+01 2.1 0.00e+00 0.0 *1.2e+05 >>> 4.0e+06 3.0e+01 2* 0 81 61 0 2 0 81 61 0 0 >>> VecScatterEnd 2558 1.0 8.0344e+01 7.9 0.00e+00 0.0 *0.0e+00 >>> 0.0e+00 0.0e+00 3* 0 0 0 0 3 0 0 0 0 0 >>> >>> Where do i see the max/min ratio in here? and why End step is all 0.0e00 >>> in both but still grows from 3% to 14% of total time? It seems i would need >>> to run again with the -log_sync option, is this correct? >>> >>> e.g., 2.1, 7.9. MPI send/recv are in VecScatterBegin(). VecScatterEnd() >> only does MPI_Wait. That is why it has zero messages. Yes, run with >> -log_sync and see what happens. >> >> >>> Different question, can't i estimate the total communication time if i >>> had a typical communication time per MPI message times the number of MPI >>> messages reported in the log? or it doesn't work like that? >>> >>> Probably not work because you have multiple processes doing send/recv at >> the same time. They might saturate the bandwidth. Petsc also does >> computation/communication overlapping. >> >> >>> Thanks. >>> >>> >>> >>> >>> >>> On Wed, Mar 20, 2019 at 2:02 PM Zhang, Junchao >>> wrote: >>> >>>> See the "Mess AvgLen Reduct" number in each log stage. Mess is the >>>> total number of messages sent in an event over all processes. AvgLen is >>>> average message len. Reduct is the number of global reduction. >>>> Each event like VecScatterBegin/End has a maximal execution time over >>>> all processes, and a max/min ratio. %T is sum(execution time of the event >>>> on each process)/sum(execution time of the stage on each process). %T >>>> indicates how expensive the event is. It is a number you should pay >>>> attention to. >>>> If your code is imbalanced (i.e., with a big max/min ratio), then the >>>> performance number is skewed and becomes misleading because some processes >>>> are just waiting for others. Then, besides -log_view, you can add >>>> -log_sync, which adds an extra MPI_Barrier for each event to let them start >>>> at the same time. With that, it is easier to interpret the number. >>>> src/vec/vscat/examples/ex4.c is a tiny example for VecScatter logging. >>>> >>>> --Junchao Zhang >>>> >>>> >>>> On Wed, Mar 20, 2019 at 2:58 PM Manuel Valera via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>>> Hello, >>>>> >>>>> I am working on timing my model, which we made MPI scalable using >>>>> petsc DMDAs, i want to know more about the output log and how to calculate >>>>> a total communication times for my runs, so far i see we have "MPI >>>>> Messages" and "MPI Messages Lengths" in the log, along VecScatterEnd and >>>>> VecScatterBegin reports. >>>>> >>>>> My question is, how do i interpret these number to get a rough >>>>> estimate on how much overhead we have just from MPI communications times in >>>>> my model runs? >>>>> >>>>> Thanks, >>>>> >>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From shash.sharma at mail.utoronto.ca Fri Mar 22 16:16:15 2019 From: shash.sharma at mail.utoronto.ca (Shashwat Sharma) Date: Fri, 22 Mar 2019 21:16:15 +0000 Subject: [petsc-users] Check if a matrix has been created Message-ID: Hello, I'd like to be able to check if a Mat object has been created and allocated; if it hasn't been created yet, I want to create and allocate it, otherwise I want to reuse the existing allocation. The context is that I have a Mat variable declared (but not defined) in a header file, so that it is available to the entire class. It is created (MatCreate) and allocated at a later point in the implementation, and may need to be reused with different values several times. But I don't want to allocate it unless it's needed, because it would be quite large. Also, I want my destructor to only call MatDestroy on the matrix if it has been created (otherwise it gives seg faults). How can I achieve this? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Mar 22 16:46:21 2019 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 22 Mar 2019 17:46:21 -0400 Subject: [petsc-users] Check if a matrix has been created In-Reply-To: References: Message-ID: On Fri, Mar 22, 2019 at 5:17 PM Shashwat Sharma via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > I'd like to be able to check if a Mat object has been created and > allocated; if it hasn't been created yet, I want to create and allocate it, > otherwise I want to reuse the existing allocation. > Have your constructor initialize it to PETSC_NULL_MAT. And the test with if (m_mat==PETSC_NULL_MAT) MatCreate .... > The context is that I have a Mat variable declared (but not defined) in a > header file, so that it is available to the entire class. It is created > (MatCreate) and allocated at a later point in the implementation, and may > need to be reused with different values several times. But I don't want to > allocate it unless it's needed, because it would be quite large. > > Also, I want my destructor to only call MatDestroy on the matrix if it has > been created (otherwise it gives seg faults). > Same, but I think you can call MatDestroy with a NULL matrix. > > How can I achieve this? > > Thanks. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Fri Mar 22 16:48:16 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Fri, 22 Mar 2019 21:48:16 +0000 Subject: [petsc-users] MPI Communication times In-Reply-To: References: Message-ID: Did you change problem size with different runs? On Fri, Mar 22, 2019 at 4:09 PM Manuel Valera > wrote: Hello, I repeated the timings with the -log_sync option and now i get for 200 processors / 20 nodes: ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ VecScatterBarrie 3014 1.0 5.6771e+01 3.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0 VecScatterBegin 3014 1.0 3.1684e+01 2.0 0.00e+00 0.0 4.2e+06 1.1e+06 2.8e+01 4 0 63 56 0 4 0 63 56 0 0 VecScatterEnd 2976 1.0 1.1383e+02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 14 0 0 0 0 14 0 0 0 0 0 With 100 processors / 10 nodes: VecScatterBarrie 3010 1.0 7.4430e+01 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 7 0 0 0 0 7 0 0 0 0 0 VecScatterBegin 3010 1.0 3.8504e+01 2.4 0.00e+00 0.0 1.6e+06 2.0e+06 2.8e+01 4 0 71 66 0 4 0 71 66 0 0 VecScatterEnd 2972 1.0 8.5158e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 9 0 0 0 0 9 0 0 0 0 0 And with 20 processors / 1 node: VecScatterBarrie 2596 1.0 4.0614e+01 7.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 VecScatterBegin 2596 1.0 1.4970e+01 1.3 0.00e+00 0.0 1.2e+05 4.0e+06 3.0e+01 1 0 81 61 0 1 0 81 61 0 0 VecScatterEnd 2558 1.0 1.4903e+01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 Can you help me interpret this? what i see is the End portion taking more relative time and Begin staying the same beyond one node, also Barrier and Begin counts are the same every time, but how do i estimate communication times from here? Thanks, On Wed, Mar 20, 2019 at 3:24 PM Zhang, Junchao > wrote: Forgot to mention long VecScatter time might also due to local memory copies. If the communication pattern has large local to local (self to self) scatter, which often happens thanks to locality, then the memory copy time is counted in VecScatter. You can analyze your code's communication pattern to see if it is the case. --Junchao Zhang On Wed, Mar 20, 2019 at 4:44 PM Zhang, Junchao via petsc-users > wrote: On Wed, Mar 20, 2019 at 4:18 PM Manuel Valera > wrote: Thanks for your answer, so for example i have a log for 200 cores across 10 nodes that reads: ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ---------------------------------------------------------------------------------------------------------------------- VecScatterBegin 3014 1.0 4.5550e+01 2.6 0.00e+00 0.0 4.2e+06 1.1e+06 2.8e+01 4 0 63 56 0 4 0 63 56 0 0 VecScatterEnd 2976 1.0 1.2143e+02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 14 0 0 0 0 14 0 0 0 0 0 While for 20 nodes at one node i have: What does that mean? VecScatterBegin 2596 1.0 2.9142e+01 2.1 0.00e+00 0.0 1.2e+05 4.0e+06 3.0e+01 2 0 81 61 0 2 0 81 61 0 0 VecScatterEnd 2558 1.0 8.0344e+01 7.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 Where do i see the max/min ratio in here? and why End step is all 0.0e00 in both but still grows from 3% to 14% of total time? It seems i would need to run again with the -log_sync option, is this correct? e.g., 2.1, 7.9. MPI send/recv are in VecScatterBegin(). VecScatterEnd() only does MPI_Wait. That is why it has zero messages. Yes, run with -log_sync and see what happens. Different question, can't i estimate the total communication time if i had a typical communication time per MPI message times the number of MPI messages reported in the log? or it doesn't work like that? Probably not work because you have multiple processes doing send/recv at the same time. They might saturate the bandwidth. Petsc also does computation/communication overlapping. Thanks. On Wed, Mar 20, 2019 at 2:02 PM Zhang, Junchao > wrote: See the "Mess AvgLen Reduct" number in each log stage. Mess is the total number of messages sent in an event over all processes. AvgLen is average message len. Reduct is the number of global reduction. Each event like VecScatterBegin/End has a maximal execution time over all processes, and a max/min ratio. %T is sum(execution time of the event on each process)/sum(execution time of the stage on each process). %T indicates how expensive the event is. It is a number you should pay attention to. If your code is imbalanced (i.e., with a big max/min ratio), then the performance number is skewed and becomes misleading because some processes are just waiting for others. Then, besides -log_view, you can add -log_sync, which adds an extra MPI_Barrier for each event to let them start at the same time. With that, it is easier to interpret the number. src/vec/vscat/examples/ex4.c is a tiny example for VecScatter logging. --Junchao Zhang On Wed, Mar 20, 2019 at 2:58 PM Manuel Valera via petsc-users > wrote: Hello, I am working on timing my model, which we made MPI scalable using petsc DMDAs, i want to know more about the output log and how to calculate a total communication times for my runs, so far i see we have "MPI Messages" and "MPI Messages Lengths" in the log, along VecScatterEnd and VecScatterBegin reports. My question is, how do i interpret these number to get a rough estimate on how much overhead we have just from MPI communications times in my model runs? Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera-w at sdsu.edu Fri Mar 22 16:54:02 2019 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Fri, 22 Mar 2019 14:54:02 -0700 Subject: [petsc-users] MPI Communication times In-Reply-To: References: Message-ID: No, is the same problem running with different number of processors, i have data from 1 to 20 processors in increments of 20 processors/1 node, and additionally for 1 processor. On Fri, Mar 22, 2019 at 2:48 PM Zhang, Junchao wrote: > Did you change problem size with different runs? > > On Fri, Mar 22, 2019 at 4:09 PM Manuel Valera wrote: > >> Hello, >> >> I repeated the timings with the -log_sync option and now i get for 200 >> processors / 20 nodes: >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flop >> --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio >> Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> VecScatterBarrie 3014 1.0 5.6771e+01 3.9 0.00e+00 0.0 *0.0e+00 >> 0.0e+00 0.0e+00 5 * 0 0 0 0 5 0 0 0 0 0 >> VecScatterBegin 3014 1.0 3.1684e+01 2.0 0.00e+00 0.0 *4.2e+06 1.1e+06 >> 2.8e+01 4* 0 63 56 0 4 0 63 56 0 0 >> VecScatterEnd 2976 1.0 1.1383e+02 1.8 0.00e+00 0.0 *0.0e+00 >> 0.0e+00 0.0e+00 14* 0 0 0 0 14 0 0 0 0 0 >> >> With 100 processors / 10 nodes: >> >> VecScatterBarrie 3010 1.0 7.4430e+01 5.0 0.00e+00 0.0 *0.0e+00 >> 0.0e+00 0.0e+00 7* 0 0 0 0 7 0 0 0 0 0 >> VecScatterBegin 3010 1.0 3.8504e+01 2.4 0.00e+00 0.0 *1.6e+06 2.0e+06 >> 2.8e+01 4 * 0 71 66 0 4 0 71 66 0 0 >> VecScatterEnd 2972 1.0 8.5158e+01 1.2 0.00e+00 0.0 *0.0e+00 >> 0.0e+00 0.0e+00 9 * 0 0 0 0 9 0 0 0 0 0 >> >> And with 20 processors / 1 node: >> >> VecScatterBarrie 2596 1.0 4.0614e+01 7.3 0.00e+00 0.0* 0.0e+00 >> 0.0e+00 0.0e+00 4 * 0 0 0 0 4 0 0 0 0 0 >> VecScatterBegin 2596 1.0 1.4970e+01 1.3 0.00e+00 0.0 *1.2e+05 >> 4.0e+06 3.0e+01 1 * 0 81 61 0 1 0 81 61 0 0 >> VecScatterEnd 2558 1.0 1.4903e+01 1.3 0.00e+00 0.0* 0.0e+00 >> 0.0e+00 0.0e+00 1 * 0 0 0 0 1 0 0 0 0 0 >> >> Can you help me interpret this? what i see is the End portion taking more >> relative time and Begin staying the same beyond one node, also Barrier and >> Begin counts are the same every time, but how do i estimate communication >> times from here? >> >> Thanks, >> >> >> On Wed, Mar 20, 2019 at 3:24 PM Zhang, Junchao >> wrote: >> >>> Forgot to mention long VecScatter time might also due to local memory >>> copies. If the communication pattern has large local to local (self to >>> self) scatter, which often happens thanks to locality, then the memory >>> copy time is counted in VecScatter. You can analyze your code's >>> communication pattern to see if it is the case. >>> >>> --Junchao Zhang >>> >>> >>> On Wed, Mar 20, 2019 at 4:44 PM Zhang, Junchao via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> >>>> >>>> On Wed, Mar 20, 2019 at 4:18 PM Manuel Valera >>>> wrote: >>>> >>>>> Thanks for your answer, so for example i have a log for 200 cores >>>>> across 10 nodes that reads: >>>>> >>>>> >>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>> Event Count Time (sec) Flop >>>>> --- Global --- --- Stage --- Total >>>>> Max Ratio Max Ratio Max Ratio >>>>> Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>>>> >>>>> ---------------------------------------------------------------------------------------------------------------------- >>>>> VecScatterBegin 3014 1.0 4.5550e+01 2.6 0.00e+00 0.0 *4.2e+06 >>>>> 1.1e+06 2.8e+01 4* 0 63 56 0 4 0 63 56 0 0 >>>>> VecScatterEnd 2976 1.0 1.2143e+02 1.7 0.00e+00 0.0 *0.0e+00 >>>>> 0.0e+00 0.0e+00 14* 0 0 0 0 14 0 0 0 0 0 >>>>> >>>>> While for 20 nodes at one node i have: >>>>> >>>> What does that mean? >>>> >>>>> VecScatterBegin 2596 1.0 2.9142e+01 2.1 0.00e+00 0.0 *1.2e+05 >>>>> 4.0e+06 3.0e+01 2* 0 81 61 0 2 0 81 61 0 0 >>>>> VecScatterEnd 2558 1.0 8.0344e+01 7.9 0.00e+00 0.0 *0.0e+00 >>>>> 0.0e+00 0.0e+00 3* 0 0 0 0 3 0 0 0 0 0 >>>>> >>>>> Where do i see the max/min ratio in here? and why End step is all >>>>> 0.0e00 in both but still grows from 3% to 14% of total time? It seems i >>>>> would need to run again with the -log_sync option, is this correct? >>>>> >>>>> e.g., 2.1, 7.9. MPI send/recv are in VecScatterBegin(). >>>> VecScatterEnd() only does MPI_Wait. That is why it has zero messages. Yes, >>>> run with -log_sync and see what happens. >>>> >>>> >>>>> Different question, can't i estimate the total communication time if i >>>>> had a typical communication time per MPI message times the number of MPI >>>>> messages reported in the log? or it doesn't work like that? >>>>> >>>>> Probably not work because you have multiple processes doing send/recv >>>> at the same time. They might saturate the bandwidth. Petsc also does >>>> computation/communication overlapping. >>>> >>>> >>>>> Thanks. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Wed, Mar 20, 2019 at 2:02 PM Zhang, Junchao >>>>> wrote: >>>>> >>>>>> See the "Mess AvgLen Reduct" number in each log stage. Mess is >>>>>> the total number of messages sent in an event over all processes. AvgLen >>>>>> is average message len. Reduct is the number of global reduction. >>>>>> Each event like VecScatterBegin/End has a maximal execution time over >>>>>> all processes, and a max/min ratio. %T is sum(execution time of the event >>>>>> on each process)/sum(execution time of the stage on each process). %T >>>>>> indicates how expensive the event is. It is a number you should pay >>>>>> attention to. >>>>>> If your code is imbalanced (i.e., with a big max/min ratio), then the >>>>>> performance number is skewed and becomes misleading because some processes >>>>>> are just waiting for others. Then, besides -log_view, you can add >>>>>> -log_sync, which adds an extra MPI_Barrier for each event to let them start >>>>>> at the same time. With that, it is easier to interpret the number. >>>>>> src/vec/vscat/examples/ex4.c is a tiny example for VecScatter >>>>>> logging. >>>>>> >>>>>> --Junchao Zhang >>>>>> >>>>>> >>>>>> On Wed, Mar 20, 2019 at 2:58 PM Manuel Valera via petsc-users < >>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> I am working on timing my model, which we made MPI scalable using >>>>>>> petsc DMDAs, i want to know more about the output log and how to calculate >>>>>>> a total communication times for my runs, so far i see we have "MPI >>>>>>> Messages" and "MPI Messages Lengths" in the log, along VecScatterEnd and >>>>>>> VecScatterBegin reports. >>>>>>> >>>>>>> My question is, how do i interpret these number to get a rough >>>>>>> estimate on how much overhead we have just from MPI communications times in >>>>>>> my model runs? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> >>>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jczhang at mcs.anl.gov Fri Mar 22 17:39:59 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Fri, 22 Mar 2019 22:39:59 +0000 Subject: [petsc-users] MPI Communication times In-Reply-To: References: Message-ID: On Fri, Mar 22, 2019 at 4:55 PM Manuel Valera > wrote: No, is the same problem running with different number of processors, i have data from 1 to 20 processors in increments of 20 processors/1 node, and additionally for 1 processor. That means you used strong scaling. If we combine VecScatterBegin/End, from 20 cores, to 100, 200 cores, it took 2%, 13%, 18% of the execution time respectively. It looks very unscalable. I do not know why. VecScatterBegin took the same time with 100 and 200 cores. My explanation is VecScatterBegin just packs data and then calls non-blocking MPI_Isend. However, VecScatterEnd has to wait for data to come. Could you tell us more about your problem, for example, is it 2D or 3D, what is the communication pattern, how many neighbors each rank has. Also attach the whole log files for -log_view so that we can know the problem better. Thanks. On Fri, Mar 22, 2019 at 2:48 PM Zhang, Junchao > wrote: Did you change problem size with different runs? On Fri, Mar 22, 2019 at 4:09 PM Manuel Valera > wrote: Hello, I repeated the timings with the -log_sync option and now i get for 200 processors / 20 nodes: ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ VecScatterBarrie 3014 1.0 5.6771e+01 3.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0 VecScatterBegin 3014 1.0 3.1684e+01 2.0 0.00e+00 0.0 4.2e+06 1.1e+06 2.8e+01 4 0 63 56 0 4 0 63 56 0 0 VecScatterEnd 2976 1.0 1.1383e+02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 14 0 0 0 0 14 0 0 0 0 0 With 100 processors / 10 nodes: VecScatterBarrie 3010 1.0 7.4430e+01 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 7 0 0 0 0 7 0 0 0 0 0 VecScatterBegin 3010 1.0 3.8504e+01 2.4 0.00e+00 0.0 1.6e+06 2.0e+06 2.8e+01 4 0 71 66 0 4 0 71 66 0 0 VecScatterEnd 2972 1.0 8.5158e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 9 0 0 0 0 9 0 0 0 0 0 And with 20 processors / 1 node: VecScatterBarrie 2596 1.0 4.0614e+01 7.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 VecScatterBegin 2596 1.0 1.4970e+01 1.3 0.00e+00 0.0 1.2e+05 4.0e+06 3.0e+01 1 0 81 61 0 1 0 81 61 0 0 VecScatterEnd 2558 1.0 1.4903e+01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 Can you help me interpret this? what i see is the End portion taking more relative time and Begin staying the same beyond one node, also Barrier and Begin counts are the same every time, but how do i estimate communication times from here? Thanks, On Wed, Mar 20, 2019 at 3:24 PM Zhang, Junchao > wrote: Forgot to mention long VecScatter time might also due to local memory copies. If the communication pattern has large local to local (self to self) scatter, which often happens thanks to locality, then the memory copy time is counted in VecScatter. You can analyze your code's communication pattern to see if it is the case. --Junchao Zhang On Wed, Mar 20, 2019 at 4:44 PM Zhang, Junchao via petsc-users > wrote: On Wed, Mar 20, 2019 at 4:18 PM Manuel Valera > wrote: Thanks for your answer, so for example i have a log for 200 cores across 10 nodes that reads: ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ---------------------------------------------------------------------------------------------------------------------- VecScatterBegin 3014 1.0 4.5550e+01 2.6 0.00e+00 0.0 4.2e+06 1.1e+06 2.8e+01 4 0 63 56 0 4 0 63 56 0 0 VecScatterEnd 2976 1.0 1.2143e+02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 14 0 0 0 0 14 0 0 0 0 0 While for 20 nodes at one node i have: What does that mean? VecScatterBegin 2596 1.0 2.9142e+01 2.1 0.00e+00 0.0 1.2e+05 4.0e+06 3.0e+01 2 0 81 61 0 2 0 81 61 0 0 VecScatterEnd 2558 1.0 8.0344e+01 7.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 Where do i see the max/min ratio in here? and why End step is all 0.0e00 in both but still grows from 3% to 14% of total time? It seems i would need to run again with the -log_sync option, is this correct? e.g., 2.1, 7.9. MPI send/recv are in VecScatterBegin(). VecScatterEnd() only does MPI_Wait. That is why it has zero messages. Yes, run with -log_sync and see what happens. Different question, can't i estimate the total communication time if i had a typical communication time per MPI message times the number of MPI messages reported in the log? or it doesn't work like that? Probably not work because you have multiple processes doing send/recv at the same time. They might saturate the bandwidth. Petsc also does computation/communication overlapping. Thanks. On Wed, Mar 20, 2019 at 2:02 PM Zhang, Junchao > wrote: See the "Mess AvgLen Reduct" number in each log stage. Mess is the total number of messages sent in an event over all processes. AvgLen is average message len. Reduct is the number of global reduction. Each event like VecScatterBegin/End has a maximal execution time over all processes, and a max/min ratio. %T is sum(execution time of the event on each process)/sum(execution time of the stage on each process). %T indicates how expensive the event is. It is a number you should pay attention to. If your code is imbalanced (i.e., with a big max/min ratio), then the performance number is skewed and becomes misleading because some processes are just waiting for others. Then, besides -log_view, you can add -log_sync, which adds an extra MPI_Barrier for each event to let them start at the same time. With that, it is easier to interpret the number. src/vec/vscat/examples/ex4.c is a tiny example for VecScatter logging. --Junchao Zhang On Wed, Mar 20, 2019 at 2:58 PM Manuel Valera via petsc-users > wrote: Hello, I am working on timing my model, which we made MPI scalable using petsc DMDAs, i want to know more about the output log and how to calculate a total communication times for my runs, so far i see we have "MPI Messages" and "MPI Messages Lengths" in the log, along VecScatterEnd and VecScatterBegin reports. My question is, how do i interpret these number to get a rough estimate on how much overhead we have just from MPI communications times in my model runs? Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Mar 22 17:54:45 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 22 Mar 2019 12:54:45 -1000 Subject: [petsc-users] Consistent domain decomposition between DMDA and DMPLEX In-Reply-To: References: Message-ID: On Thu, Mar 21, 2019 at 8:20 PM Swarnava Ghosh via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc users and developers, > > I am new to DMPLEX and had a query regarding setting up a consistent > domain decomposition of two meshes in PETSc. > I have a structured finite difference grid, managed through DMDA. I have > another unstructured finite element mesh managed through DMPLEX. Now all > the nodes in the unstructured finite element mesh also belong to the set of > nodes in the structured finite difference mesh (but not necessarily > vice-versa), and the number of nodes in DMPLEX mesh is less than the number > of nodes in DMDA mesh. How can I guarantee a consistent domain > decomposition of the two meshes? By consistent, I mean that if a process > has a set of nodes P from DMDA, and the same process has the set of nodes Q > from DMPLEX, then Q is a subset of P. > Okay, this is not hard. DMPlexDistribute() basically distributes according to a cell partition. You can use PetscPartitionerShell() to stick in whatever cell partition you want. You can see me doing this here: https://bitbucket.org/petsc/petsc/src/e2aefa968a094f48dc384fffc7d599a60aeeb591/src/dm/impls/plex/examples/tests/ex1.c#lines-261 Will that work for you? Thanks, Matt > I look forward to your response. > > Sincerely, > Swarnava > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Mar 22 18:08:56 2019 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 22 Mar 2019 19:08:56 -0400 Subject: [petsc-users] Consistent domain decomposition between DMDA and DMPLEX In-Reply-To: References: Message-ID: Matt, I think they want a vertex partitioning. They may have elements on the unstructured mesh that intersect with any number of processor domains on the structured mesh. But the unstructured mesh vertices are in the structured mesh set of vertices. They want the partition of the unstructured mesh vertices (ie, matrices) to be slaved to the partitioning of the structured mesh. Do I have that right Swarnava? Mark On Fri, Mar 22, 2019 at 6:56 PM Matthew Knepley via petsc-users < petsc-users at mcs.anl.gov> wrote: > On Thu, Mar 21, 2019 at 8:20 PM Swarnava Ghosh via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Dear PETSc users and developers, >> >> I am new to DMPLEX and had a query regarding setting up a consistent >> domain decomposition of two meshes in PETSc. >> I have a structured finite difference grid, managed through DMDA. I have >> another unstructured finite element mesh managed through DMPLEX. Now all >> the nodes in the unstructured finite element mesh also belong to the set of >> nodes in the structured finite difference mesh (but not necessarily >> vice-versa), and the number of nodes in DMPLEX mesh is less than the number >> of nodes in DMDA mesh. How can I guarantee a consistent domain >> decomposition of the two meshes? By consistent, I mean that if a process >> has a set of nodes P from DMDA, and the same process has the set of nodes Q >> from DMPLEX, then Q is a subset of P. >> > > Okay, this is not hard. DMPlexDistribute() basically distributes according > to a cell partition. You can use PetscPartitionerShell() to stick in > whatever cell partition you want. You can see me doing this here: > > > https://bitbucket.org/petsc/petsc/src/e2aefa968a094f48dc384fffc7d599a60aeeb591/src/dm/impls/plex/examples/tests/ex1.c#lines-261 > > Will that work for you? > > Thanks, > > Matt > > >> I look forward to your response. >> >> Sincerely, >> Swarnava >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From swarnava89 at gmail.com Fri Mar 22 18:40:19 2019 From: swarnava89 at gmail.com (Swarnava Ghosh) Date: Fri, 22 Mar 2019 16:40:19 -0700 Subject: [petsc-users] Consistent domain decomposition between DMDA and DMPLEX In-Reply-To: References: Message-ID: Hi Mark and Matt, Thank you for your responses. "They may have elements on the unstructured mesh that intersect with any number of processor domains on the structured mesh. But the unstructured mesh vertices are in the structured mesh set of vertices" Yes, that is correct. We would want a vertex partitioning. Sincerely, Swarnava On Fri, Mar 22, 2019 at 4:08 PM Mark Adams wrote: > Matt, > I think they want a vertex partitioning. They may have elements on the > unstructured mesh that intersect with any number of processor domains on > the structured mesh. But the unstructured mesh vertices are in the > structured mesh set of vertices. They want the partition of the > unstructured mesh vertices (ie, matrices) to be slaved to the partitioning > of the structured mesh. > Do I have that right Swarnava? > Mark > > On Fri, Mar 22, 2019 at 6:56 PM Matthew Knepley via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> On Thu, Mar 21, 2019 at 8:20 PM Swarnava Ghosh via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Dear PETSc users and developers, >>> >>> I am new to DMPLEX and had a query regarding setting up a consistent >>> domain decomposition of two meshes in PETSc. >>> I have a structured finite difference grid, managed through DMDA. I have >>> another unstructured finite element mesh managed through DMPLEX. Now all >>> the nodes in the unstructured finite element mesh also belong to the set of >>> nodes in the structured finite difference mesh (but not necessarily >>> vice-versa), and the number of nodes in DMPLEX mesh is less than the number >>> of nodes in DMDA mesh. How can I guarantee a consistent domain >>> decomposition of the two meshes? By consistent, I mean that if a process >>> has a set of nodes P from DMDA, and the same process has the set of nodes Q >>> from DMPLEX, then Q is a subset of P. >>> >> >> Okay, this is not hard. DMPlexDistribute() basically distributes >> according to a cell partition. You can use PetscPartitionerShell() to stick >> in whatever cell partition you want. You can see me doing this here: >> >> >> https://bitbucket.org/petsc/petsc/src/e2aefa968a094f48dc384fffc7d599a60aeeb591/src/dm/impls/plex/examples/tests/ex1.c#lines-261 >> >> Will that work for you? >> >> Thanks, >> >> Matt >> >> >>> I look forward to your response. >>> >>> Sincerely, >>> Swarnava >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mvalera-w at sdsu.edu Fri Mar 22 18:52:24 2019 From: mvalera-w at sdsu.edu (Manuel Valera) Date: Fri, 22 Mar 2019 16:52:24 -0700 Subject: [petsc-users] MPI Communication times In-Reply-To: References: Message-ID: This is a 3D fluid dynamics code, it uses arakawa C type grids and curvilinear coordinates in nonhydrostatic navier stokes, we also add realistic stratification (Temperature / Density) and subgrid scale for turbulence. What we are solving here is just a seamount with a velocity forcing from one side and is just 5 pressure solvers or iterations. PETSc is used via the DMDAs to set up the grids and arrays and do (almost) every calculation in a distributed manner, the pressure solver is implicit and carried out with the KSP module. I/O is still serial. I am attaching the run outputs with the format 60mNP.txt with NP the number of processors used. These are large files you can read with tail -n 140 [filename] for the -log_view part Thanks for your help, On Fri, Mar 22, 2019 at 3:40 PM Zhang, Junchao wrote: > > On Fri, Mar 22, 2019 at 4:55 PM Manuel Valera wrote: > >> No, is the same problem running with different number of processors, i >> have data from 1 to 20 processors in increments of 20 processors/1 node, >> and additionally for 1 processor. >> > > That means you used strong scaling. If we combine VecScatterBegin/End, > from 20 cores, to 100, 200 cores, it took 2%, 13%, 18% of the execution > time respectively. It looks very unscalable. I do not know why. > VecScatterBegin took the same time with 100 and 200 cores. My explanation > is VecScatterBegin just packs data and then calls non-blocking MPI_Isend. > However, VecScatterEnd has to wait for data to come. > Could you tell us more about your problem, for example, is it 2D or 3D, > what is the communication pattern, how many neighbors each rank has. Also > attach the whole log files for -log_view so that we can know the problem > better. > Thanks. > >> >> On Fri, Mar 22, 2019 at 2:48 PM Zhang, Junchao >> wrote: >> >>> Did you change problem size with different runs? >>> >>> On Fri, Mar 22, 2019 at 4:09 PM Manuel Valera >>> wrote: >>> >>>> Hello, >>>> >>>> I repeated the timings with the -log_sync option and now i get for 200 >>>> processors / 20 nodes: >>>> >>>> >>>> ------------------------------------------------------------------------------------------------------------------------ >>>> Event Count Time (sec) Flop >>>> --- Global --- --- Stage --- Total >>>> Max Ratio Max Ratio Max Ratio >>>> Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>>> >>>> ------------------------------------------------------------------------------------------------------------------------ >>>> >>>> VecScatterBarrie 3014 1.0 5.6771e+01 3.9 0.00e+00 0.0 *0.0e+00 >>>> 0.0e+00 0.0e+00 5 * 0 0 0 0 5 0 0 0 0 0 >>>> VecScatterBegin 3014 1.0 3.1684e+01 2.0 0.00e+00 0.0 *4.2e+06 >>>> 1.1e+06 2.8e+01 4* 0 63 56 0 4 0 63 56 0 0 >>>> VecScatterEnd 2976 1.0 1.1383e+02 1.8 0.00e+00 0.0 *0.0e+00 >>>> 0.0e+00 0.0e+00 14* 0 0 0 0 14 0 0 0 0 0 >>>> >>>> With 100 processors / 10 nodes: >>>> >>>> VecScatterBarrie 3010 1.0 7.4430e+01 5.0 0.00e+00 0.0 *0.0e+00 >>>> 0.0e+00 0.0e+00 7* 0 0 0 0 7 0 0 0 0 0 >>>> VecScatterBegin 3010 1.0 3.8504e+01 2.4 0.00e+00 0.0 *1.6e+06 >>>> 2.0e+06 2.8e+01 4 * 0 71 66 0 4 0 71 66 0 0 >>>> VecScatterEnd 2972 1.0 8.5158e+01 1.2 0.00e+00 0.0 *0.0e+00 >>>> 0.0e+00 0.0e+00 9 * 0 0 0 0 9 0 0 0 0 0 >>>> >>>> And with 20 processors / 1 node: >>>> >>>> VecScatterBarrie 2596 1.0 4.0614e+01 7.3 0.00e+00 0.0* 0.0e+00 >>>> 0.0e+00 0.0e+00 4 * 0 0 0 0 4 0 0 0 0 0 >>>> VecScatterBegin 2596 1.0 1.4970e+01 1.3 0.00e+00 0.0 *1.2e+05 >>>> 4.0e+06 3.0e+01 1 * 0 81 61 0 1 0 81 61 0 0 >>>> VecScatterEnd 2558 1.0 1.4903e+01 1.3 0.00e+00 0.0* 0.0e+00 >>>> 0.0e+00 0.0e+00 1 * 0 0 0 0 1 0 0 0 0 0 >>>> >>>> Can you help me interpret this? what i see is the End portion taking >>>> more relative time and Begin staying the same beyond one node, also Barrier >>>> and Begin counts are the same every time, but how do i estimate >>>> communication times from here? >>>> >>>> Thanks, >>>> >>>> >>>> On Wed, Mar 20, 2019 at 3:24 PM Zhang, Junchao >>>> wrote: >>>> >>>>> Forgot to mention long VecScatter time might also due to local memory >>>>> copies. If the communication pattern has large local to local (self to >>>>> self) scatter, which often happens thanks to locality, then the memory >>>>> copy time is counted in VecScatter. You can analyze your code's >>>>> communication pattern to see if it is the case. >>>>> >>>>> --Junchao Zhang >>>>> >>>>> >>>>> On Wed, Mar 20, 2019 at 4:44 PM Zhang, Junchao via petsc-users < >>>>> petsc-users at mcs.anl.gov> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Wed, Mar 20, 2019 at 4:18 PM Manuel Valera >>>>>> wrote: >>>>>> >>>>>>> Thanks for your answer, so for example i have a log for 200 cores >>>>>>> across 10 nodes that reads: >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>>>> Event Count Time (sec) Flop >>>>>>> --- Global --- --- Stage --- Total >>>>>>> Max Ratio Max Ratio Max >>>>>>> Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>>>>>> >>>>>>> ---------------------------------------------------------------------------------------------------------------------- >>>>>>> VecScatterBegin 3014 1.0 4.5550e+01 2.6 0.00e+00 0.0 *4.2e+06 >>>>>>> 1.1e+06 2.8e+01 4* 0 63 56 0 4 0 63 56 0 0 >>>>>>> VecScatterEnd 2976 1.0 1.2143e+02 1.7 0.00e+00 0.0 *0.0e+00 >>>>>>> 0.0e+00 0.0e+00 14* 0 0 0 0 14 0 0 0 0 0 >>>>>>> >>>>>>> While for 20 nodes at one node i have: >>>>>>> >>>>>> What does that mean? >>>>>> >>>>>>> VecScatterBegin 2596 1.0 2.9142e+01 2.1 0.00e+00 0.0 *1.2e+05 >>>>>>> 4.0e+06 3.0e+01 2* 0 81 61 0 2 0 81 61 0 0 >>>>>>> VecScatterEnd 2558 1.0 8.0344e+01 7.9 0.00e+00 0.0 *0.0e+00 >>>>>>> 0.0e+00 0.0e+00 3* 0 0 0 0 3 0 0 0 0 0 >>>>>>> >>>>>>> Where do i see the max/min ratio in here? and why End step is all >>>>>>> 0.0e00 in both but still grows from 3% to 14% of total time? It seems i >>>>>>> would need to run again with the -log_sync option, is this correct? >>>>>>> >>>>>>> e.g., 2.1, 7.9. MPI send/recv are in VecScatterBegin(). >>>>>> VecScatterEnd() only does MPI_Wait. That is why it has zero messages. Yes, >>>>>> run with -log_sync and see what happens. >>>>>> >>>>>> >>>>>>> Different question, can't i estimate the total communication time if >>>>>>> i had a typical communication time per MPI message times the number of MPI >>>>>>> messages reported in the log? or it doesn't work like that? >>>>>>> >>>>>>> Probably not work because you have multiple processes doing >>>>>> send/recv at the same time. They might saturate the bandwidth. Petsc also >>>>>> does computation/communication overlapping. >>>>>> >>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Mar 20, 2019 at 2:02 PM Zhang, Junchao >>>>>>> wrote: >>>>>>> >>>>>>>> See the "Mess AvgLen Reduct" number in each log stage. Mess is >>>>>>>> the total number of messages sent in an event over all processes. AvgLen >>>>>>>> is average message len. Reduct is the number of global reduction. >>>>>>>> Each event like VecScatterBegin/End has a maximal execution time >>>>>>>> over all processes, and a max/min ratio. %T is sum(execution time of the >>>>>>>> event on each process)/sum(execution time of the stage on each process). %T >>>>>>>> indicates how expensive the event is. It is a number you should pay >>>>>>>> attention to. >>>>>>>> If your code is imbalanced (i.e., with a big max/min ratio), then >>>>>>>> the performance number is skewed and becomes misleading because some >>>>>>>> processes are just waiting for others. Then, besides -log_view, you can add >>>>>>>> -log_sync, which adds an extra MPI_Barrier for each event to let them start >>>>>>>> at the same time. With that, it is easier to interpret the number. >>>>>>>> src/vec/vscat/examples/ex4.c is a tiny example for VecScatter >>>>>>>> logging. >>>>>>>> >>>>>>>> --Junchao Zhang >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Mar 20, 2019 at 2:58 PM Manuel Valera via petsc-users < >>>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> I am working on timing my model, which we made MPI scalable using >>>>>>>>> petsc DMDAs, i want to know more about the output log and how to calculate >>>>>>>>> a total communication times for my runs, so far i see we have "MPI >>>>>>>>> Messages" and "MPI Messages Lengths" in the log, along VecScatterEnd and >>>>>>>>> VecScatterBegin reports. >>>>>>>>> >>>>>>>>> My question is, how do i interpret these number to get a rough >>>>>>>>> estimate on how much overhead we have just from MPI communications times in >>>>>>>>> my model runs? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> >>>>>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: timing60m-032219.zip Type: application/zip Size: 142854 bytes Desc: not available URL: From jczhang at mcs.anl.gov Sat Mar 23 12:19:49 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Sat, 23 Mar 2019 17:19:49 +0000 Subject: [petsc-users] MPI Communication times In-Reply-To: References: Message-ID: Before further looking into it, can you try these: * It seems you used petsc 3.9.4. Could you update to petsc master branch? We have an optimization (after 3.9.4) that is very useful for VecScatter on DMDA vectors. * To measure performance, you do not want that many printfs. * Only measure the parallel part of your program, i.e., skip the init and I/O part. You can use petsc stages, see src/vec/vscat/examples/ex4.c for an example * Since your grid is 3000 x 200 x 100, so can you measure with 60 and 240 processors? It is easy to do analysis with balanced partition. Thanks. --Junchao Zhang On Fri, Mar 22, 2019 at 6:53 PM Manuel Valera > wrote: This is a 3D fluid dynamics code, it uses arakawa C type grids and curvilinear coordinates in nonhydrostatic navier stokes, we also add realistic stratification (Temperature / Density) and subgrid scale for turbulence. What we are solving here is just a seamount with a velocity forcing from one side and is just 5 pressure solvers or iterations. PETSc is used via the DMDAs to set up the grids and arrays and do (almost) every calculation in a distributed manner, the pressure solver is implicit and carried out with the KSP module. I/O is still serial. I am attaching the run outputs with the format 60mNP.txt with NP the number of processors used. These are large files you can read with tail -n 140 [filename] for the -log_view part Thanks for your help, On Fri, Mar 22, 2019 at 3:40 PM Zhang, Junchao > wrote: On Fri, Mar 22, 2019 at 4:55 PM Manuel Valera > wrote: No, is the same problem running with different number of processors, i have data from 1 to 20 processors in increments of 20 processors/1 node, and additionally for 1 processor. That means you used strong scaling. If we combine VecScatterBegin/End, from 20 cores, to 100, 200 cores, it took 2%, 13%, 18% of the execution time respectively. It looks very unscalable. I do not know why. VecScatterBegin took the same time with 100 and 200 cores. My explanation is VecScatterBegin just packs data and then calls non-blocking MPI_Isend. However, VecScatterEnd has to wait for data to come. Could you tell us more about your problem, for example, is it 2D or 3D, what is the communication pattern, how many neighbors each rank has. Also attach the whole log files for -log_view so that we can know the problem better. Thanks. On Fri, Mar 22, 2019 at 2:48 PM Zhang, Junchao > wrote: Did you change problem size with different runs? On Fri, Mar 22, 2019 at 4:09 PM Manuel Valera > wrote: Hello, I repeated the timings with the -log_sync option and now i get for 200 processors / 20 nodes: ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ VecScatterBarrie 3014 1.0 5.6771e+01 3.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0 VecScatterBegin 3014 1.0 3.1684e+01 2.0 0.00e+00 0.0 4.2e+06 1.1e+06 2.8e+01 4 0 63 56 0 4 0 63 56 0 0 VecScatterEnd 2976 1.0 1.1383e+02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 14 0 0 0 0 14 0 0 0 0 0 With 100 processors / 10 nodes: VecScatterBarrie 3010 1.0 7.4430e+01 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 7 0 0 0 0 7 0 0 0 0 0 VecScatterBegin 3010 1.0 3.8504e+01 2.4 0.00e+00 0.0 1.6e+06 2.0e+06 2.8e+01 4 0 71 66 0 4 0 71 66 0 0 VecScatterEnd 2972 1.0 8.5158e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 9 0 0 0 0 9 0 0 0 0 0 And with 20 processors / 1 node: VecScatterBarrie 2596 1.0 4.0614e+01 7.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 VecScatterBegin 2596 1.0 1.4970e+01 1.3 0.00e+00 0.0 1.2e+05 4.0e+06 3.0e+01 1 0 81 61 0 1 0 81 61 0 0 VecScatterEnd 2558 1.0 1.4903e+01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 Can you help me interpret this? what i see is the End portion taking more relative time and Begin staying the same beyond one node, also Barrier and Begin counts are the same every time, but how do i estimate communication times from here? Thanks, On Wed, Mar 20, 2019 at 3:24 PM Zhang, Junchao > wrote: Forgot to mention long VecScatter time might also due to local memory copies. If the communication pattern has large local to local (self to self) scatter, which often happens thanks to locality, then the memory copy time is counted in VecScatter. You can analyze your code's communication pattern to see if it is the case. --Junchao Zhang On Wed, Mar 20, 2019 at 4:44 PM Zhang, Junchao via petsc-users > wrote: On Wed, Mar 20, 2019 at 4:18 PM Manuel Valera > wrote: Thanks for your answer, so for example i have a log for 200 cores across 10 nodes that reads: ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ---------------------------------------------------------------------------------------------------------------------- VecScatterBegin 3014 1.0 4.5550e+01 2.6 0.00e+00 0.0 4.2e+06 1.1e+06 2.8e+01 4 0 63 56 0 4 0 63 56 0 0 VecScatterEnd 2976 1.0 1.2143e+02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 14 0 0 0 0 14 0 0 0 0 0 While for 20 nodes at one node i have: What does that mean? VecScatterBegin 2596 1.0 2.9142e+01 2.1 0.00e+00 0.0 1.2e+05 4.0e+06 3.0e+01 2 0 81 61 0 2 0 81 61 0 0 VecScatterEnd 2558 1.0 8.0344e+01 7.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 Where do i see the max/min ratio in here? and why End step is all 0.0e00 in both but still grows from 3% to 14% of total time? It seems i would need to run again with the -log_sync option, is this correct? e.g., 2.1, 7.9. MPI send/recv are in VecScatterBegin(). VecScatterEnd() only does MPI_Wait. That is why it has zero messages. Yes, run with -log_sync and see what happens. Different question, can't i estimate the total communication time if i had a typical communication time per MPI message times the number of MPI messages reported in the log? or it doesn't work like that? Probably not work because you have multiple processes doing send/recv at the same time. They might saturate the bandwidth. Petsc also does computation/communication overlapping. Thanks. On Wed, Mar 20, 2019 at 2:02 PM Zhang, Junchao > wrote: See the "Mess AvgLen Reduct" number in each log stage. Mess is the total number of messages sent in an event over all processes. AvgLen is average message len. Reduct is the number of global reduction. Each event like VecScatterBegin/End has a maximal execution time over all processes, and a max/min ratio. %T is sum(execution time of the event on each process)/sum(execution time of the stage on each process). %T indicates how expensive the event is. It is a number you should pay attention to. If your code is imbalanced (i.e., with a big max/min ratio), then the performance number is skewed and becomes misleading because some processes are just waiting for others. Then, besides -log_view, you can add -log_sync, which adds an extra MPI_Barrier for each event to let them start at the same time. With that, it is easier to interpret the number. src/vec/vscat/examples/ex4.c is a tiny example for VecScatter logging. --Junchao Zhang On Wed, Mar 20, 2019 at 2:58 PM Manuel Valera via petsc-users > wrote: Hello, I am working on timing my model, which we made MPI scalable using petsc DMDAs, i want to know more about the output log and how to calculate a total communication times for my runs, so far i see we have "MPI Messages" and "MPI Messages Lengths" in the log, along VecScatterEnd and VecScatterBegin reports. My question is, how do i interpret these number to get a rough estimate on how much overhead we have just from MPI communications times in my model runs? Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From maahi.buet at gmail.com Sat Mar 23 13:12:21 2019 From: maahi.buet at gmail.com (Maahi Talukder) Date: Sat, 23 Mar 2019 14:12:21 -0400 Subject: [petsc-users] About Configuring PETSc In-Reply-To: References: Message-ID: I think I didn't build with PETSC-ARCH=arch-opt. So just to make sure, now I just run the command - ./configure PETSC_ARCH = arch-opt - and it will create the missing directory and I can switch between different architectures, right? Thanks, Maahi Talukder On Fri, Mar 22, 2019 at 3:12 PM Balay, Satish wrote: > Did you build PETSc with PETSC_ARCH=arch-opt? > > Btw - I used PETSC_ARCH=arch-debug to illustrate - but you already > have a build with PETSC_ARCH=arch-linux2-c-debug - so you should stick > to that. > > Satish > > On Fri, 22 Mar 2019, Maahi Talukder via petsc-users wrote: > > > Hi, > > > > I tried to run the command 'make PETSC_ARCH=arch-opt wholetest1' but it > > shows me the following error- > > > > > .................................................................................................................................................. > > [maahi at CB272PP-THINK1 Desktop]$ make PETSC_ARCH=arch-opt wholetest1 > > /home/maahi/petsc/lib/petsc/conf/rules:960: > > */home/maahi/petsc/arch-opt/lib/petsc/conf/petscrules: > > No such file or directory* > > make: *** No rule to make target > > '/home/maahi/petsc/arch-opt/lib/petsc/conf/petscrules'. Stop. > > > ........................................................................................................................................................ > > > > My .bashrc is the following - > > > > ..................................................................... > > > > PATH=$PATH:$HOME/.local/bin:$HOME/bin:/usr/lib64/openmpi/bin > > > > export PATH > > > > export PETSC_DIR=$HOME/petsc > > export PETSC_ARCH=arch-linux2-c-debug > > #export PETSC_ARCH=arch-debug > > > > > .......................................................................................... > > > > Could you please tell me what went wrong? > > > > Regards, > > Maahi Talukder > > > > > > On Thu, Mar 21, 2019 at 11:55 PM Maahi Talukder > > wrote: > > > > > Thank you so much for your reply. That clear things up! > > > > > > On Thu, Mar 21, 2019 at 10:43 PM Balay, Satish > wrote: > > > > > >> On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: > > >> > > >> > Thank you for your reply. > > >> > > > >> > So do I need to set the value of PETSC_ARCH as needed in .bashrc > as I > > >> did > > >> > in case of PETSC_DIR ? > > >> > > >> You can specify PETSC_ARCH as an option to make. You can have a > default > > >> value set in .bashrc - and change to a different value on command > line. > > >> > > >> For ex: in .bashrc > > >> > > >> export PETSC_ARCH=arch-debug > > >> > > >> Now if you want to build with debug libraries: > > >> > > >> make wholetest1 > > >> > > >> Now If you want to build with optimized libraries: > > >> > > >> make PETSC_ARCH=arch-opt wholetest1 > > >> > > >> > > >> > And by PETSC_ARCH=arch-opt, do you mean the > > >> > non-debugging mode? > > >> > > >> Yes. You can use whatever name you think is appropriate here. > > >> > > >> ./configure PETSC_ARCH=a-name-i-can-easily-associate-with-this-build > > >> [other configure options.] > > >> > > >> > > > >> > And I am using the following makefile with my code- > > >> > > > >> > CFLAGS = > > >> > FFLAGS =-I/home/maahi/petsc/include > > >> > -I/home/maahi/petsc/arch-linux2-c-debug/include -cpp -mcmodel=large > > >> > > >> Hm - you shouldn't be needing these options here. You should switch > your > > >> source files from .f to .F and .f90 to .F90 - and remove the above > FFLAGS > > >> > > >> Satish > > >> > > >> > CPPFLAGS = > > >> > FPPFLAGS = > > >> > > > >> > > > >> > include ${PETSC_DIR}/lib/petsc/conf/variables > > >> > include ${PETSC_DIR}/lib/petsc/conf/rules > > >> > > > >> > wholetest1: wholetest1.o > > >> > -${FLINKER} -o wholetest1 wholetest1.o ${PETSC_LIB} > > >> > ${RM} wholetest1.o > > >> > > > >> > So where do I add that PETSC_ARCH? > > >> > > > >> > Thanks, > > >> > Maahi Talukder > > >> > > > >> > On Thu, Mar 21, 2019 at 10:14 PM Balay, Satish > > >> wrote: > > >> > > > >> > > PETSc uses the concept of PETSC_ARCH to enable multiple in-place > > >> > > builds. > > >> > > > > >> > > So you can have a debug build with PETSC_ARCH=arch-debug, and a > > >> > > optimized build with PETSC_ARCH=arch-opt etc. > > >> > > > > >> > > And if you are using a petsc formatted makefile with your code - > you > > >> > > can switch between these builds by just switching PETSC_ARCH. > > >> > > > > >> > > Satish > > >> > > > > >> > > On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: > > >> > > > > >> > > > Dear All, > > >> > > > > > >> > > > Currently, I am running PETSc with debugging option. And it says > > >> that if > > >> > > I > > >> > > > run ./configure --with-debugging=no, the performance would be > > >> faster. My > > >> > > > question is: what would I do if I want to go back to debugging > > >> mode, and > > >> > > If > > >> > > > I configure it now with no debugging option, would it make any > > >> changes to > > >> > > > my current setting? > > >> > > > > > >> > > > Regards, > > >> > > > Maahi Talukder > > >> > > > > > >> > > > > >> > > > > >> > > > >> > > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sat Mar 23 13:29:14 2019 From: mfadams at lbl.gov (Mark Adams) Date: Sat, 23 Mar 2019 14:29:14 -0400 Subject: [petsc-users] About Configuring PETSc In-Reply-To: References: Message-ID: On Sat, Mar 23, 2019 at 2:13 PM Maahi Talukder via petsc-users < petsc-users at mcs.anl.gov> wrote: > I think I didn't build with PETSC-ARCH=arch-opt. > So just to make sure, now I just run the command - ./configure PETSC_ARCH > = arch-opt - and it will create the missing directory and I can switch > between different architectures, right? > I would not put any spaces around "=", and you can also just look at what PETSc defaulted to and use that directory. PETSc will pick a default architecture name and you should see an "arch..." directory in your PETSc directory. > > Thanks, > Maahi Talukder > > On Fri, Mar 22, 2019 at 3:12 PM Balay, Satish wrote: > >> Did you build PETSc with PETSC_ARCH=arch-opt? >> >> Btw - I used PETSC_ARCH=arch-debug to illustrate - but you already >> have a build with PETSC_ARCH=arch-linux2-c-debug - so you should stick >> to that. >> >> Satish >> >> On Fri, 22 Mar 2019, Maahi Talukder via petsc-users wrote: >> >> > Hi, >> > >> > I tried to run the command 'make PETSC_ARCH=arch-opt wholetest1' but it >> > shows me the following error- >> > >> > >> .................................................................................................................................................. >> > [maahi at CB272PP-THINK1 Desktop]$ make PETSC_ARCH=arch-opt wholetest1 >> > /home/maahi/petsc/lib/petsc/conf/rules:960: >> > */home/maahi/petsc/arch-opt/lib/petsc/conf/petscrules: >> > No such file or directory* >> > make: *** No rule to make target >> > '/home/maahi/petsc/arch-opt/lib/petsc/conf/petscrules'. Stop. >> > >> ........................................................................................................................................................ >> > >> > My .bashrc is the following - >> > >> > ..................................................................... >> > >> > PATH=$PATH:$HOME/.local/bin:$HOME/bin:/usr/lib64/openmpi/bin >> > >> > export PATH >> > >> > export PETSC_DIR=$HOME/petsc >> > export PETSC_ARCH=arch-linux2-c-debug >> > #export PETSC_ARCH=arch-debug >> > >> > >> .......................................................................................... >> > >> > Could you please tell me what went wrong? >> > >> > Regards, >> > Maahi Talukder >> > >> > >> > On Thu, Mar 21, 2019 at 11:55 PM Maahi Talukder >> > wrote: >> > >> > > Thank you so much for your reply. That clear things up! >> > > >> > > On Thu, Mar 21, 2019 at 10:43 PM Balay, Satish >> wrote: >> > > >> > >> On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: >> > >> >> > >> > Thank you for your reply. >> > >> > >> > >> > So do I need to set the value of PETSC_ARCH as needed in .bashrc >> as I >> > >> did >> > >> > in case of PETSC_DIR ? >> > >> >> > >> You can specify PETSC_ARCH as an option to make. You can have a >> default >> > >> value set in .bashrc - and change to a different value on command >> line. >> > >> >> > >> For ex: in .bashrc >> > >> >> > >> export PETSC_ARCH=arch-debug >> > >> >> > >> Now if you want to build with debug libraries: >> > >> >> > >> make wholetest1 >> > >> >> > >> Now If you want to build with optimized libraries: >> > >> >> > >> make PETSC_ARCH=arch-opt wholetest1 >> > >> >> > >> >> > >> > And by PETSC_ARCH=arch-opt, do you mean the >> > >> > non-debugging mode? >> > >> >> > >> Yes. You can use whatever name you think is appropriate here. >> > >> >> > >> ./configure PETSC_ARCH=a-name-i-can-easily-associate-with-this-build >> > >> [other configure options.] >> > >> >> > >> > >> > >> > And I am using the following makefile with my code- >> > >> > >> > >> > CFLAGS = >> > >> > FFLAGS =-I/home/maahi/petsc/include >> > >> > -I/home/maahi/petsc/arch-linux2-c-debug/include -cpp -mcmodel=large >> > >> >> > >> Hm - you shouldn't be needing these options here. You should switch >> your >> > >> source files from .f to .F and .f90 to .F90 - and remove the above >> FFLAGS >> > >> >> > >> Satish >> > >> >> > >> > CPPFLAGS = >> > >> > FPPFLAGS = >> > >> > >> > >> > >> > >> > include ${PETSC_DIR}/lib/petsc/conf/variables >> > >> > include ${PETSC_DIR}/lib/petsc/conf/rules >> > >> > >> > >> > wholetest1: wholetest1.o >> > >> > -${FLINKER} -o wholetest1 wholetest1.o ${PETSC_LIB} >> > >> > ${RM} wholetest1.o >> > >> > >> > >> > So where do I add that PETSC_ARCH? >> > >> > >> > >> > Thanks, >> > >> > Maahi Talukder >> > >> > >> > >> > On Thu, Mar 21, 2019 at 10:14 PM Balay, Satish >> > >> wrote: >> > >> > >> > >> > > PETSc uses the concept of PETSC_ARCH to enable multiple in-place >> > >> > > builds. >> > >> > > >> > >> > > So you can have a debug build with PETSC_ARCH=arch-debug, and a >> > >> > > optimized build with PETSC_ARCH=arch-opt etc. >> > >> > > >> > >> > > And if you are using a petsc formatted makefile with your code - >> you >> > >> > > can switch between these builds by just switching PETSC_ARCH. >> > >> > > >> > >> > > Satish >> > >> > > >> > >> > > On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: >> > >> > > >> > >> > > > Dear All, >> > >> > > > >> > >> > > > Currently, I am running PETSc with debugging option. And it >> says >> > >> that if >> > >> > > I >> > >> > > > run ./configure --with-debugging=no, the performance would be >> > >> faster. My >> > >> > > > question is: what would I do if I want to go back to debugging >> > >> mode, and >> > >> > > If >> > >> > > > I configure it now with no debugging option, would it make any >> > >> changes to >> > >> > > > my current setting? >> > >> > > > >> > >> > > > Regards, >> > >> > > > Maahi Talukder >> > >> > > > >> > >> > > >> > >> > > >> > >> > >> > >> >> > >> >> > >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sat Mar 23 13:55:43 2019 From: mfadams at lbl.gov (Mark Adams) Date: Sat, 23 Mar 2019 14:55:43 -0400 Subject: [petsc-users] About Configuring PETSc In-Reply-To: References: Message-ID: On Sat, Mar 23, 2019 at 2:51 PM Maahi Talukder wrote: > Thank you for your reply. > > In my PETSC directory, there is only one "arch..." directory called > arch-linux2-c-debug. And with that one, I can only run my code in debugging > mode. But I want to run them in non-debugging mode, so I was wondering what > to do. > So should I run the command ' ./configure --with-debugging=no'? > I use 0. I don't know if 'no' works. I would guess that PETSc would pick arch-linux2-c-opt, or something like that if you do not provide a PETSC_ARCH. > On Sat, Mar 23, 2019 at 2:29 PM Mark Adams wrote: > >> >> >> On Sat, Mar 23, 2019 at 2:13 PM Maahi Talukder via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> I think I didn't build with PETSC-ARCH=arch-opt. >>> So just to make sure, now I just run the command - ./configure >>> PETSC_ARCH = arch-opt - and it will create the missing directory and I can >>> switch between different architectures, right? >>> >> >> I would not put any spaces around "=", and you can also just look at what >> PETSc defaulted to and use that directory. PETSc will pick a default >> architecture name and you should see an "arch..." directory in your PETSc >> directory. >> >> >>> >>> Thanks, >>> Maahi Talukder >>> >>> On Fri, Mar 22, 2019 at 3:12 PM Balay, Satish wrote: >>> >>>> Did you build PETSc with PETSC_ARCH=arch-opt? >>>> >>>> Btw - I used PETSC_ARCH=arch-debug to illustrate - but you already >>>> have a build with PETSC_ARCH=arch-linux2-c-debug - so you should stick >>>> to that. >>>> >>>> Satish >>>> >>>> On Fri, 22 Mar 2019, Maahi Talukder via petsc-users wrote: >>>> >>>> > Hi, >>>> > >>>> > I tried to run the command 'make PETSC_ARCH=arch-opt wholetest1' but >>>> it >>>> > shows me the following error- >>>> > >>>> > >>>> .................................................................................................................................................. >>>> > [maahi at CB272PP-THINK1 Desktop]$ make PETSC_ARCH=arch-opt wholetest1 >>>> > /home/maahi/petsc/lib/petsc/conf/rules:960: >>>> > */home/maahi/petsc/arch-opt/lib/petsc/conf/petscrules: >>>> > No such file or directory* >>>> > make: *** No rule to make target >>>> > '/home/maahi/petsc/arch-opt/lib/petsc/conf/petscrules'. Stop. >>>> > >>>> ........................................................................................................................................................ >>>> > >>>> > My .bashrc is the following - >>>> > >>>> > ..................................................................... >>>> > >>>> > PATH=$PATH:$HOME/.local/bin:$HOME/bin:/usr/lib64/openmpi/bin >>>> > >>>> > export PATH >>>> > >>>> > export PETSC_DIR=$HOME/petsc >>>> > export PETSC_ARCH=arch-linux2-c-debug >>>> > #export PETSC_ARCH=arch-debug >>>> > >>>> > >>>> .......................................................................................... >>>> > >>>> > Could you please tell me what went wrong? >>>> > >>>> > Regards, >>>> > Maahi Talukder >>>> > >>>> > >>>> > On Thu, Mar 21, 2019 at 11:55 PM Maahi Talukder >>> > >>>> > wrote: >>>> > >>>> > > Thank you so much for your reply. That clear things up! >>>> > > >>>> > > On Thu, Mar 21, 2019 at 10:43 PM Balay, Satish >>>> wrote: >>>> > > >>>> > >> On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: >>>> > >> >>>> > >> > Thank you for your reply. >>>> > >> > >>>> > >> > So do I need to set the value of PETSC_ARCH as needed in >>>> .bashrc as I >>>> > >> did >>>> > >> > in case of PETSC_DIR ? >>>> > >> >>>> > >> You can specify PETSC_ARCH as an option to make. You can have a >>>> default >>>> > >> value set in .bashrc - and change to a different value on command >>>> line. >>>> > >> >>>> > >> For ex: in .bashrc >>>> > >> >>>> > >> export PETSC_ARCH=arch-debug >>>> > >> >>>> > >> Now if you want to build with debug libraries: >>>> > >> >>>> > >> make wholetest1 >>>> > >> >>>> > >> Now If you want to build with optimized libraries: >>>> > >> >>>> > >> make PETSC_ARCH=arch-opt wholetest1 >>>> > >> >>>> > >> >>>> > >> > And by PETSC_ARCH=arch-opt, do you mean the >>>> > >> > non-debugging mode? >>>> > >> >>>> > >> Yes. You can use whatever name you think is appropriate here. >>>> > >> >>>> > >> ./configure >>>> PETSC_ARCH=a-name-i-can-easily-associate-with-this-build >>>> > >> [other configure options.] >>>> > >> >>>> > >> > >>>> > >> > And I am using the following makefile with my code- >>>> > >> > >>>> > >> > CFLAGS = >>>> > >> > FFLAGS =-I/home/maahi/petsc/include >>>> > >> > -I/home/maahi/petsc/arch-linux2-c-debug/include -cpp >>>> -mcmodel=large >>>> > >> >>>> > >> Hm - you shouldn't be needing these options here. You should >>>> switch your >>>> > >> source files from .f to .F and .f90 to .F90 - and remove the above >>>> FFLAGS >>>> > >> >>>> > >> Satish >>>> > >> >>>> > >> > CPPFLAGS = >>>> > >> > FPPFLAGS = >>>> > >> > >>>> > >> > >>>> > >> > include ${PETSC_DIR}/lib/petsc/conf/variables >>>> > >> > include ${PETSC_DIR}/lib/petsc/conf/rules >>>> > >> > >>>> > >> > wholetest1: wholetest1.o >>>> > >> > -${FLINKER} -o wholetest1 wholetest1.o ${PETSC_LIB} >>>> > >> > ${RM} wholetest1.o >>>> > >> > >>>> > >> > So where do I add that PETSC_ARCH? >>>> > >> > >>>> > >> > Thanks, >>>> > >> > Maahi Talukder >>>> > >> > >>>> > >> > On Thu, Mar 21, 2019 at 10:14 PM Balay, Satish < >>>> balay at mcs.anl.gov> >>>> > >> wrote: >>>> > >> > >>>> > >> > > PETSc uses the concept of PETSC_ARCH to enable multiple >>>> in-place >>>> > >> > > builds. >>>> > >> > > >>>> > >> > > So you can have a debug build with PETSC_ARCH=arch-debug, and a >>>> > >> > > optimized build with PETSC_ARCH=arch-opt etc. >>>> > >> > > >>>> > >> > > And if you are using a petsc formatted makefile with your code >>>> - you >>>> > >> > > can switch between these builds by just switching PETSC_ARCH. >>>> > >> > > >>>> > >> > > Satish >>>> > >> > > >>>> > >> > > On Thu, 21 Mar 2019, Maahi Talukder via petsc-users wrote: >>>> > >> > > >>>> > >> > > > Dear All, >>>> > >> > > > >>>> > >> > > > Currently, I am running PETSc with debugging option. And it >>>> says >>>> > >> that if >>>> > >> > > I >>>> > >> > > > run ./configure --with-debugging=no, the performance would be >>>> > >> faster. My >>>> > >> > > > question is: what would I do if I want to go back to >>>> debugging >>>> > >> mode, and >>>> > >> > > If >>>> > >> > > > I configure it now with no debugging option, would it make >>>> any >>>> > >> changes to >>>> > >> > > > my current setting? >>>> > >> > > > >>>> > >> > > > Regards, >>>> > >> > > > Maahi Talukder >>>> > >> > > > >>>> > >> > > >>>> > >> > > >>>> > >> > >>>> > >> >>>> > >> >>>> > >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sun Mar 24 03:27:22 2019 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 24 Mar 2019 04:27:22 -0400 Subject: [petsc-users] [petsc-maint] PETSc bugs In-Reply-To: <000f01d4e1e5$f48e3db0$ddaab910$@com> References: <000a01d22637$6489b990$2d9d2cb0$@com> <3C14830F-F87B-4233-A9BF-08888968DCA2@mcs.anl.gov> <000501d2296e$ebc5dd00$c3519700$@com> <565C44D9-BC6B-4A75-B036-A509FCA0EB8A@mcs.anl.gov> <000001d22991$e484c970$ad8e5c50$@com> <14799612-6C57-416A-A7C0-4DDE344C9118@mcs.anl.gov> <000001d22f09$eb99c4e0$c2cd4ea0$@com> <001801d22fa2$cfdc3b90$6f94b2b0$@com> <000001d4dfee$43ce6fc0$cb6b4f40$@com> <000301d4e002$696aa460$3c3fed20$@com> <000001d4e19b$50cba1d0$f262e570$@com> <000601d4e1e1$bdffb4d0$39ff1e70$@com> <000f01d4e1e5$f48e3db0$ddaab910$@com> Message-ID: On Sat, Mar 23, 2019 at 10:04 PM Dian Han wrote: > Thanks, Matt. Is there any PETSc pre-conditioner able to handle ?0? > diagonal case? > (Matt told you to use svd.) Fast solvers exploit structure in your problem. If your problems are not arbitrary but have a source in some application then you should do a literature search and see what others have done from application like yours. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 14056 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 18442 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 55086 bytes Desc: not available URL: From fdkong.jd at gmail.com Sun Mar 24 17:36:53 2019 From: fdkong.jd at gmail.com (Fande Kong) Date: Sun, 24 Mar 2019 16:36:53 -0600 Subject: [petsc-users] How to use khash in PETSc? Message-ID: Hi All, Since PetscTable will be replaced by khash in the future somehow, it is better to use khash for new implementations. I was wondering where I can find some examples that use khash? Do we have any petsc wrappers of khash? Thanks, Fande, -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Mar 24 18:39:28 2019 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 24 Mar 2019 13:39:28 -1000 Subject: [petsc-users] How to use khash in PETSc? In-Reply-To: References: Message-ID: On Sun, Mar 24, 2019 at 12:38 PM Fande Kong via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi All, > > Since PetscTable will be replaced by khash in the future somehow, it is > better to use khash for new implementations. I was wondering where I can > find some examples that use khash? Do we have any petsc wrappers of khash? > First look here: https://bitbucket.org/petsc/petsc/src/master/include/petsc/private/hashseti.h and the other associated files (setij, mapi, mapij). Lisandro did a good job organizing these, and we might have already defined what you want. Thanks, Matt > Thanks, > > Fande, > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Mon Mar 25 04:37:07 2019 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 25 Mar 2019 12:37:07 +0300 Subject: [petsc-users] How to use khash in PETSc? In-Reply-To: References: Message-ID: On Mon, 25 Mar 2019 at 02:40, Matthew Knepley via petsc-users < petsc-users at mcs.anl.gov> wrote: > On Sun, Mar 24, 2019 at 12:38 PM Fande Kong via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hi All, >> >> Since PetscTable will be replaced by khash in the future somehow, it is >> better to use khash for new implementations. I was wondering where I can >> find some examples that use khash? Do we have any petsc wrappers of khash? >> > > First look here: > > > https://bitbucket.org/petsc/petsc/src/master/include/petsc/private/hashseti.h > > > and the other associated files (setij, mapi, mapij). Lisandro did a good > job organizing > these, and we might have already defined what you want. > > It is even documented, which is extremely unusual coming from me!!! -- Lisandro Dalcin ============ Research Scientist Extreme Computing Research Center (ECRC) King Abdullah University of Science and Technology (KAUST) http://ecrc.kaust.edu.sa/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eda.oktay at metu.edu.tr Mon Mar 25 05:37:38 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Mon, 25 Mar 2019 13:37:38 +0300 Subject: [petsc-users] Argument out of range error in MatPermute Message-ID: Hello, I am trying to permute a vector A using following lines: ierr = ISCreateGeneral(PETSC_COMM_SELF,siz,idx,PETSC_COPY_VALUES,&is);CHKERRQ(ierr); ierr = ISSetPermutation(is);CHKERRQ(ierr); ierr = ISDuplicate(is,&newIS);CHKERRQ(ierr); ierr = MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE);CHKERRQ(ierr); ierr = MatPermute(A,is,newIS,&PL);CHKERRQ(ierr); However, in MatPermute line, I get the following error even if I used MatSetOption before this line: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: New nonzero at (0,485) caused a malloc Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.10.3, Dec, 18, 2018 [0]PETSC ERROR: ./DENEME_TEMIZ_ENYENI_FINAL on a arch-linux2-c-debug named 1232.wls.metu.edu.tr by edaoktay Mon Mar 25 12:15:14 2019 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-cxx-dialect=C++11 --download-openblas --download-metis --download-parmetis --download-superlu_dist --download-slepc --download-mpich [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 579 in /home/edaoktay/petsc-3.10.3/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: #2 MatAssemblyEnd_MPIAIJ() line 807 in /home/edaoktay/petsc-3.10.3/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: #3 MatAssemblyEnd() line 5340 in /home/edaoktay/petsc-3.10.3/src/mat/interface/matrix.c [0]PETSC ERROR: #4 MatPermute_MPIAIJ() line 1723 in /home/edaoktay/petsc-3.10.3/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: #5 MatPermute() line 4997 in /home/edaoktay/petsc-3.10.3/src/mat/interface/matrix.c [0]PETSC ERROR: #6 main() line 285 in /home/edaoktay/petsc-3.10.3/arch-linux2-c-debug/share/slepc/examples/src/eda/DENEME_TEMIZ_ENYENI_FINAL.c [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -f /home/edaoktay/petsc-3.10.3/share/petsc/datafiles/matrices/binary_files/airfoil1_binary [0]PETSC ERROR: -mat_partitioning_type parmetis [0]PETSC ERROR: -weighted [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- I'll be glad if you can help me. Thanks! Eda -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 25 05:40:57 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 25 Mar 2019 00:40:57 -1000 Subject: [petsc-users] Argument out of range error in MatPermute In-Reply-To: References: Message-ID: That should not happen. Can you send in a small example that we can debug. Thanks, Matt On Mon, Mar 25, 2019 at 12:38 AM Eda Oktay via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > I am trying to permute a vector A using following lines: > > ierr = > ISCreateGeneral(PETSC_COMM_SELF,siz,idx,PETSC_COPY_VALUES,&is);CHKERRQ(ierr); > ierr = ISSetPermutation(is);CHKERRQ(ierr); > ierr = ISDuplicate(is,&newIS);CHKERRQ(ierr); > ierr = > MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE);CHKERRQ(ierr); > ierr = MatPermute(A,is,newIS,&PL);CHKERRQ(ierr); > > However, in MatPermute line, I get the following error even if I used > MatSetOption before this line: > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: New nonzero at (0,485) caused a malloc > Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn > off this check > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.10.3, Dec, 18, 2018 > [0]PETSC ERROR: ./DENEME_TEMIZ_ENYENI_FINAL on a arch-linux2-c-debug named > 1232.wls.metu.edu.tr by edaoktay Mon Mar 25 12:15:14 2019 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-cxx-dialect=C++11 --download-openblas > --download-metis --download-parmetis --download-superlu_dist > --download-slepc --download-mpich > [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 579 in > /home/edaoktay/petsc-3.10.3/src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: #2 MatAssemblyEnd_MPIAIJ() line 807 in > /home/edaoktay/petsc-3.10.3/src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: #3 MatAssemblyEnd() line 5340 in > /home/edaoktay/petsc-3.10.3/src/mat/interface/matrix.c > [0]PETSC ERROR: #4 MatPermute_MPIAIJ() line 1723 in > /home/edaoktay/petsc-3.10.3/src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: #5 MatPermute() line 4997 in > /home/edaoktay/petsc-3.10.3/src/mat/interface/matrix.c > [0]PETSC ERROR: #6 main() line 285 in > /home/edaoktay/petsc-3.10.3/arch-linux2-c-debug/share/slepc/examples/src/eda/DENEME_TEMIZ_ENYENI_FINAL.c > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -f > /home/edaoktay/petsc-3.10.3/share/petsc/datafiles/matrices/binary_files/airfoil1_binary > [0]PETSC ERROR: -mat_partitioning_type parmetis > [0]PETSC ERROR: -weighted > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > > I'll be glad if you can help me. > > Thanks! > > Eda > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eda.oktay at metu.edu.tr Mon Mar 25 05:53:18 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Mon, 25 Mar 2019 13:53:18 +0300 Subject: [petsc-users] Argument out of range error in MatPermute In-Reply-To: References: Message-ID: I attached whole program I wrote where the problem is in line 285. One of the matrices I used was airfoil1_binary, included in the folder. Also, I included makefile. Is that what you want? Matthew Knepley , 25 Mar 2019 Pzt, 13:41 tarihinde ?unu yazd?: > That should not happen. Can you send in a small example that we can debug. > > Thanks, > > Matt > > On Mon, Mar 25, 2019 at 12:38 AM Eda Oktay via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello, >> >> I am trying to permute a vector A using following lines: >> >> ierr = >> ISCreateGeneral(PETSC_COMM_SELF,siz,idx,PETSC_COPY_VALUES,&is);CHKERRQ(ierr); >> ierr = ISSetPermutation(is);CHKERRQ(ierr); >> ierr = ISDuplicate(is,&newIS);CHKERRQ(ierr); >> ierr = >> MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE);CHKERRQ(ierr); >> ierr = MatPermute(A,is,newIS,&PL);CHKERRQ(ierr); >> >> However, in MatPermute line, I get the following error even if I used >> MatSetOption before this line: >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: Argument out of range >> [0]PETSC ERROR: New nonzero at (0,485) caused a malloc >> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn >> off this check >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.10.3, Dec, 18, 2018 >> [0]PETSC ERROR: ./DENEME_TEMIZ_ENYENI_FINAL on a arch-linux2-c-debug >> named 1232.wls.metu.edu.tr by edaoktay Mon Mar 25 12:15:14 2019 >> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ >> --with-fc=gfortran --with-cxx-dialect=C++11 --download-openblas >> --download-metis --download-parmetis --download-superlu_dist >> --download-slepc --download-mpich >> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 579 in >> /home/edaoktay/petsc-3.10.3/src/mat/impls/aij/mpi/mpiaij.c >> [0]PETSC ERROR: #2 MatAssemblyEnd_MPIAIJ() line 807 in >> /home/edaoktay/petsc-3.10.3/src/mat/impls/aij/mpi/mpiaij.c >> [0]PETSC ERROR: #3 MatAssemblyEnd() line 5340 in >> /home/edaoktay/petsc-3.10.3/src/mat/interface/matrix.c >> [0]PETSC ERROR: #4 MatPermute_MPIAIJ() line 1723 in >> /home/edaoktay/petsc-3.10.3/src/mat/impls/aij/mpi/mpiaij.c >> [0]PETSC ERROR: #5 MatPermute() line 4997 in >> /home/edaoktay/petsc-3.10.3/src/mat/interface/matrix.c >> [0]PETSC ERROR: #6 main() line 285 in >> /home/edaoktay/petsc-3.10.3/arch-linux2-c-debug/share/slepc/examples/src/eda/DENEME_TEMIZ_ENYENI_FINAL.c >> [0]PETSC ERROR: PETSc Option Table entries: >> [0]PETSC ERROR: -f >> /home/edaoktay/petsc-3.10.3/share/petsc/datafiles/matrices/binary_files/airfoil1_binary >> [0]PETSC ERROR: -mat_partitioning_type parmetis >> [0]PETSC ERROR: -weighted >> [0]PETSC ERROR: ----------------End of Error Message -------send entire >> error message to petsc-maint at mcs.anl.gov---------- >> >> I'll be glad if you can help me. >> >> Thanks! >> >> Eda >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: example.zip Type: application/zip Size: 42504 bytes Desc: not available URL: From m.colera at upm.es Mon Mar 25 07:06:53 2019 From: m.colera at upm.es (Manuel Colera Rico) Date: Mon, 25 Mar 2019 13:06:53 +0100 Subject: [petsc-users] Solving block systems with some null diagonal blocks Message-ID: <8b48a306-74ec-674c-cefe-e19bbfd7679e@upm.es> Hello, I would like to solve a N*N block system (with N>2) in which some of the diagonal blocks are null. My system matrix is defined as a MatNest. As N>2, I can't use "pc_fieldsplit_type schur" nor "pc_fieldsplit_detect_saddle_point". The other algorithms ("additive", "multiplicative" and "symmetric_multiplicative") don't work either as they need each A_ii to be non-zero. Is there any built-in function in PETSc for this? If not, could you please suggest a workaround? Thanks and kind regards, Manuel --- From myriam.peyrounette at idris.fr Mon Mar 25 09:54:29 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Mon, 25 Mar 2019 15:54:29 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> <00d471e0-ed2a-51fa-3031-a6b63c3a96e1@idris.fr> Message-ID: <75a2f7b1-9e7a-843f-1a83-efff8e56f797@idris.fr> Hi, thanks for the explanations. I tried the last PETSc version (commit fbc5705bc518d02a4999f188aad4ccff5f754cbf), which includes the patch you talked about. But the memory scaling shows no improvement (see scaling attached), even when using the "scalable" options :( I had a look at the PETSc functions MatPtAPNumeric_MPIAIJ_MPIAIJ and MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences before and after the first "bad" commit), but I can't find what induced this memory issue. Myriam Le 03/20/19 ? 17:38, Fande Kong a ?crit?: > Hi?Myriam, > > There are three algorithms in PETSc to do PtAP (?const char? ? ? ? ? > *algTypes[3] = {"scalable","nonscalable","hypre"};), and can be > specified using the petsc options:?-matptap_via xxxx. > > (1) -matptap_via hypre: This call the hypre package to do the PtAP > trough an all-at-once triple product. In our experiences, it is the > most memory efficient, but could be slow. > > (2)? -matptap_via scalable: This involves a row-wise algorithm plus an > outer product.? This will use more memory than hypre, but way faster. > This used to have a bug that could take all your memory, and I have a > fix > at?https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff.? > When using this option, we may want to have extra options such as? > ?-inner_offdiag_matmatmult_via scalable -inner_diag_matmatmult_via > scalable? to select inner scalable algorithms. > > (3)??-matptap_via nonscalable:? Suppose to be even faster, but use > more memory. It does dense matrix operations. > > > Thanks, > > Fande Kong > > > > > On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via petsc-users > > wrote: > > More precisely: something happens when upgrading the functions > MatPtAPNumeric_MPIAIJ_MPIAIJ and/or MatPtAPSymbolic_MPIAIJ_MPIAIJ. > > Unfortunately, there are a lot of differences between the old and > new versions of these functions. I keep investigating but if you > have any idea, please let me know. > > Best, > > Myriam > > > Le 03/20/19 ? 13:48, Myriam Peyrounette a ?crit?: >> >> Hi all, >> >> I used git bisect to determine when the memory need increased. I >> found that the first "bad" commit is ? >> aa690a28a7284adb519c28cb44eae20a2c131c85. >> >> Barry was right, this commit seems to be about an evolution of >> MatPtAPSymbolic_MPIAIJ_MPIAIJ. You mentioned the option >> "-matptap_via scalable" but I can't find any information about >> it. Can you tell me more? >> >> Thanks >> >> Myriam >> >> >> Le 03/11/19 ? 14:40, Mark Adams a ?crit?: >>> Is there a difference in memory usage on your tiny problem? I >>> assume no. >>> >>> I don't see anything that could come from GAMG other than the >>> RAP stuff that you have discussed already. >>> >>> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette >>> >> > wrote: >>> >>> The code I am using here is the example 42 of PETSc >>> (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). >>> Indeed it solves the Stokes equation. I thought it was a >>> good idea to use an example you might know (and didn't find >>> any that uses GAMG functions). I just changed the PCMG setup >>> so that the memory problem appears. And it appears when >>> adding PCGAMG. >>> >>> I don't care about the performance or even the result >>> rightness here, but only about the difference in memory use >>> between 3.6 and 3.10. Do you think finding a more adapted >>> script would help? >>> >>> I used the threshold of 0.1 only once, at the beginning, to >>> test its influence. I used the default threshold (of 0, I >>> guess) for all the other runs. >>> >>> Myriam >>> >>> >>> Le 03/11/19 ? 13:52, Mark Adams a ?crit?: >>>> In looking at this larger scale run ... >>>> >>>> * Your eigen estimates are much lower than your tiny test >>>> problem.? But this is Stokes apparently and it should not >>>> work anyway. Maybe you have a small time step that adds a >>>> lot of mass that brings the eigen estimates down. And your >>>> min eigenvalue (not used) is positive. I would expect >>>> negative for Stokes ... >>>> >>>> * You seem to be setting a threshold value of 0.1 -- that >>>> is very high >>>> >>>> * v3.6 says "using nonzero initial guess" but this is not >>>> in v3.10. Maybe we just stopped printing that. >>>> >>>> * There were some changes to coasening parameters in going >>>> from v3.6 but it does not look like your problem was >>>> effected. (The coarsening algo is non-deterministic by >>>> default and you can see small difference on different runs) >>>> >>>> * We may have also added a "noisy" RHS for eigen estimates >>>> by default from v3.6. >>>> >>>> * And for non-symetric problems you can try >>>> -pc_gamg_agg_nsmooths 0, but again GAMG is not built for >>>> Stokes anyway. >>>> >>>> >>>> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette >>>> >>> > wrote: >>>> >>>> I used PCView to display the size of the linear system >>>> in each level of the MG. You'll find the outputs >>>> attached to this mail (zip file) for both the default >>>> threshold value and a value of 0.1, and for both 3.6 >>>> and 3.10 PETSc versions. >>>> >>>> For convenience, I summarized the information in a >>>> graph, also attached (png file). >>>> >>>> As you can see, there are slight differences between >>>> the two versions but none is critical, in my opinion. >>>> Do you see anything suspicious in the outputs? >>>> >>>> + I can't find the default threshold value. Do you know >>>> where I can find it? >>>> >>>> Thanks for the follow-up >>>> >>>> Myriam >>>> >>>> >>>> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit?: >>>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette >>>>> >>>> > wrote: >>>>> >>>>> Hi Matt, >>>>> >>>>> I plotted the memory scalings using different >>>>> threshold values. The two scalings are slightly >>>>> translated (from -22 to -88 mB) but this gain is >>>>> neglectable. The 3.6-scaling keeps being robust >>>>> while the 3.10-scaling deteriorates. >>>>> >>>>> Do you have any other suggestion? >>>>> >>>>> Mark, what is the option she can give to output all >>>>> the GAMG data? >>>>> >>>>> Also, run using -ksp_view. GAMG will report all the >>>>> sizes of its grids, so it should be easy to see >>>>> if the coarse grid sizes are increasing, and also what >>>>> the effect of the threshold value is. >>>>> >>>>> ? Thanks, >>>>> >>>>> ? ? ?Matt? >>>>> >>>>> Thanks >>>>> >>>>> Myriam >>>>> >>>>> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit?: >>>>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam >>>>>> Peyrounette via petsc-users >>>>>> >>>>> > wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I used to run my code with PETSc 3.6. Since I >>>>>> upgraded the PETSc version >>>>>> to 3.10, this code has a bad memory scaling. >>>>>> >>>>>> To report this issue, I took the PETSc script >>>>>> ex42.c and slightly >>>>>> modified it so that the KSP and PC >>>>>> configurations are the same as in my >>>>>> code. In particular, I use a "personnalised" >>>>>> multi-grid method. The >>>>>> modifications are indicated by the keyword >>>>>> "TopBridge" in the attached >>>>>> scripts. >>>>>> >>>>>> To plot the memory (weak) scaling, I ran four >>>>>> calculations for each >>>>>> script with increasing problem sizes and >>>>>> computations cores: >>>>>> >>>>>> 1. 100,000 elts on 4 cores >>>>>> 2. 1 million elts on 40 cores >>>>>> 3. 10 millions elts on 400 cores >>>>>> 4. 100 millions elts on 4,000 cores >>>>>> >>>>>> The resulting graph is also attached. The >>>>>> scaling using PETSc 3.10 >>>>>> clearly deteriorates for large cases, while >>>>>> the one using PETSc 3.6 is >>>>>> robust. >>>>>> >>>>>> After a few tests, I found that the scaling >>>>>> is mostly sensitive to the >>>>>> use of the AMG method for the coarse grid >>>>>> (line 1780 in >>>>>> main_ex42_petsc36.cc). In particular, the >>>>>> performance strongly >>>>>> deteriorates when commenting lines 1777 to >>>>>> 1790 (in main_ex42_petsc36.cc). >>>>>> >>>>>> Do you have any idea of what changed between >>>>>> version 3.6 and version >>>>>> 3.10 that may imply such degradation? >>>>>> >>>>>> >>>>>> I believe the default values for PCGAMG changed >>>>>> between versions. It sounds like the coarsening rate >>>>>> is not great enough, so that these grids are too >>>>>> large. This can be set using: >>>>>> >>>>>> ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>>>> >>>>>> There is some explanation of this effect on that >>>>>> page. Let us know if setting this does not >>>>>> correct the situation. >>>>>> >>>>>> ? Thanks, >>>>>> >>>>>> ? ? ?Matt >>>>>> ? >>>>>> >>>>>> Let me know if you need further information. >>>>>> >>>>>> Best, >>>>>> >>>>>> Myriam Peyrounette >>>>>> >>>>>> >>>>>> -- >>>>>> Myriam Peyrounette >>>>>> CNRS/IDRIS - HLST >>>>>> -- >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before >>>>>> they begin their experiments is infinitely more >>>>>> interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they >>>>> begin their experiments is infinitely more interesting >>>>> than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mem_scaling_newPetscVersion.png Type: image/png Size: 20729 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From knepley at gmail.com Mon Mar 25 12:24:41 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 25 Mar 2019 13:24:41 -0400 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: <75a2f7b1-9e7a-843f-1a83-efff8e56f797@idris.fr> References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> <00d471e0-ed2a-51fa-3031-a6b63c3a96e1@idris.fr> <75a2f7b1-9e7a-843f-1a83-efff8e56f797@idris.fr> Message-ID: On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > > thanks for the explanations. I tried the last PETSc version (commit > fbc5705bc518d02a4999f188aad4ccff5f754cbf), which includes the patch you > talked about. But the memory scaling shows no improvement (see scaling > attached), even when using the "scalable" options :( > > I had a look at the PETSc functions MatPtAPNumeric_MPIAIJ_MPIAIJ and > MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences before and > after the first "bad" commit), but I can't find what induced this memory > issue. > > Are you sure that the option was used? It just looks suspicious to me that they use exactly the same amount of memory. It should be different, even if it does not solve the problem. Thanks, Matt > Myriam > > > > > Le 03/20/19 ? 17:38, Fande Kong a ?crit : > > Hi Myriam, > > There are three algorithms in PETSc to do PtAP ( const char > *algTypes[3] = {"scalable","nonscalable","hypre"};), and can be specified > using the petsc options: -matptap_via xxxx. > > (1) -matptap_via hypre: This call the hypre package to do the PtAP trough > an all-at-once triple product. In our experiences, it is the most memory > efficient, but could be slow. > > (2) -matptap_via scalable: This involves a row-wise algorithm plus an > outer product. This will use more memory than hypre, but way faster. This > used to have a bug that could take all your memory, and I have a fix at > https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff. > When using this option, we may want to have extra options such as > -inner_offdiag_matmatmult_via scalable -inner_diag_matmatmult_via > scalable to select inner scalable algorithms. > > (3) -matptap_via nonscalable: Suppose to be even faster, but use more > memory. It does dense matrix operations. > > > Thanks, > > Fande Kong > > > > > On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> More precisely: something happens when upgrading the functions >> MatPtAPNumeric_MPIAIJ_MPIAIJ and/or MatPtAPSymbolic_MPIAIJ_MPIAIJ. >> >> Unfortunately, there are a lot of differences between the old and new >> versions of these functions. I keep investigating but if you have any idea, >> please let me know. >> >> Best, >> >> Myriam >> >> Le 03/20/19 ? 13:48, Myriam Peyrounette a ?crit : >> >> Hi all, >> >> I used git bisect to determine when the memory need increased. I found >> that the first "bad" commit is aa690a28a7284adb519c28cb44eae20a2c131c85. >> >> Barry was right, this commit seems to be about an evolution of MatPtAPSymbolic_MPIAIJ_MPIAIJ. >> You mentioned the option "-matptap_via scalable" but I can't find any >> information about it. Can you tell me more? >> >> Thanks >> >> Myriam >> >> >> Le 03/11/19 ? 14:40, Mark Adams a ?crit : >> >> Is there a difference in memory usage on your tiny problem? I assume no. >> >> I don't see anything that could come from GAMG other than the RAP stuff >> that you have discussed already. >> >> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette < >> myriam.peyrounette at idris.fr> wrote: >> >>> The code I am using here is the example 42 of PETSc ( >>> https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). >>> Indeed it solves the Stokes equation. I thought it was a good idea to use >>> an example you might know (and didn't find any that uses GAMG functions). I >>> just changed the PCMG setup so that the memory problem appears. And it >>> appears when adding PCGAMG. >>> >>> I don't care about the performance or even the result rightness here, >>> but only about the difference in memory use between 3.6 and 3.10. Do you >>> think finding a more adapted script would help? >>> >>> I used the threshold of 0.1 only once, at the beginning, to test its >>> influence. I used the default threshold (of 0, I guess) for all the other >>> runs. >>> >>> Myriam >>> >>> Le 03/11/19 ? 13:52, Mark Adams a ?crit : >>> >>> In looking at this larger scale run ... >>> >>> * Your eigen estimates are much lower than your tiny test problem. But >>> this is Stokes apparently and it should not work anyway. Maybe you have a >>> small time step that adds a lot of mass that brings the eigen estimates >>> down. And your min eigenvalue (not used) is positive. I would expect >>> negative for Stokes ... >>> >>> * You seem to be setting a threshold value of 0.1 -- that is very high >>> >>> * v3.6 says "using nonzero initial guess" but this is not in v3.10. >>> Maybe we just stopped printing that. >>> >>> * There were some changes to coasening parameters in going from v3.6 but >>> it does not look like your problem was effected. (The coarsening algo is >>> non-deterministic by default and you can see small difference on different >>> runs) >>> >>> * We may have also added a "noisy" RHS for eigen estimates by default >>> from v3.6. >>> >>> * And for non-symetric problems you can try -pc_gamg_agg_nsmooths 0, but >>> again GAMG is not built for Stokes anyway. >>> >>> >>> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette < >>> myriam.peyrounette at idris.fr> wrote: >>> >>>> I used PCView to display the size of the linear system in each level of >>>> the MG. You'll find the outputs attached to this mail (zip file) for both >>>> the default threshold value and a value of 0.1, and for both 3.6 and 3.10 >>>> PETSc versions. >>>> >>>> For convenience, I summarized the information in a graph, also attached >>>> (png file). >>>> >>>> As you can see, there are slight differences between the two versions >>>> but none is critical, in my opinion. Do you see anything suspicious in the >>>> outputs? >>>> >>>> + I can't find the default threshold value. Do you know where I can >>>> find it? >>>> >>>> Thanks for the follow-up >>>> >>>> Myriam >>>> >>>> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit : >>>> >>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette < >>>> myriam.peyrounette at idris.fr> wrote: >>>> >>>>> Hi Matt, >>>>> >>>>> I plotted the memory scalings using different threshold values. The >>>>> two scalings are slightly translated (from -22 to -88 mB) but this gain is >>>>> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling >>>>> deteriorates. >>>>> >>>>> Do you have any other suggestion? >>>>> >>>> Mark, what is the option she can give to output all the GAMG data? >>>> >>>> Also, run using -ksp_view. GAMG will report all the sizes of its grids, >>>> so it should be easy to see >>>> if the coarse grid sizes are increasing, and also what the effect of >>>> the threshold value is. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>>> Thanks >>>>> Myriam >>>>> >>>>> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit : >>>>> >>>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users < >>>>> petsc-users at mcs.anl.gov> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc >>>>>> version >>>>>> to 3.10, this code has a bad memory scaling. >>>>>> >>>>>> To report this issue, I took the PETSc script ex42.c and slightly >>>>>> modified it so that the KSP and PC configurations are the same as in >>>>>> my >>>>>> code. In particular, I use a "personnalised" multi-grid method. The >>>>>> modifications are indicated by the keyword "TopBridge" in the attached >>>>>> scripts. >>>>>> >>>>>> To plot the memory (weak) scaling, I ran four calculations for each >>>>>> script with increasing problem sizes and computations cores: >>>>>> >>>>>> 1. 100,000 elts on 4 cores >>>>>> 2. 1 million elts on 40 cores >>>>>> 3. 10 millions elts on 400 cores >>>>>> 4. 100 millions elts on 4,000 cores >>>>>> >>>>>> The resulting graph is also attached. The scaling using PETSc 3.10 >>>>>> clearly deteriorates for large cases, while the one using PETSc 3.6 is >>>>>> robust. >>>>>> >>>>>> After a few tests, I found that the scaling is mostly sensitive to the >>>>>> use of the AMG method for the coarse grid (line 1780 in >>>>>> main_ex42_petsc36.cc). In particular, the performance strongly >>>>>> deteriorates when commenting lines 1777 to 1790 (in >>>>>> main_ex42_petsc36.cc). >>>>>> >>>>>> Do you have any idea of what changed between version 3.6 and version >>>>>> 3.10 that may imply such degradation? >>>>>> >>>>> >>>>> I believe the default values for PCGAMG changed between versions. It >>>>> sounds like the coarsening rate >>>>> is not great enough, so that these grids are too large. This can be >>>>> set using: >>>>> >>>>> >>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>>> >>>>> There is some explanation of this effect on that page. Let us know if >>>>> setting this does not correct the situation. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Let me know if you need further information. >>>>>> >>>>>> Best, >>>>>> >>>>>> Myriam Peyrounette >>>>>> >>>>>> >>>>>> -- >>>>>> Myriam Peyrounette >>>>>> CNRS/IDRIS - HLST >>>>>> -- >>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Mar 25 12:27:44 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 25 Mar 2019 13:27:44 -0400 Subject: [petsc-users] Solving block systems with some null diagonal blocks In-Reply-To: <8b48a306-74ec-674c-cefe-e19bbfd7679e@upm.es> References: <8b48a306-74ec-674c-cefe-e19bbfd7679e@upm.es> Message-ID: On Mon, Mar 25, 2019 at 8:07 AM Manuel Colera Rico via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > I would like to solve a N*N block system (with N>2) in which some of the > diagonal blocks are null. My system matrix is defined as a MatNest. As > N>2, I can't use "pc_fieldsplit_type schur" nor > "pc_fieldsplit_detect_saddle_point". The other algorithms ("additive", > "multiplicative" and "symmetric_multiplicative") don't work either as > they need each A_ii to be non-zero. > > Is there any built-in function in PETSc for this? If not, could you > please suggest a workaround? > You can just shove all of the rows with nonzero diagonal in one field, and all with zero diagonal in another, and do Schur. This is what -pc_fieldsplit_detect_saddle_point does. However, you have to understand the Schur complement to solve it efficiently. More generally, you can recursively split the matrix, which is what I do for many multiphysics problems. Thanks, Matt > Thanks and kind regards, > > Manuel > > --- > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexlindsay239 at gmail.com Mon Mar 25 15:32:03 2019 From: alexlindsay239 at gmail.com (Alexander Lindsay) Date: Mon, 25 Mar 2019 14:32:03 -0600 Subject: [petsc-users] Problems about GMRES restart and Scaling In-Reply-To: References: Message-ID: We also see this behavior quite frequently in MOOSE applications that have physics that generate residuals of largely different scales. Like Matt said non-dimensionalization would help a lot. Without proper scaling for some of these types of problems, even when the GMRES iteration converges the non-linear step can be garbage for the reason that Barry stated; running with -ksp_monitor_true_residual for these troublesome cases would illustrate that we weren't converging the true residual. The failure of matrix-free under these conditions was one of the biggest motivators for us to add automatic differentiation to MOOSE. Sometimes biting the bullet and working on the Jacobian can be worth it. On Thu, Mar 21, 2019 at 9:02 AM Yingjie Wu via petsc-users < petsc-users at mcs.anl.gov> wrote: > Thanks for all the reply. > The model I simulated is a thermal model that contains multiple physical > fields(eg. temperature, pressure, velocity). In PDEs, these variables are > preceded by some physical parameters, which in turn are functions of these > variables(eg. density is a function of pressure and temperature.). Due to > the complexity of these physical parameter functions, we cannot explicitly > construct Jacobian matrices for this problem. So I use -snes_mf_operator. > > My preconditioner is to treat these physical parameters as constants. At > the beginning of each nonlinear step(SNES), the Jacobian matrix is updated > with the result of the previous nonlinear step output(the physical > parameters are updated). > > > After setting a large KSP restart step, about 60 KSP can converge(ksp_rtol > = 1.e-5). > > > I have a feeling that my initial values are too large to cause this > phenomenon. > > > Snes/ex19 is actually a lot like my example, setting up: -da_grid_x 200 > -da_grid_y 200 -snes_mf > > There will also be a residual rise in step 1290 of KSP > > But not all examples will produce this phenomenon. > > > Thanks, Yingjie > > Smith, Barry F. ?2019?3?21??? ??1:18??? > >> >> >> > On Mar 20, 2019, at 5:52 AM, Yingjie Wu via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> > >> > Dear PETSc developers: >> > Hi, >> > Recently, I used PETSc to solve a non-linear PDEs for thermodynamic >> problems. In the process of solving, I found the following two phenomena, >> hoping to get some help and suggestions. >> > >> > 1. Because my problem involves a lot of physical parameters, it needs >> to call a series of functions, and can not analytically construct Jacobian >> matrix, so I use - snes_mf_operator to solve it, and give an approximate >> Jacobian matrix as a preconditioner. Because of the large dimension of the >> problem and the magnitude difference of the physical variables involved, it >> is found that the linear step residuals will increase at each restart >> (default 30th linear step) . This problem can be solved by setting a large >> number of restart steps. I would like to ask the reasons for this >> phenomenon? What knowledge or articles should I learn if I want to find out >> this problem? >> >> I've seen this behavior. I think in your case it is likely the >> -snes_mf_operator is not really producing an "accurate enough" >> Jacobian-Vector product (and the "solution" being generated by GMRES may be >> garbage). Run with -ksp_monitor_true_residual >> >> If your residual function has if () statements in it or other very >> sharp changes (discontinuities) then it may not even have a true Jacobian >> at the locations it is being evaluated at. In the sense that the >> "Jacobian" you are applying via finite differences is not a linear operator >> and hence GMRES will fail on it. >> >> What are you using for a preconditioner? And roughly how many KSP >> iterations are being used. >> >> Barry >> >> > >> > >> > 2. In my problem model, there are many physical fields (variables are >> realized by finite difference method), and the magnitude of variables >> varies greatly. Is there any Scaling interface or function in Petsc? >> > >> > Thanks, >> > Yingjie >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedmud at gmail.com Mon Mar 25 15:43:16 2019 From: friedmud at gmail.com (Derek Gaston) Date: Mon, 25 Mar 2019 14:43:16 -0600 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: <9AAE0599-D4B2-4FCF-ADFE-011417C5CAF6@gmail.com> References: <9AAE0599-D4B2-4FCF-ADFE-011417C5CAF6@gmail.com> Message-ID: Stefano: the stupidity was all mine and had nothing to do with PETSc. Valgrind helped me track down a memory corruption issue that ultimately was just about a bad input file to my code (and obviously not enough error checking for input files!). The issue is fixed. Now - I'd like to understand a bit more about what happened here on the PETSc side. Was this valgrind issue something that was known and you already had a fix for it - but it wasn't on maint yet? Or was it just that I was using too old of a version of PETSc so I didn't have the fix? Derek On Fri, Mar 22, 2019 at 4:29 AM Stefano Zampini wrote: > > > On Mar 21, 2019, at 7:59 PM, Derek Gaston wrote: > > It sounds like you already tracked this down... but for completeness here > is what track-origins gives: > > ==262923== Conditional jump or move depends on uninitialised value(s) > ==262923== at 0x73C6548: VecScatterMemcpyPlanCreate_Index (vscat.c:294) > ==262923== by 0x73DBD97: VecScatterMemcpyPlanCreate_PtoP > (vpscat_mpi1.c:312) > ==262923== by 0x73DE6AE: VecScatterCreateCommon_PtoS_MPI1 > (vpscat_mpi1.c:2328) > ==262923== by 0x73DFFEA: VecScatterCreateLocal_PtoS_MPI1 > (vpscat_mpi1.c:2202) > ==262923== by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608) > ==262923== by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857) > ==262923== by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) > ==262923== by 0x7413D39: VecScatterSetUp (vscatfce.c:212) > ==262923== by 0x7412D73: VecScatterCreateWithData (vscreate.c:333) > ==262923== by 0x747A232: VecCreateGhostWithArray (pbvec.c:685) > ==262923== by 0x747A90D: VecCreateGhost (pbvec.c:741) > ==262923== by 0x5C7FFD6: libMesh::PetscVector::init(unsigned > long, unsigned long, std::vector long> > const&, bool, libMesh::ParallelType) (petsc_vector.h:752) > ==262923== Uninitialised value was created by a heap allocation > ==262923== at 0x402DDC6: memalign (vg_replace_malloc.c:899) > ==262923== by 0x7359702: PetscMallocAlign (mal.c:41) > ==262923== by 0x7359C70: PetscMallocA (mal.c:390) > ==262923== by 0x73DECF0: VecScatterCreateLocal_PtoS_MPI1 > (vpscat_mpi1.c:2061) > ==262923== by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608) > ==262923== by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857) > ==262923== by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) > ==262923== by 0x7413D39: VecScatterSetUp (vscatfce.c:212) > ==262923== by 0x7412D73: VecScatterCreateWithData (vscreate.c:333) > ==262923== by 0x747A232: VecCreateGhostWithArray (pbvec.c:685) > ==262923== by 0x747A90D: VecCreateGhost (pbvec.c:741) > ==262923== by 0x5C7FFD6: libMesh::PetscVector::init(unsigned > long, unsigned long, std::vector long> > const&, bool, libMesh::ParallelType) (petsc_vector.h:752) > > > BTW: This turned out not to be my actual problem. My actual problem was > just some stupidity on my part... just a simple input parameter issue to my > code (should have had better error checking!). > > But: It sounds like my digging may have uncovered something real here... > so it wasn't completely useless :-) > > > Derek, > > I don?t understand. Is your problem fixed or not? Would be nice to > understand what was the ?some stupidity on your part?, and if it was still > leading to valid PETSc code or just to a broken setup. > In the first case, then we should investigate the valgrind error you > reported. > In the second case, this is not a PETSc issue. > > > Thanks for your help everyone! > > Derek > > > > On Thu, Mar 21, 2019 at 10:38 AM Stefano Zampini < > stefano.zampini at gmail.com> wrote: > >> >> >> Il giorno mer 20 mar 2019 alle ore 23:40 Derek Gaston via petsc-users < >> petsc-users at mcs.anl.gov> ha scritto: >> >>> Trying to track down some memory corruption I'm seeing on larger scale >>> runs (3.5B+ unknowns). >>> >> >> Uhm.... are you using 32bit indices? is it possible there's integer >> overflow somewhere? >> >> >> >>> Was able to run Valgrind on it... and I'm seeing quite a lot of >>> uninitialized value errors coming from ghost updating. Here are some of >>> the traces: >>> >>> ==87695== Conditional jump or move depends on uninitialised value(s) >>> ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) >>> ==87695== by 0x7323C70: PetscMallocA (mal.c:390) >>> ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284) >>> ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP >>> (vpscat_mpi1.c:312) >>> ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) >>> ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) >>> ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) >>> ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) >>> ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) >>> ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) >>> >>> ==133582== Conditional jump or move depends on uninitialised value(s) >>> ==133582== at 0x4030384: memcpy@@GLIBC_2.14 >>> (vg_replace_strmem.c:1034) >>> ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) >>> ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack >>> (vecscatterimpl.h:150) >>> ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) >>> ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) >>> ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) >>> >>> This is from a Git checkout of PETSc... the hash I branched from is: >>> 0e667e8fea4aa from December 23rd (updating would be really hard at this >>> point as I've completed 90% of my dissertation with this version... and >>> changing PETSc now would be pretty painful!). >>> >>> Any ideas? Is it possible it's in my code? Is it possible that there >>> are later PETSc commits that already fix this? >>> >>> Thanks for any help, >>> Derek >>> >>> >> >> -- >> Stefano >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon Mar 25 15:59:26 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Mon, 25 Mar 2019 20:59:26 +0000 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: References: <9AAE0599-D4B2-4FCF-ADFE-011417C5CAF6@gmail.com> Message-ID: I see you are using " 0e667e8fea4aa from December 23rd" - which is old petsc 'master' snapshot. 1. After your fix for 'bad input file' - do you still get these valgrind messages? 2. You should be able to easily apply Stefano's potential fix to your snapshot [without upgrading to latest petsc]. git cherry-pick 0b85991cae8259fd283ce3f99b399b38f1dcd7b4 And then rebuild petsc and - rerun with valgrind - and see if the messages persist. Satish On Mon, 25 Mar 2019, Derek Gaston via petsc-users wrote: > Stefano: the stupidity was all mine and had nothing to do with PETSc. > Valgrind helped me track down a memory corruption issue that ultimately was > just about a bad input file to my code (and obviously not enough error > checking for input files!). > > The issue is fixed. > > Now - I'd like to understand a bit more about what happened here on the > PETSc side. Was this valgrind issue something that was known and you > already had a fix for it - but it wasn't on maint yet? Or was it just that > I was using too old of a version of PETSc so I didn't have the fix? > > Derek > > On Fri, Mar 22, 2019 at 4:29 AM Stefano Zampini > wrote: > > > > > > > On Mar 21, 2019, at 7:59 PM, Derek Gaston wrote: > > > > It sounds like you already tracked this down... but for completeness here > > is what track-origins gives: > > > > ==262923== Conditional jump or move depends on uninitialised value(s) > > ==262923== at 0x73C6548: VecScatterMemcpyPlanCreate_Index (vscat.c:294) > > ==262923== by 0x73DBD97: VecScatterMemcpyPlanCreate_PtoP > > (vpscat_mpi1.c:312) > > ==262923== by 0x73DE6AE: VecScatterCreateCommon_PtoS_MPI1 > > (vpscat_mpi1.c:2328) > > ==262923== by 0x73DFFEA: VecScatterCreateLocal_PtoS_MPI1 > > (vpscat_mpi1.c:2202) > > ==262923== by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608) > > ==262923== by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857) > > ==262923== by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) > > ==262923== by 0x7413D39: VecScatterSetUp (vscatfce.c:212) > > ==262923== by 0x7412D73: VecScatterCreateWithData (vscreate.c:333) > > ==262923== by 0x747A232: VecCreateGhostWithArray (pbvec.c:685) > > ==262923== by 0x747A90D: VecCreateGhost (pbvec.c:741) > > ==262923== by 0x5C7FFD6: libMesh::PetscVector::init(unsigned > > long, unsigned long, std::vector > long> > const&, bool, libMesh::ParallelType) (petsc_vector.h:752) > > ==262923== Uninitialised value was created by a heap allocation > > ==262923== at 0x402DDC6: memalign (vg_replace_malloc.c:899) > > ==262923== by 0x7359702: PetscMallocAlign (mal.c:41) > > ==262923== by 0x7359C70: PetscMallocA (mal.c:390) > > ==262923== by 0x73DECF0: VecScatterCreateLocal_PtoS_MPI1 > > (vpscat_mpi1.c:2061) > > ==262923== by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608) > > ==262923== by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857) > > ==262923== by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) > > ==262923== by 0x7413D39: VecScatterSetUp (vscatfce.c:212) > > ==262923== by 0x7412D73: VecScatterCreateWithData (vscreate.c:333) > > ==262923== by 0x747A232: VecCreateGhostWithArray (pbvec.c:685) > > ==262923== by 0x747A90D: VecCreateGhost (pbvec.c:741) > > ==262923== by 0x5C7FFD6: libMesh::PetscVector::init(unsigned > > long, unsigned long, std::vector > long> > const&, bool, libMesh::ParallelType) (petsc_vector.h:752) > > > > > > BTW: This turned out not to be my actual problem. My actual problem was > > just some stupidity on my part... just a simple input parameter issue to my > > code (should have had better error checking!). > > > > But: It sounds like my digging may have uncovered something real here... > > so it wasn't completely useless :-) > > > > > > Derek, > > > > I don?t understand. Is your problem fixed or not? Would be nice to > > understand what was the ?some stupidity on your part?, and if it was still > > leading to valid PETSc code or just to a broken setup. > > In the first case, then we should investigate the valgrind error you > > reported. > > In the second case, this is not a PETSc issue. > > > > > > Thanks for your help everyone! > > > > Derek > > > > > > > > On Thu, Mar 21, 2019 at 10:38 AM Stefano Zampini < > > stefano.zampini at gmail.com> wrote: > > > >> > >> > >> Il giorno mer 20 mar 2019 alle ore 23:40 Derek Gaston via petsc-users < > >> petsc-users at mcs.anl.gov> ha scritto: > >> > >>> Trying to track down some memory corruption I'm seeing on larger scale > >>> runs (3.5B+ unknowns). > >>> > >> > >> Uhm.... are you using 32bit indices? is it possible there's integer > >> overflow somewhere? > >> > >> > >> > >>> Was able to run Valgrind on it... and I'm seeing quite a lot of > >>> uninitialized value errors coming from ghost updating. Here are some of > >>> the traces: > >>> > >>> ==87695== Conditional jump or move depends on uninitialised value(s) > >>> ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) > >>> ==87695== by 0x7323C70: PetscMallocA (mal.c:390) > >>> ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284) > >>> ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP > >>> (vpscat_mpi1.c:312) > >>> ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) > >>> ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) > >>> ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) > >>> ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) > >>> ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) > >>> ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) > >>> > >>> ==133582== Conditional jump or move depends on uninitialised value(s) > >>> ==133582== at 0x4030384: memcpy@@GLIBC_2.14 > >>> (vg_replace_strmem.c:1034) > >>> ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) > >>> ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack > >>> (vecscatterimpl.h:150) > >>> ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) > >>> ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) > >>> ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) > >>> > >>> This is from a Git checkout of PETSc... the hash I branched from is: > >>> 0e667e8fea4aa from December 23rd (updating would be really hard at this > >>> point as I've completed 90% of my dissertation with this version... and > >>> changing PETSc now would be pretty painful!). > >>> > >>> Any ideas? Is it possible it's in my code? Is it possible that there > >>> are later PETSc commits that already fix this? > >>> > >>> Thanks for any help, > >>> Derek > >>> > >>> > >> > >> -- > >> Stefano > >> > > > > > From stefano.zampini at gmail.com Mon Mar 25 16:47:50 2019 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Mon, 25 Mar 2019 22:47:50 +0100 Subject: [petsc-users] Valgrind Issue With Ghosted Vectors In-Reply-To: References: <9AAE0599-D4B2-4FCF-ADFE-011417C5CAF6@gmail.com> Message-ID: Il giorno lun 25 mar 2019 alle ore 21:40 Derek Gaston ha scritto: > Stefano: the stupidity was all mine and had nothing to do with PETSc. > Valgrind helped me track down a memory corruption issue that ultimately was > just about a bad input file to my code (and obviously not enough error > checking for input files!). > > The issue is fixed. > > Now - I'd like to understand a bit more about what happened here on the > PETSc side. Was this valgrind issue something that was known and you > already had a fix for it - but it wasn't on maint yet? Or was it just that > I was using too old of a version of PETSc so I didn't have the fix? > None of this. My fix is for a separate code path (ADD_VALUES, and not INSERT_VALUES). I cannot tell what happened on your side without knowing the details. > > Derek > > On Fri, Mar 22, 2019 at 4:29 AM Stefano Zampini > wrote: > >> >> >> On Mar 21, 2019, at 7:59 PM, Derek Gaston wrote: >> >> It sounds like you already tracked this down... but for completeness here >> is what track-origins gives: >> >> ==262923== Conditional jump or move depends on uninitialised value(s) >> ==262923== at 0x73C6548: VecScatterMemcpyPlanCreate_Index (vscat.c:294) >> ==262923== by 0x73DBD97: VecScatterMemcpyPlanCreate_PtoP >> (vpscat_mpi1.c:312) >> ==262923== by 0x73DE6AE: VecScatterCreateCommon_PtoS_MPI1 >> (vpscat_mpi1.c:2328) >> ==262923== by 0x73DFFEA: VecScatterCreateLocal_PtoS_MPI1 >> (vpscat_mpi1.c:2202) >> ==262923== by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608) >> ==262923== by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857) >> ==262923== by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) >> ==262923== by 0x7413D39: VecScatterSetUp (vscatfce.c:212) >> ==262923== by 0x7412D73: VecScatterCreateWithData (vscreate.c:333) >> ==262923== by 0x747A232: VecCreateGhostWithArray (pbvec.c:685) >> ==262923== by 0x747A90D: VecCreateGhost (pbvec.c:741) >> ==262923== by 0x5C7FFD6: libMesh::PetscVector::init(unsigned >> long, unsigned long, std::vector> long> > const&, bool, libMesh::ParallelType) (petsc_vector.h:752) >> ==262923== Uninitialised value was created by a heap allocation >> ==262923== at 0x402DDC6: memalign (vg_replace_malloc.c:899) >> ==262923== by 0x7359702: PetscMallocAlign (mal.c:41) >> ==262923== by 0x7359C70: PetscMallocA (mal.c:390) >> ==262923== by 0x73DECF0: VecScatterCreateLocal_PtoS_MPI1 >> (vpscat_mpi1.c:2061) >> ==262923== by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608) >> ==262923== by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857) >> ==262923== by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) >> ==262923== by 0x7413D39: VecScatterSetUp (vscatfce.c:212) >> ==262923== by 0x7412D73: VecScatterCreateWithData (vscreate.c:333) >> ==262923== by 0x747A232: VecCreateGhostWithArray (pbvec.c:685) >> ==262923== by 0x747A90D: VecCreateGhost (pbvec.c:741) >> ==262923== by 0x5C7FFD6: libMesh::PetscVector::init(unsigned >> long, unsigned long, std::vector> long> > const&, bool, libMesh::ParallelType) (petsc_vector.h:752) >> >> >> BTW: This turned out not to be my actual problem. My actual problem was >> just some stupidity on my part... just a simple input parameter issue to my >> code (should have had better error checking!). >> >> But: It sounds like my digging may have uncovered something real here... >> so it wasn't completely useless :-) >> >> >> Derek, >> >> I don?t understand. Is your problem fixed or not? Would be nice to >> understand what was the ?some stupidity on your part?, and if it was still >> leading to valid PETSc code or just to a broken setup. >> In the first case, then we should investigate the valgrind error you >> reported. >> In the second case, this is not a PETSc issue. >> >> >> Thanks for your help everyone! >> >> Derek >> >> >> >> On Thu, Mar 21, 2019 at 10:38 AM Stefano Zampini < >> stefano.zampini at gmail.com> wrote: >> >>> >>> >>> Il giorno mer 20 mar 2019 alle ore 23:40 Derek Gaston via petsc-users < >>> petsc-users at mcs.anl.gov> ha scritto: >>> >>>> Trying to track down some memory corruption I'm seeing on larger scale >>>> runs (3.5B+ unknowns). >>>> >>> >>> Uhm.... are you using 32bit indices? is it possible there's integer >>> overflow somewhere? >>> >>> >>> >>>> Was able to run Valgrind on it... and I'm seeing quite a lot of >>>> uninitialized value errors coming from ghost updating. Here are some of >>>> the traces: >>>> >>>> ==87695== Conditional jump or move depends on uninitialised value(s) >>>> ==87695== at 0x73236D3: PetscMallocAlign (mal.c:28) >>>> ==87695== by 0x7323C70: PetscMallocA (mal.c:390) >>>> ==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index >>>> (vscat.c:284) >>>> ==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP >>>> (vpscat_mpi1.c:312) >>>> ==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857) >>>> ==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543) >>>> ==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212) >>>> ==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333) >>>> ==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685) >>>> ==64730== by 0x744490D: VecCreateGhost (pbvec.c:741) >>>> >>>> ==133582== Conditional jump or move depends on uninitialised value(s) >>>> ==133582== at 0x4030384: memcpy@@GLIBC_2.14 >>>> (vg_replace_strmem.c:1034) >>>> ==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649) >>>> ==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack >>>> (vecscatterimpl.h:150) >>>> ==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69) >>>> ==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110) >>>> ==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225) >>>> >>>> This is from a Git checkout of PETSc... the hash I branched from is: >>>> 0e667e8fea4aa from December 23rd (updating would be really hard at this >>>> point as I've completed 90% of my dissertation with this version... and >>>> changing PETSc now would be pretty painful!). >>>> >>>> Any ideas? Is it possible it's in my code? Is it possible that there >>>> are later PETSc commits that already fix this? >>>> >>>> Thanks for any help, >>>> Derek >>>> >>>> >>> >>> -- >>> Stefano >>> >> >> -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From KJiao at slb.com Mon Mar 25 21:50:12 2019 From: KJiao at slb.com (Kun Jiao) Date: Tue, 26 Mar 2019 02:50:12 +0000 Subject: [petsc-users] error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 Message-ID: Hi Petsc Experts, Is MatCreateMPIAIJMKL retired in 3.10.4? I got this error with my code which works fine in 3.8.3 version. Regards, Kun Schlumberger-Private -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.colera at upm.es Tue Mar 26 04:23:25 2019 From: m.colera at upm.es (Manuel Colera Rico) Date: Tue, 26 Mar 2019 10:23:25 +0100 Subject: [petsc-users] Solving block systems with some null diagonal blocks In-Reply-To: References: <8b48a306-74ec-674c-cefe-e19bbfd7679e@upm.es> Message-ID: <551a1541-4160-73cb-4ccf-d32b0bc9058e@upm.es> OK, thank you Matt. Manuel --- On 3/25/19 6:27 PM, Matthew Knepley wrote: > On Mon, Mar 25, 2019 at 8:07 AM Manuel Colera Rico via petsc-users > > wrote: > > Hello, > > I would like to solve a N*N block system (with N>2) in which some > of the > diagonal blocks are null. My system matrix is defined as a > MatNest. As > N>2, I can't use "pc_fieldsplit_type schur" nor > "pc_fieldsplit_detect_saddle_point". The other algorithms > ("additive", > "multiplicative" and "symmetric_multiplicative") don't work either as > they need each A_ii to be non-zero. > > Is there any built-in function in PETSc for this? If not, could you > please suggest a workaround? > > > You can just shove all of the rows with nonzero diagonal in one field, > and all with zero diagonal in another, and do Schur. This is what > > ? -pc_fieldsplit_detect_saddle_point > > does. However, you have to understand the Schur complement to solve it > efficiently. More generally, you can recursively split the matrix, > which is what I do for many multiphysics problems. > > ? Thanks, > > ? ? Matt > > Thanks and kind regards, > > Manuel > > --- > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From myriam.peyrounette at idris.fr Tue Mar 26 04:52:37 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Tue, 26 Mar 2019 10:52:37 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> <00d471e0-ed2a-51fa-3031-a6b63c3a96e1@idris.fr> <75a2f7b1-9e7a-843f-1a83-efff8e56f797@idris.fr> Message-ID: How can I be sure they are indeed used? Can I print this information in some log file? Thanks in advance Myriam Le 03/25/19 ? 18:24, Matthew Knepley a ?crit?: > On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via petsc-users > > wrote: > > Hi, > > thanks for the explanations. I tried the last PETSc version > (commit fbc5705bc518d02a4999f188aad4ccff5f754cbf), which includes > the patch you talked about. But the memory scaling shows no > improvement (see scaling attached), even when using the "scalable" > options :( > > I had a look at the PETSc functions MatPtAPNumeric_MPIAIJ_MPIAIJ > and MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences > before and after the first "bad" commit), but I can't find what > induced this memory issue. > > Are you sure that the option was used? It just looks suspicious to me > that they use exactly the same amount of memory. It should be > different, even if it does not solve the problem. > > ? ?Thanks, > > ? ? ?Matt? > > Myriam > > > > > Le 03/20/19 ? 17:38, Fande Kong a ?crit?: >> Hi?Myriam, >> >> There are three algorithms in PETSc to do PtAP (?const char? ? ? >> ? ? *algTypes[3] = {"scalable","nonscalable","hypre"};), and can >> be specified using the petsc options:?-matptap_via xxxx. >> >> (1) -matptap_via hypre: This call the hypre package to do the >> PtAP trough an all-at-once triple product. In our experiences, it >> is the most memory efficient, but could be slow. >> >> (2)? -matptap_via scalable: This involves a row-wise algorithm >> plus an outer product.? This will use more memory than hypre, but >> way faster. This used to have a bug that could take all your >> memory, and I have a fix >> at?https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff.? >> When using this option, we may want to have extra options such >> as? ?-inner_offdiag_matmatmult_via scalable >> -inner_diag_matmatmult_via scalable? to select inner scalable >> algorithms. >> >> (3)??-matptap_via nonscalable:? Suppose to be even faster, but >> use more memory. It does dense matrix operations. >> >> >> Thanks, >> >> Fande Kong >> >> >> >> >> On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via >> petsc-users > > wrote: >> >> More precisely: something happens when upgrading the >> functions MatPtAPNumeric_MPIAIJ_MPIAIJ and/or >> MatPtAPSymbolic_MPIAIJ_MPIAIJ. >> >> Unfortunately, there are a lot of differences between the old >> and new versions of these functions. I keep investigating but >> if you have any idea, please let me know. >> >> Best, >> >> Myriam >> >> >> Le 03/20/19 ? 13:48, Myriam Peyrounette a ?crit?: >>> >>> Hi all, >>> >>> I used git bisect to determine when the memory need >>> increased. I found that the first "bad" commit is ? >>> aa690a28a7284adb519c28cb44eae20a2c131c85. >>> >>> Barry was right, this commit seems to be about an evolution >>> of MatPtAPSymbolic_MPIAIJ_MPIAIJ. You mentioned the option >>> "-matptap_via scalable" but I can't find any information >>> about it. Can you tell me more? >>> >>> Thanks >>> >>> Myriam >>> >>> >>> Le 03/11/19 ? 14:40, Mark Adams a ?crit?: >>>> Is there a difference in memory usage on your tiny problem? >>>> I assume no. >>>> >>>> I don't see anything that could come from GAMG other than >>>> the RAP stuff that you have discussed already. >>>> >>>> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette >>>> >>> > wrote: >>>> >>>> The code I am using here is the example 42 of PETSc >>>> (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). >>>> Indeed it solves the Stokes equation. I thought it was >>>> a good idea to use an example you might know (and >>>> didn't find any that uses GAMG functions). I just >>>> changed the PCMG setup so that the memory problem >>>> appears. And it appears when adding PCGAMG. >>>> >>>> I don't care about the performance or even the result >>>> rightness here, but only about the difference in memory >>>> use between 3.6 and 3.10. Do you think finding a more >>>> adapted script would help? >>>> >>>> I used the threshold of 0.1 only once, at the >>>> beginning, to test its influence. I used the default >>>> threshold (of 0, I guess) for all the other runs. >>>> >>>> Myriam >>>> >>>> >>>> Le 03/11/19 ? 13:52, Mark Adams a ?crit?: >>>>> In looking at this larger scale run ... >>>>> >>>>> * Your eigen estimates are much lower than your tiny >>>>> test problem.? But this is Stokes apparently and it >>>>> should not work anyway. Maybe you have a small time >>>>> step that adds a lot of mass that brings the eigen >>>>> estimates down. And your min eigenvalue (not used) is >>>>> positive. I would expect negative for Stokes ... >>>>> >>>>> * You seem to be setting a threshold value of 0.1 -- >>>>> that is very high >>>>> >>>>> * v3.6 says "using nonzero initial guess" but this is >>>>> not in v3.10. Maybe we just stopped printing that. >>>>> >>>>> * There were some changes to coasening parameters in >>>>> going from v3.6 but it does not look like your problem >>>>> was effected. (The coarsening algo is >>>>> non-deterministic by default and you can see small >>>>> difference on different runs) >>>>> >>>>> * We may have also added a "noisy" RHS for eigen >>>>> estimates by default from v3.6. >>>>> >>>>> * And for non-symetric problems you can try >>>>> -pc_gamg_agg_nsmooths 0, but again GAMG is not built >>>>> for Stokes anyway. >>>>> >>>>> >>>>> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette >>>>> >>>> > wrote: >>>>> >>>>> I used PCView to display the size of the linear >>>>> system in each level of the MG. You'll find the >>>>> outputs attached to this mail (zip file) for both >>>>> the default threshold value and a value of 0.1, >>>>> and for both 3.6 and 3.10 PETSc versions. >>>>> >>>>> For convenience, I summarized the information in a >>>>> graph, also attached (png file). >>>>> >>>>> As you can see, there are slight differences >>>>> between the two versions but none is critical, in >>>>> my opinion. Do you see anything suspicious in the >>>>> outputs? >>>>> >>>>> + I can't find the default threshold value. Do you >>>>> know where I can find it? >>>>> >>>>> Thanks for the follow-up >>>>> >>>>> Myriam >>>>> >>>>> >>>>> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit?: >>>>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette >>>>>> >>>>> > wrote: >>>>>> >>>>>> Hi Matt, >>>>>> >>>>>> I plotted the memory scalings using different >>>>>> threshold values. The two scalings are >>>>>> slightly translated (from -22 to -88 mB) but >>>>>> this gain is neglectable. The 3.6-scaling >>>>>> keeps being robust while the 3.10-scaling >>>>>> deteriorates. >>>>>> >>>>>> Do you have any other suggestion? >>>>>> >>>>>> Mark, what is the option she can give to output >>>>>> all the GAMG data? >>>>>> >>>>>> Also, run using -ksp_view. GAMG will report all >>>>>> the sizes of its grids, so it should be easy to see >>>>>> if the coarse grid sizes are increasing, and also >>>>>> what the effect of the threshold value is. >>>>>> >>>>>> ? Thanks, >>>>>> >>>>>> ? ? ?Matt? >>>>>> >>>>>> Thanks >>>>>> >>>>>> Myriam >>>>>> >>>>>> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit?: >>>>>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam >>>>>>> Peyrounette via petsc-users >>>>>>> >>>>>> > wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I used to run my code with PETSc 3.6. >>>>>>> Since I upgraded the PETSc version >>>>>>> to 3.10, this code has a bad memory scaling. >>>>>>> >>>>>>> To report this issue, I took the PETSc >>>>>>> script ex42.c and slightly >>>>>>> modified it so that the KSP and PC >>>>>>> configurations are the same as in my >>>>>>> code. In particular, I use a >>>>>>> "personnalised" multi-grid method. The >>>>>>> modifications are indicated by the >>>>>>> keyword "TopBridge" in the attached >>>>>>> scripts. >>>>>>> >>>>>>> To plot the memory (weak) scaling, I ran >>>>>>> four calculations for each >>>>>>> script with increasing problem sizes and >>>>>>> computations cores: >>>>>>> >>>>>>> 1. 100,000 elts on 4 cores >>>>>>> 2. 1 million elts on 40 cores >>>>>>> 3. 10 millions elts on 400 cores >>>>>>> 4. 100 millions elts on 4,000 cores >>>>>>> >>>>>>> The resulting graph is also attached. >>>>>>> The scaling using PETSc 3.10 >>>>>>> clearly deteriorates for large cases, >>>>>>> while the one using PETSc 3.6 is >>>>>>> robust. >>>>>>> >>>>>>> After a few tests, I found that the >>>>>>> scaling is mostly sensitive to the >>>>>>> use of the AMG method for the coarse >>>>>>> grid (line 1780 in >>>>>>> main_ex42_petsc36.cc). In particular, >>>>>>> the performance strongly >>>>>>> deteriorates when commenting lines 1777 >>>>>>> to 1790 (in main_ex42_petsc36.cc). >>>>>>> >>>>>>> Do you have any idea of what changed >>>>>>> between version 3.6 and version >>>>>>> 3.10 that may imply such degradation? >>>>>>> >>>>>>> >>>>>>> I believe the default values for PCGAMG >>>>>>> changed between versions. It sounds like the >>>>>>> coarsening rate >>>>>>> is not great enough, so that these grids are >>>>>>> too large. This can be set using: >>>>>>> >>>>>>> ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>>>>> >>>>>>> There is some explanation of this effect on >>>>>>> that page. Let us know if setting this does >>>>>>> not correct the situation. >>>>>>> >>>>>>> ? Thanks, >>>>>>> >>>>>>> ? ? ?Matt >>>>>>> ? >>>>>>> >>>>>>> Let me know if you need further information. >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Myriam Peyrounette >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Myriam Peyrounette >>>>>>> CNRS/IDRIS - HLST >>>>>>> -- >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted >>>>>>> before they begin their experiments is >>>>>>> infinitely more interesting than any results >>>>>>> to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>> >>>>>> -- >>>>>> Myriam Peyrounette >>>>>> CNRS/IDRIS - HLST >>>>>> -- >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before >>>>>> they begin their experiments is infinitely more >>>>>> interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From dave.mayhem23 at gmail.com Tue Mar 26 05:10:09 2019 From: dave.mayhem23 at gmail.com (Dave May) Date: Tue, 26 Mar 2019 10:10:09 +0000 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> <00d471e0-ed2a-51fa-3031-a6b63c3a96e1@idris.fr> <75a2f7b1-9e7a-843f-1a83-efff8e56f797@idris.fr> Message-ID: On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via petsc-users < petsc-users at mcs.anl.gov> wrote: > How can I be sure they are indeed used? Can I print this information in > some log file? > Yes. Re-run the job with the command line option -options_left true This will report all options parsed, and importantly, will also indicate if any options were unused. Thanks Dave Thanks in advance > > Myriam > > Le 03/25/19 ? 18:24, Matthew Knepley a ?crit : > > On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hi, >> >> thanks for the explanations. I tried the last PETSc version (commit >> fbc5705bc518d02a4999f188aad4ccff5f754cbf), which includes the patch you >> talked about. But the memory scaling shows no improvement (see scaling >> attached), even when using the "scalable" options :( >> >> I had a look at the PETSc functions MatPtAPNumeric_MPIAIJ_MPIAIJ and >> MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences before and >> after the first "bad" commit), but I can't find what induced this memory >> issue. >> > Are you sure that the option was used? It just looks suspicious to me that > they use exactly the same amount of memory. It should be different, even if > it does not solve the problem. > > Thanks, > > Matt > >> Myriam >> >> >> >> >> Le 03/20/19 ? 17:38, Fande Kong a ?crit : >> >> Hi Myriam, >> >> There are three algorithms in PETSc to do PtAP ( const char >> *algTypes[3] = {"scalable","nonscalable","hypre"};), and can be specified >> using the petsc options: -matptap_via xxxx. >> >> (1) -matptap_via hypre: This call the hypre package to do the PtAP trough >> an all-at-once triple product. In our experiences, it is the most memory >> efficient, but could be slow. >> >> (2) -matptap_via scalable: This involves a row-wise algorithm plus an >> outer product. This will use more memory than hypre, but way faster. This >> used to have a bug that could take all your memory, and I have a fix at >> https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff. >> When using this option, we may want to have extra options such as >> -inner_offdiag_matmatmult_via scalable -inner_diag_matmatmult_via >> scalable to select inner scalable algorithms. >> >> (3) -matptap_via nonscalable: Suppose to be even faster, but use more >> memory. It does dense matrix operations. >> >> >> Thanks, >> >> Fande Kong >> >> >> >> >> On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> More precisely: something happens when upgrading the functions >>> MatPtAPNumeric_MPIAIJ_MPIAIJ and/or MatPtAPSymbolic_MPIAIJ_MPIAIJ. >>> >>> Unfortunately, there are a lot of differences between the old and new >>> versions of these functions. I keep investigating but if you have any idea, >>> please let me know. >>> >>> Best, >>> >>> Myriam >>> >>> Le 03/20/19 ? 13:48, Myriam Peyrounette a ?crit : >>> >>> Hi all, >>> >>> I used git bisect to determine when the memory need increased. I found >>> that the first "bad" commit is aa690a28a7284adb519c28cb44eae20a2c131c85. >>> >>> Barry was right, this commit seems to be about an evolution of MatPtAPSymbolic_MPIAIJ_MPIAIJ. >>> You mentioned the option "-matptap_via scalable" but I can't find any >>> information about it. Can you tell me more? >>> >>> Thanks >>> >>> Myriam >>> >>> >>> Le 03/11/19 ? 14:40, Mark Adams a ?crit : >>> >>> Is there a difference in memory usage on your tiny problem? I assume no. >>> >>> I don't see anything that could come from GAMG other than the RAP stuff >>> that you have discussed already. >>> >>> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette < >>> myriam.peyrounette at idris.fr> wrote: >>> >>>> The code I am using here is the example 42 of PETSc ( >>>> https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). >>>> Indeed it solves the Stokes equation. I thought it was a good idea to use >>>> an example you might know (and didn't find any that uses GAMG functions). I >>>> just changed the PCMG setup so that the memory problem appears. And it >>>> appears when adding PCGAMG. >>>> >>>> I don't care about the performance or even the result rightness here, >>>> but only about the difference in memory use between 3.6 and 3.10. Do you >>>> think finding a more adapted script would help? >>>> >>>> I used the threshold of 0.1 only once, at the beginning, to test its >>>> influence. I used the default threshold (of 0, I guess) for all the other >>>> runs. >>>> >>>> Myriam >>>> >>>> Le 03/11/19 ? 13:52, Mark Adams a ?crit : >>>> >>>> In looking at this larger scale run ... >>>> >>>> * Your eigen estimates are much lower than your tiny test problem. But >>>> this is Stokes apparently and it should not work anyway. Maybe you have a >>>> small time step that adds a lot of mass that brings the eigen estimates >>>> down. And your min eigenvalue (not used) is positive. I would expect >>>> negative for Stokes ... >>>> >>>> * You seem to be setting a threshold value of 0.1 -- that is very high >>>> >>>> * v3.6 says "using nonzero initial guess" but this is not in v3.10. >>>> Maybe we just stopped printing that. >>>> >>>> * There were some changes to coasening parameters in going from v3.6 >>>> but it does not look like your problem was effected. (The coarsening algo >>>> is non-deterministic by default and you can see small difference on >>>> different runs) >>>> >>>> * We may have also added a "noisy" RHS for eigen estimates by default >>>> from v3.6. >>>> >>>> * And for non-symetric problems you can try -pc_gamg_agg_nsmooths 0, >>>> but again GAMG is not built for Stokes anyway. >>>> >>>> >>>> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette < >>>> myriam.peyrounette at idris.fr> wrote: >>>> >>>>> I used PCView to display the size of the linear system in each level >>>>> of the MG. You'll find the outputs attached to this mail (zip file) for >>>>> both the default threshold value and a value of 0.1, and for both 3.6 and >>>>> 3.10 PETSc versions. >>>>> >>>>> For convenience, I summarized the information in a graph, also >>>>> attached (png file). >>>>> >>>>> As you can see, there are slight differences between the two versions >>>>> but none is critical, in my opinion. Do you see anything suspicious in the >>>>> outputs? >>>>> >>>>> + I can't find the default threshold value. Do you know where I can >>>>> find it? >>>>> >>>>> Thanks for the follow-up >>>>> >>>>> Myriam >>>>> >>>>> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit : >>>>> >>>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette < >>>>> myriam.peyrounette at idris.fr> wrote: >>>>> >>>>>> Hi Matt, >>>>>> >>>>>> I plotted the memory scalings using different threshold values. The >>>>>> two scalings are slightly translated (from -22 to -88 mB) but this gain is >>>>>> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling >>>>>> deteriorates. >>>>>> >>>>>> Do you have any other suggestion? >>>>>> >>>>> Mark, what is the option she can give to output all the GAMG data? >>>>> >>>>> Also, run using -ksp_view. GAMG will report all the sizes of its >>>>> grids, so it should be easy to see >>>>> if the coarse grid sizes are increasing, and also what the effect of >>>>> the threshold value is. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>>> Thanks >>>>>> Myriam >>>>>> >>>>>> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit : >>>>>> >>>>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users < >>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc >>>>>>> version >>>>>>> to 3.10, this code has a bad memory scaling. >>>>>>> >>>>>>> To report this issue, I took the PETSc script ex42.c and slightly >>>>>>> modified it so that the KSP and PC configurations are the same as in >>>>>>> my >>>>>>> code. In particular, I use a "personnalised" multi-grid method. The >>>>>>> modifications are indicated by the keyword "TopBridge" in the >>>>>>> attached >>>>>>> scripts. >>>>>>> >>>>>>> To plot the memory (weak) scaling, I ran four calculations for each >>>>>>> script with increasing problem sizes and computations cores: >>>>>>> >>>>>>> 1. 100,000 elts on 4 cores >>>>>>> 2. 1 million elts on 40 cores >>>>>>> 3. 10 millions elts on 400 cores >>>>>>> 4. 100 millions elts on 4,000 cores >>>>>>> >>>>>>> The resulting graph is also attached. The scaling using PETSc 3.10 >>>>>>> clearly deteriorates for large cases, while the one using PETSc 3.6 >>>>>>> is >>>>>>> robust. >>>>>>> >>>>>>> After a few tests, I found that the scaling is mostly sensitive to >>>>>>> the >>>>>>> use of the AMG method for the coarse grid (line 1780 in >>>>>>> main_ex42_petsc36.cc). In particular, the performance strongly >>>>>>> deteriorates when commenting lines 1777 to 1790 (in >>>>>>> main_ex42_petsc36.cc). >>>>>>> >>>>>>> Do you have any idea of what changed between version 3.6 and version >>>>>>> 3.10 that may imply such degradation? >>>>>>> >>>>>> >>>>>> I believe the default values for PCGAMG changed between versions. It >>>>>> sounds like the coarsening rate >>>>>> is not great enough, so that these grids are too large. This can be >>>>>> set using: >>>>>> >>>>>> >>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>>>> >>>>>> There is some explanation of this effect on that page. Let us know if >>>>>> setting this does not correct the situation. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Let me know if you need further information. >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Myriam Peyrounette >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Myriam Peyrounette >>>>>>> CNRS/IDRIS - HLST >>>>>>> -- >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Myriam Peyrounette >>>>>> CNRS/IDRIS - HLST >>>>>> -- >>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From myriam.peyrounette at idris.fr Tue Mar 26 05:35:56 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Tue, 26 Mar 2019 11:35:56 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> <00d471e0-ed2a-51fa-3031-a6b63c3a96e1@idris.fr> <75a2f7b1-9e7a-843f-1a83-efff8e56f797@idris.fr> Message-ID: <788f0293-4a5e-bae3-4a8d-10d92d0a16af@idris.fr> Oh you were right, the three options are unsused (-matptap_via scalable, -inner_offdiag_matmatmult_via scalable and -inner_diag_matmatmult_via scalable). Does this mean I am not using the associated PtAP functions? Myriam Le 03/26/19 ? 11:10, Dave May a ?crit?: > > On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via petsc-users > > wrote: > > How can I be sure they are indeed used? Can I print this > information in some log file? > > Yes. Re-run the job with the command line option > > -options_left true > > This will report all options parsed, and importantly, will also > indicate if any options were unused. > ? > > Thanks > Dave > > Thanks in advance > > Myriam > > > Le 03/25/19 ? 18:24, Matthew Knepley a ?crit?: >> On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via >> petsc-users > > wrote: >> >> Hi, >> >> thanks for the explanations. I tried the last PETSc version >> (commit fbc5705bc518d02a4999f188aad4ccff5f754cbf), which >> includes the patch you talked about. But the memory scaling >> shows no improvement (see scaling attached), even when using >> the "scalable" options :( >> >> I had a look at the PETSc functions >> MatPtAPNumeric_MPIAIJ_MPIAIJ and >> MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences >> before and after the first "bad" commit), but I can't find >> what induced this memory issue. >> >> Are you sure that the option was used? It just looks suspicious >> to me that they use exactly the same amount of memory. It should >> be different, even if it does not solve the problem. >> >> ? ?Thanks, >> >> ? ? ?Matt? >> >> Myriam >> >> >> >> >> Le 03/20/19 ? 17:38, Fande Kong a ?crit?: >>> Hi?Myriam, >>> >>> There are three algorithms in PETSc to do PtAP (?const char? >>> ? ? ? ? *algTypes[3] = {"scalable","nonscalable","hypre"};), >>> and can be specified using the petsc options:?-matptap_via xxxx. >>> >>> (1) -matptap_via hypre: This call the hypre package to do >>> the PtAP trough an all-at-once triple product. In our >>> experiences, it is the most memory efficient, but could be slow. >>> >>> (2)? -matptap_via scalable: This involves a row-wise >>> algorithm plus an outer product.? This will use more memory >>> than hypre, but way faster. This used to have a bug that >>> could take all your memory, and I have a fix >>> at?https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff.? >>> When using this option, we may want to have extra options >>> such as? ?-inner_offdiag_matmatmult_via scalable >>> -inner_diag_matmatmult_via scalable? to select inner >>> scalable algorithms. >>> >>> (3)??-matptap_via nonscalable:? Suppose to be even faster, >>> but use more memory. It does dense matrix operations. >>> >>> >>> Thanks, >>> >>> Fande Kong >>> >>> >>> >>> >>> On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via >>> petsc-users >> > wrote: >>> >>> More precisely: something happens when upgrading the >>> functions MatPtAPNumeric_MPIAIJ_MPIAIJ and/or >>> MatPtAPSymbolic_MPIAIJ_MPIAIJ. >>> >>> Unfortunately, there are a lot of differences between >>> the old and new versions of these functions. I keep >>> investigating but if you have any idea, please let me know. >>> >>> Best, >>> >>> Myriam >>> >>> >>> Le 03/20/19 ? 13:48, Myriam Peyrounette a ?crit?: >>>> >>>> Hi all, >>>> >>>> I used git bisect to determine when the memory need >>>> increased. I found that the first "bad" commit is ? >>>> aa690a28a7284adb519c28cb44eae20a2c131c85. >>>> >>>> Barry was right, this commit seems to be about an >>>> evolution of MatPtAPSymbolic_MPIAIJ_MPIAIJ. You >>>> mentioned the option "-matptap_via scalable" but I >>>> can't find any information about it. Can you tell me more? >>>> >>>> Thanks >>>> >>>> Myriam >>>> >>>> >>>> Le 03/11/19 ? 14:40, Mark Adams a ?crit?: >>>>> Is there a difference in memory usage on your tiny >>>>> problem? I assume no. >>>>> >>>>> I don't see anything that could come from GAMG other >>>>> than the RAP stuff that you have discussed already. >>>>> >>>>> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette >>>>> >>>> > wrote: >>>>> >>>>> The code I am using here is the example 42 of >>>>> PETSc >>>>> (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). >>>>> Indeed it solves the Stokes equation. I thought it >>>>> was a good idea to use an example you might know >>>>> (and didn't find any that uses GAMG functions). I >>>>> just changed the PCMG setup so that the memory >>>>> problem appears. And it appears when adding PCGAMG. >>>>> >>>>> I don't care about the performance or even the >>>>> result rightness here, but only about the >>>>> difference in memory use between 3.6 and 3.10. Do >>>>> you think finding a more adapted script would help? >>>>> >>>>> I used the threshold of 0.1 only once, at the >>>>> beginning, to test its influence. I used the >>>>> default threshold (of 0, I guess) for all the >>>>> other runs. >>>>> >>>>> Myriam >>>>> >>>>> >>>>> Le 03/11/19 ? 13:52, Mark Adams a ?crit?: >>>>>> In looking at this larger scale run ... >>>>>> >>>>>> * Your eigen estimates are much lower than your >>>>>> tiny test problem.? But this is Stokes apparently >>>>>> and it should not work anyway. Maybe you have a >>>>>> small time step that adds a lot of mass that >>>>>> brings the eigen estimates down. And your min >>>>>> eigenvalue (not used) is positive. I would expect >>>>>> negative for Stokes ... >>>>>> >>>>>> * You seem to be setting a threshold value of 0.1 >>>>>> -- that is very high >>>>>> >>>>>> * v3.6 says "using nonzero initial guess" but >>>>>> this is not in v3.10. Maybe we just stopped >>>>>> printing that. >>>>>> >>>>>> * There were some changes to coasening parameters >>>>>> in going from v3.6 but it does not look like your >>>>>> problem was effected. (The coarsening algo is >>>>>> non-deterministic by default and you can see >>>>>> small difference on different runs) >>>>>> >>>>>> * We may have also added a "noisy" RHS for eigen >>>>>> estimates by default from v3.6. >>>>>> >>>>>> * And for non-symetric problems you can try >>>>>> -pc_gamg_agg_nsmooths 0, but again GAMG is not >>>>>> built for Stokes anyway. >>>>>> >>>>>> >>>>>> On Tue, Mar 5, 2019 at 11:53 AM Myriam >>>>>> Peyrounette >>>>> > wrote: >>>>>> >>>>>> I used PCView to display the size of the >>>>>> linear system in each level of the MG. You'll >>>>>> find the outputs attached to this mail (zip >>>>>> file) for both the default threshold value >>>>>> and a value of 0.1, and for both 3.6 and 3.10 >>>>>> PETSc versions. >>>>>> >>>>>> For convenience, I summarized the information >>>>>> in a graph, also attached (png file). >>>>>> >>>>>> As you can see, there are slight differences >>>>>> between the two versions but none is >>>>>> critical, in my opinion. Do you see anything >>>>>> suspicious in the outputs? >>>>>> >>>>>> + I can't find the default threshold value. >>>>>> Do you know where I can find it? >>>>>> >>>>>> Thanks for the follow-up >>>>>> >>>>>> Myriam >>>>>> >>>>>> >>>>>> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit?: >>>>>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam >>>>>>> Peyrounette >>>>>> > wrote: >>>>>>> >>>>>>> Hi Matt, >>>>>>> >>>>>>> I plotted the memory scalings using >>>>>>> different threshold values. The two >>>>>>> scalings are slightly translated (from >>>>>>> -22 to -88 mB) but this gain is >>>>>>> neglectable. The 3.6-scaling keeps being >>>>>>> robust while the 3.10-scaling deteriorates. >>>>>>> >>>>>>> Do you have any other suggestion? >>>>>>> >>>>>>> Mark, what is the option she can give to >>>>>>> output all the GAMG data? >>>>>>> >>>>>>> Also, run using -ksp_view. GAMG will report >>>>>>> all the sizes of its grids, so it should be >>>>>>> easy to see >>>>>>> if the coarse grid sizes are increasing, and >>>>>>> also what the effect of the threshold value is. >>>>>>> >>>>>>> ? Thanks, >>>>>>> >>>>>>> ? ? ?Matt? >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> Myriam >>>>>>> >>>>>>> Le 03/02/19 ? 02:27, Matthew Knepley a >>>>>>> ?crit?: >>>>>>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam >>>>>>>> Peyrounette via petsc-users >>>>>>>> >>>>>>> > wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I used to run my code with PETSc >>>>>>>> 3.6. Since I upgraded the PETSc version >>>>>>>> to 3.10, this code has a bad memory >>>>>>>> scaling. >>>>>>>> >>>>>>>> To report this issue, I took the >>>>>>>> PETSc script ex42.c and slightly >>>>>>>> modified it so that the KSP and PC >>>>>>>> configurations are the same as in my >>>>>>>> code. In particular, I use a >>>>>>>> "personnalised" multi-grid method. The >>>>>>>> modifications are indicated by the >>>>>>>> keyword "TopBridge" in the attached >>>>>>>> scripts. >>>>>>>> >>>>>>>> To plot the memory (weak) scaling, >>>>>>>> I ran four calculations for each >>>>>>>> script with increasing problem >>>>>>>> sizes and computations cores: >>>>>>>> >>>>>>>> 1. 100,000 elts on 4 cores >>>>>>>> 2. 1 million elts on 40 cores >>>>>>>> 3. 10 millions elts on 400 cores >>>>>>>> 4. 100 millions elts on 4,000 cores >>>>>>>> >>>>>>>> The resulting graph is also >>>>>>>> attached. The scaling using PETSc 3.10 >>>>>>>> clearly deteriorates for large >>>>>>>> cases, while the one using PETSc 3.6 is >>>>>>>> robust. >>>>>>>> >>>>>>>> After a few tests, I found that the >>>>>>>> scaling is mostly sensitive to the >>>>>>>> use of the AMG method for the >>>>>>>> coarse grid (line 1780 in >>>>>>>> main_ex42_petsc36.cc). In >>>>>>>> particular, the performance strongly >>>>>>>> deteriorates when commenting lines >>>>>>>> 1777 to 1790 (in main_ex42_petsc36.cc). >>>>>>>> >>>>>>>> Do you have any idea of what >>>>>>>> changed between version 3.6 and version >>>>>>>> 3.10 that may imply such degradation? >>>>>>>> >>>>>>>> >>>>>>>> I believe the default values for PCGAMG >>>>>>>> changed between versions. It sounds >>>>>>>> like the coarsening rate >>>>>>>> is not great enough, so that these >>>>>>>> grids are too large. This can be set using: >>>>>>>> >>>>>>>> ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>>>>>> >>>>>>>> There is some explanation of this >>>>>>>> effect on that page. Let us know if >>>>>>>> setting this does not correct the >>>>>>>> situation. >>>>>>>> >>>>>>>> ? Thanks, >>>>>>>> >>>>>>>> ? ? ?Matt >>>>>>>> ? >>>>>>>> >>>>>>>> Let me know if you need further >>>>>>>> information. >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> Myriam Peyrounette >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Myriam Peyrounette >>>>>>>> CNRS/IDRIS - HLST >>>>>>>> -- >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for >>>>>>>> granted before they begin their >>>>>>>> experiments is infinitely more >>>>>>>> interesting than any results to which >>>>>>>> their experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Myriam Peyrounette >>>>>>> CNRS/IDRIS - HLST >>>>>>> -- >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted >>>>>>> before they begin their experiments is >>>>>>> infinitely more interesting than any results >>>>>>> to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>> >>>>>> -- >>>>>> Myriam Peyrounette >>>>>> CNRS/IDRIS - HLST >>>>>> -- >>>>>> >>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From dave.mayhem23 at gmail.com Tue Mar 26 05:55:40 2019 From: dave.mayhem23 at gmail.com (Dave May) Date: Tue, 26 Mar 2019 10:55:40 +0000 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: <788f0293-4a5e-bae3-4a8d-10d92d0a16af@idris.fr> References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> <00d471e0-ed2a-51fa-3031-a6b63c3a96e1@idris.fr> <75a2f7b1-9e7a-843f-1a83-efff8e56f797@idris.fr> <788f0293-4a5e-bae3-4a8d-10d92d0a16af@idris.fr> Message-ID: On Tue, 26 Mar 2019 at 10:36, Myriam Peyrounette < myriam.peyrounette at idris.fr> wrote: > Oh you were right, the three options are unsused (-matptap_via scalable, > -inner_offdiag_matmatmult_via scalable and -inner_diag_matmatmult_via > scalable). Does this mean I am not using the associated PtAP functions? > No - not necessarily. All it means is the options were not parsed. If your matrices have an option prefix associated with them (e.g. abc) , then you need to provide the option as -abc_matptap_via scalable If you are not sure if you matrices have a prefix, look at the result of -ksp_view (see below for an example) Mat Object: 2 MPI processes type: mpiaij rows=363, cols=363, bs=3 total: nonzeros=8649, allocated nonzeros=8649 total number of mallocs used during MatSetValues calls =0 Mat Object: (B_) 2 MPI processes type: mpiaij rows=363, cols=363, bs=3 total: nonzeros=8649, allocated nonzeros=8649 total number of mallocs used during MatSetValues calls =0 The first matrix has no options prefix, but the second does and it's called "B_". > Myriam > > Le 03/26/19 ? 11:10, Dave May a ?crit : > > > On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> How can I be sure they are indeed used? Can I print this information in >> some log file? >> > Yes. Re-run the job with the command line option > > -options_left true > > This will report all options parsed, and importantly, will also indicate > if any options were unused. > > > Thanks > Dave > > Thanks in advance >> >> Myriam >> >> Le 03/25/19 ? 18:24, Matthew Knepley a ?crit : >> >> On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Hi, >>> >>> thanks for the explanations. I tried the last PETSc version (commit >>> fbc5705bc518d02a4999f188aad4ccff5f754cbf), which includes the patch you >>> talked about. But the memory scaling shows no improvement (see scaling >>> attached), even when using the "scalable" options :( >>> >>> I had a look at the PETSc functions MatPtAPNumeric_MPIAIJ_MPIAIJ and >>> MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences before and >>> after the first "bad" commit), but I can't find what induced this memory >>> issue. >>> >> Are you sure that the option was used? It just looks suspicious to me >> that they use exactly the same amount of memory. It should be different, >> even if it does not solve the problem. >> >> Thanks, >> >> Matt >> >>> Myriam >>> >>> >>> >>> >>> Le 03/20/19 ? 17:38, Fande Kong a ?crit : >>> >>> Hi Myriam, >>> >>> There are three algorithms in PETSc to do PtAP ( const char >>> *algTypes[3] = {"scalable","nonscalable","hypre"};), and can be specified >>> using the petsc options: -matptap_via xxxx. >>> >>> (1) -matptap_via hypre: This call the hypre package to do the PtAP >>> trough an all-at-once triple product. In our experiences, it is the most >>> memory efficient, but could be slow. >>> >>> (2) -matptap_via scalable: This involves a row-wise algorithm plus an >>> outer product. This will use more memory than hypre, but way faster. This >>> used to have a bug that could take all your memory, and I have a fix at >>> https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff. >>> When using this option, we may want to have extra options such as >>> -inner_offdiag_matmatmult_via scalable -inner_diag_matmatmult_via >>> scalable to select inner scalable algorithms. >>> >>> (3) -matptap_via nonscalable: Suppose to be even faster, but use more >>> memory. It does dense matrix operations. >>> >>> >>> Thanks, >>> >>> Fande Kong >>> >>> >>> >>> >>> On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> More precisely: something happens when upgrading the functions >>>> MatPtAPNumeric_MPIAIJ_MPIAIJ and/or MatPtAPSymbolic_MPIAIJ_MPIAIJ. >>>> >>>> Unfortunately, there are a lot of differences between the old and new >>>> versions of these functions. I keep investigating but if you have any idea, >>>> please let me know. >>>> >>>> Best, >>>> >>>> Myriam >>>> >>>> Le 03/20/19 ? 13:48, Myriam Peyrounette a ?crit : >>>> >>>> Hi all, >>>> >>>> I used git bisect to determine when the memory need increased. I found >>>> that the first "bad" commit is aa690a28a7284adb519c28cb44eae20a2c131c85. >>>> >>>> Barry was right, this commit seems to be about an evolution of MatPtAPSymbolic_MPIAIJ_MPIAIJ. >>>> You mentioned the option "-matptap_via scalable" but I can't find any >>>> information about it. Can you tell me more? >>>> >>>> Thanks >>>> >>>> Myriam >>>> >>>> >>>> Le 03/11/19 ? 14:40, Mark Adams a ?crit : >>>> >>>> Is there a difference in memory usage on your tiny problem? I assume >>>> no. >>>> >>>> I don't see anything that could come from GAMG other than the RAP stuff >>>> that you have discussed already. >>>> >>>> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette < >>>> myriam.peyrounette at idris.fr> wrote: >>>> >>>>> The code I am using here is the example 42 of PETSc ( >>>>> https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). >>>>> Indeed it solves the Stokes equation. I thought it was a good idea to use >>>>> an example you might know (and didn't find any that uses GAMG functions). I >>>>> just changed the PCMG setup so that the memory problem appears. And it >>>>> appears when adding PCGAMG. >>>>> >>>>> I don't care about the performance or even the result rightness here, >>>>> but only about the difference in memory use between 3.6 and 3.10. Do you >>>>> think finding a more adapted script would help? >>>>> >>>>> I used the threshold of 0.1 only once, at the beginning, to test its >>>>> influence. I used the default threshold (of 0, I guess) for all the other >>>>> runs. >>>>> >>>>> Myriam >>>>> >>>>> Le 03/11/19 ? 13:52, Mark Adams a ?crit : >>>>> >>>>> In looking at this larger scale run ... >>>>> >>>>> * Your eigen estimates are much lower than your tiny test problem. >>>>> But this is Stokes apparently and it should not work anyway. Maybe you have >>>>> a small time step that adds a lot of mass that brings the eigen estimates >>>>> down. And your min eigenvalue (not used) is positive. I would expect >>>>> negative for Stokes ... >>>>> >>>>> * You seem to be setting a threshold value of 0.1 -- that is very high >>>>> >>>>> * v3.6 says "using nonzero initial guess" but this is not in v3.10. >>>>> Maybe we just stopped printing that. >>>>> >>>>> * There were some changes to coasening parameters in going from v3.6 >>>>> but it does not look like your problem was effected. (The coarsening algo >>>>> is non-deterministic by default and you can see small difference on >>>>> different runs) >>>>> >>>>> * We may have also added a "noisy" RHS for eigen estimates by default >>>>> from v3.6. >>>>> >>>>> * And for non-symetric problems you can try -pc_gamg_agg_nsmooths 0, >>>>> but again GAMG is not built for Stokes anyway. >>>>> >>>>> >>>>> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette < >>>>> myriam.peyrounette at idris.fr> wrote: >>>>> >>>>>> I used PCView to display the size of the linear system in each level >>>>>> of the MG. You'll find the outputs attached to this mail (zip file) for >>>>>> both the default threshold value and a value of 0.1, and for both 3.6 and >>>>>> 3.10 PETSc versions. >>>>>> >>>>>> For convenience, I summarized the information in a graph, also >>>>>> attached (png file). >>>>>> >>>>>> As you can see, there are slight differences between the two versions >>>>>> but none is critical, in my opinion. Do you see anything suspicious in the >>>>>> outputs? >>>>>> >>>>>> + I can't find the default threshold value. Do you know where I can >>>>>> find it? >>>>>> >>>>>> Thanks for the follow-up >>>>>> >>>>>> Myriam >>>>>> >>>>>> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit : >>>>>> >>>>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette < >>>>>> myriam.peyrounette at idris.fr> wrote: >>>>>> >>>>>>> Hi Matt, >>>>>>> >>>>>>> I plotted the memory scalings using different threshold values. The >>>>>>> two scalings are slightly translated (from -22 to -88 mB) but this gain is >>>>>>> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling >>>>>>> deteriorates. >>>>>>> >>>>>>> Do you have any other suggestion? >>>>>>> >>>>>> Mark, what is the option she can give to output all the GAMG data? >>>>>> >>>>>> Also, run using -ksp_view. GAMG will report all the sizes of its >>>>>> grids, so it should be easy to see >>>>>> if the coarse grid sizes are increasing, and also what the effect of >>>>>> the threshold value is. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>>> Thanks >>>>>>> Myriam >>>>>>> >>>>>>> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit : >>>>>>> >>>>>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users < >>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc >>>>>>>> version >>>>>>>> to 3.10, this code has a bad memory scaling. >>>>>>>> >>>>>>>> To report this issue, I took the PETSc script ex42.c and slightly >>>>>>>> modified it so that the KSP and PC configurations are the same as >>>>>>>> in my >>>>>>>> code. In particular, I use a "personnalised" multi-grid method. The >>>>>>>> modifications are indicated by the keyword "TopBridge" in the >>>>>>>> attached >>>>>>>> scripts. >>>>>>>> >>>>>>>> To plot the memory (weak) scaling, I ran four calculations for each >>>>>>>> script with increasing problem sizes and computations cores: >>>>>>>> >>>>>>>> 1. 100,000 elts on 4 cores >>>>>>>> 2. 1 million elts on 40 cores >>>>>>>> 3. 10 millions elts on 400 cores >>>>>>>> 4. 100 millions elts on 4,000 cores >>>>>>>> >>>>>>>> The resulting graph is also attached. The scaling using PETSc 3.10 >>>>>>>> clearly deteriorates for large cases, while the one using PETSc 3.6 >>>>>>>> is >>>>>>>> robust. >>>>>>>> >>>>>>>> After a few tests, I found that the scaling is mostly sensitive to >>>>>>>> the >>>>>>>> use of the AMG method for the coarse grid (line 1780 in >>>>>>>> main_ex42_petsc36.cc). In particular, the performance strongly >>>>>>>> deteriorates when commenting lines 1777 to 1790 (in >>>>>>>> main_ex42_petsc36.cc). >>>>>>>> >>>>>>>> Do you have any idea of what changed between version 3.6 and version >>>>>>>> 3.10 that may imply such degradation? >>>>>>>> >>>>>>> >>>>>>> I believe the default values for PCGAMG changed between versions. It >>>>>>> sounds like the coarsening rate >>>>>>> is not great enough, so that these grids are too large. This can be >>>>>>> set using: >>>>>>> >>>>>>> >>>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>>>>> >>>>>>> There is some explanation of this effect on that page. Let us know >>>>>>> if setting this does not correct the situation. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Let me know if you need further information. >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> Myriam Peyrounette >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Myriam Peyrounette >>>>>>>> CNRS/IDRIS - HLST >>>>>>>> -- >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Myriam Peyrounette >>>>>>> CNRS/IDRIS - HLST >>>>>>> -- >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Myriam Peyrounette >>>>>> CNRS/IDRIS - HLST >>>>>> -- >>>>>> >>>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Mar 26 06:48:18 2019 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 26 Mar 2019 07:48:18 -0400 Subject: [petsc-users] error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 In-Reply-To: References: Message-ID: Please send the output of the error (runtime, compile time, link time?) On Mon, Mar 25, 2019 at 10:50 PM Kun Jiao via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi Petsc Experts, > > > > Is MatCreateMPIAIJMKL retired in 3.10.4? > > > > I got this error with my code which works fine in 3.8.3 version. > > > > Regards, > > Kun > > > > Schlumberger-Private > -------------- next part -------------- An HTML attachment was scrubbed... URL: From myriam.peyrounette at idris.fr Tue Mar 26 08:26:54 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Tue, 26 Mar 2019 14:26:54 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> <00d471e0-ed2a-51fa-3031-a6b63c3a96e1@idris.fr> <75a2f7b1-9e7a-843f-1a83-efff8e56f797@idris.fr> <788f0293-4a5e-bae3-4a8d-10d92d0a16af@idris.fr> Message-ID: I checked with -ksp_view (attached) but no prefix is associated with the matrix. Some are associated to the KSP and PC, but none to the Mat. Le 03/26/19 ? 11:55, Dave May a ?crit?: > > > On Tue, 26 Mar 2019 at 10:36, Myriam Peyrounette > > wrote: > > Oh you were right, the three options are unsused (-matptap_via > scalable, -inner_offdiag_matmatmult_via scalable and > -inner_diag_matmatmult_via scalable). Does this mean I am not > using the associated PtAP functions? > > > No - not necessarily. All it means is the options were not parsed.? > > If your matrices have an option prefix associated with them (e.g. abc) > , then you need to provide the option as > ? -abc_matptap_via scalable > > If you are not sure if you matrices have a prefix, look at the result > of -ksp_view (see below for an example) > > ??Mat Object: 2 MPI processes > > ? ? type: mpiaij > > ? ? rows=363, cols=363, bs=3 > > ? ? total: nonzeros=8649, allocated nonzeros=8649 > > ? ? total number of mallocs used during MatSetValues calls =0 > > ? Mat Object: (B_) 2 MPI processes > > ? ? type: mpiaij > > ? ? rows=363, cols=363, bs=3 > > ? ? total: nonzeros=8649, allocated nonzeros=8649 > > ? ? total number of mallocs used during MatSetValues calls =0 > > > The first matrix has no options prefix, but the second does and it's > called "B_". > > > > ? > > Myriam > > > Le 03/26/19 ? 11:10, Dave May a ?crit?: >> >> On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via petsc-users >> > wrote: >> >> How can I be sure they are indeed used? Can I print this >> information in some log file? >> >> Yes. Re-run the job with the command line option >> >> -options_left true >> >> This will report all options parsed, and importantly, will also >> indicate if any options were unused. >> ? >> >> Thanks >> Dave >> >> Thanks in advance >> >> Myriam >> >> >> Le 03/25/19 ? 18:24, Matthew Knepley a ?crit?: >>> On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via >>> petsc-users >> > wrote: >>> >>> Hi, >>> >>> thanks for the explanations. I tried the last PETSc >>> version (commit >>> fbc5705bc518d02a4999f188aad4ccff5f754cbf), which >>> includes the patch you talked about. But the memory >>> scaling shows no improvement (see scaling attached), >>> even when using the "scalable" options :( >>> >>> I had a look at the PETSc functions >>> MatPtAPNumeric_MPIAIJ_MPIAIJ and >>> MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the >>> differences before and after the first "bad" commit), >>> but I can't find what induced this memory issue. >>> >>> Are you sure that the option was used? It just looks >>> suspicious to me that they use exactly the same amount of >>> memory. It should be different, even if it does not solve >>> the problem. >>> >>> ? ?Thanks, >>> >>> ? ? ?Matt? >>> >>> Myriam >>> >>> >>> >>> >>> Le 03/20/19 ? 17:38, Fande Kong a ?crit?: >>>> Hi?Myriam, >>>> >>>> There are three algorithms in PETSc to do PtAP (?const >>>> char? ? ? ? ? *algTypes[3] = >>>> {"scalable","nonscalable","hypre"};), and can be >>>> specified using the petsc options:?-matptap_via xxxx. >>>> >>>> (1) -matptap_via hypre: This call the hypre package to >>>> do the PtAP trough an all-at-once triple product. In >>>> our experiences, it is the most memory efficient, but >>>> could be slow. >>>> >>>> (2)? -matptap_via scalable: This involves a row-wise >>>> algorithm plus an outer product.? This will use more >>>> memory than hypre, but way faster. This used to have a >>>> bug that could take all your memory, and I have a fix >>>> at?https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff.? >>>> When using this option, we may want to have extra >>>> options such as? ?-inner_offdiag_matmatmult_via >>>> scalable -inner_diag_matmatmult_via scalable? to select >>>> inner scalable algorithms. >>>> >>>> (3)??-matptap_via nonscalable:? Suppose to be even >>>> faster, but use more memory. It does dense matrix >>>> operations. >>>> >>>> >>>> Thanks, >>>> >>>> Fande Kong >>>> >>>> >>>> >>>> >>>> On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via >>>> petsc-users >>> > wrote: >>>> >>>> More precisely: something happens when upgrading >>>> the functions MatPtAPNumeric_MPIAIJ_MPIAIJ and/or >>>> MatPtAPSymbolic_MPIAIJ_MPIAIJ. >>>> >>>> Unfortunately, there are a lot of differences >>>> between the old and new versions of these >>>> functions. I keep investigating but if you have any >>>> idea, please let me know. >>>> >>>> Best, >>>> >>>> Myriam >>>> >>>> >>>> Le 03/20/19 ? 13:48, Myriam Peyrounette a ?crit?: >>>>> >>>>> Hi all, >>>>> >>>>> I used git bisect to determine when the memory >>>>> need increased. I found that the first "bad" >>>>> commit is ? aa690a28a7284adb519c28cb44eae20a2c131c85. >>>>> >>>>> Barry was right, this commit seems to be about an >>>>> evolution of MatPtAPSymbolic_MPIAIJ_MPIAIJ. You >>>>> mentioned the option "-matptap_via scalable" but I >>>>> can't find any information about it. Can you tell >>>>> me more? >>>>> >>>>> Thanks >>>>> >>>>> Myriam >>>>> >>>>> >>>>> Le 03/11/19 ? 14:40, Mark Adams a ?crit?: >>>>>> Is there a difference in memory usage on your >>>>>> tiny problem? I assume no. >>>>>> >>>>>> I don't see anything that could come from GAMG >>>>>> other than the RAP stuff that you have discussed >>>>>> already. >>>>>> >>>>>> On Mon, Mar 11, 2019 at 9:32 AM Myriam >>>>>> Peyrounette >>>>> > wrote: >>>>>> >>>>>> The code I am using here is the example 42 of >>>>>> PETSc >>>>>> (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). >>>>>> Indeed it solves the Stokes equation. I >>>>>> thought it was a good idea to use an example >>>>>> you might know (and didn't find any that uses >>>>>> GAMG functions). I just changed the PCMG >>>>>> setup so that the memory problem appears. And >>>>>> it appears when adding PCGAMG. >>>>>> >>>>>> I don't care about the performance or even >>>>>> the result rightness here, but only about the >>>>>> difference in memory use between 3.6 and >>>>>> 3.10. Do you think finding a more adapted >>>>>> script would help? >>>>>> >>>>>> I used the threshold of 0.1 only once, at the >>>>>> beginning, to test its influence. I used the >>>>>> default threshold (of 0, I guess) for all the >>>>>> other runs. >>>>>> >>>>>> Myriam >>>>>> >>>>>> >>>>>> Le 03/11/19 ? 13:52, Mark Adams a ?crit?: >>>>>>> In looking at this larger scale run ... >>>>>>> >>>>>>> * Your eigen estimates are much lower than >>>>>>> your tiny test problem.? But this is Stokes >>>>>>> apparently and it should not work anyway. >>>>>>> Maybe you have a small time step that adds a >>>>>>> lot of mass that brings the eigen estimates >>>>>>> down. And your min eigenvalue (not used) is >>>>>>> positive. I would expect negative for Stokes ... >>>>>>> >>>>>>> * You seem to be setting a threshold value >>>>>>> of 0.1 -- that is very high >>>>>>> >>>>>>> * v3.6 says "using nonzero initial guess" >>>>>>> but this is not in v3.10. Maybe we just >>>>>>> stopped printing that. >>>>>>> >>>>>>> * There were some changes to coasening >>>>>>> parameters in going from v3.6 but it does >>>>>>> not look like your problem was effected. >>>>>>> (The coarsening algo is non-deterministic by >>>>>>> default and you can see small difference on >>>>>>> different runs) >>>>>>> >>>>>>> * We may have also added a "noisy" RHS for >>>>>>> eigen estimates by default from v3.6. >>>>>>> >>>>>>> * And for non-symetric problems you can try >>>>>>> -pc_gamg_agg_nsmooths 0, but again GAMG is >>>>>>> not built for Stokes anyway. >>>>>>> >>>>>>> >>>>>>> On Tue, Mar 5, 2019 at 11:53 AM Myriam >>>>>>> Peyrounette >>>>>> > wrote: >>>>>>> >>>>>>> I used PCView to display the size of the >>>>>>> linear system in each level of the MG. >>>>>>> You'll find the outputs attached to this >>>>>>> mail (zip file) for both the default >>>>>>> threshold value and a value of 0.1, and >>>>>>> for both 3.6 and 3.10 PETSc versions. >>>>>>> >>>>>>> For convenience, I summarized the >>>>>>> information in a graph, also attached >>>>>>> (png file). >>>>>>> >>>>>>> As you can see, there are slight >>>>>>> differences between the two versions but >>>>>>> none is critical, in my opinion. Do you >>>>>>> see anything suspicious in the outputs? >>>>>>> >>>>>>> + I can't find the default threshold >>>>>>> value. Do you know where I can find it? >>>>>>> >>>>>>> Thanks for the follow-up >>>>>>> >>>>>>> Myriam >>>>>>> >>>>>>> >>>>>>> Le 03/05/19 ? 14:06, Matthew Knepley a >>>>>>> ?crit?: >>>>>>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam >>>>>>>> Peyrounette >>>>>>>> >>>>>>> > >>>>>>>> wrote: >>>>>>>> >>>>>>>> Hi Matt, >>>>>>>> >>>>>>>> I plotted the memory scalings using >>>>>>>> different threshold values. The two >>>>>>>> scalings are slightly translated >>>>>>>> (from -22 to -88 mB) but this gain >>>>>>>> is neglectable. The 3.6-scaling >>>>>>>> keeps being robust while the >>>>>>>> 3.10-scaling deteriorates. >>>>>>>> >>>>>>>> Do you have any other suggestion? >>>>>>>> >>>>>>>> Mark, what is the option she can give >>>>>>>> to output all the GAMG data? >>>>>>>> >>>>>>>> Also, run using -ksp_view. GAMG will >>>>>>>> report all the sizes of its grids, so >>>>>>>> it should be easy to see >>>>>>>> if the coarse grid sizes are >>>>>>>> increasing, and also what the effect of >>>>>>>> the threshold value is. >>>>>>>> >>>>>>>> ? Thanks, >>>>>>>> >>>>>>>> ? ? ?Matt? >>>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> Myriam >>>>>>>> >>>>>>>> Le 03/02/19 ? 02:27, Matthew >>>>>>>> Knepley a ?crit?: >>>>>>>>> On Fri, Mar 1, 2019 at 10:53 AM >>>>>>>>> Myriam Peyrounette via petsc-users >>>>>>>>> >>>>>>>> > >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I used to run my code with >>>>>>>>> PETSc 3.6. Since I upgraded >>>>>>>>> the PETSc version >>>>>>>>> to 3.10, this code has a bad >>>>>>>>> memory scaling. >>>>>>>>> >>>>>>>>> To report this issue, I took >>>>>>>>> the PETSc script ex42.c and >>>>>>>>> slightly >>>>>>>>> modified it so that the KSP >>>>>>>>> and PC configurations are the >>>>>>>>> same as in my >>>>>>>>> code. In particular, I use a >>>>>>>>> "personnalised" multi-grid >>>>>>>>> method. The >>>>>>>>> modifications are indicated by >>>>>>>>> the keyword "TopBridge" in the >>>>>>>>> attached >>>>>>>>> scripts. >>>>>>>>> >>>>>>>>> To plot the memory (weak) >>>>>>>>> scaling, I ran four >>>>>>>>> calculations for each >>>>>>>>> script with increasing problem >>>>>>>>> sizes and computations cores: >>>>>>>>> >>>>>>>>> 1. 100,000 elts on 4 cores >>>>>>>>> 2. 1 million elts on 40 cores >>>>>>>>> 3. 10 millions elts on 400 cores >>>>>>>>> 4. 100 millions elts on 4,000 >>>>>>>>> cores >>>>>>>>> >>>>>>>>> The resulting graph is also >>>>>>>>> attached. The scaling using >>>>>>>>> PETSc 3.10 >>>>>>>>> clearly deteriorates for large >>>>>>>>> cases, while the one using >>>>>>>>> PETSc 3.6 is >>>>>>>>> robust. >>>>>>>>> >>>>>>>>> After a few tests, I found >>>>>>>>> that the scaling is mostly >>>>>>>>> sensitive to the >>>>>>>>> use of the AMG method for the >>>>>>>>> coarse grid (line 1780 in >>>>>>>>> main_ex42_petsc36.cc). In >>>>>>>>> particular, the performance >>>>>>>>> strongly >>>>>>>>> deteriorates when commenting >>>>>>>>> lines 1777 to 1790 (in >>>>>>>>> main_ex42_petsc36.cc). >>>>>>>>> >>>>>>>>> Do you have any idea of what >>>>>>>>> changed between version 3.6 >>>>>>>>> and version >>>>>>>>> 3.10 that may imply such >>>>>>>>> degradation? >>>>>>>>> >>>>>>>>> >>>>>>>>> I believe the default values for >>>>>>>>> PCGAMG changed between versions. >>>>>>>>> It sounds like the coarsening rate >>>>>>>>> is not great enough, so that these >>>>>>>>> grids are too large. This can be >>>>>>>>> set using: >>>>>>>>> >>>>>>>>> ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>>>>>>> >>>>>>>>> There is some explanation of this >>>>>>>>> effect on that page. Let us know >>>>>>>>> if setting this does not correct >>>>>>>>> the situation. >>>>>>>>> >>>>>>>>> ? Thanks, >>>>>>>>> >>>>>>>>> ? ? ?Matt >>>>>>>>> ? >>>>>>>>> >>>>>>>>> Let me know if you need >>>>>>>>> further information. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> >>>>>>>>> Myriam Peyrounette >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Myriam Peyrounette >>>>>>>>> CNRS/IDRIS - HLST >>>>>>>>> -- >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for >>>>>>>>> granted before they begin their >>>>>>>>> experiments is infinitely more >>>>>>>>> interesting than any results to >>>>>>>>> which their experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Myriam Peyrounette >>>>>>>> CNRS/IDRIS - HLST >>>>>>>> -- >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for >>>>>>>> granted before they begin their >>>>>>>> experiments is infinitely more >>>>>>>> interesting than any results to which >>>>>>>> their experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Myriam Peyrounette >>>>>>> CNRS/IDRIS - HLST >>>>>>> -- >>>>>>> >>>>>> >>>>>> -- >>>>>> Myriam Peyrounette >>>>>> CNRS/IDRIS - HLST >>>>>> -- >>>>>> >>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin >>> their experiments is infinitely more interesting than any >>> results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ksp.log Type: text/x-log Size: 22348 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From dave.mayhem23 at gmail.com Tue Mar 26 08:30:01 2019 From: dave.mayhem23 at gmail.com (Dave May) Date: Tue, 26 Mar 2019 13:30:01 +0000 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> <00d471e0-ed2a-51fa-3031-a6b63c3a96e1@idris.fr> <75a2f7b1-9e7a-843f-1a83-efff8e56f797@idris.fr> <788f0293-4a5e-bae3-4a8d-10d92d0a16af@idris.fr> Message-ID: Could you also post the result from -log_view? On Tue, 26 Mar 2019 at 13:27, Myriam Peyrounette < myriam.peyrounette at idris.fr> wrote: > I checked with -ksp_view (attached) but no prefix is associated with the > matrix. Some are associated to the KSP and PC, but none to the Mat. > > Le 03/26/19 ? 11:55, Dave May a ?crit : > > > > On Tue, 26 Mar 2019 at 10:36, Myriam Peyrounette < > myriam.peyrounette at idris.fr> wrote: > >> Oh you were right, the three options are unsused (-matptap_via scalable, >> -inner_offdiag_matmatmult_via scalable and -inner_diag_matmatmult_via >> scalable). Does this mean I am not using the associated PtAP functions? >> > > No - not necessarily. All it means is the options were not parsed. > > If your matrices have an option prefix associated with them (e.g. abc) , > then you need to provide the option as > -abc_matptap_via scalable > > If you are not sure if you matrices have a prefix, look at the result of > -ksp_view (see below for an example) > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=363, cols=363, bs=3 > > total: nonzeros=8649, allocated nonzeros=8649 > > total number of mallocs used during MatSetValues calls =0 > > Mat Object: (B_) 2 MPI processes > > type: mpiaij > > rows=363, cols=363, bs=3 > > total: nonzeros=8649, allocated nonzeros=8649 > > total number of mallocs used during MatSetValues calls =0 > > The first matrix has no options prefix, but the second does and it's > called "B_". > > > > > >> Myriam >> >> Le 03/26/19 ? 11:10, Dave May a ?crit : >> >> >> On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> How can I be sure they are indeed used? Can I print this information in >>> some log file? >>> >> Yes. Re-run the job with the command line option >> >> -options_left true >> >> This will report all options parsed, and importantly, will also indicate >> if any options were unused. >> >> >> Thanks >> Dave >> >> Thanks in advance >>> >>> Myriam >>> >>> Le 03/25/19 ? 18:24, Matthew Knepley a ?crit : >>> >>> On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Hi, >>>> >>>> thanks for the explanations. I tried the last PETSc version (commit >>>> fbc5705bc518d02a4999f188aad4ccff5f754cbf), which includes the patch you >>>> talked about. But the memory scaling shows no improvement (see scaling >>>> attached), even when using the "scalable" options :( >>>> >>>> I had a look at the PETSc functions MatPtAPNumeric_MPIAIJ_MPIAIJ and >>>> MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences before and >>>> after the first "bad" commit), but I can't find what induced this memory >>>> issue. >>>> >>> Are you sure that the option was used? It just looks suspicious to me >>> that they use exactly the same amount of memory. It should be different, >>> even if it does not solve the problem. >>> >>> Thanks, >>> >>> Matt >>> >>>> Myriam >>>> >>>> >>>> >>>> >>>> Le 03/20/19 ? 17:38, Fande Kong a ?crit : >>>> >>>> Hi Myriam, >>>> >>>> There are three algorithms in PETSc to do PtAP ( const char >>>> *algTypes[3] = {"scalable","nonscalable","hypre"};), and can be specified >>>> using the petsc options: -matptap_via xxxx. >>>> >>>> (1) -matptap_via hypre: This call the hypre package to do the PtAP >>>> trough an all-at-once triple product. In our experiences, it is the most >>>> memory efficient, but could be slow. >>>> >>>> (2) -matptap_via scalable: This involves a row-wise algorithm plus an >>>> outer product. This will use more memory than hypre, but way faster. This >>>> used to have a bug that could take all your memory, and I have a fix at >>>> https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff. >>>> When using this option, we may want to have extra options such as >>>> -inner_offdiag_matmatmult_via scalable -inner_diag_matmatmult_via >>>> scalable to select inner scalable algorithms. >>>> >>>> (3) -matptap_via nonscalable: Suppose to be even faster, but use more >>>> memory. It does dense matrix operations. >>>> >>>> >>>> Thanks, >>>> >>>> Fande Kong >>>> >>>> >>>> >>>> >>>> On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>>> More precisely: something happens when upgrading the functions >>>>> MatPtAPNumeric_MPIAIJ_MPIAIJ and/or MatPtAPSymbolic_MPIAIJ_MPIAIJ. >>>>> >>>>> Unfortunately, there are a lot of differences between the old and new >>>>> versions of these functions. I keep investigating but if you have any idea, >>>>> please let me know. >>>>> >>>>> Best, >>>>> >>>>> Myriam >>>>> >>>>> Le 03/20/19 ? 13:48, Myriam Peyrounette a ?crit : >>>>> >>>>> Hi all, >>>>> >>>>> I used git bisect to determine when the memory need increased. I found >>>>> that the first "bad" commit is aa690a28a7284adb519c28cb44eae20a2c131c85. >>>>> >>>>> Barry was right, this commit seems to be about an evolution of MatPtAPSymbolic_MPIAIJ_MPIAIJ. >>>>> You mentioned the option "-matptap_via scalable" but I can't find any >>>>> information about it. Can you tell me more? >>>>> >>>>> Thanks >>>>> >>>>> Myriam >>>>> >>>>> >>>>> Le 03/11/19 ? 14:40, Mark Adams a ?crit : >>>>> >>>>> Is there a difference in memory usage on your tiny problem? I assume >>>>> no. >>>>> >>>>> I don't see anything that could come from GAMG other than the RAP >>>>> stuff that you have discussed already. >>>>> >>>>> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette < >>>>> myriam.peyrounette at idris.fr> wrote: >>>>> >>>>>> The code I am using here is the example 42 of PETSc ( >>>>>> https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). >>>>>> Indeed it solves the Stokes equation. I thought it was a good idea to use >>>>>> an example you might know (and didn't find any that uses GAMG functions). I >>>>>> just changed the PCMG setup so that the memory problem appears. And it >>>>>> appears when adding PCGAMG. >>>>>> >>>>>> I don't care about the performance or even the result rightness here, >>>>>> but only about the difference in memory use between 3.6 and 3.10. Do you >>>>>> think finding a more adapted script would help? >>>>>> >>>>>> I used the threshold of 0.1 only once, at the beginning, to test its >>>>>> influence. I used the default threshold (of 0, I guess) for all the other >>>>>> runs. >>>>>> >>>>>> Myriam >>>>>> >>>>>> Le 03/11/19 ? 13:52, Mark Adams a ?crit : >>>>>> >>>>>> In looking at this larger scale run ... >>>>>> >>>>>> * Your eigen estimates are much lower than your tiny test problem. >>>>>> But this is Stokes apparently and it should not work anyway. Maybe you have >>>>>> a small time step that adds a lot of mass that brings the eigen estimates >>>>>> down. And your min eigenvalue (not used) is positive. I would expect >>>>>> negative for Stokes ... >>>>>> >>>>>> * You seem to be setting a threshold value of 0.1 -- that is very high >>>>>> >>>>>> * v3.6 says "using nonzero initial guess" but this is not in v3.10. >>>>>> Maybe we just stopped printing that. >>>>>> >>>>>> * There were some changes to coasening parameters in going from v3.6 >>>>>> but it does not look like your problem was effected. (The coarsening algo >>>>>> is non-deterministic by default and you can see small difference on >>>>>> different runs) >>>>>> >>>>>> * We may have also added a "noisy" RHS for eigen estimates by default >>>>>> from v3.6. >>>>>> >>>>>> * And for non-symetric problems you can try -pc_gamg_agg_nsmooths 0, >>>>>> but again GAMG is not built for Stokes anyway. >>>>>> >>>>>> >>>>>> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette < >>>>>> myriam.peyrounette at idris.fr> wrote: >>>>>> >>>>>>> I used PCView to display the size of the linear system in each level >>>>>>> of the MG. You'll find the outputs attached to this mail (zip file) for >>>>>>> both the default threshold value and a value of 0.1, and for both 3.6 and >>>>>>> 3.10 PETSc versions. >>>>>>> >>>>>>> For convenience, I summarized the information in a graph, also >>>>>>> attached (png file). >>>>>>> >>>>>>> As you can see, there are slight differences between the two >>>>>>> versions but none is critical, in my opinion. Do you see anything >>>>>>> suspicious in the outputs? >>>>>>> >>>>>>> + I can't find the default threshold value. Do you know where I can >>>>>>> find it? >>>>>>> >>>>>>> Thanks for the follow-up >>>>>>> >>>>>>> Myriam >>>>>>> >>>>>>> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit : >>>>>>> >>>>>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette < >>>>>>> myriam.peyrounette at idris.fr> wrote: >>>>>>> >>>>>>>> Hi Matt, >>>>>>>> >>>>>>>> I plotted the memory scalings using different threshold values. The >>>>>>>> two scalings are slightly translated (from -22 to -88 mB) but this gain is >>>>>>>> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling >>>>>>>> deteriorates. >>>>>>>> >>>>>>>> Do you have any other suggestion? >>>>>>>> >>>>>>> Mark, what is the option she can give to output all the GAMG data? >>>>>>> >>>>>>> Also, run using -ksp_view. GAMG will report all the sizes of its >>>>>>> grids, so it should be easy to see >>>>>>> if the coarse grid sizes are increasing, and also what the effect of >>>>>>> the threshold value is. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>>> Thanks >>>>>>>> Myriam >>>>>>>> >>>>>>>> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit : >>>>>>>> >>>>>>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users < >>>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc >>>>>>>>> version >>>>>>>>> to 3.10, this code has a bad memory scaling. >>>>>>>>> >>>>>>>>> To report this issue, I took the PETSc script ex42.c and slightly >>>>>>>>> modified it so that the KSP and PC configurations are the same as >>>>>>>>> in my >>>>>>>>> code. In particular, I use a "personnalised" multi-grid method. The >>>>>>>>> modifications are indicated by the keyword "TopBridge" in the >>>>>>>>> attached >>>>>>>>> scripts. >>>>>>>>> >>>>>>>>> To plot the memory (weak) scaling, I ran four calculations for each >>>>>>>>> script with increasing problem sizes and computations cores: >>>>>>>>> >>>>>>>>> 1. 100,000 elts on 4 cores >>>>>>>>> 2. 1 million elts on 40 cores >>>>>>>>> 3. 10 millions elts on 400 cores >>>>>>>>> 4. 100 millions elts on 4,000 cores >>>>>>>>> >>>>>>>>> The resulting graph is also attached. The scaling using PETSc 3.10 >>>>>>>>> clearly deteriorates for large cases, while the one using PETSc >>>>>>>>> 3.6 is >>>>>>>>> robust. >>>>>>>>> >>>>>>>>> After a few tests, I found that the scaling is mostly sensitive to >>>>>>>>> the >>>>>>>>> use of the AMG method for the coarse grid (line 1780 in >>>>>>>>> main_ex42_petsc36.cc). In particular, the performance strongly >>>>>>>>> deteriorates when commenting lines 1777 to 1790 (in >>>>>>>>> main_ex42_petsc36.cc). >>>>>>>>> >>>>>>>>> Do you have any idea of what changed between version 3.6 and >>>>>>>>> version >>>>>>>>> 3.10 that may imply such degradation? >>>>>>>>> >>>>>>>> >>>>>>>> I believe the default values for PCGAMG changed between versions. >>>>>>>> It sounds like the coarsening rate >>>>>>>> is not great enough, so that these grids are too large. This can be >>>>>>>> set using: >>>>>>>> >>>>>>>> >>>>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>>>>>> >>>>>>>> There is some explanation of this effect on that page. Let us know >>>>>>>> if setting this does not correct the situation. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Let me know if you need further information. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> >>>>>>>>> Myriam Peyrounette >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Myriam Peyrounette >>>>>>>>> CNRS/IDRIS - HLST >>>>>>>>> -- >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Myriam Peyrounette >>>>>>>> CNRS/IDRIS - HLST >>>>>>>> -- >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Myriam Peyrounette >>>>>>> CNRS/IDRIS - HLST >>>>>>> -- >>>>>>> >>>>>>> >>>>>> -- >>>>>> Myriam Peyrounette >>>>>> CNRS/IDRIS - HLST >>>>>> -- >>>>>> >>>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Mar 26 08:30:27 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 26 Mar 2019 09:30:27 -0400 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> <00d471e0-ed2a-51fa-3031-a6b63c3a96e1@idris.fr> <75a2f7b1-9e7a-843f-1a83-efff8e56f797@idris.fr> <788f0293-4a5e-bae3-4a8d-10d92d0a16af@idris.fr> Message-ID: On Tue, Mar 26, 2019 at 9:27 AM Myriam Peyrounette < myriam.peyrounette at idris.fr> wrote: > I checked with -ksp_view (attached) but no prefix is associated with the > matrix. Some are associated to the KSP and PC, but none to the Mat > Another thing that could prevent options being used is that *SetFromOptions() is not called for the object. Thanks, Matt > Le 03/26/19 ? 11:55, Dave May a ?crit : > > > > On Tue, 26 Mar 2019 at 10:36, Myriam Peyrounette < > myriam.peyrounette at idris.fr> wrote: > >> Oh you were right, the three options are unsused (-matptap_via scalable, >> -inner_offdiag_matmatmult_via scalable and -inner_diag_matmatmult_via >> scalable). Does this mean I am not using the associated PtAP functions? >> > > No - not necessarily. All it means is the options were not parsed. > > If your matrices have an option prefix associated with them (e.g. abc) , > then you need to provide the option as > -abc_matptap_via scalable > > If you are not sure if you matrices have a prefix, look at the result of > -ksp_view (see below for an example) > > Mat Object: 2 MPI processes > > type: mpiaij > > rows=363, cols=363, bs=3 > > total: nonzeros=8649, allocated nonzeros=8649 > > total number of mallocs used during MatSetValues calls =0 > > Mat Object: (B_) 2 MPI processes > > type: mpiaij > > rows=363, cols=363, bs=3 > > total: nonzeros=8649, allocated nonzeros=8649 > > total number of mallocs used during MatSetValues calls =0 > > The first matrix has no options prefix, but the second does and it's > called "B_". > > > > > >> Myriam >> >> Le 03/26/19 ? 11:10, Dave May a ?crit : >> >> >> On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> How can I be sure they are indeed used? Can I print this information in >>> some log file? >>> >> Yes. Re-run the job with the command line option >> >> -options_left true >> >> This will report all options parsed, and importantly, will also indicate >> if any options were unused. >> >> >> Thanks >> Dave >> >> Thanks in advance >>> >>> Myriam >>> >>> Le 03/25/19 ? 18:24, Matthew Knepley a ?crit : >>> >>> On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Hi, >>>> >>>> thanks for the explanations. I tried the last PETSc version (commit >>>> fbc5705bc518d02a4999f188aad4ccff5f754cbf), which includes the patch you >>>> talked about. But the memory scaling shows no improvement (see scaling >>>> attached), even when using the "scalable" options :( >>>> >>>> I had a look at the PETSc functions MatPtAPNumeric_MPIAIJ_MPIAIJ and >>>> MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences before and >>>> after the first "bad" commit), but I can't find what induced this memory >>>> issue. >>>> >>> Are you sure that the option was used? It just looks suspicious to me >>> that they use exactly the same amount of memory. It should be different, >>> even if it does not solve the problem. >>> >>> Thanks, >>> >>> Matt >>> >>>> Myriam >>>> >>>> >>>> >>>> >>>> Le 03/20/19 ? 17:38, Fande Kong a ?crit : >>>> >>>> Hi Myriam, >>>> >>>> There are three algorithms in PETSc to do PtAP ( const char >>>> *algTypes[3] = {"scalable","nonscalable","hypre"};), and can be specified >>>> using the petsc options: -matptap_via xxxx. >>>> >>>> (1) -matptap_via hypre: This call the hypre package to do the PtAP >>>> trough an all-at-once triple product. In our experiences, it is the most >>>> memory efficient, but could be slow. >>>> >>>> (2) -matptap_via scalable: This involves a row-wise algorithm plus an >>>> outer product. This will use more memory than hypre, but way faster. This >>>> used to have a bug that could take all your memory, and I have a fix at >>>> https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff. >>>> When using this option, we may want to have extra options such as >>>> -inner_offdiag_matmatmult_via scalable -inner_diag_matmatmult_via >>>> scalable to select inner scalable algorithms. >>>> >>>> (3) -matptap_via nonscalable: Suppose to be even faster, but use more >>>> memory. It does dense matrix operations. >>>> >>>> >>>> Thanks, >>>> >>>> Fande Kong >>>> >>>> >>>> >>>> >>>> On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>>> More precisely: something happens when upgrading the functions >>>>> MatPtAPNumeric_MPIAIJ_MPIAIJ and/or MatPtAPSymbolic_MPIAIJ_MPIAIJ. >>>>> >>>>> Unfortunately, there are a lot of differences between the old and new >>>>> versions of these functions. I keep investigating but if you have any idea, >>>>> please let me know. >>>>> >>>>> Best, >>>>> >>>>> Myriam >>>>> >>>>> Le 03/20/19 ? 13:48, Myriam Peyrounette a ?crit : >>>>> >>>>> Hi all, >>>>> >>>>> I used git bisect to determine when the memory need increased. I found >>>>> that the first "bad" commit is aa690a28a7284adb519c28cb44eae20a2c131c85. >>>>> >>>>> Barry was right, this commit seems to be about an evolution of MatPtAPSymbolic_MPIAIJ_MPIAIJ. >>>>> You mentioned the option "-matptap_via scalable" but I can't find any >>>>> information about it. Can you tell me more? >>>>> >>>>> Thanks >>>>> >>>>> Myriam >>>>> >>>>> >>>>> Le 03/11/19 ? 14:40, Mark Adams a ?crit : >>>>> >>>>> Is there a difference in memory usage on your tiny problem? I assume >>>>> no. >>>>> >>>>> I don't see anything that could come from GAMG other than the RAP >>>>> stuff that you have discussed already. >>>>> >>>>> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette < >>>>> myriam.peyrounette at idris.fr> wrote: >>>>> >>>>>> The code I am using here is the example 42 of PETSc ( >>>>>> https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). >>>>>> Indeed it solves the Stokes equation. I thought it was a good idea to use >>>>>> an example you might know (and didn't find any that uses GAMG functions). I >>>>>> just changed the PCMG setup so that the memory problem appears. And it >>>>>> appears when adding PCGAMG. >>>>>> >>>>>> I don't care about the performance or even the result rightness here, >>>>>> but only about the difference in memory use between 3.6 and 3.10. Do you >>>>>> think finding a more adapted script would help? >>>>>> >>>>>> I used the threshold of 0.1 only once, at the beginning, to test its >>>>>> influence. I used the default threshold (of 0, I guess) for all the other >>>>>> runs. >>>>>> >>>>>> Myriam >>>>>> >>>>>> Le 03/11/19 ? 13:52, Mark Adams a ?crit : >>>>>> >>>>>> In looking at this larger scale run ... >>>>>> >>>>>> * Your eigen estimates are much lower than your tiny test problem. >>>>>> But this is Stokes apparently and it should not work anyway. Maybe you have >>>>>> a small time step that adds a lot of mass that brings the eigen estimates >>>>>> down. And your min eigenvalue (not used) is positive. I would expect >>>>>> negative for Stokes ... >>>>>> >>>>>> * You seem to be setting a threshold value of 0.1 -- that is very high >>>>>> >>>>>> * v3.6 says "using nonzero initial guess" but this is not in v3.10. >>>>>> Maybe we just stopped printing that. >>>>>> >>>>>> * There were some changes to coasening parameters in going from v3.6 >>>>>> but it does not look like your problem was effected. (The coarsening algo >>>>>> is non-deterministic by default and you can see small difference on >>>>>> different runs) >>>>>> >>>>>> * We may have also added a "noisy" RHS for eigen estimates by default >>>>>> from v3.6. >>>>>> >>>>>> * And for non-symetric problems you can try -pc_gamg_agg_nsmooths 0, >>>>>> but again GAMG is not built for Stokes anyway. >>>>>> >>>>>> >>>>>> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette < >>>>>> myriam.peyrounette at idris.fr> wrote: >>>>>> >>>>>>> I used PCView to display the size of the linear system in each level >>>>>>> of the MG. You'll find the outputs attached to this mail (zip file) for >>>>>>> both the default threshold value and a value of 0.1, and for both 3.6 and >>>>>>> 3.10 PETSc versions. >>>>>>> >>>>>>> For convenience, I summarized the information in a graph, also >>>>>>> attached (png file). >>>>>>> >>>>>>> As you can see, there are slight differences between the two >>>>>>> versions but none is critical, in my opinion. Do you see anything >>>>>>> suspicious in the outputs? >>>>>>> >>>>>>> + I can't find the default threshold value. Do you know where I can >>>>>>> find it? >>>>>>> >>>>>>> Thanks for the follow-up >>>>>>> >>>>>>> Myriam >>>>>>> >>>>>>> Le 03/05/19 ? 14:06, Matthew Knepley a ?crit : >>>>>>> >>>>>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette < >>>>>>> myriam.peyrounette at idris.fr> wrote: >>>>>>> >>>>>>>> Hi Matt, >>>>>>>> >>>>>>>> I plotted the memory scalings using different threshold values. The >>>>>>>> two scalings are slightly translated (from -22 to -88 mB) but this gain is >>>>>>>> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling >>>>>>>> deteriorates. >>>>>>>> >>>>>>>> Do you have any other suggestion? >>>>>>>> >>>>>>> Mark, what is the option she can give to output all the GAMG data? >>>>>>> >>>>>>> Also, run using -ksp_view. GAMG will report all the sizes of its >>>>>>> grids, so it should be easy to see >>>>>>> if the coarse grid sizes are increasing, and also what the effect of >>>>>>> the threshold value is. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>>> Thanks >>>>>>>> Myriam >>>>>>>> >>>>>>>> Le 03/02/19 ? 02:27, Matthew Knepley a ?crit : >>>>>>>> >>>>>>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users < >>>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc >>>>>>>>> version >>>>>>>>> to 3.10, this code has a bad memory scaling. >>>>>>>>> >>>>>>>>> To report this issue, I took the PETSc script ex42.c and slightly >>>>>>>>> modified it so that the KSP and PC configurations are the same as >>>>>>>>> in my >>>>>>>>> code. In particular, I use a "personnalised" multi-grid method. The >>>>>>>>> modifications are indicated by the keyword "TopBridge" in the >>>>>>>>> attached >>>>>>>>> scripts. >>>>>>>>> >>>>>>>>> To plot the memory (weak) scaling, I ran four calculations for each >>>>>>>>> script with increasing problem sizes and computations cores: >>>>>>>>> >>>>>>>>> 1. 100,000 elts on 4 cores >>>>>>>>> 2. 1 million elts on 40 cores >>>>>>>>> 3. 10 millions elts on 400 cores >>>>>>>>> 4. 100 millions elts on 4,000 cores >>>>>>>>> >>>>>>>>> The resulting graph is also attached. The scaling using PETSc 3.10 >>>>>>>>> clearly deteriorates for large cases, while the one using PETSc >>>>>>>>> 3.6 is >>>>>>>>> robust. >>>>>>>>> >>>>>>>>> After a few tests, I found that the scaling is mostly sensitive to >>>>>>>>> the >>>>>>>>> use of the AMG method for the coarse grid (line 1780 in >>>>>>>>> main_ex42_petsc36.cc). In particular, the performance strongly >>>>>>>>> deteriorates when commenting lines 1777 to 1790 (in >>>>>>>>> main_ex42_petsc36.cc). >>>>>>>>> >>>>>>>>> Do you have any idea of what changed between version 3.6 and >>>>>>>>> version >>>>>>>>> 3.10 that may imply such degradation? >>>>>>>>> >>>>>>>> >>>>>>>> I believe the default values for PCGAMG changed between versions. >>>>>>>> It sounds like the coarsening rate >>>>>>>> is not great enough, so that these grids are too large. This can be >>>>>>>> set using: >>>>>>>> >>>>>>>> >>>>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>>>>>> >>>>>>>> There is some explanation of this effect on that page. Let us know >>>>>>>> if setting this does not correct the situation. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Let me know if you need further information. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> >>>>>>>>> Myriam Peyrounette >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Myriam Peyrounette >>>>>>>>> CNRS/IDRIS - HLST >>>>>>>>> -- >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Myriam Peyrounette >>>>>>>> CNRS/IDRIS - HLST >>>>>>>> -- >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Myriam Peyrounette >>>>>>> CNRS/IDRIS - HLST >>>>>>> -- >>>>>>> >>>>>>> >>>>>> -- >>>>>> Myriam Peyrounette >>>>>> CNRS/IDRIS - HLST >>>>>> -- >>>>>> >>>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >>> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From KJiao at slb.com Tue Mar 26 09:07:56 2019 From: KJiao at slb.com (Kun Jiao) Date: Tue, 26 Mar 2019 14:07:56 +0000 Subject: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 In-Reply-To: References: Message-ID: It is compiling error, error message is: error: identifier "MatCreateMPIAIJMKL" is undefined. From: Mark Adams Sent: Tuesday, March 26, 2019 6:48 AM To: Kun Jiao Cc: petsc-users at mcs.anl.gov Subject: [Ext] Re: [petsc-users] error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 Please send the output of the error (runtime, compile time, link time?) On Mon, Mar 25, 2019 at 10:50 PM Kun Jiao via petsc-users > wrote: Hi Petsc Experts, Is MatCreateMPIAIJMKL retired in 3.10.4? I got this error with my code which works fine in 3.8.3 version. Regards, Kun Schlumberger-Private Schlumberger-Private -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Mar 26 09:21:51 2019 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 26 Mar 2019 10:21:51 -0400 Subject: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 In-Reply-To: References: Message-ID: I assume the whole error message will have the line of code. Please send the whole error message and line of offending code if not included. On Tue, Mar 26, 2019 at 10:08 AM Kun Jiao wrote: > It is compiling error, error message is: > > > > error: identifier "MatCreateMPIAIJMKL" is undefined. > > > > > > > > > > > > *From:* Mark Adams > *Sent:* Tuesday, March 26, 2019 6:48 AM > *To:* Kun Jiao > *Cc:* petsc-users at mcs.anl.gov > *Subject:* [Ext] Re: [petsc-users] error: identifier "MatCreateMPIAIJMKL" > is undefined in 3.10.4 > > > > Please send the output of the error (runtime, compile time, link time?) > > > > On Mon, Mar 25, 2019 at 10:50 PM Kun Jiao via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi Petsc Experts, > > > > Is MatCreateMPIAIJMKL retired in 3.10.4? > > > > I got this error with my code which works fine in 3.8.3 version. > > > > Regards, > > Kun > > > > > > Schlumberger-Private > > > > Schlumberger-Private > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From KJiao at slb.com Tue Mar 26 09:59:53 2019 From: KJiao at slb.com (Kun Jiao) Date: Tue, 26 Mar 2019 14:59:53 +0000 Subject: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 In-Reply-To: References: Message-ID: [kjiao at hyi0016 src/lsqr]% make [ 50%] Building CXX object lsqr/CMakeFiles/p_lsqr.dir/lsqr.cc.o /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(318): error: identifier "MatCreateMPIAIJMKL" is undefined ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); ^ /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(578): error: identifier "MatCreateMPIAIJMKL" is undefined ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); ^ compilation aborted for /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc (code 2) Thanks. From: Mark Adams Sent: Tuesday, March 26, 2019 9:22 AM To: Kun Jiao Cc: petsc-users at mcs.anl.gov Subject: Re: [Ext] Re: [petsc-users] error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 I assume the whole error message will have the line of code. Please send the whole error message and line of offending code if not included. On Tue, Mar 26, 2019 at 10:08 AM Kun Jiao > wrote: It is compiling error, error message is: error: identifier "MatCreateMPIAIJMKL" is undefined. From: Mark Adams > Sent: Tuesday, March 26, 2019 6:48 AM To: Kun Jiao > Cc: petsc-users at mcs.anl.gov Subject: [Ext] Re: [petsc-users] error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 Please send the output of the error (runtime, compile time, link time?) On Mon, Mar 25, 2019 at 10:50 PM Kun Jiao via petsc-users > wrote: Hi Petsc Experts, Is MatCreateMPIAIJMKL retired in 3.10.4? I got this error with my code which works fine in 3.8.3 version. Regards, Kun Schlumberger-Private Schlumberger-Private Schlumberger-Private -------------- next part -------------- An HTML attachment was scrubbed... URL: From myriam.peyrounette at idris.fr Tue Mar 26 10:16:32 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Tue, 26 Mar 2019 16:16:32 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> <00d471e0-ed2a-51fa-3031-a6b63c3a96e1@idris.fr> <75a2f7b1-9e7a-843f-1a83-efff8e56f797@idris.fr> <788f0293-4a5e-bae3-4a8d-10d92d0a16af@idris.fr> Message-ID: <0e5a0f3b-ebb8-849c-7ce9-ec9c23108576@idris.fr> *SetFromOptions() was not called indeed... Thanks! The code performance is better now with regard to memory usage! I still have to plot the memory scaling on bigger cases to see if it has the same good behaviour as when using the 3.6 version. I'll let ou know as soon as I have plotted it. Thanks again Myriam Le 03/26/19 ? 14:30, Matthew Knepley a ?crit?: > On Tue, Mar 26, 2019 at 9:27 AM Myriam Peyrounette > > wrote: > > I checked with -ksp_view (attached) but no prefix is associated > with the matrix. Some are associated to the KSP and PC, but none > to the Mat > > Another thing that could prevent options being used is that > *SetFromOptions() is not called for the object. > > ? Thanks, > > ? ? ?Matt > ? > > Le 03/26/19 ? 11:55, Dave May a ?crit?: >> >> >> On Tue, 26 Mar 2019 at 10:36, Myriam Peyrounette >> > > wrote: >> >> Oh you were right, the three options are unsused >> (-matptap_via scalable, -inner_offdiag_matmatmult_via >> scalable and -inner_diag_matmatmult_via scalable). Does this >> mean I am not using the associated PtAP functions? >> >> >> No - not necessarily. All it means is the options were not parsed.? >> >> If your matrices have an option prefix associated with them (e.g. >> abc) , then you need to provide the option as >> ? -abc_matptap_via scalable >> >> If you are not sure if you matrices have a prefix, look at the >> result of -ksp_view (see below for an example) >> >> ??Mat Object: 2 MPI processes >> >> ? ? type: mpiaij >> >> ? ? rows=363, cols=363, bs=3 >> >> ? ? total: nonzeros=8649, allocated nonzeros=8649 >> >> ? ? total number of mallocs used during MatSetValues calls =0 >> >> ? Mat Object: (B_) 2 MPI processes >> >> ? ? type: mpiaij >> >> ? ? rows=363, cols=363, bs=3 >> >> ? ? total: nonzeros=8649, allocated nonzeros=8649 >> >> ? ? total number of mallocs used during MatSetValues calls =0 >> >> >> The first matrix has no options prefix, but the second does and >> it's called "B_". >> >> >> >> ? >> >> Myriam >> >> >> Le 03/26/19 ? 11:10, Dave May a ?crit?: >>> >>> On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via >>> petsc-users >> > wrote: >>> >>> How can I be sure they are indeed used? Can I print this >>> information in some log file? >>> >>> Yes. Re-run the job with the command line option >>> >>> -options_left true >>> >>> This will report all options parsed, and importantly, will >>> also indicate if any options were unused. >>> ? >>> >>> Thanks >>> Dave >>> >>> Thanks in advance >>> >>> Myriam >>> >>> >>> Le 03/25/19 ? 18:24, Matthew Knepley a ?crit?: >>>> On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via >>>> petsc-users >>> > wrote: >>>> >>>> Hi, >>>> >>>> thanks for the explanations. I tried the last PETSc >>>> version (commit >>>> fbc5705bc518d02a4999f188aad4ccff5f754cbf), which >>>> includes the patch you talked about. But the memory >>>> scaling shows no improvement (see scaling >>>> attached), even when using the "scalable" options :( >>>> >>>> I had a look at the PETSc functions >>>> MatPtAPNumeric_MPIAIJ_MPIAIJ and >>>> MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the >>>> differences before and after the first "bad" >>>> commit), but I can't find what induced this memory >>>> issue. >>>> >>>> Are you sure that the option was used? It just looks >>>> suspicious to me that they use exactly the same amount >>>> of memory. It should be different, even if it does not >>>> solve the problem. >>>> >>>> ? ?Thanks, >>>> >>>> ? ? ?Matt? >>>> >>>> Myriam >>>> >>>> >>>> >>>> >>>> Le 03/20/19 ? 17:38, Fande Kong a ?crit?: >>>>> Hi?Myriam, >>>>> >>>>> There are three algorithms in PETSc to do PtAP >>>>> (?const char? ? ? ? ? *algTypes[3] = >>>>> {"scalable","nonscalable","hypre"};), and can be >>>>> specified using the petsc options:?-matptap_via xxxx. >>>>> >>>>> (1) -matptap_via hypre: This call the hypre >>>>> package to do the PtAP trough an all-at-once >>>>> triple product. In our experiences, it is the most >>>>> memory efficient, but could be slow. >>>>> >>>>> (2)? -matptap_via scalable: This involves a >>>>> row-wise algorithm plus an outer product.? This >>>>> will use more memory than hypre, but way faster. >>>>> This used to have a bug that could take all your >>>>> memory, and I have a fix >>>>> at?https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff.? >>>>> When using this option, we may want to have extra >>>>> options such as? ?-inner_offdiag_matmatmult_via >>>>> scalable -inner_diag_matmatmult_via scalable? to >>>>> select inner scalable algorithms. >>>>> >>>>> (3)??-matptap_via nonscalable:? Suppose to be even >>>>> faster, but use more memory. It does dense matrix >>>>> operations. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Fande Kong >>>>> >>>>> >>>>> >>>>> >>>>> On Wed, Mar 20, 2019 at 10:06 AM Myriam >>>>> Peyrounette via petsc-users >>>>> >>>> > wrote: >>>>> >>>>> More precisely: something happens when >>>>> upgrading the functions >>>>> MatPtAPNumeric_MPIAIJ_MPIAIJ and/or >>>>> MatPtAPSymbolic_MPIAIJ_MPIAIJ. >>>>> >>>>> Unfortunately, there are a lot of differences >>>>> between the old and new versions of these >>>>> functions. I keep investigating but if you >>>>> have any idea, please let me know. >>>>> >>>>> Best, >>>>> >>>>> Myriam >>>>> >>>>> >>>>> Le 03/20/19 ? 13:48, Myriam Peyrounette a ?crit?: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> I used git bisect to determine when the >>>>>> memory need increased. I found that the first >>>>>> "bad" commit is ? >>>>>> aa690a28a7284adb519c28cb44eae20a2c131c85. >>>>>> >>>>>> Barry was right, this commit seems to be >>>>>> about an evolution of >>>>>> MatPtAPSymbolic_MPIAIJ_MPIAIJ. You mentioned >>>>>> the option "-matptap_via scalable" but I >>>>>> can't find any information about it. Can you >>>>>> tell me more? >>>>>> >>>>>> Thanks >>>>>> >>>>>> Myriam >>>>>> >>>>>> >>>>>> Le 03/11/19 ? 14:40, Mark Adams a ?crit?: >>>>>>> Is there a difference in memory usage on >>>>>>> your tiny problem? I assume no. >>>>>>> >>>>>>> I don't see anything that could come from >>>>>>> GAMG other than the RAP stuff that you have >>>>>>> discussed already. >>>>>>> >>>>>>> On Mon, Mar 11, 2019 at 9:32 AM Myriam >>>>>>> Peyrounette >>>>>> > wrote: >>>>>>> >>>>>>> The code I am using here is the example >>>>>>> 42 of PETSc >>>>>>> (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). >>>>>>> Indeed it solves the Stokes equation. I >>>>>>> thought it was a good idea to use an >>>>>>> example you might know (and didn't find >>>>>>> any that uses GAMG functions). I just >>>>>>> changed the PCMG setup so that the >>>>>>> memory problem appears. And it appears >>>>>>> when adding PCGAMG. >>>>>>> >>>>>>> I don't care about the performance or >>>>>>> even the result rightness here, but only >>>>>>> about the difference in memory use >>>>>>> between 3.6 and 3.10. Do you think >>>>>>> finding a more adapted script would help? >>>>>>> >>>>>>> I used the threshold of 0.1 only once, >>>>>>> at the beginning, to test its influence. >>>>>>> I used the default threshold (of 0, I >>>>>>> guess) for all the other runs. >>>>>>> >>>>>>> Myriam >>>>>>> >>>>>>> >>>>>>> Le 03/11/19 ? 13:52, Mark Adams a ?crit?: >>>>>>>> In looking at this larger scale run ... >>>>>>>> >>>>>>>> * Your eigen estimates are much lower >>>>>>>> than your tiny test problem.? But this >>>>>>>> is Stokes apparently and it should not >>>>>>>> work anyway. Maybe you have a small >>>>>>>> time step that adds a lot of mass that >>>>>>>> brings the eigen estimates down. And >>>>>>>> your min eigenvalue (not used) is >>>>>>>> positive. I would expect negative for >>>>>>>> Stokes ... >>>>>>>> >>>>>>>> * You seem to be setting a threshold >>>>>>>> value of 0.1 -- that is very high >>>>>>>> >>>>>>>> * v3.6 says "using nonzero initial >>>>>>>> guess" but this is not in v3.10. Maybe >>>>>>>> we just stopped printing that. >>>>>>>> >>>>>>>> * There were some changes to coasening >>>>>>>> parameters in going from v3.6 but it >>>>>>>> does not look like your problem was >>>>>>>> effected. (The coarsening algo is >>>>>>>> non-deterministic by default and you >>>>>>>> can see small difference on different runs) >>>>>>>> >>>>>>>> * We may have also added a "noisy" RHS >>>>>>>> for eigen estimates by default from v3.6. >>>>>>>> >>>>>>>> * And for non-symetric problems you can >>>>>>>> try -pc_gamg_agg_nsmooths 0, but again >>>>>>>> GAMG is not built for Stokes anyway. >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Mar 5, 2019 at 11:53 AM Myriam >>>>>>>> Peyrounette >>>>>>>> >>>>>>> > >>>>>>>> wrote: >>>>>>>> >>>>>>>> I used PCView to display the size >>>>>>>> of the linear system in each level >>>>>>>> of the MG. You'll find the outputs >>>>>>>> attached to this mail (zip file) >>>>>>>> for both the default threshold >>>>>>>> value and a value of 0.1, and for >>>>>>>> both 3.6 and 3.10 PETSc versions. >>>>>>>> >>>>>>>> For convenience, I summarized the >>>>>>>> information in a graph, also >>>>>>>> attached (png file). >>>>>>>> >>>>>>>> As you can see, there are slight >>>>>>>> differences between the two >>>>>>>> versions but none is critical, in >>>>>>>> my opinion. Do you see anything >>>>>>>> suspicious in the outputs? >>>>>>>> >>>>>>>> + I can't find the default >>>>>>>> threshold value. Do you know where >>>>>>>> I can find it? >>>>>>>> >>>>>>>> Thanks for the follow-up >>>>>>>> >>>>>>>> Myriam >>>>>>>> >>>>>>>> >>>>>>>> Le 03/05/19 ? 14:06, Matthew >>>>>>>> Knepley a ?crit?: >>>>>>>>> On Tue, Mar 5, 2019 at 7:14 AM >>>>>>>>> Myriam Peyrounette >>>>>>>>> >>>>>>>> > >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi Matt, >>>>>>>>> >>>>>>>>> I plotted the memory scalings >>>>>>>>> using different threshold >>>>>>>>> values. The two scalings are >>>>>>>>> slightly translated (from -22 >>>>>>>>> to -88 mB) but this gain is >>>>>>>>> neglectable. The 3.6-scaling >>>>>>>>> keeps being robust while the >>>>>>>>> 3.10-scaling deteriorates. >>>>>>>>> >>>>>>>>> Do you have any other suggestion? >>>>>>>>> >>>>>>>>> Mark, what is the option she can >>>>>>>>> give to output all the GAMG data? >>>>>>>>> >>>>>>>>> Also, run using -ksp_view. GAMG >>>>>>>>> will report all the sizes of its >>>>>>>>> grids, so it should be easy to see >>>>>>>>> if the coarse grid sizes are >>>>>>>>> increasing, and also what the >>>>>>>>> effect of the threshold value is. >>>>>>>>> >>>>>>>>> ? Thanks, >>>>>>>>> >>>>>>>>> ? ? ?Matt? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> Myriam >>>>>>>>> >>>>>>>>> Le 03/02/19 ? 02:27, Matthew >>>>>>>>> Knepley a ?crit?: >>>>>>>>>> On Fri, Mar 1, 2019 at 10:53 >>>>>>>>>> AM Myriam Peyrounette via >>>>>>>>>> petsc-users >>>>>>>>>> >>>>>>>>> > >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I used to run my code >>>>>>>>>> with PETSc 3.6. Since I >>>>>>>>>> upgraded the PETSc version >>>>>>>>>> to 3.10, this code has a >>>>>>>>>> bad memory scaling. >>>>>>>>>> >>>>>>>>>> To report this issue, I >>>>>>>>>> took the PETSc script >>>>>>>>>> ex42.c and slightly >>>>>>>>>> modified it so that the >>>>>>>>>> KSP and PC configurations >>>>>>>>>> are the same as in my >>>>>>>>>> code. In particular, I >>>>>>>>>> use a "personnalised" >>>>>>>>>> multi-grid method. The >>>>>>>>>> modifications are >>>>>>>>>> indicated by the keyword >>>>>>>>>> "TopBridge" in the attached >>>>>>>>>> scripts. >>>>>>>>>> >>>>>>>>>> To plot the memory (weak) >>>>>>>>>> scaling, I ran four >>>>>>>>>> calculations for each >>>>>>>>>> script with increasing >>>>>>>>>> problem sizes and >>>>>>>>>> computations cores: >>>>>>>>>> >>>>>>>>>> 1. 100,000 elts on 4 cores >>>>>>>>>> 2. 1 million elts on 40 cores >>>>>>>>>> 3. 10 millions elts on >>>>>>>>>> 400 cores >>>>>>>>>> 4. 100 millions elts on >>>>>>>>>> 4,000 cores >>>>>>>>>> >>>>>>>>>> The resulting graph is >>>>>>>>>> also attached. The >>>>>>>>>> scaling using PETSc 3.10 >>>>>>>>>> clearly deteriorates for >>>>>>>>>> large cases, while the >>>>>>>>>> one using PETSc 3.6 is >>>>>>>>>> robust. >>>>>>>>>> >>>>>>>>>> After a few tests, I >>>>>>>>>> found that the scaling is >>>>>>>>>> mostly sensitive to the >>>>>>>>>> use of the AMG method for >>>>>>>>>> the coarse grid (line 1780 in >>>>>>>>>> main_ex42_petsc36.cc). In >>>>>>>>>> particular, the >>>>>>>>>> performance strongly >>>>>>>>>> deteriorates when >>>>>>>>>> commenting lines 1777 to >>>>>>>>>> 1790 (in >>>>>>>>>> main_ex42_petsc36.cc). >>>>>>>>>> >>>>>>>>>> Do you have any idea of >>>>>>>>>> what changed between >>>>>>>>>> version 3.6 and version >>>>>>>>>> 3.10 that may imply such >>>>>>>>>> degradation? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I believe the default values >>>>>>>>>> for PCGAMG changed between >>>>>>>>>> versions. It sounds like the >>>>>>>>>> coarsening rate >>>>>>>>>> is not great enough, so that >>>>>>>>>> these grids are too large. >>>>>>>>>> This can be set using: >>>>>>>>>> >>>>>>>>>> ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>>>>>>>> >>>>>>>>>> There is some explanation of >>>>>>>>>> this effect on that page. Let >>>>>>>>>> us know if setting this does >>>>>>>>>> not correct the situation. >>>>>>>>>> >>>>>>>>>> ? Thanks, >>>>>>>>>> >>>>>>>>>> ? ? ?Matt >>>>>>>>>> ? >>>>>>>>>> >>>>>>>>>> Let me know if you need >>>>>>>>>> further information. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> >>>>>>>>>> Myriam Peyrounette >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Myriam Peyrounette >>>>>>>>>> CNRS/IDRIS - HLST >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take >>>>>>>>>> for granted before they begin >>>>>>>>>> their experiments is >>>>>>>>>> infinitely more interesting >>>>>>>>>> than any results to which >>>>>>>>>> their experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Myriam Peyrounette >>>>>>>>> CNRS/IDRIS - HLST >>>>>>>>> -- >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for >>>>>>>>> granted before they begin their >>>>>>>>> experiments is infinitely more >>>>>>>>> interesting than any results to >>>>>>>>> which their experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Myriam Peyrounette >>>>>>>> CNRS/IDRIS - HLST >>>>>>>> -- >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Myriam Peyrounette >>>>>>> CNRS/IDRIS - HLST >>>>>>> -- >>>>>>> >>>>>> >>>>>> -- >>>>>> Myriam Peyrounette >>>>>> CNRS/IDRIS - HLST >>>>>> -- >>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they >>>> begin their experiments is infinitely more interesting >>>> than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From balay at mcs.anl.gov Tue Mar 26 10:18:52 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Tue, 26 Mar 2019 15:18:52 +0000 Subject: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 In-Reply-To: References: Message-ID: >>>>>>> balay at sb /home/balay/petsc (maint=) $ git grep MatCreateMPIAIJMKL maint-3.8 maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[],Mat *A) maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL maint-3.8:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) $ git grep MatCreateMPIAIJMKL maint maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[],Mat *A) maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL maint:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) $ <<<<<<<<<<< MatCreateMPIAIJMKL() exists in both petsc-3.8 and petsc-3.10. However the public interface is missing from both of these versions. So I'm surprised you don't get the same error with petsc-3.8 Can you try the following change? diff --git a/include/petscmat.h b/include/petscmat.h index 1b8ac69377..c66f727994 100644 --- a/include/petscmat.h +++ b/include/petscmat.h @@ -223,7 +223,8 @@ typedef enum {DIFFERENT_NONZERO_PATTERN,SUBSET_NONZERO_PATTERN,SAME_NONZERO_PATT #if defined PETSC_HAVE_MKL_SPARSE PETSC_EXTERN PetscErrorCode MatCreateBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,const PetscInt[],PetscInt,const PetscInt[],Mat*); -PETSC_EXTERN PetscErrorCode MatCreateSeqBAIJMKL(MPI_Comm comm,PetscInt bs,PetscInt m,PetscInt n,PetscInt nz,const PetscInt nnz[],Mat *A); +PETSC_EXTERN PetscErrorCode MatCreateSeqBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,const PetscInt[],Mat*); +PETSC_EXTERN PetscErrorCode MatCreateMPIAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,const PetscInt[],PetscInt,const PetscInt[],Mat*); #endif PETSC_EXTERN PetscErrorCode MatCreateSeqSELL(MPI_Comm,PetscInt,PetscInt,PetscInt,const PetscInt[],Mat*); Also note: - this routine is available only when PETSc is built with Intel MKL Satish On Tue, 26 Mar 2019, Kun Jiao via petsc-users wrote: > [kjiao at hyi0016 src/lsqr]% make > [ 50%] Building CXX object lsqr/CMakeFiles/p_lsqr.dir/lsqr.cc.o > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(318): error: identifier "MatCreateMPIAIJMKL" is undefined > ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); > ^ > > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(578): error: identifier "MatCreateMPIAIJMKL" is undefined > ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); > ^ > > compilation aborted for /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc (code 2) > > Thanks. > > > From: Mark Adams > Sent: Tuesday, March 26, 2019 9:22 AM > To: Kun Jiao > Cc: petsc-users at mcs.anl.gov > Subject: Re: [Ext] Re: [petsc-users] error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > I assume the whole error message will have the line of code. Please send the whole error message and line of offending code if not included. > > On Tue, Mar 26, 2019 at 10:08 AM Kun Jiao > wrote: > It is compiling error, error message is: > > error: identifier "MatCreateMPIAIJMKL" is undefined. > > > > > > From: Mark Adams > > Sent: Tuesday, March 26, 2019 6:48 AM > To: Kun Jiao > > Cc: petsc-users at mcs.anl.gov > Subject: [Ext] Re: [petsc-users] error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > Please send the output of the error (runtime, compile time, link time?) > > On Mon, Mar 25, 2019 at 10:50 PM Kun Jiao via petsc-users > wrote: > Hi Petsc Experts, > > Is MatCreateMPIAIJMKL retired in 3.10.4? > > I got this error with my code which works fine in 3.8.3 version. > > Regards, > Kun > > > > Schlumberger-Private > > > Schlumberger-Private > > > Schlumberger-Private > From mfadams at lbl.gov Tue Mar 26 10:22:33 2019 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 26 Mar 2019 11:22:33 -0400 Subject: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 In-Reply-To: References: Message-ID: So this works with v3.8? I don't see any differences (I see Satish figured this out and has suggestions). You could also work around it with code like this: ierr = MatCreate(PETSC_COMM_WORLD,&A);CHKERRQ(ierr); ierr = MatSetType(A,MATAIJMKL);CHKERRQ(ierr); ierr = MatMPIAIJSetPreallocation(A,0,ourlens,0,offlens); On Tue, Mar 26, 2019 at 11:00 AM Kun Jiao wrote: > [kjiao at hyi0016 src/lsqr]% make > > [ 50%] Building CXX object lsqr/CMakeFiles/p_lsqr.dir/lsqr.cc.o > > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(318): > error: identifier "MatCreateMPIAIJMKL" is undefined > > ierr = > MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); > > ^ > > > > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(578): > error: identifier "MatCreateMPIAIJMKL" is undefined > > ierr = > MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); > > ^ > > > > compilation aborted for > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc (code > 2) > > > > Thanks. > > > > > > *From:* Mark Adams > *Sent:* Tuesday, March 26, 2019 9:22 AM > *To:* Kun Jiao > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [Ext] Re: [petsc-users] error: identifier > "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > > > I assume the whole error message will have the line of code. Please send > the whole error message and line of offending code if not included. > > > > On Tue, Mar 26, 2019 at 10:08 AM Kun Jiao wrote: > > It is compiling error, error message is: > > > > error: identifier "MatCreateMPIAIJMKL" is undefined. > > > > > > > > > > > > *From:* Mark Adams > *Sent:* Tuesday, March 26, 2019 6:48 AM > *To:* Kun Jiao > *Cc:* petsc-users at mcs.anl.gov > *Subject:* [Ext] Re: [petsc-users] error: identifier "MatCreateMPIAIJMKL" > is undefined in 3.10.4 > > > > Please send the output of the error (runtime, compile time, link time?) > > > > On Mon, Mar 25, 2019 at 10:50 PM Kun Jiao via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi Petsc Experts, > > > > Is MatCreateMPIAIJMKL retired in 3.10.4? > > > > I got this error with my code which works fine in 3.8.3 version. > > > > Regards, > > Kun > > > > > > Schlumberger-Private > > > > Schlumberger-Private > > > > Schlumberger-Private > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From colin.cotter at imperial.ac.uk Tue Mar 26 13:53:59 2019 From: colin.cotter at imperial.ac.uk (Cotter, Colin J) Date: Tue, 26 Mar 2019 18:53:59 +0000 Subject: [petsc-users] Confusing Schur preconditioner behaviour In-Reply-To: References: , Message-ID: Hi Dave, Thanks for the tip - you were right, and this works better for higher resolutions now. all the best --Colin ________________________________ From: Dave May Sent: 19 March 2019 11:25:11 To: Cotter, Colin J Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Confusing Schur preconditioner behaviour Hi Colin, On Tue, 19 Mar 2019 at 09:33, Cotter, Colin J > wrote: Hi Dave, >If you are doing that, then you need to tell fieldsplit to use the Amat to define the splits otherwise it will define the Schur compliment as >S = B22 - B21 inv(B11) B12 >preconditiones with B22, where as what you want is >S = A22 - A21 inv(A11) A12 >preconditioned with B22. >If your operators are set up this way and you didn't indicate to use Amat to define S this would definitely explain why preonly works but iterating on Schur does not. Yes, thanks - this solves it! I need pc_use_amat. Okay great. But doesn't that option eradicate your custom Schur complement object which you inserted into the Bmat in the (2,2) slot? I thought you would use the option -pc_fieldsplit_diag_use_amat In general for fieldsplit (Schur) I found that the best way to manage user defined Schur complement preconditioners is via PCFieldSplitSetSchurPre(). https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetSchurPre.html#PCFieldSplitSetSchurPre Also, for solver debugging purposes with fieldsplit and MatNest, I find it incredibly useful to attach textual names to all the matrices going to into FieldSplit. You can use PetscObjectSetName() with each of your sub-matrices in the Amat and the Bmat, and any schur complement operators. The textual names will be displayed in KSP view. In that way you have a better chance of understanding which operators are being used where. (Note that this trick is less useful with the Amat and Bmat are AIJ matrices). Below is an example KSPView associated with 2x2 block system where I've attached the names Auu,Aup,Apu,App, and S* to the Amat sub-matices and the schur complement preconditioner. PC Object:(dcy_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (lumped, if requested) A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (dcy_fieldsplit_u_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (dcy_fieldsplit_u_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 0., needed 0. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=85728, cols=85728 package used to perform factorization: umfpack total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 not using I-node routines UMFPACK run parameters: Control[UMFPACK_PRL]: 1. Control[UMFPACK_STRATEGY]: 0. Control[UMFPACK_DENSE_COL]: 0.2 Control[UMFPACK_DENSE_ROW]: 0.2 Control[UMFPACK_AMD_DENSE]: 10. Control[UMFPACK_BLOCK_SIZE]: 32. Control[UMFPACK_FIXQ]: 0. Control[UMFPACK_AGGRESSIVE]: 1. Control[UMFPACK_PIVOT_TOLERANCE]: 0.1 Control[UMFPACK_SYM_PIVOT_TOLERANCE]: 0.001 Control[UMFPACK_SCALE]: 1. Control[UMFPACK_ALLOC_INIT]: 0.7 Control[UMFPACK_DROPTOL]: 0. Control[UMFPACK_IRSTEP]: 0. Control[UMFPACK_ORDERING]: AMD (not using the PETSc ordering) linear system matrix = precond matrix: Mat Object: Auu (dcy_fieldsplit_u_) 1 MPI processes type: seqaij rows=85728, cols=85728 total: nonzeros=1028736, allocated nonzeros=1028736 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 21432 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (dcy_fieldsplit_p_) 1 MPI processes type: fgmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=300, initial guess is zero tolerances: relative=0.01, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (dcy_fieldsplit_p_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 0., needed 0. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=32542, cols=32542 package used to perform factorization: umfpack total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 not using I-node routines UMFPACK run parameters: Control[UMFPACK_PRL]: 1. Control[UMFPACK_STRATEGY]: 0. Control[UMFPACK_DENSE_COL]: 0.2 Control[UMFPACK_DENSE_ROW]: 0.2 Control[UMFPACK_AMD_DENSE]: 10. Control[UMFPACK_BLOCK_SIZE]: 32. Control[UMFPACK_FIXQ]: 0. Control[UMFPACK_AGGRESSIVE]: 1. Control[UMFPACK_PIVOT_TOLERANCE]: 0.1 Control[UMFPACK_SYM_PIVOT_TOLERANCE]: 0.001 Control[UMFPACK_SCALE]: 1. Control[UMFPACK_ALLOC_INIT]: 0.7 Control[UMFPACK_DROPTOL]: 0. Control[UMFPACK_IRSTEP]: 0. Control[UMFPACK_ORDERING]: AMD (not using the PETSc ordering) linear system matrix followed by preconditioner matrix: Mat Object: (dcy_fieldsplit_p_) 1 MPI processes type: schurcomplement rows=32542, cols=32542 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: App (dcy_fieldsplit_p_) 1 MPI processes type: seqaij rows=32542, cols=32542 total: nonzeros=548482, allocated nonzeros=548482 total number of mallocs used during MatSetValues calls =0 not using I-node routines A10 Mat Object: Apu 1 MPI processes type: seqaij rows=32542, cols=85728 total: nonzeros=857280, allocated nonzeros=921990 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (dcy_fieldsplit_u_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (dcy_fieldsplit_u_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 0., needed 0. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=85728, cols=85728 package used to perform factorization: umfpack total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 not using I-node routines UMFPACK run parameters: Control[UMFPACK_PRL]: 1. Control[UMFPACK_STRATEGY]: 0. Control[UMFPACK_DENSE_COL]: 0.2 Control[UMFPACK_DENSE_ROW]: 0.2 Control[UMFPACK_AMD_DENSE]: 10. Control[UMFPACK_BLOCK_SIZE]: 32. Control[UMFPACK_FIXQ]: 0. Control[UMFPACK_AGGRESSIVE]: 1. Control[UMFPACK_PIVOT_TOLERANCE]: 0.1 Control[UMFPACK_SYM_PIVOT_TOLERANCE]: 0.001 Control[UMFPACK_SCALE]: 1. Control[UMFPACK_ALLOC_INIT]: 0.7 Control[UMFPACK_DROPTOL]: 0. Control[UMFPACK_IRSTEP]: 0. Control[UMFPACK_ORDERING]: AMD (not using the PETSc ordering) linear system matrix = precond matrix: Mat Object: Auu (dcy_fieldsplit_u_) 1 MPI processes type: seqaij rows=85728, cols=85728 total: nonzeros=1028736, allocated nonzeros=1028736 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 21432 nodes, limit used is 5 A01 Mat Object: Aup 1 MPI processes type: seqaij rows=85728, cols=32542 total: nonzeros=857280, allocated nonzeros=942634 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 21432 nodes, limit used is 5 Mat Object: S* 1 MPI processes type: seqaij rows=32542, cols=32542 total: nonzeros=548482, allocated nonzeros=548482 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=118270, cols=118270 has attached null space Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : name="Auu", prefix="dcy_fieldsplit_u_", type=seqaij, rows=85728, cols=85728 (0,1) : name="Aup", type=seqaij, rows=85728, cols=32542 (1,0) : name="Apu", type=seqaij, rows=32542, cols=85728 (1,1) : name="App", prefix="dcy_fieldsplit_p_", type=seqaij, rows=32542, cols=32542 all the best --Colin -------------- next part -------------- An HTML attachment was scrubbed... URL: From KJiao at slb.com Tue Mar 26 14:00:13 2019 From: KJiao at slb.com (Kun Jiao) Date: Tue, 26 Mar 2019 19:00:13 +0000 Subject: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 In-Reply-To: References: Message-ID: Strange things, when I compile my code in the test dir in PETSC, it works. After I "make install" PETSC, and try to compile my code against the installed PETSC, it doesn't work any more. I guess this is what you means. Is there any way to reenable MatCreateMPIAIJMKL public interface? And, I am using intel MKL, here is my configure option: Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions PETSC_ARCH=linux-gnu-intel --with-precision=single --with-cc=mpiicc --with-cxx=mpiicc --with-fc=mpiifort --with-mpi-include=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mpi/intel64/include --with-mpi-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel//compilers_and_libraries_2019.2.187/linux/mpi/intel64/lib -lmpifort -lmpi_ilp64" --with-blaslapack-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64 -Wl, --no-as-needed -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldl" --with-scalapack-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64" --with-scalapack-include=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl/include --with-mkl_pardiso-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse=1 --with-mkl_sparse-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_cpardiso=1 --with-mkl_cpardiso-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse_optimize=1 --with-mkl_sparse_optimize-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse_sp2m=1 --with-mkl_sparse_sp2m-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-cmake=1 --prefix=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/petsc_3.9.4 --known-endian=big --with-debugging=0 --COPTFLAGS=" -Ofast -xHost" --CXXOPTFLAGS=" -Ofast -xHost" --FOPTFLAGS=" -Ofast -xHost" --with-x=0 Working directory: /wgdisk/hy3300/source_code_dev/imaging/kjiao/petsc-3.10.4 Schlumberger-Private -----Original Message----- From: Balay, Satish Sent: Tuesday, March 26, 2019 10:19 AM To: Kun Jiao Cc: Mark Adams ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 >>>>>>> balay at sb /home/balay/petsc (maint=) $ git grep MatCreateMPIAIJMKL maint-3.8 maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[],Mat *A) maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL maint-3.8:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) $ git grep MatCreateMPIAIJMKL maint maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[],Mat *A) maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL maint:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) $ <<<<<<<<<<< MatCreateMPIAIJMKL() exists in both petsc-3.8 and petsc-3.10. However the public interface is missing from both of these versions. So I'm surprised you don't get the same error with petsc-3.8 Can you try the following change? diff --git a/include/petscmat.h b/include/petscmat.h index 1b8ac69377..c66f727994 100644 --- a/include/petscmat.h +++ b/include/petscmat.h @@ -223,7 +223,8 @@ typedef enum {DIFFERENT_NONZERO_PATTERN,SUBSET_NONZERO_PATTERN,SAME_NONZERO_PATT #if defined PETSC_HAVE_MKL_SPARSE PETSC_EXTERN PetscErrorCode MatCreateBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,const PetscInt[],PetscInt,const PetscInt[],Mat*); -PETSC_EXTERN PetscErrorCode MatCreateSeqBAIJMKL(MPI_Comm comm,PetscInt bs,PetscInt m,PetscInt n,PetscInt nz,const PetscInt nnz[],Mat *A); +PETSC_EXTERN PetscErrorCode +MatCreateSeqBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,const +PetscInt[],Mat*); PETSC_EXTERN PetscErrorCode +MatCreateMPIAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,PetscIn +t,const PetscInt[],PetscInt,const PetscInt[],Mat*); #endif PETSC_EXTERN PetscErrorCode MatCreateSeqSELL(MPI_Comm,PetscInt,PetscInt,PetscInt,const PetscInt[],Mat*); Also note: - this routine is available only when PETSc is built with Intel MKL Satish On Tue, 26 Mar 2019, Kun Jiao via petsc-users wrote: > [kjiao at hyi0016 src/lsqr]% make > [ 50%] Building CXX object lsqr/CMakeFiles/p_lsqr.dir/lsqr.cc.o > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(318): error: identifier "MatCreateMPIAIJMKL" is undefined > ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); > ^ > > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(578): error: identifier "MatCreateMPIAIJMKL" is undefined > ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); > ^ > > compilation aborted for > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc > (code 2) > > Thanks. > > > From: Mark Adams > Sent: Tuesday, March 26, 2019 9:22 AM > To: Kun Jiao > Cc: petsc-users at mcs.anl.gov > Subject: Re: [Ext] Re: [petsc-users] error: identifier > "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > I assume the whole error message will have the line of code. Please send the whole error message and line of offending code if not included. > > On Tue, Mar 26, 2019 at 10:08 AM Kun Jiao > wrote: > It is compiling error, error message is: > > error: identifier "MatCreateMPIAIJMKL" is undefined. > > > > > > From: Mark Adams > > Sent: Tuesday, March 26, 2019 6:48 AM > To: Kun Jiao > > Cc: petsc-users at mcs.anl.gov > Subject: [Ext] Re: [petsc-users] error: identifier > "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > Please send the output of the error (runtime, compile time, link > time?) > > On Mon, Mar 25, 2019 at 10:50 PM Kun Jiao via petsc-users > wrote: > Hi Petsc Experts, > > Is MatCreateMPIAIJMKL retired in 3.10.4? > > I got this error with my code which works fine in 3.8.3 version. > > Regards, > Kun > > > > Schlumberger-Private > > > Schlumberger-Private > > > Schlumberger-Private > From mfadams at lbl.gov Tue Mar 26 15:37:53 2019 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 26 Mar 2019 16:37:53 -0400 Subject: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 In-Reply-To: References: Message-ID: On Tue, Mar 26, 2019 at 3:00 PM Kun Jiao wrote: > Strange things, when I compile my code in the test dir in PETSC, it works. > After I "make install" PETSC, and try to compile my code against the > installed PETSC, it doesn't work any more. > I'm not sure I follow what you are doing exactly but look at the compile lines (good and bad) and compare them. If one works and one does not then they must be different. Anyway, as Satish said this interface was not enabled in any version that we see. (So we are puzzled that any version works.) You can wait for a fix to get pushed but using the method that I showed you should work now. > > I guess this is what you means. > > Is there any way to reenable MatCreateMPIAIJMKL public interface? > > And, I am using intel MKL, here is my configure option: > > Configure Options: --configModules=PETSc.Configure > --optionsModule=config.compilerOptions PETSC_ARCH=linux-gnu-intel > --with-precision=single --with-cc=mpiicc --with-cxx=mpiicc > --with-fc=mpiifort > --with-mpi-include=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mpi/intel64/include > --with-mpi-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel//compilers_and_libraries_2019.2.187/linux/mpi/intel64/lib > -lmpifort -lmpi_ilp64" > --with-blaslapack-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64 > -Wl, --no-as-needed -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread > -lm -ldl" > --with-scalapack-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64 > -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64" > --with-scalapack-include=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl/include > --with-mkl_pardiso-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl > --with-mkl_sparse=1 > --with-mkl_sparse-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl > --with-mkl_cpardiso=1 > --with-mkl_cpardiso-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl > --with-mkl_sparse_optimize=1 > --with-mkl_sparse_optimize-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl > --with-mkl_sparse_sp2m=1 > --with-mkl_sparse_sp2m-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl > --with-cmake=1 > --prefix=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/petsc_3.9.4 > --known-endian=big --with-debugging=0 --COPTFLAGS=" -Ofast -xHost" > --CXXOPTFLAGS=" -Ofast -xHost" --FOPTFLAGS=" -Ofast -xHost" --with-x=0 > Working directory: > /wgdisk/hy3300/source_code_dev/imaging/kjiao/petsc-3.10.4 > > > > Schlumberger-Private > > -----Original Message----- > From: Balay, Satish > Sent: Tuesday, March 26, 2019 10:19 AM > To: Kun Jiao > Cc: Mark Adams ; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] [Ext] Re: error: identifier > "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > >>>>>>> > balay at sb /home/balay/petsc (maint=) > $ git grep MatCreateMPIAIJMKL maint-3.8 > maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - > Creates a sparse parallel matrix whose local > maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode > MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt > N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt > o_nnz[],Mat *A) > maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: > MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL > maint-3.8:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), > MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) > $ git grep MatCreateMPIAIJMKL maint > maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - > Creates a sparse parallel matrix whose local > maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode > MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt > N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt > o_nnz[],Mat *A) > maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: > MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL > maint:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), > MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) > $ <<<<<<<<<<< > > MatCreateMPIAIJMKL() exists in both petsc-3.8 and petsc-3.10. However the > public interface is missing from both of these versions. So I'm surprised > you don't get the same error with petsc-3.8 > > Can you try the following change? > > diff --git a/include/petscmat.h b/include/petscmat.h index > 1b8ac69377..c66f727994 100644 > --- a/include/petscmat.h > +++ b/include/petscmat.h > @@ -223,7 +223,8 @@ typedef enum > {DIFFERENT_NONZERO_PATTERN,SUBSET_NONZERO_PATTERN,SAME_NONZERO_PATT > > #if defined PETSC_HAVE_MKL_SPARSE > PETSC_EXTERN PetscErrorCode > MatCreateBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,const > PetscInt[],PetscInt,const PetscInt[],Mat*); -PETSC_EXTERN PetscErrorCode > MatCreateSeqBAIJMKL(MPI_Comm comm,PetscInt bs,PetscInt m,PetscInt > n,PetscInt nz,const PetscInt nnz[],Mat *A); > +PETSC_EXTERN PetscErrorCode > +MatCreateSeqBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,const > +PetscInt[],Mat*); PETSC_EXTERN PetscErrorCode > +MatCreateMPIAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,PetscIn > +t,const PetscInt[],PetscInt,const PetscInt[],Mat*); > #endif > > PETSC_EXTERN PetscErrorCode > MatCreateSeqSELL(MPI_Comm,PetscInt,PetscInt,PetscInt,const PetscInt[],Mat*); > > > Also note: - this routine is available only when PETSc is built with Intel > MKL > > Satish > > On Tue, 26 Mar 2019, Kun Jiao via petsc-users wrote: > > > [kjiao at hyi0016 src/lsqr]% make > > [ 50%] Building CXX object lsqr/CMakeFiles/p_lsqr.dir/lsqr.cc.o > > > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(318): > error: identifier "MatCreateMPIAIJMKL" is undefined > > ierr = > MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); > > ^ > > > > > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(578): > error: identifier "MatCreateMPIAIJMKL" is undefined > > ierr = > MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); > > ^ > > > > compilation aborted for > > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc > > (code 2) > > > > Thanks. > > > > > > From: Mark Adams > > Sent: Tuesday, March 26, 2019 9:22 AM > > To: Kun Jiao > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [Ext] Re: [petsc-users] error: identifier > > "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > > > I assume the whole error message will have the line of code. Please send > the whole error message and line of offending code if not included. > > > > On Tue, Mar 26, 2019 at 10:08 AM Kun Jiao KJiao at slb.com>> wrote: > > It is compiling error, error message is: > > > > error: identifier "MatCreateMPIAIJMKL" is undefined. > > > > > > > > > > > > From: Mark Adams > > > Sent: Tuesday, March 26, 2019 6:48 AM > > To: Kun Jiao > > > Cc: petsc-users at mcs.anl.gov > > Subject: [Ext] Re: [petsc-users] error: identifier > > "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > > > Please send the output of the error (runtime, compile time, link > > time?) > > > > On Mon, Mar 25, 2019 at 10:50 PM Kun Jiao via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi Petsc Experts, > > > > Is MatCreateMPIAIJMKL retired in 3.10.4? > > > > I got this error with my code which works fine in 3.8.3 version. > > > > Regards, > > Kun > > > > > > > > Schlumberger-Private > > > > > > Schlumberger-Private > > > > > > Schlumberger-Private > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Mar 26 15:41:44 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Tue, 26 Mar 2019 20:41:44 +0000 Subject: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 In-Reply-To: References: Message-ID: Please apply the patch I sent earlier and retry. Satish On Tue, 26 Mar 2019, Kun Jiao via petsc-users wrote: > Strange things, when I compile my code in the test dir in PETSC, it works. After I "make install" PETSC, and try to compile my code against the installed PETSC, it doesn't work any more. > > I guess this is what you means. > > Is there any way to reenable MatCreateMPIAIJMKL public interface? > > And, I am using intel MKL, here is my configure option: > > Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions PETSC_ARCH=linux-gnu-intel --with-precision=single --with-cc=mpiicc --with-cxx=mpiicc --with-fc=mpiifort --with-mpi-include=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mpi/intel64/include --with-mpi-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel//compilers_and_libraries_2019.2.187/linux/mpi/intel64/lib -lmpifort -lmpi_ilp64" --with-blaslapack-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64 -Wl, --no-as-needed -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldl" --with-scalapack-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64" --with-scalapack-include=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl/include --with-mkl_pardiso-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse=1 --with-mkl_sparse-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_cpardiso=1 --with-mkl_cpardiso-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse_optimize=1 --with-mkl_sparse_optimize-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse_sp2m=1 --with-mkl_sparse_sp2m-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-cmake=1 --prefix=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/petsc_3.9.4 --known-endian=big --with-debugging=0 --COPTFLAGS=" -Ofast -xHost" --CXXOPTFLAGS=" -Ofast -xHost" --FOPTFLAGS=" -Ofast -xHost" --with-x=0 > Working directory: /wgdisk/hy3300/source_code_dev/imaging/kjiao/petsc-3.10.4 > > > > Schlumberger-Private > > -----Original Message----- > From: Balay, Satish > Sent: Tuesday, March 26, 2019 10:19 AM > To: Kun Jiao > Cc: Mark Adams ; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > >>>>>>> > balay at sb /home/balay/petsc (maint=) > $ git grep MatCreateMPIAIJMKL maint-3.8 > maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local > maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[],Mat *A) > maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL > maint-3.8:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) $ git grep MatCreateMPIAIJMKL maint > maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local > maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[],Mat *A) > maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL > maint:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) $ <<<<<<<<<<< > > MatCreateMPIAIJMKL() exists in both petsc-3.8 and petsc-3.10. However the public interface is missing from both of these versions. So I'm surprised you don't get the same error with petsc-3.8 > > Can you try the following change? > > diff --git a/include/petscmat.h b/include/petscmat.h index 1b8ac69377..c66f727994 100644 > --- a/include/petscmat.h > +++ b/include/petscmat.h > @@ -223,7 +223,8 @@ typedef enum {DIFFERENT_NONZERO_PATTERN,SUBSET_NONZERO_PATTERN,SAME_NONZERO_PATT > > #if defined PETSC_HAVE_MKL_SPARSE > PETSC_EXTERN PetscErrorCode MatCreateBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,const PetscInt[],PetscInt,const PetscInt[],Mat*); -PETSC_EXTERN PetscErrorCode MatCreateSeqBAIJMKL(MPI_Comm comm,PetscInt bs,PetscInt m,PetscInt n,PetscInt nz,const PetscInt nnz[],Mat *A); > +PETSC_EXTERN PetscErrorCode > +MatCreateSeqBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,const > +PetscInt[],Mat*); PETSC_EXTERN PetscErrorCode > +MatCreateMPIAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,PetscIn > +t,const PetscInt[],PetscInt,const PetscInt[],Mat*); > #endif > > PETSC_EXTERN PetscErrorCode MatCreateSeqSELL(MPI_Comm,PetscInt,PetscInt,PetscInt,const PetscInt[],Mat*); > > > Also note: - this routine is available only when PETSc is built with Intel MKL > > Satish > > On Tue, 26 Mar 2019, Kun Jiao via petsc-users wrote: > > > [kjiao at hyi0016 src/lsqr]% make > > [ 50%] Building CXX object lsqr/CMakeFiles/p_lsqr.dir/lsqr.cc.o > > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(318): error: identifier "MatCreateMPIAIJMKL" is undefined > > ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); > > ^ > > > > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(578): error: identifier "MatCreateMPIAIJMKL" is undefined > > ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); > > ^ > > > > compilation aborted for > > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc > > (code 2) > > > > Thanks. > > > > > > From: Mark Adams > > Sent: Tuesday, March 26, 2019 9:22 AM > > To: Kun Jiao > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [Ext] Re: [petsc-users] error: identifier > > "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > > > I assume the whole error message will have the line of code. Please send the whole error message and line of offending code if not included. > > > > On Tue, Mar 26, 2019 at 10:08 AM Kun Jiao > wrote: > > It is compiling error, error message is: > > > > error: identifier "MatCreateMPIAIJMKL" is undefined. > > > > > > > > > > > > From: Mark Adams > > > Sent: Tuesday, March 26, 2019 6:48 AM > > To: Kun Jiao > > > Cc: petsc-users at mcs.anl.gov > > Subject: [Ext] Re: [petsc-users] error: identifier > > "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > > > Please send the output of the error (runtime, compile time, link > > time?) > > > > On Mon, Mar 25, 2019 at 10:50 PM Kun Jiao via petsc-users > wrote: > > Hi Petsc Experts, > > > > Is MatCreateMPIAIJMKL retired in 3.10.4? > > > > I got this error with my code which works fine in 3.8.3 version. > > > > Regards, > > Kun > > > > > > > > Schlumberger-Private > > > > > > Schlumberger-Private > > > > > > Schlumberger-Private > > > From KJiao at slb.com Tue Mar 26 16:29:59 2019 From: KJiao at slb.com (Kun Jiao) Date: Tue, 26 Mar 2019 21:29:59 +0000 Subject: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 In-Reply-To: References: Message-ID: One strange thing I just found out. Compile a *.c file make it work. mpiicc -o ex5.o -c -fPIC -wd1572 -Ofast -xHost -I /NFS/home/home3/kjiao/software/petsc_3.10.4/include/ -I/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl/include -I/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mpi/intel64/include lsqr.c Compile a *.cpp file does not work, even though .cpp file is exactly same as .c file. mpiicc -o ex5.o -c -Ofast -xHost -I /NFS/home/home3/kjiao/software/petsc_3.10.4/include/ -I/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl/include -I/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mpi/intel64/include lsqr.cpp From: Mark Adams Sent: Tuesday, March 26, 2019 3:38 PM To: Kun Jiao Cc: petsc-users Subject: Re: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 On Tue, Mar 26, 2019 at 3:00 PM Kun Jiao > wrote: Strange things, when I compile my code in the test dir in PETSC, it works. After I "make install" PETSC, and try to compile my code against the installed PETSC, it doesn't work any more. I'm not sure I follow what you are doing exactly but look at the compile lines (good and bad) and compare them. If one works and one does not then they must be different. Anyway, as Satish said this interface was not enabled in any version that we see. (So we are puzzled that any version works.) You can wait for a fix to get pushed but using the method that I showed you should work now. I guess this is what you means. Is there any way to reenable MatCreateMPIAIJMKL public interface? And, I am using intel MKL, here is my configure option: Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions PETSC_ARCH=linux-gnu-intel --with-precision=single --with-cc=mpiicc --with-cxx=mpiicc --with-fc=mpiifort --with-mpi-include=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mpi/intel64/include --with-mpi-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel//compilers_and_libraries_2019.2.187/linux/mpi/intel64/lib -lmpifort -lmpi_ilp64" --with-blaslapack-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64 -Wl, --no-as-needed -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldl" --with-scalapack-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64" --with-scalapack-include=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl/include --with-mkl_pardiso-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse=1 --with-mkl_sparse-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_cpardiso=1 --with-mkl_cpardiso-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse_optimize=1 --with-mkl_sparse_optimize-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse_sp2m=1 --with-mkl_sparse_sp2m-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-cmake=1 --prefix=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/petsc_3.9.4 --known-endian=big --with-debugging=0 --COPTFLAGS=" -Ofast -xHost" --CXXOPTFLAGS=" -Ofast -xHost" --FOPTFLAGS=" -Ofast -xHost" --with-x=0 Working directory: /wgdisk/hy3300/source_code_dev/imaging/kjiao/petsc-3.10.4 Schlumberger-Private -----Original Message----- From: Balay, Satish > Sent: Tuesday, March 26, 2019 10:19 AM To: Kun Jiao > Cc: Mark Adams >; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 >>>>>>> balay at sb /home/balay/petsc (maint=) $ git grep MatCreateMPIAIJMKL maint-3.8 maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[],Mat *A) maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL maint-3.8:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) $ git grep MatCreateMPIAIJMKL maint maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[],Mat *A) maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL maint:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) $ <<<<<<<<<<< MatCreateMPIAIJMKL() exists in both petsc-3.8 and petsc-3.10. However the public interface is missing from both of these versions. So I'm surprised you don't get the same error with petsc-3.8 Can you try the following change? diff --git a/include/petscmat.h b/include/petscmat.h index 1b8ac69377..c66f727994 100644 --- a/include/petscmat.h +++ b/include/petscmat.h @@ -223,7 +223,8 @@ typedef enum {DIFFERENT_NONZERO_PATTERN,SUBSET_NONZERO_PATTERN,SAME_NONZERO_PATT #if defined PETSC_HAVE_MKL_SPARSE PETSC_EXTERN PetscErrorCode MatCreateBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,const PetscInt[],PetscInt,const PetscInt[],Mat*); -PETSC_EXTERN PetscErrorCode MatCreateSeqBAIJMKL(MPI_Comm comm,PetscInt bs,PetscInt m,PetscInt n,PetscInt nz,const PetscInt nnz[],Mat *A); +PETSC_EXTERN PetscErrorCode +MatCreateSeqBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,const +PetscInt[],Mat*); PETSC_EXTERN PetscErrorCode +MatCreateMPIAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,PetscIn +t,const PetscInt[],PetscInt,const PetscInt[],Mat*); #endif PETSC_EXTERN PetscErrorCode MatCreateSeqSELL(MPI_Comm,PetscInt,PetscInt,PetscInt,const PetscInt[],Mat*); Also note: - this routine is available only when PETSc is built with Intel MKL Satish On Tue, 26 Mar 2019, Kun Jiao via petsc-users wrote: > [kjiao at hyi0016 src/lsqr]% make > [ 50%] Building CXX object lsqr/CMakeFiles/p_lsqr.dir/lsqr.cc.o > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(318): error: identifier "MatCreateMPIAIJMKL" is undefined > ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); > ^ > > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(578): error: identifier "MatCreateMPIAIJMKL" is undefined > ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); > ^ > > compilation aborted for > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc > (code 2) > > Thanks. > > > From: Mark Adams > > Sent: Tuesday, March 26, 2019 9:22 AM > To: Kun Jiao > > Cc: petsc-users at mcs.anl.gov > Subject: Re: [Ext] Re: [petsc-users] error: identifier > "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > I assume the whole error message will have the line of code. Please send the whole error message and line of offending code if not included. > > On Tue, Mar 26, 2019 at 10:08 AM Kun Jiao >> wrote: > It is compiling error, error message is: > > error: identifier "MatCreateMPIAIJMKL" is undefined. > > > > > > From: Mark Adams >> > Sent: Tuesday, March 26, 2019 6:48 AM > To: Kun Jiao >> > Cc: petsc-users at mcs.anl.gov> > Subject: [Ext] Re: [petsc-users] error: identifier > "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > Please send the output of the error (runtime, compile time, link > time?) > > On Mon, Mar 25, 2019 at 10:50 PM Kun Jiao via petsc-users >> wrote: > Hi Petsc Experts, > > Is MatCreateMPIAIJMKL retired in 3.10.4? > > I got this error with my code which works fine in 3.8.3 version. > > Regards, > Kun > > > > Schlumberger-Private > > > Schlumberger-Private > > > Schlumberger-Private > Schlumberger-Private -------------- next part -------------- An HTML attachment was scrubbed... URL: From KJiao at slb.com Tue Mar 26 16:37:59 2019 From: KJiao at slb.com (Kun Jiao) Date: Tue, 26 Mar 2019 21:37:59 +0000 Subject: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 In-Reply-To: References: Message-ID: And yes, by applying the patch in the petscmat.h, everything works. Thanks for the help. Regards, Kun Schlumberger-Private -----Original Message----- From: Balay, Satish Sent: Tuesday, March 26, 2019 3:42 PM To: Kun Jiao Cc: petsc-users Subject: Re: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 Please apply the patch I sent earlier and retry. Satish On Tue, 26 Mar 2019, Kun Jiao via petsc-users wrote: > Strange things, when I compile my code in the test dir in PETSC, it works. After I "make install" PETSC, and try to compile my code against the installed PETSC, it doesn't work any more. > > I guess this is what you means. > > Is there any way to reenable MatCreateMPIAIJMKL public interface? > > And, I am using intel MKL, here is my configure option: > > Configure Options: --configModules=PETSc.Configure > --optionsModule=config.compilerOptions PETSC_ARCH=linux-gnu-intel > --with-precision=single --with-cc=mpiicc --with-cxx=mpiicc > --with-fc=mpiifort > --with-mpi-include=/wgdisk/hy3300/source_code_dev/imaging/kjiao/softwa > re/intel/compilers_and_libraries_2019.2.187/linux/mpi/intel64/include > --with-mpi-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/softwar > e/intel//compilers_and_libraries_2019.2.187/linux/mpi/intel64/lib > -lmpifort -lmpi_ilp64" > --with-blaslapack-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/ > software/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel6 > 4 -Wl, --no-as-needed -lmkl_intel_lp64 -lmkl_sequential -lmkl_core > -lpthread -lm -ldl" > --with-scalapack-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/s > oftware/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64 > -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64" > --with-scalapack-include=/wgdisk/hy3300/source_code_dev/imaging/kjiao/ > software/intel/compilers_and_libraries_2019.2.187/linux/mkl/include > --with-mkl_pardiso-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/so > ftware/intel/compilers_and_libraries_2019.2.187/linux/mkl > --with-mkl_sparse=1 > --with-mkl_sparse-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/sof > tware/intel/compilers_and_libraries_2019.2.187/linux/mkl > --with-mkl_cpardiso=1 > --with-mkl_cpardiso-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/s > oftware/intel/compilers_and_libraries_2019.2.187/linux/mkl > --with-mkl_sparse_optimize=1 > --with-mkl_sparse_optimize-dir=/wgdisk/hy3300/source_code_dev/imaging/ > kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl > --with-mkl_sparse_sp2m=1 > --with-mkl_sparse_sp2m-dir=/wgdisk/hy3300/source_code_dev/imaging/kjia > o/software/intel/compilers_and_libraries_2019.2.187/linux/mkl > --with-cmake=1 > --prefix=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/petsc_3 > .9.4 --known-endian=big --with-debugging=0 --COPTFLAGS=" -Ofast > -xHost" --CXXOPTFLAGS=" -Ofast -xHost" --FOPTFLAGS=" -Ofast -xHost" > --with-x=0 Working directory: > /wgdisk/hy3300/source_code_dev/imaging/kjiao/petsc-3.10.4 > > > > Schlumberger-Private > > -----Original Message----- > From: Balay, Satish > Sent: Tuesday, March 26, 2019 10:19 AM > To: Kun Jiao > Cc: Mark Adams ; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] [Ext] Re: error: identifier > "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > >>>>>>> > balay at sb /home/balay/petsc (maint=) > $ git grep MatCreateMPIAIJMKL maint-3.8 > maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local > maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode > MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt > M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const > PetscInt o_nnz[],Mat *A) > maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: > MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL > maint-3.8:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) $ git grep MatCreateMPIAIJMKL maint > maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local > maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode > MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt > M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const > PetscInt o_nnz[],Mat *A) > maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: > MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL > maint:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), > MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc > (maint=) $ <<<<<<<<<<< > > MatCreateMPIAIJMKL() exists in both petsc-3.8 and petsc-3.10. However > the public interface is missing from both of these versions. So I'm > surprised you don't get the same error with petsc-3.8 > > Can you try the following change? > > diff --git a/include/petscmat.h b/include/petscmat.h index > 1b8ac69377..c66f727994 100644 > --- a/include/petscmat.h > +++ b/include/petscmat.h > @@ -223,7 +223,8 @@ typedef enum > {DIFFERENT_NONZERO_PATTERN,SUBSET_NONZERO_PATTERN,SAME_NONZERO_PATT > > #if defined PETSC_HAVE_MKL_SPARSE > PETSC_EXTERN PetscErrorCode > MatCreateBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt > ,PetscInt,const PetscInt[],PetscInt,const PetscInt[],Mat*); > -PETSC_EXTERN PetscErrorCode MatCreateSeqBAIJMKL(MPI_Comm > comm,PetscInt bs,PetscInt m,PetscInt n,PetscInt nz,const PetscInt > nnz[],Mat *A); > +PETSC_EXTERN PetscErrorCode > +MatCreateSeqBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,cons > +t PetscInt[],Mat*); PETSC_EXTERN PetscErrorCode > +MatCreateMPIAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,Petsc > +In t,const PetscInt[],PetscInt,const PetscInt[],Mat*); > #endif > > PETSC_EXTERN PetscErrorCode > MatCreateSeqSELL(MPI_Comm,PetscInt,PetscInt,PetscInt,const > PetscInt[],Mat*); > > > Also note: - this routine is available only when PETSc is built with > Intel MKL > > Satish > > On Tue, 26 Mar 2019, Kun Jiao via petsc-users wrote: > > > [kjiao at hyi0016 src/lsqr]% make > > [ 50%] Building CXX object lsqr/CMakeFiles/p_lsqr.dir/lsqr.cc.o > > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(318): error: identifier "MatCreateMPIAIJMKL" is undefined > > ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); > > ^ > > > > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(578): error: identifier "MatCreateMPIAIJMKL" is undefined > > ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); > > ^ > > > > compilation aborted for > > /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.c > > c > > (code 2) > > > > Thanks. > > > > > > From: Mark Adams > > Sent: Tuesday, March 26, 2019 9:22 AM > > To: Kun Jiao > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [Ext] Re: [petsc-users] error: identifier > > "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > > > I assume the whole error message will have the line of code. Please send the whole error message and line of offending code if not included. > > > > On Tue, Mar 26, 2019 at 10:08 AM Kun Jiao > wrote: > > It is compiling error, error message is: > > > > error: identifier "MatCreateMPIAIJMKL" is undefined. > > > > > > > > > > > > From: Mark Adams > > > Sent: Tuesday, March 26, 2019 6:48 AM > > To: Kun Jiao > > > Cc: petsc-users at mcs.anl.gov > > Subject: [Ext] Re: [petsc-users] error: identifier > > "MatCreateMPIAIJMKL" is undefined in 3.10.4 > > > > Please send the output of the error (runtime, compile time, link > > time?) > > > > On Mon, Mar 25, 2019 at 10:50 PM Kun Jiao via petsc-users > wrote: > > Hi Petsc Experts, > > > > Is MatCreateMPIAIJMKL retired in 3.10.4? > > > > I got this error with my code which works fine in 3.8.3 version. > > > > Regards, > > Kun > > > > > > > > Schlumberger-Private > > > > > > Schlumberger-Private > > > > > > Schlumberger-Private > > > From rtmills at anl.gov Tue Mar 26 20:11:16 2019 From: rtmills at anl.gov (Mills, Richard Tran) Date: Wed, 27 Mar 2019 01:11:16 +0000 Subject: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 In-Reply-To: References: Message-ID: Hi Kun, I'm the author of most of the AIJMKL stuff in PETSc. My apologies for having inadvertently omitted these function prototypes for these interfaces; I'm glad that Satish's patch has fixed this. I want to point out that -- though I can envision some scenarios in which one would want to use the MatCreateXXXAIJMKL interfaces -- most of the time I would recommend against using these directly. Instead, I would recommend simply creating AIJ matrices and then setting them to the AIJMKL sub-types via the PETSc options database. (Via the command line, this could be done by specifying something like "-mat_seqaij_type seqaijmkl" to indicate that all of the "sequential" AIJ matrices that make up an "MPI" AIJ matrix should be of type SEQAIJMKL.) Because this is how I usually do things, my testing had not uncovered the missing function prototypes. Best regards, Richard On 3/26/19 2:37 PM, Kun Jiao via petsc-users wrote: And yes, by applying the patch in the petscmat.h, everything works. Thanks for the help. Regards, Kun Schlumberger-Private -----Original Message----- From: Balay, Satish Sent: Tuesday, March 26, 2019 3:42 PM To: Kun Jiao Cc: petsc-users Subject: Re: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 Please apply the patch I sent earlier and retry. Satish On Tue, 26 Mar 2019, Kun Jiao via petsc-users wrote: Strange things, when I compile my code in the test dir in PETSC, it works. After I "make install" PETSC, and try to compile my code against the installed PETSC, it doesn't work any more. I guess this is what you means. Is there any way to reenable MatCreateMPIAIJMKL public interface? And, I am using intel MKL, here is my configure option: Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions PETSC_ARCH=linux-gnu-intel --with-precision=single --with-cc=mpiicc --with-cxx=mpiicc --with-fc=mpiifort --with-mpi-include=/wgdisk/hy3300/source_code_dev/imaging/kjiao/softwa re/intel/compilers_and_libraries_2019.2.187/linux/mpi/intel64/include --with-mpi-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/softwar e/intel//compilers_and_libraries_2019.2.187/linux/mpi/intel64/lib -lmpifort -lmpi_ilp64" --with-blaslapack-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/ software/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel6 4 -Wl, --no-as-needed -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldl" --with-scalapack-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/s oftware/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64" --with-scalapack-include=/wgdisk/hy3300/source_code_dev/imaging/kjiao/ software/intel/compilers_and_libraries_2019.2.187/linux/mkl/include --with-mkl_pardiso-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/so ftware/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse=1 --with-mkl_sparse-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/sof tware/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_cpardiso=1 --with-mkl_cpardiso-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/s oftware/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse_optimize=1 --with-mkl_sparse_optimize-dir=/wgdisk/hy3300/source_code_dev/imaging/ kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse_sp2m=1 --with-mkl_sparse_sp2m-dir=/wgdisk/hy3300/source_code_dev/imaging/kjia o/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-cmake=1 --prefix=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/petsc_3 .9.4 --known-endian=big --with-debugging=0 --COPTFLAGS=" -Ofast -xHost" --CXXOPTFLAGS=" -Ofast -xHost" --FOPTFLAGS=" -Ofast -xHost" --with-x=0 Working directory: /wgdisk/hy3300/source_code_dev/imaging/kjiao/petsc-3.10.4 Schlumberger-Private -----Original Message----- From: Balay, Satish Sent: Tuesday, March 26, 2019 10:19 AM To: Kun Jiao Cc: Mark Adams ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 balay at sb /home/balay/petsc (maint=) $ git grep MatCreateMPIAIJMKL maint-3.8 maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[],Mat *A) maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL maint-3.8:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) $ git grep MatCreateMPIAIJMKL maint maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[],Mat *A) maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL maint:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) $ <<<<<<<<<<< MatCreateMPIAIJMKL() exists in both petsc-3.8 and petsc-3.10. However the public interface is missing from both of these versions. So I'm surprised you don't get the same error with petsc-3.8 Can you try the following change? diff --git a/include/petscmat.h b/include/petscmat.h index 1b8ac69377..c66f727994 100644 --- a/include/petscmat.h +++ b/include/petscmat.h @@ -223,7 +223,8 @@ typedef enum {DIFFERENT_NONZERO_PATTERN,SUBSET_NONZERO_PATTERN,SAME_NONZERO_PATT #if defined PETSC_HAVE_MKL_SPARSE PETSC_EXTERN PetscErrorCode MatCreateBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt ,PetscInt,const PetscInt[],PetscInt,const PetscInt[],Mat*); -PETSC_EXTERN PetscErrorCode MatCreateSeqBAIJMKL(MPI_Comm comm,PetscInt bs,PetscInt m,PetscInt n,PetscInt nz,const PetscInt nnz[],Mat *A); +PETSC_EXTERN PetscErrorCode +MatCreateSeqBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,cons +t PetscInt[],Mat*); PETSC_EXTERN PetscErrorCode +MatCreateMPIAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,Petsc +In t,const PetscInt[],PetscInt,const PetscInt[],Mat*); #endif PETSC_EXTERN PetscErrorCode MatCreateSeqSELL(MPI_Comm,PetscInt,PetscInt,PetscInt,const PetscInt[],Mat*); Also note: - this routine is available only when PETSc is built with Intel MKL Satish On Tue, 26 Mar 2019, Kun Jiao via petsc-users wrote: [kjiao at hyi0016 src/lsqr]% make [ 50%] Building CXX object lsqr/CMakeFiles/p_lsqr.dir/lsqr.cc.o /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(318): error: identifier "MatCreateMPIAIJMKL" is undefined ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); ^ /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(578): error: identifier "MatCreateMPIAIJMKL" is undefined ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); ^ compilation aborted for /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.c c (code 2) Thanks. From: Mark Adams Sent: Tuesday, March 26, 2019 9:22 AM To: Kun Jiao Cc: petsc-users at mcs.anl.gov Subject: Re: [Ext] Re: [petsc-users] error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 I assume the whole error message will have the line of code. Please send the whole error message and line of offending code if not included. On Tue, Mar 26, 2019 at 10:08 AM Kun Jiao > wrote: It is compiling error, error message is: error: identifier "MatCreateMPIAIJMKL" is undefined. From: Mark Adams > Sent: Tuesday, March 26, 2019 6:48 AM To: Kun Jiao > Cc: petsc-users at mcs.anl.gov Subject: [Ext] Re: [petsc-users] error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 Please send the output of the error (runtime, compile time, link time?) On Mon, Mar 25, 2019 at 10:50 PM Kun Jiao via petsc-users > wrote: Hi Petsc Experts, Is MatCreateMPIAIJMKL retired in 3.10.4? I got this error with my code which works fine in 3.8.3 version. Regards, Kun Schlumberger-Private Schlumberger-Private Schlumberger-Private -------------- next part -------------- An HTML attachment was scrubbed... URL: From KJiao at slb.com Tue Mar 26 20:15:04 2019 From: KJiao at slb.com (Kun Jiao) Date: Wed, 27 Mar 2019 01:15:04 +0000 Subject: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 In-Reply-To: References: Message-ID: Hi Richard, Understood! Thanks very much for you advice. Regards, Kun Schlumberger-Private From: Mills, Richard Tran Sent: Tuesday, March 26, 2019 8:11 PM To: petsc-users at mcs.anl.gov Cc: Kun Jiao Subject: Re: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 Hi Kun, I'm the author of most of the AIJMKL stuff in PETSc. My apologies for having inadvertently omitted these function prototypes for these interfaces; I'm glad that Satish's patch has fixed this. I want to point out that -- though I can envision some scenarios in which one would want to use the MatCreateXXXAIJMKL interfaces -- most of the time I would recommend against using these directly. Instead, I would recommend simply creating AIJ matrices and then setting them to the AIJMKL sub-types via the PETSc options database. (Via the command line, this could be done by specifying something like "-mat_seqaij_type seqaijmkl" to indicate that all of the "sequential" AIJ matrices that make up an "MPI" AIJ matrix should be of type SEQAIJMKL.) Because this is how I usually do things, my testing had not uncovered the missing function prototypes. Best regards, Richard On 3/26/19 2:37 PM, Kun Jiao via petsc-users wrote: And yes, by applying the patch in the petscmat.h, everything works. Thanks for the help. Regards, Kun Schlumberger-Private -----Original Message----- From: Balay, Satish Sent: Tuesday, March 26, 2019 3:42 PM To: Kun Jiao Cc: petsc-users Subject: Re: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 Please apply the patch I sent earlier and retry. Satish On Tue, 26 Mar 2019, Kun Jiao via petsc-users wrote: Strange things, when I compile my code in the test dir in PETSC, it works. After I "make install" PETSC, and try to compile my code against the installed PETSC, it doesn't work any more. I guess this is what you means. Is there any way to reenable MatCreateMPIAIJMKL public interface? And, I am using intel MKL, here is my configure option: Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions PETSC_ARCH=linux-gnu-intel --with-precision=single --with-cc=mpiicc --with-cxx=mpiicc --with-fc=mpiifort --with-mpi-include=/wgdisk/hy3300/source_code_dev/imaging/kjiao/softwa re/intel/compilers_and_libraries_2019.2.187/linux/mpi/intel64/include --with-mpi-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/softwar e/intel//compilers_and_libraries_2019.2.187/linux/mpi/intel64/lib -lmpifort -lmpi_ilp64" --with-blaslapack-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/ software/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel6 4 -Wl, --no-as-needed -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldl" --with-scalapack-lib="-L/wgdisk/hy3300/source_code_dev/imaging/kjiao/s oftware/intel/compilers_and_libraries_2019.2.187/linux/mkl/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64" --with-scalapack-include=/wgdisk/hy3300/source_code_dev/imaging/kjiao/ software/intel/compilers_and_libraries_2019.2.187/linux/mkl/include --with-mkl_pardiso-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/so ftware/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse=1 --with-mkl_sparse-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/sof tware/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_cpardiso=1 --with-mkl_cpardiso-dir=/wgdisk/hy3300/source_code_dev/imaging/kjiao/s oftware/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse_optimize=1 --with-mkl_sparse_optimize-dir=/wgdisk/hy3300/source_code_dev/imaging/ kjiao/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-mkl_sparse_sp2m=1 --with-mkl_sparse_sp2m-dir=/wgdisk/hy3300/source_code_dev/imaging/kjia o/software/intel/compilers_and_libraries_2019.2.187/linux/mkl --with-cmake=1 --prefix=/wgdisk/hy3300/source_code_dev/imaging/kjiao/software/petsc_3 .9.4 --known-endian=big --with-debugging=0 --COPTFLAGS=" -Ofast -xHost" --CXXOPTFLAGS=" -Ofast -xHost" --FOPTFLAGS=" -Ofast -xHost" --with-x=0 Working directory: /wgdisk/hy3300/source_code_dev/imaging/kjiao/petsc-3.10.4 Schlumberger-Private -----Original Message----- From: Balay, Satish Sent: Tuesday, March 26, 2019 10:19 AM To: Kun Jiao Cc: Mark Adams ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] [Ext] Re: error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 balay at sb /home/balay/petsc (maint=) $ git grep MatCreateMPIAIJMKL maint-3.8 maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[],Mat *A) maint-3.8:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL maint-3.8:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) $ git grep MatCreateMPIAIJMKL maint maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c: MatCreateMPIAIJMKL - Creates a sparse parallel matrix whose local maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:PetscErrorCode MatCreateMPIAIJMKL(MPI_Comm comm,PetscInt m,PetscInt n,PetscInt M,PetscInt N,PetscInt d_nz,const PetscInt d_nnz[],PetscInt o_nz,const PetscInt o_nnz[],Mat *A) maint:src/mat/impls/aij/mpi/aijmkl/mpiaijmkl.c:.seealso: MatCreateMPIAIJMKL(), MATSEQAIJMKL, MATMPIAIJMKL maint:src/mat/impls/aij/seq/aijmkl/aijmkl.c:.seealso: MatCreate(), MatCreateMPIAIJMKL(), MatSetValues() balay at sb /home/balay/petsc (maint=) $ <<<<<<<<<<< MatCreateMPIAIJMKL() exists in both petsc-3.8 and petsc-3.10. However the public interface is missing from both of these versions. So I'm surprised you don't get the same error with petsc-3.8 Can you try the following change? diff --git a/include/petscmat.h b/include/petscmat.h index 1b8ac69377..c66f727994 100644 --- a/include/petscmat.h +++ b/include/petscmat.h @@ -223,7 +223,8 @@ typedef enum {DIFFERENT_NONZERO_PATTERN,SUBSET_NONZERO_PATTERN,SAME_NONZERO_PATT #if defined PETSC_HAVE_MKL_SPARSE PETSC_EXTERN PetscErrorCode MatCreateBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt ,PetscInt,const PetscInt[],PetscInt,const PetscInt[],Mat*); -PETSC_EXTERN PetscErrorCode MatCreateSeqBAIJMKL(MPI_Comm comm,PetscInt bs,PetscInt m,PetscInt n,PetscInt nz,const PetscInt nnz[],Mat *A); +PETSC_EXTERN PetscErrorCode +MatCreateSeqBAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,cons +t PetscInt[],Mat*); PETSC_EXTERN PetscErrorCode +MatCreateMPIAIJMKL(MPI_Comm,PetscInt,PetscInt,PetscInt,PetscInt,Petsc +In t,const PetscInt[],PetscInt,const PetscInt[],Mat*); #endif PETSC_EXTERN PetscErrorCode MatCreateSeqSELL(MPI_Comm,PetscInt,PetscInt,PetscInt,const PetscInt[],Mat*); Also note: - this routine is available only when PETSc is built with Intel MKL Satish On Tue, 26 Mar 2019, Kun Jiao via petsc-users wrote: [kjiao at hyi0016 src/lsqr]% make [ 50%] Building CXX object lsqr/CMakeFiles/p_lsqr.dir/lsqr.cc.o /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(318): error: identifier "MatCreateMPIAIJMKL" is undefined ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); ^ /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.cc(578): error: identifier "MatCreateMPIAIJMKL" is undefined ierr = MatCreateMPIAIJMKL(comm,m,n,M,N,maxnz,dialens,maxnz,offlens,&A);CHKERRQ(ierr); ^ compilation aborted for /wgdisk/hy3300/source_code_dev/imaging/kjiao/src/git/src/lsqr/lsqr.c c (code 2) Thanks. From: Mark Adams Sent: Tuesday, March 26, 2019 9:22 AM To: Kun Jiao Cc: petsc-users at mcs.anl.gov Subject: Re: [Ext] Re: [petsc-users] error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 I assume the whole error message will have the line of code. Please send the whole error message and line of offending code if not included. On Tue, Mar 26, 2019 at 10:08 AM Kun Jiao > wrote: It is compiling error, error message is: error: identifier "MatCreateMPIAIJMKL" is undefined. From: Mark Adams > Sent: Tuesday, March 26, 2019 6:48 AM To: Kun Jiao > Cc: petsc-users at mcs.anl.gov Subject: [Ext] Re: [petsc-users] error: identifier "MatCreateMPIAIJMKL" is undefined in 3.10.4 Please send the output of the error (runtime, compile time, link time?) On Mon, Mar 25, 2019 at 10:50 PM Kun Jiao via petsc-users > wrote: Hi Petsc Experts, Is MatCreateMPIAIJMKL retired in 3.10.4? I got this error with my code which works fine in 3.8.3 version. Regards, Kun Schlumberger-Private Schlumberger-Private Schlumberger-Private -------------- next part -------------- An HTML attachment was scrubbed... URL: From myriam.peyrounette at idris.fr Wed Mar 27 04:10:02 2019 From: myriam.peyrounette at idris.fr (Myriam Peyrounette) Date: Wed, 27 Mar 2019 10:10:02 +0100 Subject: [petsc-users] Bad memory scaling with PETSc 3.10 In-Reply-To: <0e5a0f3b-ebb8-849c-7ce9-ec9c23108576@idris.fr> References: <3daa411c-d2c4-53d3-ff7e-c14429b40e49@idris.fr> <7b104336-a067-e679-23ec-2a89e0ba9bc4@idris.fr> <8925b24f-62dd-1e45-5658-968491e51205@idris.fr> <00d471e0-ed2a-51fa-3031-a6b63c3a96e1@idris.fr> <75a2f7b1-9e7a-843f-1a83-efff8e56f797@idris.fr> <788f0293-4a5e-bae3-4a8d-10d92d0a16af@idris.fr> <0e5a0f3b-ebb8-849c-7ce9-ec9c23108576@idris.fr> Message-ID: <00dfa074-cc41-6a3b-257d-28a089e90617@idris.fr> Hi all, for your information, you'll find attached the comparison of the weak memory scalings when using : - PETSc 3.6.4 (reference) - PETSc 3.10.4 without specific options - PETSc 3.10.4 with the three scalability options you mentionned Using the scalability options does improve the memory scaling. However, the 3.6 version still has a better one... Myriam Le 03/26/19 ? 16:16, Myriam Peyrounette a ?crit?: > > *SetFromOptions() was not called indeed... Thanks! The code > performance is better now with regard to memory usage! > > I still have to plot the memory scaling on bigger cases to see if it > has the same good behaviour as when using the 3.6 version. > > I'll let ou know as soon as I have plotted it. > > Thanks again > > Myriam > > > Le 03/26/19 ? 14:30, Matthew Knepley a ?crit?: >> On Tue, Mar 26, 2019 at 9:27 AM Myriam Peyrounette >> > wrote: >> >> I checked with -ksp_view (attached) but no prefix is associated >> with the matrix. Some are associated to the KSP and PC, but none >> to the Mat >> >> Another thing that could prevent options being used is that >> *SetFromOptions() is not called for the object. >> >> ? Thanks, >> >> ? ? ?Matt >> ? >> >> Le 03/26/19 ? 11:55, Dave May a ?crit?: >>> >>> >>> On Tue, 26 Mar 2019 at 10:36, Myriam Peyrounette >>> >> > wrote: >>> >>> Oh you were right, the three options are unsused >>> (-matptap_via scalable, -inner_offdiag_matmatmult_via >>> scalable and -inner_diag_matmatmult_via scalable). Does this >>> mean I am not using the associated PtAP functions? >>> >>> >>> No - not necessarily. All it means is the options were not parsed.? >>> >>> If your matrices have an option prefix associated with them >>> (e.g. abc) , then you need to provide the option as >>> ? -abc_matptap_via scalable >>> >>> If you are not sure if you matrices have a prefix, look at the >>> result of -ksp_view (see below for an example) >>> >>> ??Mat Object: 2 MPI processes >>> >>> ? ? type: mpiaij >>> >>> ? ? rows=363, cols=363, bs=3 >>> >>> ? ? total: nonzeros=8649, allocated nonzeros=8649 >>> >>> ? ? total number of mallocs used during MatSetValues calls =0 >>> >>> ? Mat Object: (B_) 2 MPI processes >>> >>> ? ? type: mpiaij >>> >>> ? ? rows=363, cols=363, bs=3 >>> >>> ? ? total: nonzeros=8649, allocated nonzeros=8649 >>> >>> ? ? total number of mallocs used during MatSetValues calls =0 >>> >>> >>> The first matrix has no options prefix, but the second does and >>> it's called "B_". >>> >>> >>> >>> ? >>> >>> Myriam >>> >>> >>> Le 03/26/19 ? 11:10, Dave May a ?crit?: >>>> >>>> On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via >>>> petsc-users >>> > wrote: >>>> >>>> How can I be sure they are indeed used? Can I print >>>> this information in some log file? >>>> >>>> Yes. Re-run the job with the command line option >>>> >>>> -options_left true >>>> >>>> This will report all options parsed, and importantly, will >>>> also indicate if any options were unused. >>>> ? >>>> >>>> Thanks >>>> Dave >>>> >>>> Thanks in advance >>>> >>>> Myriam >>>> >>>> >>>> Le 03/25/19 ? 18:24, Matthew Knepley a ?crit?: >>>>> On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette >>>>> via petsc-users >>>> > wrote: >>>>> >>>>> Hi, >>>>> >>>>> thanks for the explanations. I tried the last >>>>> PETSc version (commit >>>>> fbc5705bc518d02a4999f188aad4ccff5f754cbf), which >>>>> includes the patch you talked about. But the >>>>> memory scaling shows no improvement (see scaling >>>>> attached), even when using the "scalable" options :( >>>>> >>>>> I had a look at the PETSc functions >>>>> MatPtAPNumeric_MPIAIJ_MPIAIJ and >>>>> MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the >>>>> differences before and after the first "bad" >>>>> commit), but I can't find what induced this memory >>>>> issue. >>>>> >>>>> Are you sure that the option was used? It just looks >>>>> suspicious to me that they use exactly the same amount >>>>> of memory. It should be different, even if it does not >>>>> solve the problem. >>>>> >>>>> ? ?Thanks, >>>>> >>>>> ? ? ?Matt? >>>>> >>>>> Myriam >>>>> >>>>> >>>>> >>>>> >>>>> Le 03/20/19 ? 17:38, Fande Kong a ?crit?: >>>>>> Hi?Myriam, >>>>>> >>>>>> There are three algorithms in PETSc to do PtAP >>>>>> (?const char? ? ? ? ? *algTypes[3] = >>>>>> {"scalable","nonscalable","hypre"};), and can be >>>>>> specified using the petsc options:?-matptap_via xxxx. >>>>>> >>>>>> (1) -matptap_via hypre: This call the hypre >>>>>> package to do the PtAP trough an all-at-once >>>>>> triple product. In our experiences, it is the >>>>>> most memory efficient, but could be slow. >>>>>> >>>>>> (2)? -matptap_via scalable: This involves a >>>>>> row-wise algorithm plus an outer product.? This >>>>>> will use more memory than hypre, but way faster. >>>>>> This used to have a bug that could take all your >>>>>> memory, and I have a fix >>>>>> at?https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff.? >>>>>> When using this option, we may want to have extra >>>>>> options such as? ?-inner_offdiag_matmatmult_via >>>>>> scalable -inner_diag_matmatmult_via scalable? to >>>>>> select inner scalable algorithms. >>>>>> >>>>>> (3)??-matptap_via nonscalable:? Suppose to be >>>>>> even faster, but use more memory. It does dense >>>>>> matrix operations. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Fande Kong >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Mar 20, 2019 at 10:06 AM Myriam >>>>>> Peyrounette via petsc-users >>>>>> >>>>> > wrote: >>>>>> >>>>>> More precisely: something happens when >>>>>> upgrading the functions >>>>>> MatPtAPNumeric_MPIAIJ_MPIAIJ and/or >>>>>> MatPtAPSymbolic_MPIAIJ_MPIAIJ. >>>>>> >>>>>> Unfortunately, there are a lot of differences >>>>>> between the old and new versions of these >>>>>> functions. I keep investigating but if you >>>>>> have any idea, please let me know. >>>>>> >>>>>> Best, >>>>>> >>>>>> Myriam >>>>>> >>>>>> >>>>>> Le 03/20/19 ? 13:48, Myriam Peyrounette a ?crit?: >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I used git bisect to determine when the >>>>>>> memory need increased. I found that the >>>>>>> first "bad" commit is ? >>>>>>> aa690a28a7284adb519c28cb44eae20a2c131c85. >>>>>>> >>>>>>> Barry was right, this commit seems to be >>>>>>> about an evolution of >>>>>>> MatPtAPSymbolic_MPIAIJ_MPIAIJ. You mentioned >>>>>>> the option "-matptap_via scalable" but I >>>>>>> can't find any information about it. Can you >>>>>>> tell me more? >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> Myriam >>>>>>> >>>>>>> >>>>>>> Le 03/11/19 ? 14:40, Mark Adams a ?crit?: >>>>>>>> Is there a difference in memory usage on >>>>>>>> your tiny problem? I assume no. >>>>>>>> >>>>>>>> I don't see anything that could come from >>>>>>>> GAMG other than the RAP stuff that you have >>>>>>>> discussed already. >>>>>>>> >>>>>>>> On Mon, Mar 11, 2019 at 9:32 AM Myriam >>>>>>>> Peyrounette >>>>>>> > wrote: >>>>>>>> >>>>>>>> The code I am using here is the example >>>>>>>> 42 of PETSc >>>>>>>> (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html). >>>>>>>> Indeed it solves the Stokes equation. I >>>>>>>> thought it was a good idea to use an >>>>>>>> example you might know (and didn't find >>>>>>>> any that uses GAMG functions). I just >>>>>>>> changed the PCMG setup so that the >>>>>>>> memory problem appears. And it appears >>>>>>>> when adding PCGAMG. >>>>>>>> >>>>>>>> I don't care about the performance or >>>>>>>> even the result rightness here, but >>>>>>>> only about the difference in memory use >>>>>>>> between 3.6 and 3.10. Do you think >>>>>>>> finding a more adapted script would help? >>>>>>>> >>>>>>>> I used the threshold of 0.1 only once, >>>>>>>> at the beginning, to test its >>>>>>>> influence. I used the default threshold >>>>>>>> (of 0, I guess) for all the other runs. >>>>>>>> >>>>>>>> Myriam >>>>>>>> >>>>>>>> >>>>>>>> Le 03/11/19 ? 13:52, Mark Adams a ?crit?: >>>>>>>>> In looking at this larger scale run ... >>>>>>>>> >>>>>>>>> * Your eigen estimates are much lower >>>>>>>>> than your tiny test problem.? But this >>>>>>>>> is Stokes apparently and it should not >>>>>>>>> work anyway. Maybe you have a small >>>>>>>>> time step that adds a lot of mass that >>>>>>>>> brings the eigen estimates down. And >>>>>>>>> your min eigenvalue (not used) is >>>>>>>>> positive. I would expect negative for >>>>>>>>> Stokes ... >>>>>>>>> >>>>>>>>> * You seem to be setting a threshold >>>>>>>>> value of 0.1 -- that is very high >>>>>>>>> >>>>>>>>> * v3.6 says "using nonzero initial >>>>>>>>> guess" but this is not in v3.10. Maybe >>>>>>>>> we just stopped printing that. >>>>>>>>> >>>>>>>>> * There were some changes to coasening >>>>>>>>> parameters in going from v3.6 but it >>>>>>>>> does not look like your problem was >>>>>>>>> effected. (The coarsening algo is >>>>>>>>> non-deterministic by default and you >>>>>>>>> can see small difference on different >>>>>>>>> runs) >>>>>>>>> >>>>>>>>> * We may have also added a "noisy" RHS >>>>>>>>> for eigen estimates by default from v3.6. >>>>>>>>> >>>>>>>>> * And for non-symetric problems you >>>>>>>>> can try -pc_gamg_agg_nsmooths 0, but >>>>>>>>> again GAMG is not built for Stokes anyway. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Mar 5, 2019 at 11:53 AM Myriam >>>>>>>>> Peyrounette >>>>>>>>> >>>>>>>> > >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> I used PCView to display the size >>>>>>>>> of the linear system in each level >>>>>>>>> of the MG. You'll find the outputs >>>>>>>>> attached to this mail (zip file) >>>>>>>>> for both the default threshold >>>>>>>>> value and a value of 0.1, and for >>>>>>>>> both 3.6 and 3.10 PETSc versions. >>>>>>>>> >>>>>>>>> For convenience, I summarized the >>>>>>>>> information in a graph, also >>>>>>>>> attached (png file). >>>>>>>>> >>>>>>>>> As you can see, there are slight >>>>>>>>> differences between the two >>>>>>>>> versions but none is critical, in >>>>>>>>> my opinion. Do you see anything >>>>>>>>> suspicious in the outputs? >>>>>>>>> >>>>>>>>> + I can't find the default >>>>>>>>> threshold value. Do you know where >>>>>>>>> I can find it? >>>>>>>>> >>>>>>>>> Thanks for the follow-up >>>>>>>>> >>>>>>>>> Myriam >>>>>>>>> >>>>>>>>> >>>>>>>>> Le 03/05/19 ? 14:06, Matthew >>>>>>>>> Knepley a ?crit?: >>>>>>>>>> On Tue, Mar 5, 2019 at 7:14 AM >>>>>>>>>> Myriam Peyrounette >>>>>>>>>> >>>>>>>>> > >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi Matt, >>>>>>>>>> >>>>>>>>>> I plotted the memory scalings >>>>>>>>>> using different threshold >>>>>>>>>> values. The two scalings are >>>>>>>>>> slightly translated (from -22 >>>>>>>>>> to -88 mB) but this gain is >>>>>>>>>> neglectable. The 3.6-scaling >>>>>>>>>> keeps being robust while the >>>>>>>>>> 3.10-scaling deteriorates. >>>>>>>>>> >>>>>>>>>> Do you have any other suggestion? >>>>>>>>>> >>>>>>>>>> Mark, what is the option she can >>>>>>>>>> give to output all the GAMG data? >>>>>>>>>> >>>>>>>>>> Also, run using -ksp_view. GAMG >>>>>>>>>> will report all the sizes of its >>>>>>>>>> grids, so it should be easy to see >>>>>>>>>> if the coarse grid sizes are >>>>>>>>>> increasing, and also what the >>>>>>>>>> effect of the threshold value is. >>>>>>>>>> >>>>>>>>>> ? Thanks, >>>>>>>>>> >>>>>>>>>> ? ? ?Matt? >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> >>>>>>>>>> Myriam >>>>>>>>>> >>>>>>>>>> Le 03/02/19 ? 02:27, Matthew >>>>>>>>>> Knepley a ?crit?: >>>>>>>>>>> On Fri, Mar 1, 2019 at 10:53 >>>>>>>>>>> AM Myriam Peyrounette via >>>>>>>>>>> petsc-users >>>>>>>>>>> >>>>>>>>>> > >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I used to run my code >>>>>>>>>>> with PETSc 3.6. Since I >>>>>>>>>>> upgraded the PETSc version >>>>>>>>>>> to 3.10, this code has a >>>>>>>>>>> bad memory scaling. >>>>>>>>>>> >>>>>>>>>>> To report this issue, I >>>>>>>>>>> took the PETSc script >>>>>>>>>>> ex42.c and slightly >>>>>>>>>>> modified it so that the >>>>>>>>>>> KSP and PC >>>>>>>>>>> configurations are the >>>>>>>>>>> same as in my >>>>>>>>>>> code. In particular, I >>>>>>>>>>> use a "personnalised" >>>>>>>>>>> multi-grid method. The >>>>>>>>>>> modifications are >>>>>>>>>>> indicated by the keyword >>>>>>>>>>> "TopBridge" in the attached >>>>>>>>>>> scripts. >>>>>>>>>>> >>>>>>>>>>> To plot the memory >>>>>>>>>>> (weak) scaling, I ran >>>>>>>>>>> four calculations for each >>>>>>>>>>> script with increasing >>>>>>>>>>> problem sizes and >>>>>>>>>>> computations cores: >>>>>>>>>>> >>>>>>>>>>> 1. 100,000 elts on 4 cores >>>>>>>>>>> 2. 1 million elts on 40 >>>>>>>>>>> cores >>>>>>>>>>> 3. 10 millions elts on >>>>>>>>>>> 400 cores >>>>>>>>>>> 4. 100 millions elts on >>>>>>>>>>> 4,000 cores >>>>>>>>>>> >>>>>>>>>>> The resulting graph is >>>>>>>>>>> also attached. The >>>>>>>>>>> scaling using PETSc 3.10 >>>>>>>>>>> clearly deteriorates for >>>>>>>>>>> large cases, while the >>>>>>>>>>> one using PETSc 3.6 is >>>>>>>>>>> robust. >>>>>>>>>>> >>>>>>>>>>> After a few tests, I >>>>>>>>>>> found that the scaling >>>>>>>>>>> is mostly sensitive to the >>>>>>>>>>> use of the AMG method >>>>>>>>>>> for the coarse grid >>>>>>>>>>> (line 1780 in >>>>>>>>>>> main_ex42_petsc36.cc). >>>>>>>>>>> In particular, the >>>>>>>>>>> performance strongly >>>>>>>>>>> deteriorates when >>>>>>>>>>> commenting lines 1777 to >>>>>>>>>>> 1790 (in >>>>>>>>>>> main_ex42_petsc36.cc). >>>>>>>>>>> >>>>>>>>>>> Do you have any idea of >>>>>>>>>>> what changed between >>>>>>>>>>> version 3.6 and version >>>>>>>>>>> 3.10 that may imply such >>>>>>>>>>> degradation? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I believe the default values >>>>>>>>>>> for PCGAMG changed between >>>>>>>>>>> versions. It sounds like the >>>>>>>>>>> coarsening rate >>>>>>>>>>> is not great enough, so that >>>>>>>>>>> these grids are too large. >>>>>>>>>>> This can be set using: >>>>>>>>>>> >>>>>>>>>>> ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html >>>>>>>>>>> >>>>>>>>>>> There is some explanation of >>>>>>>>>>> this effect on that page. >>>>>>>>>>> Let us know if setting this >>>>>>>>>>> does not correct the situation. >>>>>>>>>>> >>>>>>>>>>> ? Thanks, >>>>>>>>>>> >>>>>>>>>>> ? ? ?Matt >>>>>>>>>>> ? >>>>>>>>>>> >>>>>>>>>>> Let me know if you need >>>>>>>>>>> further information. >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> >>>>>>>>>>> Myriam Peyrounette >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Myriam Peyrounette >>>>>>>>>>> CNRS/IDRIS - HLST >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> What most experimenters take >>>>>>>>>>> for granted before they >>>>>>>>>>> begin their experiments is >>>>>>>>>>> infinitely more interesting >>>>>>>>>>> than any results to which >>>>>>>>>>> their experiments lead. >>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>> >>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Myriam Peyrounette >>>>>>>>>> CNRS/IDRIS - HLST >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for >>>>>>>>>> granted before they begin their >>>>>>>>>> experiments is infinitely more >>>>>>>>>> interesting than any results to >>>>>>>>>> which their experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Myriam Peyrounette >>>>>>>>> CNRS/IDRIS - HLST >>>>>>>>> -- >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Myriam Peyrounette >>>>>>>> CNRS/IDRIS - HLST >>>>>>>> -- >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Myriam Peyrounette >>>>>>> CNRS/IDRIS - HLST >>>>>>> -- >>>>>> >>>>>> -- >>>>>> Myriam Peyrounette >>>>>> CNRS/IDRIS - HLST >>>>>> -- >>>>>> >>>>> >>>>> -- >>>>> Myriam Peyrounette >>>>> CNRS/IDRIS - HLST >>>>> -- >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they >>>>> begin their experiments is infinitely more interesting >>>>> than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>> >>>> -- >>>> Myriam Peyrounette >>>> CNRS/IDRIS - HLST >>>> -- >>>> >>> >>> -- >>> Myriam Peyrounette >>> CNRS/IDRIS - HLST >>> -- >>> >> >> -- >> Myriam Peyrounette >> CNRS/IDRIS - HLST >> -- >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> > > -- > Myriam Peyrounette > CNRS/IDRIS - HLST > -- -- Myriam Peyrounette CNRS/IDRIS - HLST -- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: memScaling_scalable.png Type: image/png Size: 53792 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2975 bytes Desc: Signature cryptographique S/MIME URL: From eda.oktay at metu.edu.tr Wed Mar 27 07:33:34 2019 From: eda.oktay at metu.edu.tr (Eda Oktay) Date: Wed, 27 Mar 2019 15:33:34 +0300 Subject: [petsc-users] Local and global size of IS Message-ID: Hello, I am trying to permute a matrix A(of size 2n*2n) according to a sorted eigenvector vr (of size 2n) in parallel using 2 processors (processor number can change). However, I get an error in MatPermute line stating that arguments are out of range and a new nonzero caused a malloc even if I used MatSetOption. I discovered that this may be because of the unequality of local sizes of is (and newIS) and local size of A. Since I allocate index set idx according to size of the vector vr, global size of is becomes 2n and the local size is also 2n (I think it should be n since both A and vr has local sizes n because of the number of processors). If I change the size of idx, then because of VecGetArrayRead, I think is is created wrongly. So, how can I make both global and local sizes of is,newIS and A? Below, you can see the part of my program. Thanks, Eda ierr = VecGetSize(vr,&siz);CHKERRQ(ierr); ierr = PetscMalloc1(siz,&idx);CHKERRQ(ierr); for (i=0; i From knepley at gmail.com Wed Mar 27 07:48:35 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 27 Mar 2019 08:48:35 -0400 Subject: [petsc-users] Local and global size of IS In-Reply-To: References: Message-ID: On Wed, Mar 27, 2019 at 8:33 AM Eda Oktay via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > I am trying to permute a matrix A(of size 2n*2n) according to a sorted > eigenvector vr (of size 2n) in parallel using 2 processors (processor > number can change). However, I get an error in MatPermute line stating that > arguments are out of range and a new nonzero caused a malloc even if I used > MatSetOption. > > I discovered that this may be because of the unequality of local sizes of > is (and newIS) and local size of A. > > Since I allocate index set idx according to size of the vector vr, global > size of is becomes 2n and the local size is also 2n (I think it should be n > since both A and vr has local sizes n because of the number of processors). > If I change the size of idx, then because of VecGetArrayRead, I think is is > created wrongly. > > So, how can I make both global and local sizes of is,newIS and A? > > Below, you can see the part of my program. > > Thanks, > > Eda > > ierr = VecGetSize(vr,&siz);CHKERRQ(ierr); > Here siz is the length of the parallel Vec. > ierr = PetscMalloc1(siz,&idx);CHKERRQ(ierr); > for (i=0; i ierr = VecGetArrayRead(vr,&avr);CHKERRQ(ierr); > ierr = PetscSortRealWithPermutation(siz,avr,idx);CHKERRQ(ierr); > > ierr = > ISCreateGeneral(PETSC_COMM_SELF,siz,idx,PETSC_COPY_VALUES,&is);CHKERRQ(ierr); > Here siz is the length of the _serial_ IS. You want the local size of the Vec here. Matt > ierr = ISSetPermutation(is);CHKERRQ(ierr); > ierr = ISDuplicate(is,&newIS);CHKERRQ(ierr); > MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE);CHKERRQ(ierr); > ierr = MatPermute(A,is,newIS,&PL);CHKERRQ(ierr); > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 27 18:27:26 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 27 Mar 2019 19:27:26 -0400 Subject: [petsc-users] Consistent domain decomposition between DMDA and DMPLEX In-Reply-To: References: Message-ID: On Fri, Mar 22, 2019 at 1:41 PM Swarnava Ghosh wrote: > Hi Mark and Matt, > > Thank you for your responses. > "They may have elements on the unstructured mesh that intersect with any > number of processor domains on the structured mesh. But the unstructured > mesh vertices are in the structured mesh set of vertices" > Yes, that is correct. We would want a vertex partitioning. > Okay, I need to understand better what you want. A vertex partition of a mesh does not make sense to me. What kind of mesh do you have, and how do you plan to use the partitioned mesh? Thanks, Matt > Sincerely, > Swarnava > > On Fri, Mar 22, 2019 at 4:08 PM Mark Adams wrote: > >> Matt, >> I think they want a vertex partitioning. They may have elements on the >> unstructured mesh that intersect with any number of processor domains on >> the structured mesh. But the unstructured mesh vertices are in the >> structured mesh set of vertices. They want the partition of the >> unstructured mesh vertices (ie, matrices) to be slaved to the partitioning >> of the structured mesh. >> Do I have that right Swarnava? >> Mark >> >> On Fri, Mar 22, 2019 at 6:56 PM Matthew Knepley via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> On Thu, Mar 21, 2019 at 8:20 PM Swarnava Ghosh via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Dear PETSc users and developers, >>>> >>>> I am new to DMPLEX and had a query regarding setting up a consistent >>>> domain decomposition of two meshes in PETSc. >>>> I have a structured finite difference grid, managed through DMDA. I >>>> have another unstructured finite element mesh managed through DMPLEX. Now >>>> all the nodes in the unstructured finite element mesh also belong to the >>>> set of nodes in the structured finite difference mesh (but not necessarily >>>> vice-versa), and the number of nodes in DMPLEX mesh is less than the number >>>> of nodes in DMDA mesh. How can I guarantee a consistent domain >>>> decomposition of the two meshes? By consistent, I mean that if a process >>>> has a set of nodes P from DMDA, and the same process has the set of nodes Q >>>> from DMPLEX, then Q is a subset of P. >>>> >>> >>> Okay, this is not hard. DMPlexDistribute() basically distributes >>> according to a cell partition. You can use PetscPartitionerShell() to stick >>> in whatever cell partition you want. You can see me doing this here: >>> >>> >>> https://bitbucket.org/petsc/petsc/src/e2aefa968a094f48dc384fffc7d599a60aeeb591/src/dm/impls/plex/examples/tests/ex1.c#lines-261 >>> >>> Will that work for you? >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> I look forward to your response. >>>> >>>> Sincerely, >>>> Swarnava >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Mar 27 19:13:11 2019 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 27 Mar 2019 20:13:11 -0400 Subject: [petsc-users] Consistent domain decomposition between DMDA and DMPLEX In-Reply-To: References: Message-ID: On Wed, Mar 27, 2019 at 7:27 PM Matthew Knepley wrote: > On Fri, Mar 22, 2019 at 1:41 PM Swarnava Ghosh > wrote: > >> Hi Mark and Matt, >> >> Thank you for your responses. >> "They may have elements on the unstructured mesh that intersect with any >> number of processor domains on the structured mesh. But the unstructured >> mesh vertices are in the structured mesh set of vertices" >> Yes, that is correct. We would want a vertex partitioning. >> > > Okay, I need to understand better what you want. A vertex partition of a > mesh does not make sense to me. What kind > of mesh do you have, and how do you plan to use the partitioned mesh? > I would guess they want a vertex partitioning to make an MatMPIAIJ. They could inject fine (structured) grid points into coarse (unstructured) points/vertices w/o communication. That's my best guess. > > Thanks, > > Matt > > >> Sincerely, >> Swarnava >> >> On Fri, Mar 22, 2019 at 4:08 PM Mark Adams wrote: >> >>> Matt, >>> I think they want a vertex partitioning. They may have elements on the >>> unstructured mesh that intersect with any number of processor domains on >>> the structured mesh. But the unstructured mesh vertices are in the >>> structured mesh set of vertices. They want the partition of the >>> unstructured mesh vertices (ie, matrices) to be slaved to the partitioning >>> of the structured mesh. >>> Do I have that right Swarnava? >>> Mark >>> >>> On Fri, Mar 22, 2019 at 6:56 PM Matthew Knepley via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> On Thu, Mar 21, 2019 at 8:20 PM Swarnava Ghosh via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>>> Dear PETSc users and developers, >>>>> >>>>> I am new to DMPLEX and had a query regarding setting up a consistent >>>>> domain decomposition of two meshes in PETSc. >>>>> I have a structured finite difference grid, managed through DMDA. I >>>>> have another unstructured finite element mesh managed through DMPLEX. Now >>>>> all the nodes in the unstructured finite element mesh also belong to the >>>>> set of nodes in the structured finite difference mesh (but not necessarily >>>>> vice-versa), and the number of nodes in DMPLEX mesh is less than the number >>>>> of nodes in DMDA mesh. How can I guarantee a consistent domain >>>>> decomposition of the two meshes? By consistent, I mean that if a process >>>>> has a set of nodes P from DMDA, and the same process has the set of nodes Q >>>>> from DMPLEX, then Q is a subset of P. >>>>> >>>> >>>> Okay, this is not hard. DMPlexDistribute() basically distributes >>>> according to a cell partition. You can use PetscPartitionerShell() to stick >>>> in whatever cell partition you want. You can see me doing this here: >>>> >>>> >>>> https://bitbucket.org/petsc/petsc/src/e2aefa968a094f48dc384fffc7d599a60aeeb591/src/dm/impls/plex/examples/tests/ex1.c#lines-261 >>>> >>>> Will that work for you? >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> I look forward to your response. >>>>> >>>>> Sincerely, >>>>> Swarnava >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Mar 27 19:27:56 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 27 Mar 2019 20:27:56 -0400 Subject: [petsc-users] Consistent domain decomposition between DMDA and DMPLEX In-Reply-To: References: Message-ID: On Wed, Mar 27, 2019 at 8:13 PM Mark Adams wrote: > On Wed, Mar 27, 2019 at 7:27 PM Matthew Knepley wrote: > >> On Fri, Mar 22, 2019 at 1:41 PM Swarnava Ghosh >> wrote: >> >>> Hi Mark and Matt, >>> >>> Thank you for your responses. >>> "They may have elements on the unstructured mesh that intersect with >>> any number of processor domains on the structured mesh. But the >>> unstructured mesh vertices are in the structured mesh set of vertices" >>> Yes, that is correct. We would want a vertex partitioning. >>> >> >> Okay, I need to understand better what you want. A vertex partition of a >> mesh does not make sense to me. What kind >> of mesh do you have, and how do you plan to use the partitioned mesh? >> > > I would guess they want a vertex partitioning to make an MatMPIAIJ. They > could inject fine (structured) grid points into coarse (unstructured) > points/vertices w/o communication. That's my best guess. > That seems like a bad tradeoff. You avoid one communication during injection for at least that much or more during FE assembly on that cell partition? Matt > >> Thanks, >> >> Matt >> >> >>> Sincerely, >>> Swarnava >>> >>> On Fri, Mar 22, 2019 at 4:08 PM Mark Adams wrote: >>> >>>> Matt, >>>> I think they want a vertex partitioning. They may have elements on the >>>> unstructured mesh that intersect with any number of processor domains on >>>> the structured mesh. But the unstructured mesh vertices are in the >>>> structured mesh set of vertices. They want the partition of the >>>> unstructured mesh vertices (ie, matrices) to be slaved to the partitioning >>>> of the structured mesh. >>>> Do I have that right Swarnava? >>>> Mark >>>> >>>> On Fri, Mar 22, 2019 at 6:56 PM Matthew Knepley via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>>> On Thu, Mar 21, 2019 at 8:20 PM Swarnava Ghosh via petsc-users < >>>>> petsc-users at mcs.anl.gov> wrote: >>>>> >>>>>> Dear PETSc users and developers, >>>>>> >>>>>> I am new to DMPLEX and had a query regarding setting up a consistent >>>>>> domain decomposition of two meshes in PETSc. >>>>>> I have a structured finite difference grid, managed through DMDA. I >>>>>> have another unstructured finite element mesh managed through DMPLEX. Now >>>>>> all the nodes in the unstructured finite element mesh also belong to the >>>>>> set of nodes in the structured finite difference mesh (but not necessarily >>>>>> vice-versa), and the number of nodes in DMPLEX mesh is less than the number >>>>>> of nodes in DMDA mesh. How can I guarantee a consistent domain >>>>>> decomposition of the two meshes? By consistent, I mean that if a process >>>>>> has a set of nodes P from DMDA, and the same process has the set of nodes Q >>>>>> from DMPLEX, then Q is a subset of P. >>>>>> >>>>> >>>>> Okay, this is not hard. DMPlexDistribute() basically distributes >>>>> according to a cell partition. You can use PetscPartitionerShell() to stick >>>>> in whatever cell partition you want. You can see me doing this here: >>>>> >>>>> >>>>> https://bitbucket.org/petsc/petsc/src/e2aefa968a094f48dc384fffc7d599a60aeeb591/src/dm/impls/plex/examples/tests/ex1.c#lines-261 >>>>> >>>>> Will that work for you? >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> I look forward to your response. >>>>>> >>>>>> Sincerely, >>>>>> Swarnava >>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sajidsyed2021 at u.northwestern.edu Wed Mar 27 20:07:43 2019 From: sajidsyed2021 at u.northwestern.edu (Sajid Ali) Date: Wed, 27 Mar 2019 20:07:43 -0500 Subject: [petsc-users] Converting complex PDE to real for KNL performance ? Message-ID: Hi, I'm able to solve the following equation using complex numbers (with ts_type cn and pc_type gamg) : u_t = A*u'' + F_t*u; (where A = -1j/(2k) amd u'' refers to u_xx+u_yy implemented with the familiar 5-point stencil) Now, I want to solve the same problem using real numbers. The equivalent equations are: u_t_real = 1/(2k) * u''_imag + F_real*u_real - F_imag*u_imag u_t_imag = -1/(2k) * u''_real + F_imag*u_real - F_real*u_imag Thus, if we now take our new u vector to have twice the length of the problem we're solving, keeping the first half as real and the second half as imaginary, we'd get a matrix that had matrices computing the laplacian via the 5-point stencil in the top-right and bottom-left corners and a diagonal [F_real+F_imag, F_real-F_imag] term. I tried doing this and the gamg preconditioner complains about an unsymmetric matrix. If i use the default preconditioner, I get DIVERGED_NONLINEAR_SOLVE. Is there a way to better organize the matrix ? PS: I'm trying to do this using only real numbers because I realized that the optimized avx-512 kernels for KNL are not implemented for complex numbers. Would that be implemented soon ? Thank You, Sajid Ali Applied Physics Northwestern University -------------- next part -------------- An HTML attachment was scrubbed... URL: From tisaac at cc.gatech.edu Wed Mar 27 20:15:28 2019 From: tisaac at cc.gatech.edu (Isaac, Tobin G) Date: Thu, 28 Mar 2019 01:15:28 +0000 Subject: [petsc-users] Registration open: PETSc users meeting Georgia Tech, Atlanta, June 5-7, 2019 Message-ID: <20190328011525.q4ki6ls5yjaaf26z@gatech.edu> We are pleased to announce that the fifth annual PETSc user meeting will take place at Georgia Tech, Atlanta, June 5-7, 2019. Registration is now open on eventbrite: https://www.eventbrite.com/e/petsc-19-user-meeting-tickets-59465955273 Early bird registration for students (until April 19) is $50, after which it will be $75. General admission is $100. Some support for student travel is anticipated, with more details coming shortly. We are still accepting abstracts for talks and posters at https://easychair.org/cfp/petsc19. More details can be found at the conference website, which is still http://www.mcs.anl.gov/petsc/meetings/2019/index.html. We look forward to seeing you there! The organizing committee From jed at jedbrown.org Wed Mar 27 21:36:50 2019 From: jed at jedbrown.org (Jed Brown) Date: Wed, 27 Mar 2019 20:36:50 -0600 Subject: [petsc-users] Converting complex PDE to real for KNL performance ? In-Reply-To: References: Message-ID: <87tvfnvi65.fsf@jedbrown.org> When you roll your own equivalent real formulation, PETSc has no way of knowing what conjugate transpose might mean, thus symmetry is lost. I would suggest just using the AVX2 implementation for now and putting in a request (or contributing a patch) for AVX-512 complex optimizations. Sajid Ali via petsc-users writes: > Hi, > > I'm able to solve the following equation using complex numbers (with > ts_type cn and pc_type gamg) : > u_t = A*u'' + F_t*u; > (where A = -1j/(2k) amd u'' refers to u_xx+u_yy implemented with the > familiar 5-point stencil) > > Now, I want to solve the same problem using real numbers. The equivalent > equations are: > u_t_real = 1/(2k) * u''_imag + F_real*u_real - F_imag*u_imag > u_t_imag = -1/(2k) * u''_real + F_imag*u_real - F_real*u_imag > > Thus, if we now take our new u vector to have twice the length of the > problem we're solving, keeping the first half as real and the second half > as imaginary, we'd get a matrix that had matrices computing the laplacian > via the 5-point stencil in the top-right and bottom-left corners and a > diagonal [F_real+F_imag, F_real-F_imag] term. > > I tried doing this and the gamg preconditioner complains about an > unsymmetric matrix. If i use the default preconditioner, I get > DIVERGED_NONLINEAR_SOLVE. > > Is there a way to better organize the matrix ? > > PS: I'm trying to do this using only real numbers because I realized that > the optimized avx-512 kernels for KNL are not implemented for complex > numbers. Would that be implemented soon ? > > Thank You, > Sajid Ali > Applied Physics > Northwestern University From mfadams at lbl.gov Thu Mar 28 07:55:53 2019 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 28 Mar 2019 08:55:53 -0400 Subject: [petsc-users] Consistent domain decomposition between DMDA and DMPLEX In-Reply-To: References: Message-ID: > > > > That seems like a bad tradeoff. You avoid one communication during > injection for at least that much or more during > FE assembly on that cell partition? > > I am just guessing about the purpose as a way to describing what they are asking for. > Matt > > >> >>> Thanks, >>> >>> Matt >>> >>> >>>> Sincerely, >>>> Swarnava >>>> >>>> On Fri, Mar 22, 2019 at 4:08 PM Mark Adams wrote: >>>> >>>>> Matt, >>>>> I think they want a vertex partitioning. They may have elements on the >>>>> unstructured mesh that intersect with any number of processor domains on >>>>> the structured mesh. But the unstructured mesh vertices are in the >>>>> structured mesh set of vertices. They want the partition of the >>>>> unstructured mesh vertices (ie, matrices) to be slaved to the partitioning >>>>> of the structured mesh. >>>>> Do I have that right Swarnava? >>>>> Mark >>>>> >>>>> On Fri, Mar 22, 2019 at 6:56 PM Matthew Knepley via petsc-users < >>>>> petsc-users at mcs.anl.gov> wrote: >>>>> >>>>>> On Thu, Mar 21, 2019 at 8:20 PM Swarnava Ghosh via petsc-users < >>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>> >>>>>>> Dear PETSc users and developers, >>>>>>> >>>>>>> I am new to DMPLEX and had a query regarding setting up a consistent >>>>>>> domain decomposition of two meshes in PETSc. >>>>>>> I have a structured finite difference grid, managed through DMDA. I >>>>>>> have another unstructured finite element mesh managed through DMPLEX. Now >>>>>>> all the nodes in the unstructured finite element mesh also belong to the >>>>>>> set of nodes in the structured finite difference mesh (but not necessarily >>>>>>> vice-versa), and the number of nodes in DMPLEX mesh is less than the number >>>>>>> of nodes in DMDA mesh. How can I guarantee a consistent domain >>>>>>> decomposition of the two meshes? By consistent, I mean that if a process >>>>>>> has a set of nodes P from DMDA, and the same process has the set of nodes Q >>>>>>> from DMPLEX, then Q is a subset of P. >>>>>>> >>>>>> >>>>>> Okay, this is not hard. DMPlexDistribute() basically distributes >>>>>> according to a cell partition. You can use PetscPartitionerShell() to stick >>>>>> in whatever cell partition you want. You can see me doing this here: >>>>>> >>>>>> >>>>>> https://bitbucket.org/petsc/petsc/src/e2aefa968a094f48dc384fffc7d599a60aeeb591/src/dm/impls/plex/examples/tests/ex1.c#lines-261 >>>>>> >>>>>> Will that work for you? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> I look forward to your response. >>>>>>> >>>>>>> Sincerely, >>>>>>> Swarnava >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncreati at inogs.it Thu Mar 28 04:17:53 2019 From: ncreati at inogs.it (Nicola Creati) Date: Thu, 28 Mar 2019 10:17:53 +0100 Subject: [petsc-users] petsc4py - Convert c example code to Python Message-ID: Hello, I'm trying to write in Python one of the TS PETSc tutorial example: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex13.c.html I'm not able to fill the Jacobian matrix in the right way in Python. Maybe there are some other conversion errors. Might someone help, please? This is my code: # Example 13 petsc TS import sys, petsc4py petsc4py.init(sys.argv) from petsc4py import PETSc from mpi4py import MPI import numpy as np import math def RHS_func(ts, t, X, xdot, F): ??? da = ts.getDM() ??? mx, my = da.getSizes() ??? hx, hy = [1.0/(m-1) for m in [mx, my]] ??? sx = 1.0/(hx*hx) ??? sy = 1.0/(hy*hy) ??? x = X.getArray(readonly=1).reshape(8, 8, order='C') ??? f = F.getArray(readonly=0).reshape(8, 8, order='C') ??? (xs, xm), (ys, ym) = da.getRanges() ??? aa = np.zeros((8,8)) ??? for j in range(ys, ym): ??????? for i in range(xs, xm): ??????????? if i == 0 or j == 0 or i == (mx-1) or j == (my-1): ??????????????? f[i,j] = x[i,j] ??????????????? continue ??????????? u = x[i,j] ??????????? uxx = (-2.0 * u + x[i, j-1] + x[i, j+1]) * sx ??????????? uyy = (-2.0 * u + x[i-1, j] + x[i+1, j])* sy ??????????? f[i, j] = uxx + uyy ??? F.assemble() def Jacobian_func(ts, t, x, xdot, a, A, B): ??? mx, my = da.getSizes() ??? hx = 1.0/(mx-1) ??? hy = 1.0/(my-1) ??? sx = 1.0/(hx*hx) ??? sy = 1.0/(hy*hy) ??? B.zeroEntries() ??? (i0, i1), (j0, j1) = da.getRanges() ??? row = PETSc.Mat.Stencil() ??? col = PETSc.Mat.Stencil() ??? for i in range(j0, j1): ??????? for j in range(i0, i1): ??????????? row.index = (i,j) ??????????? row.field = 0 ??????????? if i == 0 or j== 0 or i==(my-1) or j==(mx-1): ??????????????? B.setValueStencil(row, row, 1.0) ??????????? else: ??????????????? for index, value in [ ??????????????????? ((i-1, j),?? sx), ??????????????????? ((i+1, j),?? sx), ??????????????????? ((i,?? j-1), sy), ??????????????????? ((i-1, j+1), sy), ??????????????????? ((i,?? j),? -2*sx - 2*sy)]: ??????????????????? col.index = index ??????????????????? col.field = 0 ??????????????????? B.setValueStencil(row, col, value) ??? B.assemble() ??? if A != B: B.assemble() ??? return PETSc.Mat.Structure.SAME_NONZERO_PATTERN def make_initial_solution(da, U, c): ??? mx, my = da.getSizes() ??? hx = 1.0/(mx-1) ??? hy = 1.0/(my-1) ??? (xs, xm), (ys, ym) = da.getRanges() ??? u = U.getArray(readonly=0).reshape(8, 8, order='C') ??? for j in range(ys, ym): ??????? y = j*hy ??????? for i in range(xs, xm): ??????????? x = i*hx ??????????? r = math.sqrt((x-0.5)**2+(y-0.5)**2) ??????????? if r < (0.125): ??????????????? u[i, j] = math.exp(c*r*r*r) ??????????? else: ??????????????? u[i, j] = 0.0 ??? U.assemble() def monitor(ts, i, t, x): ??? xx = x[:].tolist() ??? history.append((i, t, xx)) history = [] nx = 8 ny = 8 da = PETSc.DMDA().create([nx, ny], stencil_type= PETSc.DA.StencilType.STAR) da.setFromOptions() da.setUp() u = da.createGlobalVec() f = u.duplicate() c = -30.0 ts = PETSc.TS().create() ts.setDM(da) ts.setType(ts.Type.BEULER) ts.setIFunction(RHS_func, f) ts.setIJacobian(Jacobian_func) ftime = 1.0 ts.setMaxTime(ftime) ts.setExactFinalTime(PETSc.TS.ExactFinalTime.STEPOVER) make_initial_solution(da, u, c) dt = 0.01 ts.setMonitor(monitor) ts.setTimeStep(dt) ts.setFromOptions() ts.solve(u) ftime = ts.getSolveTime() steps = ts.getStepNumber() Thanks. Nicola -- Nicola Creati Istituto Nazionale di Oceanografia e di Geofisica Sperimentale - OGS www.inogs.it Dipartimento di Geofisica della Litosfera Geophysics of Lithosphere Department CARS (Cartography and Remote Sensing) Research Group http://www.inogs.it/Cars/ Borgo Grotta Gigante 42/c 34010 Sgonico - Trieste - ITALY ncreati at ogs.trieste.it off. +39 040 2140 213 fax. +39 040 327307 _____________________________________________________________________ This communication, that may contain confidential and/or legally privileged information, is intended solely for the use of the intended addressees. Opinions, conclusions and other information contained in this message, that do not relate to the official business of OGS, shall be considered as not given or endorsed by it. Every opinion or advice contained in this communication is subject to the terms and conditions provided by the agreement governing the engagement with such a client. Any use, disclosure, copying or distribution of the contents of this communication by a not-intended recipient or in violation of the purposes of this communication is strictly prohibited and may be unlawful. For Italy only: Ai sensi del D.Lgs.196/2003 - "T.U. sulla Privacy" si precisa che le informazioni contenute in questo messaggio sono riservate ed a uso esclusivo del destinatario. From swarnava89 at gmail.com Thu Mar 28 18:23:46 2019 From: swarnava89 at gmail.com (Swarnava Ghosh) Date: Thu, 28 Mar 2019 16:23:46 -0700 Subject: [petsc-users] Consistent domain decomposition between DMDA and DMPLEX In-Reply-To: References: Message-ID: Hi Mark and Matt, I calculate my unknown fields at the nodes of my coarse unstructured mesh, and then project the solution at some of the fine structured mesh nodes. The only global matrix I form is on the unstructured coarse mesh to do a Poisson solve. Sincerely, Swarnava On Thu, Mar 28, 2019 at 5:56 AM Mark Adams wrote: > >> >> That seems like a bad tradeoff. You avoid one communication during >> injection for at least that much or more during >> FE assembly on that cell partition? >> >> > I am just guessing about the purpose as a way to describing what they are > asking for. > > >> Matt >> >> >>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Sincerely, >>>>> Swarnava >>>>> >>>>> On Fri, Mar 22, 2019 at 4:08 PM Mark Adams wrote: >>>>> >>>>>> Matt, >>>>>> I think they want a vertex partitioning. They may have elements on >>>>>> the unstructured mesh that intersect with any number of processor domains >>>>>> on the structured mesh. But the unstructured mesh vertices are in the >>>>>> structured mesh set of vertices. They want the partition of the >>>>>> unstructured mesh vertices (ie, matrices) to be slaved to the partitioning >>>>>> of the structured mesh. >>>>>> Do I have that right Swarnava? >>>>>> Mark >>>>>> >>>>>> On Fri, Mar 22, 2019 at 6:56 PM Matthew Knepley via petsc-users < >>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>> >>>>>>> On Thu, Mar 21, 2019 at 8:20 PM Swarnava Ghosh via petsc-users < >>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>> >>>>>>>> Dear PETSc users and developers, >>>>>>>> >>>>>>>> I am new to DMPLEX and had a query regarding setting up a >>>>>>>> consistent domain decomposition of two meshes in PETSc. >>>>>>>> I have a structured finite difference grid, managed through DMDA. I >>>>>>>> have another unstructured finite element mesh managed through DMPLEX. Now >>>>>>>> all the nodes in the unstructured finite element mesh also belong to the >>>>>>>> set of nodes in the structured finite difference mesh (but not necessarily >>>>>>>> vice-versa), and the number of nodes in DMPLEX mesh is less than the number >>>>>>>> of nodes in DMDA mesh. How can I guarantee a consistent domain >>>>>>>> decomposition of the two meshes? By consistent, I mean that if a process >>>>>>>> has a set of nodes P from DMDA, and the same process has the set of nodes Q >>>>>>>> from DMPLEX, then Q is a subset of P. >>>>>>>> >>>>>>> >>>>>>> Okay, this is not hard. DMPlexDistribute() basically distributes >>>>>>> according to a cell partition. You can use PetscPartitionerShell() to stick >>>>>>> in whatever cell partition you want. You can see me doing this here: >>>>>>> >>>>>>> >>>>>>> https://bitbucket.org/petsc/petsc/src/e2aefa968a094f48dc384fffc7d599a60aeeb591/src/dm/impls/plex/examples/tests/ex1.c#lines-261 >>>>>>> >>>>>>> Will that work for you? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> I look forward to your response. >>>>>>>> >>>>>>>> Sincerely, >>>>>>>> Swarnava >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 28 19:09:10 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 28 Mar 2019 20:09:10 -0400 Subject: [petsc-users] Consistent domain decomposition between DMDA and DMPLEX In-Reply-To: References: Message-ID: On Thu, Mar 28, 2019 at 7:24 PM Swarnava Ghosh wrote: > Hi Mark and Matt, > > I calculate my unknown fields at the nodes of my coarse unstructured mesh, > and then project the solution at some of the fine structured mesh nodes. > The only global matrix I form is on the unstructured coarse mesh to do a > Poisson solve. > Is this a finite element Poisson solve on your coarse unstructured mesh? Thanks, Matt > Sincerely, > Swarnava > > On Thu, Mar 28, 2019 at 5:56 AM Mark Adams wrote: > >> >>> >>> That seems like a bad tradeoff. You avoid one communication during >>> injection for at least that much or more during >>> FE assembly on that cell partition? >>> >>> >> I am just guessing about the purpose as a way to describing what they are >> asking for. >> >> >>> Matt >>> >>> >>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Sincerely, >>>>>> Swarnava >>>>>> >>>>>> On Fri, Mar 22, 2019 at 4:08 PM Mark Adams wrote: >>>>>> >>>>>>> Matt, >>>>>>> I think they want a vertex partitioning. They may have elements on >>>>>>> the unstructured mesh that intersect with any number of processor domains >>>>>>> on the structured mesh. But the unstructured mesh vertices are in the >>>>>>> structured mesh set of vertices. They want the partition of the >>>>>>> unstructured mesh vertices (ie, matrices) to be slaved to the partitioning >>>>>>> of the structured mesh. >>>>>>> Do I have that right Swarnava? >>>>>>> Mark >>>>>>> >>>>>>> On Fri, Mar 22, 2019 at 6:56 PM Matthew Knepley via petsc-users < >>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>> >>>>>>>> On Thu, Mar 21, 2019 at 8:20 PM Swarnava Ghosh via petsc-users < >>>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>>> >>>>>>>>> Dear PETSc users and developers, >>>>>>>>> >>>>>>>>> I am new to DMPLEX and had a query regarding setting up a >>>>>>>>> consistent domain decomposition of two meshes in PETSc. >>>>>>>>> I have a structured finite difference grid, managed through DMDA. >>>>>>>>> I have another unstructured finite element mesh managed through DMPLEX. Now >>>>>>>>> all the nodes in the unstructured finite element mesh also belong to the >>>>>>>>> set of nodes in the structured finite difference mesh (but not necessarily >>>>>>>>> vice-versa), and the number of nodes in DMPLEX mesh is less than the number >>>>>>>>> of nodes in DMDA mesh. How can I guarantee a consistent domain >>>>>>>>> decomposition of the two meshes? By consistent, I mean that if a process >>>>>>>>> has a set of nodes P from DMDA, and the same process has the set of nodes Q >>>>>>>>> from DMPLEX, then Q is a subset of P. >>>>>>>>> >>>>>>>> >>>>>>>> Okay, this is not hard. DMPlexDistribute() basically distributes >>>>>>>> according to a cell partition. You can use PetscPartitionerShell() to stick >>>>>>>> in whatever cell partition you want. You can see me doing this here: >>>>>>>> >>>>>>>> >>>>>>>> https://bitbucket.org/petsc/petsc/src/e2aefa968a094f48dc384fffc7d599a60aeeb591/src/dm/impls/plex/examples/tests/ex1.c#lines-261 >>>>>>>> >>>>>>>> Will that work for you? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> I look forward to your response. >>>>>>>>> >>>>>>>>> Sincerely, >>>>>>>>> Swarnava >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From swarnava89 at gmail.com Thu Mar 28 19:50:16 2019 From: swarnava89 at gmail.com (Swarnava Ghosh) Date: Thu, 28 Mar 2019 17:50:16 -0700 Subject: [petsc-users] Consistent domain decomposition between DMDA and DMPLEX In-Reply-To: References: Message-ID: "Is this a finite element Poisson solve on your coarse unstructured mesh?" > Yes, this is a finite element Poisson solve. Sincerely, Swarnava On Thu, Mar 28, 2019 at 5:09 PM Matthew Knepley wrote: > On Thu, Mar 28, 2019 at 7:24 PM Swarnava Ghosh > wrote: > >> Hi Mark and Matt, >> >> I calculate my unknown fields at the nodes of my coarse unstructured >> mesh, and then project the solution at some of the fine structured mesh >> nodes. >> The only global matrix I form is on the unstructured coarse mesh to do a >> Poisson solve. >> > > Is this a finite element Poisson solve on your coarse unstructured mesh? > > Thanks, > > Matt > > >> Sincerely, >> Swarnava >> >> On Thu, Mar 28, 2019 at 5:56 AM Mark Adams wrote: >> >>> >>>> >>>> That seems like a bad tradeoff. You avoid one communication during >>>> injection for at least that much or more during >>>> FE assembly on that cell partition? >>>> >>>> >>> I am just guessing about the purpose as a way to describing what they >>> are asking for. >>> >>> >>>> Matt >>>> >>>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Sincerely, >>>>>>> Swarnava >>>>>>> >>>>>>> On Fri, Mar 22, 2019 at 4:08 PM Mark Adams wrote: >>>>>>> >>>>>>>> Matt, >>>>>>>> I think they want a vertex partitioning. They may have elements on >>>>>>>> the unstructured mesh that intersect with any number of processor domains >>>>>>>> on the structured mesh. But the unstructured mesh vertices are in the >>>>>>>> structured mesh set of vertices. They want the partition of the >>>>>>>> unstructured mesh vertices (ie, matrices) to be slaved to the partitioning >>>>>>>> of the structured mesh. >>>>>>>> Do I have that right Swarnava? >>>>>>>> Mark >>>>>>>> >>>>>>>> On Fri, Mar 22, 2019 at 6:56 PM Matthew Knepley via petsc-users < >>>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>>> >>>>>>>>> On Thu, Mar 21, 2019 at 8:20 PM Swarnava Ghosh via petsc-users < >>>>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>>>> >>>>>>>>>> Dear PETSc users and developers, >>>>>>>>>> >>>>>>>>>> I am new to DMPLEX and had a query regarding setting up a >>>>>>>>>> consistent domain decomposition of two meshes in PETSc. >>>>>>>>>> I have a structured finite difference grid, managed through DMDA. >>>>>>>>>> I have another unstructured finite element mesh managed through DMPLEX. Now >>>>>>>>>> all the nodes in the unstructured finite element mesh also belong to the >>>>>>>>>> set of nodes in the structured finite difference mesh (but not necessarily >>>>>>>>>> vice-versa), and the number of nodes in DMPLEX mesh is less than the number >>>>>>>>>> of nodes in DMDA mesh. How can I guarantee a consistent domain >>>>>>>>>> decomposition of the two meshes? By consistent, I mean that if a process >>>>>>>>>> has a set of nodes P from DMDA, and the same process has the set of nodes Q >>>>>>>>>> from DMPLEX, then Q is a subset of P. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Okay, this is not hard. DMPlexDistribute() basically distributes >>>>>>>>> according to a cell partition. You can use PetscPartitionerShell() to stick >>>>>>>>> in whatever cell partition you want. You can see me doing this here: >>>>>>>>> >>>>>>>>> >>>>>>>>> https://bitbucket.org/petsc/petsc/src/e2aefa968a094f48dc384fffc7d599a60aeeb591/src/dm/impls/plex/examples/tests/ex1.c#lines-261 >>>>>>>>> >>>>>>>>> Will that work for you? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> I look forward to your response. >>>>>>>>>> >>>>>>>>>> Sincerely, >>>>>>>>>> Swarnava >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Mar 28 21:08:21 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 28 Mar 2019 22:08:21 -0400 Subject: [petsc-users] Consistent domain decomposition between DMDA and DMPLEX In-Reply-To: References: Message-ID: On Thu, Mar 28, 2019 at 8:51 PM Swarnava Ghosh wrote: > "Is this a finite element Poisson solve on your coarse unstructured mesh?" > > Yes, this is a finite element Poisson solve. > You know the vertex division from DMDA. You can easily make a cell partition by assigning a cell to the process of its first vertex (or any vertex really). You can do this by using a PlexPartitionerShell as in the example I linked. However, you also want to force the vertex partition afterwards, so you can copy values locally. In DMPlexCreatePointSF(), we decide who will own the parts of the boundary. Normally, everyone votes for the points that they have, and the highest rank process wins (we also have a randomized version). However, here you can just substitute a function that only votes for a vertex if that vertex is in its DMDA partition. Would not be more than an hour of coding/testing I think. Could you guys make a small example? Thanks, Matt Sincerely, > Swarnava > > On Thu, Mar 28, 2019 at 5:09 PM Matthew Knepley wrote: > >> On Thu, Mar 28, 2019 at 7:24 PM Swarnava Ghosh >> wrote: >> >>> Hi Mark and Matt, >>> >>> I calculate my unknown fields at the nodes of my coarse unstructured >>> mesh, and then project the solution at some of the fine structured mesh >>> nodes. >>> The only global matrix I form is on the unstructured coarse mesh to do a >>> Poisson solve. >>> >> >> Is this a finite element Poisson solve on your coarse unstructured mesh? >> >> Thanks, >> >> Matt >> >> >>> Sincerely, >>> Swarnava >>> >>> On Thu, Mar 28, 2019 at 5:56 AM Mark Adams wrote: >>> >>>> >>>>> >>>>> That seems like a bad tradeoff. You avoid one communication during >>>>> injection for at least that much or more during >>>>> FE assembly on that cell partition? >>>>> >>>>> >>>> I am just guessing about the purpose as a way to describing what they >>>> are asking for. >>>> >>>> >>>>> Matt >>>>> >>>>> >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Sincerely, >>>>>>>> Swarnava >>>>>>>> >>>>>>>> On Fri, Mar 22, 2019 at 4:08 PM Mark Adams wrote: >>>>>>>> >>>>>>>>> Matt, >>>>>>>>> I think they want a vertex partitioning. They may have elements on >>>>>>>>> the unstructured mesh that intersect with any number of processor domains >>>>>>>>> on the structured mesh. But the unstructured mesh vertices are in the >>>>>>>>> structured mesh set of vertices. They want the partition of the >>>>>>>>> unstructured mesh vertices (ie, matrices) to be slaved to the partitioning >>>>>>>>> of the structured mesh. >>>>>>>>> Do I have that right Swarnava? >>>>>>>>> Mark >>>>>>>>> >>>>>>>>> On Fri, Mar 22, 2019 at 6:56 PM Matthew Knepley via petsc-users < >>>>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>>>> >>>>>>>>>> On Thu, Mar 21, 2019 at 8:20 PM Swarnava Ghosh via petsc-users < >>>>>>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>>>>>> >>>>>>>>>>> Dear PETSc users and developers, >>>>>>>>>>> >>>>>>>>>>> I am new to DMPLEX and had a query regarding setting up a >>>>>>>>>>> consistent domain decomposition of two meshes in PETSc. >>>>>>>>>>> I have a structured finite difference grid, managed through >>>>>>>>>>> DMDA. I have another unstructured finite element mesh managed through >>>>>>>>>>> DMPLEX. Now all the nodes in the unstructured finite element mesh also >>>>>>>>>>> belong to the set of nodes in the structured finite difference mesh (but >>>>>>>>>>> not necessarily vice-versa), and the number of nodes in DMPLEX mesh is less >>>>>>>>>>> than the number of nodes in DMDA mesh. How can I guarantee a consistent >>>>>>>>>>> domain decomposition of the two meshes? By consistent, I mean that if a >>>>>>>>>>> process has a set of nodes P from DMDA, and the same process has the set of >>>>>>>>>>> nodes Q from DMPLEX, then Q is a subset of P. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Okay, this is not hard. DMPlexDistribute() basically distributes >>>>>>>>>> according to a cell partition. You can use PetscPartitionerShell() to stick >>>>>>>>>> in whatever cell partition you want. You can see me doing this here: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> https://bitbucket.org/petsc/petsc/src/e2aefa968a094f48dc384fffc7d599a60aeeb591/src/dm/impls/plex/examples/tests/ex1.c#lines-261 >>>>>>>>>> >>>>>>>>>> Will that work for you? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> I look forward to your response. >>>>>>>>>>> >>>>>>>>>>> Sincerely, >>>>>>>>>>> Swarnava >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From cpraveen at gmail.com Fri Mar 29 09:59:26 2019 From: cpraveen at gmail.com (Praveen C) Date: Fri, 29 Mar 2019 15:59:26 +0100 Subject: [petsc-users] Unstructured grid info Message-ID: Dear all I have a rather basic query. I want to write a cell-centered code on 2d hybrid (tri+quad) unstructured grids. It will be only explicit time stepping but I want to use petsc to manage the grid, solutions variables and parallelization. Will use gmsh for meshing. Could you tell me where to look for some help/tutorials/examples for such an application ? Thanks praveen From knepley at gmail.com Fri Mar 29 10:06:33 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 29 Mar 2019 11:06:33 -0400 Subject: [petsc-users] Unstructured grid info In-Reply-To: References: Message-ID: On Fri, Mar 29, 2019 at 10:59 AM Praveen C via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear all > I have a rather basic query. I want to write a cell-centered code on 2d > hybrid (tri+quad) unstructured grids. It will be only explicit time > stepping but I want to use petsc to manage the grid, solutions variables > and parallelization. Will use gmsh for meshing. Could you tell me where to > look for some help/tutorials/examples for such an application ? > We do not have any examples like this. It will likely involve some debugging, but not too much. 1) Make sure we can read your GMsh. You want the latest release (coming out on Monday) or the master branch for this, since the GMsh reader is greatly improved. 2) If you only want PETSc to layout data and manage parallelism, then all you have to do is create a PetscSection for your data layout and call DMSetSection(). Everything else should work, although as I said we have no tests of hybrid meshes like this right now. 3) Consider contributing a simple example to ts/examples/tutorials since then everything would be tested, and it would provide an easy way for me to run similar problems and fix any bugs you find. Thanks, Matt > Thanks > praveen -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Mar 29 23:31:02 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sat, 30 Mar 2019 04:31:02 +0000 Subject: [petsc-users] PETSc 3.11 Release Message-ID: <292C8A09-56E2-4ED3-AC17-6E9BAE6AE7B8@mcs.anl.gov> We are pleased to announce the release of PETSc version 3.11 at http://www.mcs.anl.gov/petsc The major changes and updates can be found at http://www.mcs.anl.gov/petsc/documentation/changes/311.html We recommend upgrading to PETSc 3.11 soon. As always, please report problems to petsc-maint at mcs.anl.gov and ask questions at petsc-users at mcs.anl.gov This release includes contributions from Albert Cowie Alejandro Lamas Davi?a Alp Dener Andreas Fink Andreas Selinger Barry Smith Bas van 't Hof Carola Kruse Chris Golinski Dave May David Hansol Suh Fande Kong Florian Wechsung Francesco Ballarin Greg Meyer Hansol Suh Hong Zhang Hong Zhang Jakub Kruzik Jed Brown Joseph Pusztay Junchao Zhang Karl Rupp Kaushik Kulkarni Lawrence Mitchell Lisandro Dalcin Marek Pecha Marius Buerkle Mark Adams Martin Diehl Matthew G. Knepley Patrick Farrell Patrick Sanan Pierre Jolivet Pieter Ghysels Richard Tran Mills Satish Balay Scott Kruger Shri Abhyankar Stefano Zampini Toby Isaac Todd Munson Tristan Konolige V?clav Hapla Valeria Barra Xiang Huang Xiang (Shawn) Huang and bug reports/patches/proposed improvements received from Adrian Croucher Alp Dener Amneet Pal Bhalla Andreas Selinger Bernard ck Wong Blaise A Bourdin Sophie Blondel Boris Kaus Brian Van Straalen Carl Steefel Chris Coutinho Chris Eldred DAFNAKIS PANAGIOTIS Dave A. May Denis Davydov Devon Powell Fande Kong Febrian Setianto Glenn E Hammond Hapla Vaclav Huy Vo Toby Isaac Jaroslaw Piwonski (CAU) Jason Guo Jason Miller Jean-Luc Guermond Jiaoyan Li Jose E. Roman Josh L Juan Franco Kun Jiao Lisandro Dalcin Manuel Colera Rico Marco Tiberga Mark Adams Martin Diehl Martin J Kuehn Matthew Overholt Mauro Cacace Oana Marin Oleksandr Koshkarov Patrick Farrell Phani Motamarri Randall Mackie Richard Tran Mills Robert Nourgaliev Sajid Ali Satish Balay S?bastien Gilles Stefano Zampini "Syed, Sajid Ali" Tim Steinhoff Tobias Goerler Tristan Konolige Vendel Szeremi Victor Eijkhout William Perkins Yingjie Wu As always, thanks for your support, Barry From daniel.s.kokron at nasa.gov Sat Mar 30 15:28:25 2019 From: daniel.s.kokron at nasa.gov (Kokron, Daniel S. (ARC-606.2)[InuTeq, LLC]) Date: Sat, 30 Mar 2019 20:28:25 +0000 Subject: [petsc-users] 3.11 configure error on pleiades Message-ID: Last time I built PETSc on Pleiades it was version 3.8.3. Using the same build procedure with the same compilers and MPI libraries with 3.11 does not work. Is there a way to enable more verbose diagnostics during the configure phase so I can figure out what executable was being run and how it was compiled? PBS r147i6n10 24> ./configure --prefix=/nobackupp8/XXX /Projects/CHEM/BoA_Case/Codes-2018.3.222/binaries/petsc-3.11+ --with-debugging=0 --with-shared-libraries=1 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx --with-blas-lapack-dir=$MKLROOT/lib/intel64 --with-scalapack-include=$MKLROOT/include --with-scalapack-lib="$MKLROOT/lib/intel64/libmkl_scalapack_lp64.so $MKLROOT/lib/intel64/libmkl_blacs_sgimpt_lp64.so" --with-cpp=/usr/bin/cpp --with-gnu-compilers=0 --with-vendor-compilers=intel -COPTFLAGS="-g -O3 -xCORE-AVX2 -diag-disable=cpu-dispatch" -CXXOPTFLAGS="-g -O3 -xCORE-AVX2 -diag-disable=cpu-dispatch" -FOPTFLAGS="-g -O3 -xCORE-AVX2 -diag-disable=cpu-dispatch" --with-mpi=true --with-mpi-exec=mpiexec --with-mpi-compilers=1 --with-precision=double --with-scalar-type=real --with-x=0 --with-x11=0 --with-memalign=32 I get this which usually means that an executable was linked with libmpi, but was not launched with mpiexec. TESTING: configureMPITypes from config.packages.MPI(/nobackupp8/dkokron/Projects/CHEM/BoA_Case/Codes-2018.3.222/petsc/config/BuildSystem/config/packages/MPI.py:283) ????????CMPT ERROR: mpiexec_mpt must be used to launch all MPI applications ????????CMPT ERROR: mpiexec_mpt must be used to launch all MPI applications ????????CMPT ERROR: mpiexec_mpt must be used to launch all MPI applications If I let it continue, configure reports that MPI is empty. make: BLAS/LAPACK: -Wl,-rpath,/nasa/intel/Compiler/2018.3.222/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64 -L/nasa/intel/Compiler/2018.3.222/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread MPI: cmake: pthread: scalapack: Daniel Kokron Redline Performance Solutions SciCon/APP group -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Sat Mar 30 15:35:33 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Sat, 30 Mar 2019 20:35:33 +0000 Subject: [petsc-users] 3.11 configure error on pleiades In-Reply-To: References: Message-ID: configure creates configure.log with all the debugging details. Its best to compare configure.log from the successful 3.8.3 with the current one - and see what changed between these 2 builds [you can send us both logs at petsc-maint] Satish On Sat, 30 Mar 2019, Kokron, Daniel S. (ARC-606.2)[InuTeq, LLC] via petsc-users wrote: > Last time I built PETSc on Pleiades it was version 3.8.3. Using the same build procedure with the same compilers and MPI libraries with 3.11 does not work. Is there a way to enable more verbose diagnostics during the configure phase so I can figure out what executable was being run and how it was compiled? > > PBS r147i6n10 24> ./configure --prefix=/nobackupp8/XXX /Projects/CHEM/BoA_Case/Codes-2018.3.222/binaries/petsc-3.11+ --with-debugging=0 --with-shared-libraries=1 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx --with-blas-lapack-dir=$MKLROOT/lib/intel64 --with-scalapack-include=$MKLROOT/include --with-scalapack-lib="$MKLROOT/lib/intel64/libmkl_scalapack_lp64.so $MKLROOT/lib/intel64/libmkl_blacs_sgimpt_lp64.so" --with-cpp=/usr/bin/cpp --with-gnu-compilers=0 --with-vendor-compilers=intel -COPTFLAGS="-g -O3 -xCORE-AVX2 -diag-disable=cpu-dispatch" -CXXOPTFLAGS="-g -O3 -xCORE-AVX2 -diag-disable=cpu-dispatch" -FOPTFLAGS="-g -O3 -xCORE-AVX2 -diag-disable=cpu-dispatch" --with-mpi=true --with-mpi-exec=mpiexec --with-mpi-compilers=1 --with-precision=double --with-scalar-type=real --with-x=0 --with-x11=0 --with-memalign=32 > > I get this which usually means that an executable was linked with libmpi, but was not launched with mpiexec. > > TESTING: configureMPITypes from config.packages.MPI(/nobackupp8/dkokron/Projects/CHEM/BoA_Case/Codes-2018.3.222/petsc/config/BuildSystem/config/packages/MPI.py:283) > ????????CMPT ERROR: mpiexec_mpt must be used to launch all MPI applications > ????????CMPT ERROR: mpiexec_mpt must be used to launch all MPI applications > ????????CMPT ERROR: mpiexec_mpt must be used to launch all MPI applications > > If I let it continue, configure reports that MPI is empty. > > make: > BLAS/LAPACK: -Wl,-rpath,/nasa/intel/Compiler/2018.3.222/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64 -L/nasa/intel/Compiler/2018.3.222/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread > MPI: > cmake: > pthread: > scalapack: > > Daniel Kokron > Redline Performance Solutions > SciCon/APP group > -- > > From knepley at gmail.com Sat Mar 30 15:41:46 2019 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 30 Mar 2019 16:41:46 -0400 Subject: [petsc-users] 3.11 configure error on pleiades In-Reply-To: References: Message-ID: On Sat, Mar 30, 2019 at 4:31 PM Kokron, Daniel S. (ARC-606.2)[InuTeq, LLC] via petsc-users wrote: > Last time I built PETSc on Pleiades it was version 3.8.3. Using the same > build procedure with the same compilers and MPI libraries with 3.11 does > not work. Is there a way to enable more verbose diagnostics during the > configure phase so I can figure out what executable was being run and how > it was compiled? > This is not the right option: --with-mpi-exec=mpiexec it is --with-mpiexec=mpiexec Thanks, Matt PBS r147i6n10 24> ./configure --prefix=/nobackupp8/XXX > /Projects/CHEM/BoA_Case/Codes-2018.3.222/binaries/petsc-3.11+ > --with-debugging=0 --with-shared-libraries=1 --with-cc=mpicc > --with-fc=mpif90 --with-cxx=mpicxx > --with-blas-lapack-dir=$MKLROOT/lib/intel64 > --with-scalapack-include=$MKLROOT/include > --with-scalapack-lib="$MKLROOT/lib/intel64/libmkl_scalapack_lp64.so > $MKLROOT/lib/intel64/libmkl_blacs_sgimpt_lp64.so" --with-cpp=/usr/bin/cpp > --with-gnu-compilers=0 --with-vendor-compilers=intel -COPTFLAGS="-g -O3 > -xCORE-AVX2 -diag-disable=cpu-dispatch" -CXXOPTFLAGS="-g -O3 -xCORE-AVX2 > -diag-disable=cpu-dispatch" -FOPTFLAGS="-g -O3 -xCORE-AVX2 > -diag-disable=cpu-dispatch" --with-mpi=true --with-mpi-exec=mpiexec > --with-mpi-compilers=1 --with-precision=double --with-scalar-type=real > --with-x=0 --with-x11=0 --with-memalign=32 > > > > I get this which usually means that an executable was linked with libmpi, > but was not launched with mpiexec. > > > > TESTING: configureMPITypes from > config.packages.MPI(/nobackupp8/dkokron/Projects/CHEM/BoA_Case/Codes-2018.3.222/petsc/config/BuildSystem/config/packages/MPI.py:283) > > ????????CMPT ERROR: mpiexec_mpt must be used to launch all MPI applications > > ????????CMPT ERROR: mpiexec_mpt must be used to launch all MPI applications > > ????????CMPT ERROR: mpiexec_mpt must be used to launch all MPI applications > > > > If I let it continue, configure reports that MPI is empty. > > > > make: > > BLAS/LAPACK: > -Wl,-rpath,/nasa/intel/Compiler/2018.3.222/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64 > -L/nasa/intel/Compiler/2018.3.222/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64 > -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread > > MPI: > > cmake: > > pthread: > > scalapack: > > > > Daniel Kokron > Redline Performance Solutions > SciCon/APP group > > -- > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Sat Mar 30 16:03:39 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Sat, 30 Mar 2019 21:03:39 +0000 Subject: [petsc-users] 3.11 configure error on pleiades In-Reply-To: References: Message-ID: Hm - with-mpi-exec is wrong and ignored - but configure should default to 'mpiexec' from PATH anyway. It would be good to check configure.log for details on this error. Satish On Sat, 30 Mar 2019, Matthew Knepley via petsc-users wrote: > On Sat, Mar 30, 2019 at 4:31 PM Kokron, Daniel S. (ARC-606.2)[InuTeq, LLC] > via petsc-users wrote: > > > Last time I built PETSc on Pleiades it was version 3.8.3. Using the same > > build procedure with the same compilers and MPI libraries with 3.11 does > > not work. Is there a way to enable more verbose diagnostics during the > > configure phase so I can figure out what executable was being run and how > > it was compiled? > > > > This is not the right option: > > --with-mpi-exec=mpiexec > > it is > > --with-mpiexec=mpiexec > > Thanks, > > Matt > > PBS r147i6n10 24> ./configure --prefix=/nobackupp8/XXX > > /Projects/CHEM/BoA_Case/Codes-2018.3.222/binaries/petsc-3.11+ > > --with-debugging=0 --with-shared-libraries=1 --with-cc=mpicc > > --with-fc=mpif90 --with-cxx=mpicxx > > --with-blas-lapack-dir=$MKLROOT/lib/intel64 > > --with-scalapack-include=$MKLROOT/include > > --with-scalapack-lib="$MKLROOT/lib/intel64/libmkl_scalapack_lp64.so > > $MKLROOT/lib/intel64/libmkl_blacs_sgimpt_lp64.so" --with-cpp=/usr/bin/cpp > > --with-gnu-compilers=0 --with-vendor-compilers=intel -COPTFLAGS="-g -O3 > > -xCORE-AVX2 -diag-disable=cpu-dispatch" -CXXOPTFLAGS="-g -O3 -xCORE-AVX2 > > -diag-disable=cpu-dispatch" -FOPTFLAGS="-g -O3 -xCORE-AVX2 > > -diag-disable=cpu-dispatch" --with-mpi=true --with-mpi-exec=mpiexec > > --with-mpi-compilers=1 --with-precision=double --with-scalar-type=real > > --with-x=0 --with-x11=0 --with-memalign=32 > > > > > > > > I get this which usually means that an executable was linked with libmpi, > > but was not launched with mpiexec. > > > > > > > > TESTING: configureMPITypes from > > config.packages.MPI(/nobackupp8/dkokron/Projects/CHEM/BoA_Case/Codes-2018.3.222/petsc/config/BuildSystem/config/packages/MPI.py:283) > > > > ????????CMPT ERROR: mpiexec_mpt must be used to launch all MPI applications > > > > ????????CMPT ERROR: mpiexec_mpt must be used to launch all MPI applications > > > > ????????CMPT ERROR: mpiexec_mpt must be used to launch all MPI applications > > > > > > > > If I let it continue, configure reports that MPI is empty. > > > > > > > > make: > > > > BLAS/LAPACK: > > -Wl,-rpath,/nasa/intel/Compiler/2018.3.222/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64 > > -L/nasa/intel/Compiler/2018.3.222/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64 > > -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread > > > > MPI: > > > > cmake: > > > > pthread: > > > > scalapack: > > > > > > > > Daniel Kokron > > Redline Performance Solutions > > SciCon/APP group > > > > -- > > > > > > > > > From bsmith at mcs.anl.gov Sat Mar 30 17:17:33 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sat, 30 Mar 2019 22:17:33 +0000 Subject: [petsc-users] petsc4py - Convert c example code to Python In-Reply-To: References: Message-ID: <68C4BED4-5AA2-4DD2-A28D-491EC05DD9F6@anl.gov> Nicola, What goes wrong with your code? Does it crash? Not converge in the same way as the C code? Produce wrong answers? If you think specifically the Jacobian is wrong you can run both codes with -mat_view to see any differences in the Jacobian For the computation of the right hand side you can call VecView() and compare it to the same output from the C code. I would debug by running each code next to each other for a very small problem looking at intermediate values computed and try to find the first location where the two are producing different results. Barry > On Mar 28, 2019, at 4:17 AM, Nicola Creati via petsc-users wrote: > > Hello, I'm trying to write in Python one of the TS PETSc tutorial example: > > https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex13.c.html > > I'm not able to fill the Jacobian matrix in the right way in Python. Maybe there are some other conversion errors. > Might someone help, please? > > This is my code: > > # Example 13 petsc TS > import sys, petsc4py > petsc4py.init(sys.argv) > > from petsc4py import PETSc > from mpi4py import MPI > import numpy as np > import math > > def RHS_func(ts, t, X, xdot, F): > da = ts.getDM() > > mx, my = da.getSizes() > > hx, hy = [1.0/(m-1) for m in [mx, my]] > sx = 1.0/(hx*hx) > sy = 1.0/(hy*hy) > > x = X.getArray(readonly=1).reshape(8, 8, order='C') > f = F.getArray(readonly=0).reshape(8, 8, order='C') > > (xs, xm), (ys, ym) = da.getRanges() > aa = np.zeros((8,8)) > for j in range(ys, ym): > for i in range(xs, xm): > if i == 0 or j == 0 or i == (mx-1) or j == (my-1): > f[i,j] = x[i,j] > continue > u = x[i,j] > uxx = (-2.0 * u + x[i, j-1] + x[i, j+1]) * sx > uyy = (-2.0 * u + x[i-1, j] + x[i+1, j])* sy > f[i, j] = uxx + uyy > F.assemble() > > def Jacobian_func(ts, t, x, xdot, a, A, B): > mx, my = da.getSizes() > hx = 1.0/(mx-1) > hy = 1.0/(my-1) > sx = 1.0/(hx*hx) > sy = 1.0/(hy*hy) > > B.zeroEntries() > (i0, i1), (j0, j1) = da.getRanges() > row = PETSc.Mat.Stencil() > col = PETSc.Mat.Stencil() > > for i in range(j0, j1): > for j in range(i0, i1): > row.index = (i,j) > row.field = 0 > if i == 0 or j== 0 or i==(my-1) or j==(mx-1): > B.setValueStencil(row, row, 1.0) > else: > for index, value in [ > ((i-1, j), sx), > ((i+1, j), sx), > ((i, j-1), sy), > ((i-1, j+1), sy), > ((i, j), -2*sx - 2*sy)]: > col.index = index > col.field = 0 > B.setValueStencil(row, col, value) > > B.assemble() > if A != B: B.assemble() > return PETSc.Mat.Structure.SAME_NONZERO_PATTERN > > def make_initial_solution(da, U, c): > mx, my = da.getSizes() > hx = 1.0/(mx-1) > hy = 1.0/(my-1) > (xs, xm), (ys, ym) = da.getRanges() > > u = U.getArray(readonly=0).reshape(8, 8, order='C') > > for j in range(ys, ym): > y = j*hy > for i in range(xs, xm): > x = i*hx > r = math.sqrt((x-0.5)**2+(y-0.5)**2) > if r < (0.125): > u[i, j] = math.exp(c*r*r*r) > else: > u[i, j] = 0.0 > U.assemble() > > def monitor(ts, i, t, x): > xx = x[:].tolist() > history.append((i, t, xx)) > > history = [] > nx = 8 > ny = 8 > da = PETSc.DMDA().create([nx, ny], stencil_type= PETSc.DA.StencilType.STAR) > da.setFromOptions() > da.setUp() > > u = da.createGlobalVec() > f = u.duplicate() > > c = -30.0 > > ts = PETSc.TS().create() > ts.setDM(da) > ts.setType(ts.Type.BEULER) > > ts.setIFunction(RHS_func, f) > ts.setIJacobian(Jacobian_func) > > ftime = 1.0 > ts.setMaxTime(ftime) > ts.setExactFinalTime(PETSc.TS.ExactFinalTime.STEPOVER) > > make_initial_solution(da, u, c) > dt = 0.01 > ts.setMonitor(monitor) > ts.setTimeStep(dt) > ts.setFromOptions() > ts.solve(u) > > ftime = ts.getSolveTime() > steps = ts.getStepNumber() > > Thanks. > > Nicola > > -- > > Nicola Creati > Istituto Nazionale di Oceanografia e di Geofisica Sperimentale - OGS www.inogs.it > Dipartimento di Geofisica della Litosfera Geophysics of Lithosphere Department > CARS (Cartography and Remote Sensing) Research Group http://www.inogs.it/Cars/ > Borgo Grotta Gigante 42/c 34010 Sgonico - Trieste - ITALY ncreati at ogs.trieste.it > off. +39 040 2140 213 > fax. +39 040 327307 > > _____________________________________________________________________ > This communication, that may contain confidential and/or legally privileged information, is intended solely for the use of the intended addressees. Opinions, conclusions and other information contained in this message, that do not relate to the official business of OGS, shall be considered as not given or endorsed by it. Every opinion or advice contained in this communication is subject to the terms and conditions provided by the agreement governing the engagement with such a client. Any use, disclosure, copying or distribution of the contents of this communication by a not-intended recipient or in violation of the purposes of this communication is strictly prohibited and may be unlawful. For Italy only: Ai sensi del D.Lgs.196/2003 - "T.U. sulla Privacy" si precisa che le informazioni contenute in questo messaggio sono riservate ed a uso esclusivo del destinatario. > From bsmith at mcs.anl.gov Sat Mar 30 17:46:45 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sat, 30 Mar 2019 22:46:45 +0000 Subject: [petsc-users] Local and global size of IS In-Reply-To: References: Message-ID: <202187C0-0CD5-46F6-BE73-4F48243BE119@anl.gov> > On Mar 27, 2019, at 7:33 AM, Eda Oktay via petsc-users wrote: > > Hello, > > I am trying to permute a matrix A(of size 2n*2n) according to a sorted eigenvector vr (of size 2n) in parallel using 2 processors (processor number can change). I don't understand your rational for wanting to permute the matrix rows/columns based on a sorted eigenvector? > However, I get an error in MatPermute line stating that arguments are out of range and a new nonzero caused a malloc even if I used MatSetOption. > > I discovered that this may be because of the unequality of local sizes of is (and newIS) and local size of A. > > Since I allocate index set idx according to size of the vector vr, global size of is becomes 2n and the local size is also 2n (I think it should be n since both A and vr has local sizes n because of the number of processors). If I change the size of idx, then because of VecGetArrayRead, I think is is created wrongly. > > So, how can I make both global and local sizes of is,newIS and A? > > Below, you can see the part of my program. > > Thanks, > > Eda > > ierr = VecGetSize(vr,&siz);CHKERRQ(ierr); > ierr = PetscMalloc1(siz,&idx);CHKERRQ(ierr); > for (i=0; i ierr = VecGetArrayRead(vr,&avr);CHKERRQ(ierr); The sort routine is sequential; it knows nothing about parallelism so each process is just sorting its part. Hence this code won't work as expected > ierr = PetscSortRealWithPermutation(siz,avr,idx);CHKERRQ(ierr); If you need to sort the parallel vector and get the entire permutation then you need to do the following 1) communicate the entire vector to each process, you can use VecScatterCreateToAll() for this 2) call VecGetArray on the new vector (that contains all entries on each process) 3) Call PetscSortRealWithPermutation() on the entire vector on each process 4) select out a piece of the resulting indices idx on each process; for example with two processes I think rank = 0 would get the first half of the idx and rank = 1 would get the second half. > > ierr = ISCreateGeneral(PETSC_COMM_SELF,siz,idx,PETSC_COPY_VALUES,&is);CHKERRQ(ierr); > ierr = ISSetPermutation(is);CHKERRQ(ierr); You don't need to duplicate the is, just pass it in twice. > ierr = ISDuplicate(is,&newIS);CHKERRQ(ierr); You should definitely not need the next line. The A matrix is untouched by the subroutine call so you don't need to change its allocation options > MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE);CHKERRQ(ierr); > ierr = MatPermute(A,is,newIS,&PL);CHKERRQ(ierr);